Metadata retrieval¶
-
chaininglib.search.metadata.
get_available_metadata
(resource_name, resource_type=None)[source]¶ Return all possible metadata fields for a lexicon or corpus
Parameters: - resource_name – Name of the lexicon or corpus
- resource_type – (optional) One of ‘lexicon’ or ‘corpus’. Can be used to disambiguate when resource name can be both a lexicon or corpus
Returns: A dictionary of lists of document and token metadata (corpus) or a list of metadata fields (lexicon)
>>> corpus_metadata = get_available_metadata("zeebrieven") >>> print(corpus_metadata) >>> {'document': ['aantal_paginas', 'aantal_woorden', ..., 'witnessYear_from', 'witnessYear_to'], 'token': ['word', 'lemma', 'pos', 'punct', 'starttag']} >>> lexicon_metadata = get_available_metadata("molex") >>> print(lexicon_metadata) >>> ['lemEntryId', 'lemma', 'lemPos', 'wordformId', 'wordform', 'hyphenation', 'wordformPos', 'Gender', 'Number']