MULTEXT-East Morphosyntactic Specifications, Version 4

1.4. Organisation of the language-specific chapters

The language-specific parts present the specifications category by category and are structured as follows:

  1. A table giving the features and values for the category. For some languages the tables give additional information such as the localisation of attribute and value names and possibly the codes of the values. Note that with Version 4 the positions of the attributes in the language specific sections need not be the same as those in the common tables, leading to two possible MSD tagsets per language: a language specific tagset where the attributes are ordered optimally for the language in question, and the common one, where the MSDs can be mixed with MSDs for other languages.
  2. A table providing the allowed combinations of values for the particular language. The tables give the allowed combinations as simple regular expressions, where each value is represented as a literal, or a list of literals enclosed in square brackets, e.g. N[cp][mfn].
  3. A table giving all the valid MSDs for the language, i.e. specifying the MSD tagset for the language. These lists were typically generated from a lexicon or corpus, and give expansions of the MSDs into feature-structure, number of occurences and examples of usage.

Date: 2010-05-12
This work is licensed under the Creative Commons licence Attribution-ShareAlike 3.0.