MULTEXT-East Morphosyntactic Specifications

2.3. Attributes and values

Table of contents

The common MULTEXT-East tables of attribute values are given for all categories above and have a rigid structure, which makes them suitable for automatically verifying the conformance of a particular morphosyntactic description with the tables, or for expanding a morphosyntactic description into its more verbose form.

This formal part is given as a table, having the following columns:

  1. Position gives the position of the attribute in the string of the morphosyntactic description;
  2. Attribute gives the name of the attribute;
  3. Value gives the name and one-letter code of the attribute-value;
  4. a column for each of the languages. For easier comparison between them they have been grouped by language family.

The language columns define (by marking with 'x'), in the first line, whether the category is used by the language, and in subsequent lines, which attribute-values a particular language uses.

2.3.1. Noun

Common specifications for Noun
P Attribute Value Code English Romanian Polish Czech Slovak Slovene Resian Serbo-Croatian Torlak dialect of Serbian Russian Ukrainian Macedonian Bulgarian Damaskini Albanian Persian Estonian Hungarian Chechen Georgian
0 CATEGORY Noun N en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq fa et hu ce ka
1 Type common c en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg sq fa et hu ce ka
proper p en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg sq fa et hu ce ka
gerund g pl ka
2 Gender masculine m en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq
feminine f en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq
neuter n en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq
common c ru uk
3 Number singular s en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq fa et hu ce ka
plural p en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq fa et hu ce ka
dual d cs sl sl-rozaj bg-dam
count t mk bg
collective l sl-rozaj
4 Case nominative n pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq et hu ce ka
genitive g pl cs sk sl sl-rozaj hbs sr-tor ru uk bg-dam sq fa et hu ce ka
dative d pl cs sk sl sl-rozaj hbs sr-tor ru uk bg-dam sq hu ce ka
accusative a pl cs sk sl sl-rozaj hbs sr-tor ru uk bg-dam sq hu
vocative v ro pl cs sk hbs sr-tor ru uk mk bg bg-dam fa ka
locative l pl cs sk sl sl-rozaj hbs sr-tor ru uk bg-dam ce
instrumental i pl cs sk sl sl-rozaj hbs sr-tor ru uk bg-dam hu ce ka
direct r ro
oblique o ro mk bg-dam
partitive 1 et
illative x et hu
inessive 2 et hu
elative e et hu
allative t et hu ce
adessive 3 et hu
ablative b sq et hu
translative 4 et
terminative 9 et hu
essive w et hu ka
abessive 5 et
komitative k et
aditive 7 et
temporalis m hu
causalis c hu
sublative s hu
delative h hu
sociative q hu
factive y hu
superessive p hu
distributive u hu
essive-formal f hu
ergative z ce ka
lative j ce
comparison 8 ce
5 Definiteness no n ro mk bg sq fa
yes y ro mk bg sq fa
short-art s bg
full-art f bg
proximal p mk
distal d mk
6 Clitic no n ro ka
yes y ro ka
7 Animate no n pl cs sk sl sl-rozaj hbs sr-tor ru uk bg-dam ka
yes y pl cs sk sl sl-rozaj hbs sr-tor ru uk bg-dam ka
8 Owner_Number singular s hu
plural p hu
9 Owner_Person first 1 hu
second 2 hu
third 3 hu
10 Owned_Number singular s hu
plural p hu
11 Case2 partitive p ru
locative l ru
12 Human no n pl
yes y pl
13 Aspect progressive p pl
perfective e pl
14 Negation no n pl
yes y pl
15 Class bu b ce
vu v ce
du d ce
ju j ce
16 Article t-form t sr-tor
v-form v sr-tor
n-form n sr-tor

2.3.1.1. Notes

  • In the Romanian case system the value 'direct' conflates 'nominative' and 'accusative', while the value 'oblique' conflates 'genitive' and 'dative'.
  • In the Macedonian case system the value 'oblique' conflates archaic forms of 'genitive', 'dative' and 'accusative'.

2.3.2. Verb

Common specifications for Verb
P Attribute Value Code English Romanian Polish Czech Slovak Slovene Resian Serbo-Croatian Torlak dialect of Serbian Russian Ukrainian Macedonian Bulgarian Damaskini Albanian Persian Estonian Hungarian Chechen Georgian
0 CATEGORY Verb V en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq fa et hu ce ka
1 Type main m en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq fa et hu ce ka
auxiliary a en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq fa et hu ce ka
modal o en ro cs sk sl-rozaj mk sq fa et
copula c ro cs sk sl-rozaj hbs sr-tor fa
base b en
light l fa
2 VForm indicative i en ro pl cs sk sl-rozaj ru uk mk bg bg-dam sq fa et hu ce ka
subjunctive s ro sl-rozaj sq fa ka
imperative m ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq fa et hu ce ka
conditional c en pl cs sk sl ru bg-dam et hu ka
infinitive n en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk bg-dam et hu ce
participle p en ro cs sk sl sl-rozaj hbs sr-tor ru bg bg-dam sq fa et ce
gerund g ro pl ru uk bg et ce
supine u sl sl-rozaj et
transgressive t cs sk
quotative q et
impersonal o pl uk
present r sl hbs sr-tor
future f sl hbs sr-tor
interrogative v ce
realistic_conditional k ce
unrealistic_conditional h ce
causative z ce ka
potential x ce
aorist a hbs sr-tor
imperfect e hbs sr-tor
admirative d sq
optative y sq
3 Tense present p en ro pl cs sk sl-rozaj ru uk mk bg bg-dam sq fa et hu ce ka
imperfect i ro sl-rozaj mk bg bg-dam sq et ce ka
future f pl cs sk sl-rozaj ru uk ce ka
past s en ro pl cs sk sl-rozaj ru uk bg fa et hu
pluperfect l ro ce ka
aorist a sl-rozaj mk bg bg-dam sq ka
recent_past r ce
evident_past e ce
perfective_past t ce
compound c mk
perfect n ka
4 Person first 1 en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq fa et hu
second 2 en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq fa et hu
third 3 en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq fa et hu
5 Number singular s en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq fa et hu
plural p en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq fa et hu
dual d sl sl-rozaj bg-dam
collective l sl-rozaj
6 Gender masculine m ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg
feminine f ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg
neuter n ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg
7 Voice active a cs sl-rozaj ru bg sq et ka
passive p cs sl-rozaj ru bg sq et ka
medial m ru
autoactive c ka
inactive i ka
mediopassive d ka
8 Negative no n cs sk sl sl-rozaj hbs sr-tor mk fa et
yes y cs sk sl sl-rozaj hbs sr-tor mk fa et
9 Definiteness no n bg sq hu
yes y bg hu
short-art s pl ru bg
full-art f pl ru bg
1s2s 2 hu
1sd 0 sq
1sd3sa 1 sq
1sd3pa 3 sq
3sd 4 sq
3sd3sa 5 sq
1pd 6 sq
3pd 7 sq
3sa 8 sq
3pd3sa 9 sq
10 Clitic no n ro pl fa ka
yes y ro pl fa ka
agglutinant a pl
demanding d pl
11 Case nominative n ru
genitive g ru
dative d ru
accusative a ru
locative l ru
instrumental i ru
illative x et
inessive 2 et
elative e et
translative 4 et
abessive 5 et
12 Animate no n cs
yes y cs
13 Clitic_s no n cs
yes y cs
14 Aspect progressive p pl sk sl sl-rozaj ru uk mk fa ka
perfective e pl sk sl sl-rozaj ru uk mk bg-dam ka
biaspectual b sl ru uk mk ce
ambivalent a sk
iterative r ce
semelfactive f ce
imperfective i bg-dam
15 Courtesy no n sl-rozaj fa
yes y sl-rozaj fa
16 Transitive no n fa
yes y fa
17 Human no n pl
yes y pl
18 Class bu b ce
vu v ce
du d ce
ju j ce
19 Subject_Person first 1 ka
second 2 ka
third 3 ka
20 Direct_Object_Person first 1 ka
second 2 ka
third 3 ka
21 Indirect_Object_Person first 1 ka
second 2 ka
third 3 ka
22 Subject_Number singular s ka
plural p ka
23 Direct_Object_Number singular s ka
plural p ka
24 Indirect_Object_Number singular s ka
plural p ka
25 Subject_Case nominative n ka
ergative z ka
dative d ka
26 Direct_Object_Case nominative n ka
dative d ka
27 Indirect_Object_Case dative d ka

2.3.3. Adjective

Common specifications for Adjective
P Attribute Value Code English Romanian Polish Czech Slovak Slovene Resian Serbo-Croatian Torlak dialect of Serbian Russian Ukrainian Macedonian Bulgarian Damaskini Albanian Persian Estonian Hungarian Chechen Georgian
0 CATEGORY Adjective A en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq fa et hu ce ka
1 Type qualificative f en ro pl cs sk sl-rozaj ru uk mk fa hu
indefinite i
possessive s cs sk sl sl-rozaj hbs sr-tor ru mk
ordinal o sl-rozaj uk mk
participle p pl sl hbs sr-tor uk mk ka
general g sl hbs sr-tor mk sq ka
preposed r sq
2 Degree positive p en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk fa et hu ce ka
comparative c en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk fa et hu ce ka
superlative s en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk fa et hu ka
elative e sl-rozaj
diminutive d sl-rozaj ka
3 Gender masculine m ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq
feminine f ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq
neuter n ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam
common c uk
4 Number singular s ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq et hu ka
plural p ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq et hu ka
dual d cs sl sl-rozaj bg-dam
collective l sl-rozaj
5 Case nominative n pl cs sk sl sl-rozaj hbs sr-tor ru uk bg-dam sq et hu ce ka
genitive g pl cs sk sl sl-rozaj hbs sr-tor ru uk bg-dam sq fa et hu ka
dative d pl cs sk sl sl-rozaj hbs sr-tor ru uk bg-dam sq hu ka
accusative a pl cs sk sl sl-rozaj hbs sr-tor ru uk bg-dam sq hu
vocative v ro cs sk hbs sr-tor bg-dam ka
locative l pl cs sk sl sl-rozaj hbs sr-tor ru uk bg-dam
instrumental i pl cs sk sl sl-rozaj hbs sr-tor ru uk bg-dam hu ka
direct r ro
oblique o ro bg-dam
partitive 1 et
illative x et hu
inessive 2 et hu
elative e et hu
allative t et hu
adessive 3 et hu
ablative b sq et hu
translative 4 et
terminative 9 et hu
essive w et hu ka
abessive 5 et
komitative k et
aditive 7 et
temporalis m hu
causalis c hu
sublative s hu
delative h hu
sociative q hu
factive y hu
superessive p hu
distributive u hu
essive-formal f hu
other 6 ce
ergative z ka
6 Definiteness no n ro sl sl-rozaj hbs sr-tor mk bg bg-dam sq fa
yes y ro sl sl-rozaj hbs sr-tor mk bg bg-dam sq fa
short-art s pl ru uk bg
full-art f pl ru uk bg
proximal p mk
distal d mk
7 Clitic no n ro ka
yes y ro ka
8 Animate no n pl cs sk sl-rozaj hbs sr-tor uk
yes y pl cs sk sl-rozaj hbs sr-tor uk
9 Formation nominal n cs
compound c cs
10 Owner_Number singular s hu
plural p hu
11 Owner_Person first 1 hu
second 2 hu
third 3 hu
12 Owned_Number singular s hu
plural p hu
13 Aspect progressive p pl uk
perfective e pl uk
biaspectual b uk
14 Voice active a pl uk
passive p pl uk
15 Tense present p uk
past s uk
16 Human no n pl
yes y pl
17 Negation no n pl
yes y pl
18 Class bu b ce
vu v ce
du d ce
ju j ce
19 Articulation articulated a sq
unarticulated u sq
20 Article t-form t sr-tor
v-form v sr-tor
n-form n sr-tor

2.3.3.1. Notes

  • In the Romanian case system the value 'direct' conflates 'nominative' and 'accusative', while the value 'oblique' conflates 'genitive' and 'dative'.
  • For Macedonian, the definiteness attributes can take the values: non definite (no), generally definite (yes), definite at short visible distance (proximal), and definite at longer visible distance (distal).

2.3.4. Pronoun

Common specifications for Pronoun
P Attribute Value Code English Romanian Polish Czech Slovak Slovene Resian Serbo-Croatian Torlak dialect of Serbian Russian Ukrainian Macedonian Bulgarian Damaskini Albanian Persian Estonian Hungarian Chechen Georgian
0 CATEGORY Pronoun P en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq fa et hu ce ka
1 Type personal p en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq fa et hu ce ka
demonstrative d ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq fa et hu ce ka
indefinite i ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq fa et hu ce ka
possessive s en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk bg sq et hu ce ka
interrogative q en pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq fa et hu ce ka
relative r en pl cs sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq et hu ka
exclamative e
reflexive x en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq fa et hu ce
reciprocal y fa et hu ka
negative z ro pl cs sk sl sl-rozaj ru uk mk bg bg-dam ce ka
general g en pl cs sk sl sl-rozaj uk mk bg
int-rel w ro
determinal m et ka
ex-there t en
nonspecific n ru
emphatic h uk
definite f ce
2 Person first 1 en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq fa et hu ce ka
second 2 en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq fa et hu ce ka
third 3 en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq fa et hu ce ka
3 Gender masculine m en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq
feminine f en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq
neuter n en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam
4 Number singular s en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq fa et hu ce ka
plural p en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq fa et hu ce ka
dual d cs sl sl-rozaj bg-dam
paucal c
collective l sl-rozaj
5 Case nominative n en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq et hu ce ka
genitive g ro pl cs sk sl sl-rozaj hbs sr-tor ru uk bg-dam sq fa et hu ce ka
dative d ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq hu ce ka
accusative a en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq fa hu
vocative v ro sk hbs sr-tor ru ka
locative l pl cs sk sl sl-rozaj hbs sr-tor ru uk bg-dam ce
instrumental i pl cs sk sl sl-rozaj hbs sr-tor ru uk bg-dam hu ce ka
direct r ro
oblique o ro bg-dam
partitive 1 et
illative x et hu
inessive 2 et hu
elative e et hu
allative t et hu ce
adessive 3 et hu
ablative b sq et hu
translative 4 et
terminative 9 et hu
essive w et hu ka
abessive 5 et
komitative k et
aditive 7 et
temporalis m hu
causalis c hu
sublative s hu
delative h hu
sociative q hu
factive y hu
superessive p hu
distributive u hu
essive-formal f hu
ergative z ce ka
lative j ce
comparison 8 ce
6 Owner_Number singular s en ro cs sk sl sl-rozaj sq hu
plural p en ro cs sk sl sl-rozaj sq hu
dual d sl sl-rozaj
7 Owner_Gender masculine m en cs sl sl-rozaj sq
feminine f en cs sl sl-rozaj sq
neuter n cs sl sl-rozaj sq
8 Clitic no n ro pl cs sk sl-rozaj mk bg sq fa ka
yes y ro pl cs sk sl sl-rozaj mk bg sq fa ka
bound b sl sl-rozaj sq
agglutinant a pl
9 Referent_Type personal p pl cs sk sl-rozaj bg
possessive s pl cs sk sl-rozaj uk bg
attributive a bg
quantitative q bg
10 Syntactic_Type nominal n pl cs sk sl-rozaj ru uk sq
adjectival a pl cs sk sl-rozaj ru uk sq
adverbial r pl ru uk
11 Definiteness no n sl-rozaj mk bg sq
yes y sl-rozaj mk bg sq
short-art s pl bg
full-art f pl bg
proximal p mk sq
distal d mk sq
12 Animate no n pl cs sk sl-rozaj hbs sr-tor ru uk
yes y pl cs sk sl-rozaj hbs sr-tor ru uk
13 Clitic_s yes y cs
no n cs
14 Pronoun_Form strong s ro sq
weak w ro sq
15 Owner_Person first 1 hu
second 2 hu
third 3 hu
16 Owned_Number singular s hu
plural p hu
17 Wh_Type relative r en
question q en
18 Courtesy no n fa
yes y fa
19 Human no n pl
yes y pl
20 Inclusion inclusive i ce
exclusive e ce
21 Article t-form t sr-tor
v-form v sr-tor
n-form n sr-tor

2.3.4.1. Notes

  • In the Romanian case system the value 'direct' conflates 'nominative' and 'accusative', while the value 'oblique' conflates 'genitive' and 'dative'.
  • For Macedonian, the definiteness attributes can take the values: non definite (no), generally definite (yes), definite at short visible distance (proximal), and definite at longer visible distance (distal).

2.3.5. Determiner

Common specifications for Determiner
P Attribute Value Code English Romanian Polish Czech Slovak Slovene Resian Serbo-Croatian Torlak dialect of Serbian Russian Ukrainian Macedonian Bulgarian Damaskini Albanian Persian Estonian Hungarian Chechen Georgian
0 CATEGORY Determiner D en ro fa
1 Type demonstrative d en ro fa
indefinite i en ro fa
possessive s en ro
interrogative q fa
relative r
exclamative e fa
article a fa
general g en
int-rel w ro
negative z ro
emphatic h ro
exceptional x fa
2 Person first 1 en ro
second 2 en ro
third 3 en ro
3 Gender masculine m ro
feminine f ro
neuter n ro
4 Number singular s en ro fa
plural p en ro fa
5 Case direct r ro
oblique o ro
6 Owner_Number singular s en ro
plural p en ro
7 Owner_Gender masculine m en
feminine f en
neuter n en
8 Clitic no n ro
yes y ro
9 Modific_Type prenomin e ro
postnomin o ro
10 Wh_Type relative r en
question q en

Notes: In the Romanian case system the value 'direct' conflates 'nominative' and 'accusative', while the value 'oblique' conflates 'genitive' and 'dative'.

2.3.6. Article

Common specifications for Article
P Attribute Value Code English Romanian Polish Czech Slovak Slovene Resian Serbo-Croatian Torlak dialect of Serbian Russian Ukrainian Macedonian Bulgarian Damaskini Albanian Persian Estonian Hungarian Chechen Georgian
0 CATEGORY Article T ro sl-rozaj sq hu
1 Type definite f ro sl-rozaj hu
indefinite i ro sl-rozaj sq hu
possessive s ro sq
demonstrative d ro
nominal n sq
adjectival a sq
numerical m sq
pronominal p sq
2 Gender masculine m ro sl-rozaj
feminine f ro sl-rozaj
neuter n ro sl-rozaj
3 Number singular s ro sl-rozaj
plural p ro sl-rozaj
dual d sl-rozaj
collective l sl-rozaj
4 Case nominative n sl-rozaj
genitive g sl-rozaj
dative d sl-rozaj
accusative a sl-rozaj
locative l sl-rozaj
instrumental i sl-rozaj
direct r ro
oblique o ro
5 Clitic no n ro
yes y ro
6 Animate no n sl-rozaj
yes y sl-rozaj

Notes: In the Romanian case system the value 'direct' conflates 'nominative' and 'accusative', while the value 'oblique' conflates 'genitive' and 'dative'.

2.3.7. Adverb

Common specifications for Adverb
P Attribute Value Code English Romanian Polish Czech Slovak Slovene Resian Serbo-Croatian Torlak dialect of Serbian Russian Ukrainian Macedonian Bulgarian Damaskini Albanian Persian Estonian Hungarian Chechen Georgian
0 CATEGORY Adverb R en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq fa et hu ce ka
1 Type general g ro cs sl sl-rozaj hbs sr-tor mk bg hu
particle p ro hu
causal o hu ka
negative z ro
adjectival a mk bg
verbal v mk hu
modifier m en ro sq hu ka
specifier s en sq ka
int-rel w ro
portmanteau c ro
interrogative q sq hu ka
participle r sl hbs sr-tor
modal d mk
local l ka
temporal t ka
quantitative u ka
relative e ka
2 Degree positive p en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk sq fa hu
comparative c en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk fa hu
superlative s en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk sq hu
elative e sl-rozaj
3 Clitic no n ro pl hu ka
yes y ro pl hu ka
agglutinant a pl
burkinostka u pl
4 Number singular s hu
plural p hu
5 Person first 1 hu
second 2 hu
third 3 hu
6 Wh_Type relative r en
question q en
7 Case genitive g fa

2.3.8. Adposition

Common specifications for Adposition
P Attribute Value Code English Romanian Polish Czech Slovak Slovene Resian Serbo-Croatian Torlak dialect of Serbian Russian Ukrainian Macedonian Bulgarian Damaskini Albanian Persian Estonian Hungarian Chechen Georgian
0 CATEGORY Adposition S en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq fa et hu ce ka
1 Type preposition p en ro pl cs sk sl-rozaj ru uk mk bg sq fa et
postposition t en fa et hu ce ka
2 Formation simple s ro cs sk sl-rozaj ru uk mk fa
compound c ro cs sk sl-rozaj ru uk mk fa
3 Case nominative n sl sl-rozaj sq ka
genitive g ro pl cs sk sl sl-rozaj hbs sr-tor ru uk bg-dam ka
dative d ro pl cs sk sl sl-rozaj hbs sr-tor ru uk bg-dam ka
accusative a ro pl cs sk sl sl-rozaj hbs sr-tor ru uk bg-dam sq
locative l pl cs sk sl sl-rozaj hbs sr-tor ru uk bg-dam
instrumental i pl cs sk sl sl-rozaj hbs sr-tor ru uk bg-dam ka
ablative b sq
essive w ka
ergative z ka
vocative v ka
4 Clitic no n ro ka
yes y ro ka

2.3.9. Conjunction

Common specifications for Conjunction
P Attribute Value Code English Romanian Polish Czech Slovak Slovene Resian Serbo-Croatian Torlak dialect of Serbian Russian Ukrainian Macedonian Bulgarian Damaskini Albanian Persian Estonian Hungarian Chechen Georgian
0 CATEGORY Conjunction C en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq fa et hu ce ka
1 Type coordinating c en ro cs sk sl sl-rozaj hbs sr-tor ru uk mk bg sq fa et hu ka
subordinating s en ro cs sk sl sl-rozaj hbs sr-tor ru uk mk bg sq fa et hu ka
portmanteau r ro
2 Formation simple s ro sl-rozaj hbs ru uk mk bg fa hu
compound c ro sl-rozaj hbs ru uk mk bg fa hu
3 Coord_Type simple s ro
repetit r ro
correlat c ro
sentence p ru hu
words w ru hu
initial i en
non-initial n en
4 Sub_Type negative z ro ru
positive p ro ru
5 Clitic no n ro
yes y ro
6 Number singular s cs
plural p cs
7 Person first 1 cs
second 2 cs
third 3 cs

2.3.10. Numeral

Common specifications for Numeral
P Attribute Value Code English Romanian Polish Czech Slovak Slovene Resian Serbo-Croatian Torlak dialect of Serbian Russian Ukrainian Macedonian Bulgarian Damaskini Albanian Persian Estonian Hungarian Chechen Georgian
0 CATEGORY Numeral M en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq fa et hu ce ka
1 Type cardinal c en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg sq fa et hu ce ka
ordinal o en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk bg fa et hu ce ka
fractal f ro sq fa hu ka
multiple m ro cs sk hbs ru sq ka
collect l ro pl ru sq
special s cs sk sl hbs sr-tor mk
ordinal2 r fa
pronominal p sl mk sq
approximative a ka
2 Gender masculine m ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg sq
feminine f ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg sq
neuter n ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg
3 Number singular s ro pl cs sk sl sl-rozaj hbs sr-tor ru uk bg sq et hu ce ka
plural p ro pl cs sk sl sl-rozaj hbs sr-tor ru uk bg sq et hu ce ka
dual d cs sl sl-rozaj
collective l sl-rozaj
4 Case nominative n pl cs sk sl sl-rozaj hbs sr-tor ru uk sq et hu ce ka
genitive g pl cs sk sl sl-rozaj hbs sr-tor ru uk sq fa et hu ce ka
dative d pl cs sk sl sl-rozaj hbs sr-tor ru uk sq hu ce ka
accusative a pl cs sk sl sl-rozaj hbs sr-tor ru uk sq hu
vocative v sk hbs sr-tor ka
locative l pl cs sk sl sl-rozaj hbs sr-tor ru uk ce
instrumental i pl cs sk sl sl-rozaj hbs sr-tor ru uk hu ce ka
direct r ro
oblique o ro
partitive 1 et
illative x et hu
inessive 2 et hu
elative e et hu
allative t et hu ce
adessive 3 et hu
ablative b sq et hu
translative 4 et
terminative 9 et hu
essive w et hu ka
abessive 5 et
komitative k et
aditive 7 et
temporalis m hu
causalis c hu
sublative s hu
delative h hu
sociative q hu
factive y hu
superessive p hu
distributive u hu
essive-formal f hu
multiplicative 6 hu
ergative z ce ka
lative j ce
comparison 8 ce
5 Form digit d ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq et hu ce ka
roman r ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq et hu ce ka
letter l ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq et hu ce ka
both b ro sq
m-form m bg
approx a bg
alphabetic c bg-dam ka
6 Definiteness no n ro sl mk bg fa
yes y ro sl mk bg fa
short-art s bg
full-art f bg
proximal p mk
distal d mk
7 Clitic no n ro ka
yes y ro ka
8 Class definite f pl cs sk
definite1 1 cs sk
definite2 2 cs
definite34 3 pl cs
definite234 4 sk
demonstrative d cs sk
indefinite i cs sk
interrogative q cs sk
relative r cs
9 Animate no n pl cs sk sl-rozaj hbs sr-tor ru uk
yes y pl cs sk sl-rozaj hbs sr-tor ru uk
10 Owner_Number singular s hu
plural p hu
11 Owner_Person first 1 hu
second 2 hu
third 3 hu
12 Owned_Number singular s hu
plural p hu
13 Human no n pl
yes y pl
14 Article t-form t sr-tor
v-form v sr-tor
n-form n sr-tor

Notes: In the Romanian case system the value 'direct' conflates 'nominative' and 'accusative', while the value 'oblique' conflates 'genitive' and 'dative'.

2.3.11. Particle

Common specifications for Particle
P Attribute Value Code English Romanian Polish Czech Slovak Slovene Resian Serbo-Croatian Torlak dialect of Serbian Russian Ukrainian Macedonian Bulgarian Damaskini Albanian Persian Estonian Hungarian Chechen Georgian
0 CATEGORY Particle Q ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq ce ka
1 Type negative z ro hbs sr-tor bg bg-dam sq
infinitive n ro sq
subjunctive s ro sq
aspect a ro
future f ro
general g bg bg-dam sq
comparative c bg bg-dam sq
verbal v bg sq
interrogative q hbs sr-tor bg bg-dam sq
modal o hbs sr-tor bg bg-dam
affirmative r hbs sr-tor sq
definitive d bg-dam sq
2 Formation simple s ru mk bg
compound c ru mk bg
3 Clitic no n ro pl
yes y ro pl
agglutinant a pl
demanding d pl

2.3.12. Interjection

Common specifications for Interjection
P Attribute Value Code English Romanian Polish Czech Slovak Slovene Resian Serbo-Croatian Torlak dialect of Serbian Russian Ukrainian Macedonian Bulgarian Damaskini Albanian Persian Estonian Hungarian Chechen Georgian
0 CATEGORY Interjection I en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq fa et hu ce ka
1 Type mood m sq hu
other o sq hu
2 Formation simple s ru bg
compound c ru bg

2.3.13. Abbreviation

Common specifications for Abbreviation
P Attribute Value Code English Romanian Polish Czech Slovak Slovene Resian Serbo-Croatian Torlak dialect of Serbian Russian Ukrainian Macedonian Bulgarian Damaskini Albanian Persian Estonian Hungarian Chechen Georgian
0 CATEGORY Abbreviation Y en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq fa et hu ka
1 Syntactic_Type nominal n ro ru et
verbal v ro et
adjectival a ro et
adverbial r ro ru et
pronominal p ro
2 Gender masculine m ro ru
feminine f ro ru
neuter n ro ru
3 Number singular s ro ru et
plural p ro ru et
paucal c ru
4 Case nominative n ru et
genitive g ru et
dative d ru
accusative a ru
locative l ru
instrumental i ru
direct r ro
oblique o ro
vocative v ro
partitive 1 et
illative x et
inessive 2 et
elative e et
allative t et
adessive 3 et
ablative b et
translative 4 et
terminative 9 et
essive w et
abessive 5 et
komitative k et
aditive 7 et
5 Definiteness yes y ro
no n ro

2.3.13.1. Notes

  • In the Romanian case system the value 'direct' conflates 'nominative' and 'accusative', while the value 'oblique' conflates 'genitive' and 'dative'.

2.3.14. Residual

Common specifications for Residual
P Attribute Value Code English Romanian Polish Czech Slovak Slovene Resian Serbo-Croatian Torlak dialect of Serbian Russian Ukrainian Macedonian Bulgarian Damaskini Albanian Persian Estonian Hungarian Chechen Georgian
0 CATEGORY Residual X en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq fa et hu ce
1 Type foreign f sl hbs sr-tor mk sq ce
typo t sl mk sq
program p sl mk sq
web w sl hbs sr-tor mk sq
emo e sl hbs sr-tor mk sq
hashtag h sl hbs sr-tor mk sq
at a sl hbs sr-tor mk sq

2.3.14.1. Notes

  • For Slovene the Type attribute has been introduced on Residual, which distinguishes the values of "foreign", to mark a words in a strech of foreign language text, "typo", a mis-typed word, and "program", where the tokenisation program made a mistake. The second, and esp. the third value are useful for hand-annotation of corpora.

2.3.15. Punctuation

Common specifications for Punctuation
P Attribute Value Code English Romanian Polish Czech Slovak Slovene Resian Serbo-Croatian Torlak dialect of Serbian Russian Ukrainian Macedonian Bulgarian Damaskini Albanian Persian Estonian Hungarian Chechen Georgian
0 CATEGORY Punctuation Z en ro pl cs sk sl sl-rozaj hbs sr-tor ru uk mk bg bg-dam sq fa et hu ce ka

2.3.15.1. Notes

  • Due to popular demand (i.e. Vladimir Benko:), punctuation has been introduced into the specifications in Version 5. Even though MULTEXT(-East) specifications and the MSDs are meant to describe the morphosyntax of words rather than act directly as corpus tags, they are nevetheless often used for exactly this purpose. This means that punctuation was tagged in each corpus differently,leading to inconsistences. Therefore, the Z category was introduced.
Date: 2022-06-24
This work is licensed under the Creative Commons Attribution-ShareAlike 4.0 International.