Up: 2. Definitions of Morphosyntactic Categories Previous: 2.11. Abbreviation
Table of contents
P | Attribute | Value | Code | Attribute | Value | Code |
0 | besedna_vrsta | neuvrščeno | N | CATEGORY | Residual | X |
1 | vrsta | tujejezično | j | Type | foreign | f |
tipkarska | t | typo | t | |||
program | p | program | p |
This index gives the complete list of morphosyntactic descriptions (MSDs) and their features, in Slovene and English. The first and third column give the MSD, and the second and fourth their expansion to features. The fifth and sixth columns give the number of word tokens and word types tagged with this MSD in the paritally hand validated 1 million word jos1M corpus. The last column gives up to 10 examples of the usage of the MSD in the form word-form/lemma. Where the word-form and lemma are identical, lemma is written as an equal sign. The examples were automatically extracted from (1) the jos1M corpus, (2) the lexicon of closed class words and (3) the lexicon derived from the FidaPLUS corpus. The examples are ordered by the number of occurences in (1), followed by examples from (2) and (3). Examples from (2) and (3) have not been attested in the base corpus, and are therefore crossed out. It should be noted that both (1) and (3) contain some errors of tagging or lemmatisation, so not all examples are necessarily correct.
MSD (sl) | Features (sl) | MSD (en) | Features (en) | Tokens | Types | Examples of usage |
N | neuvrščeno | X | Residual | 376 | 309 | D12/=, V6/=, K6-2/=, G400/=, C2/=, A4/=, x86/=, pre/=, kb128/=, V8/= |
Nj | neuvrščeno vrsta=tujejezično | Xf | Residual Type=foreign | 4425 | 2514 | de/=, of/=, The/=, the/=, and/=, in/=, la/=, a/=, La/=, on/= |
Nt | neuvrščeno vrsta=tipkarska | Xt | Residual Type=typo | 1332 | 1238 | o/=, po/=, e/=, a/=, na/=, Cemi1/=, za/=, do/=, pri/=, no/= |
Np | neuvrščeno vrsta=program | Xp | Residual Type=program | 1878 | 992 | 1/=, 2/=, 3/=, a/=, e/=, §/=, www./=, 4/=, ja/=, ju/on |
Up: 2. Definitions of Morphosyntactic Categories Previous: 2.11. Abbreviation