JOS morphosyntactic specifications for Slovene

2.12. Residual

Up: 2. Definitions of Morphosyntactic Categories Previous: 2.11. Abbreviation

Table of contents

Table 23. Attribute-Value Table for Residual
P Attribute Value Code Attribute Value Code
0 neuvrščeno N Residual X
1 vrsta tujejezično j Type foreign f
tipkarska t typo t
program p program p

2.12.1. Lexicon

This index gives the complete list of morphosyntactic descriptions (MSDs) and their features, in Slovene and English. The first and third column give the MSD, and the second and fourth their expansion to features. The fifth and sixth columns give the number of word tokens and word types tagged with this MSD in the paritally hand validated 1 million word jos1M corpus. The last column gives up to 10 examples of the usage of the MSD in the form word-form/lemma. Where the word-form and lemma are identical, lemma is written as an equal sign. The examples were automatically extracted from (1) the jos1M corpus, (2) the lexicon of closed class words and (3) the lexicon derived from the FidaPLUS corpus. The examples are ordered by the number of occurences in (1), followed by examples from (2) and (3). Examples from (2) and (3) have not been attested in the base corpus, and are therefore crossed out. It should be noted that both (1) and (3) contain some errors of tagging or lemmatisation, so not all examples are necessarily correct.

Table 24. MSDs (4)
MSD (sl) Features (sl) MSD (en) Features (en) Tokens Types Examples of usage
N neuvrščeno X Residual 376 309 D12/=, V6/=, K6-2/=, G400/=, C2/=, A4/=, x86/=, pre/=, kb128/=, V8/=
Nj neuvrščeno vrsta=tujejezično Xf Residual Type=foreign 4425 2514 de/=, of/=, The/=, the/=, and/=, in/=, la/=, a/=, La/=, on/=
Nt neuvrščeno vrsta=tipkarska Xt Residual Type=typo 1332 1238 o/=, po/=, e/=, a/=, na/=, Cemi1/=, za/=, do/=, pri/=, no/=
Np neuvrščeno vrsta=program Xp Residual Type=program 1878 992 1/=, 2/=, 3/=, a/=, e/=, §/=, www./=, 4/=, ja/=, ju/on

Up: 2. Definitions of Morphosyntactic Categories Previous: 2.11. Abbreviation



Tomaž Erjavec, Simon Krek, Špela Arhar, Darja Fišer, Nina Ledinek, Amanda Saksida, Breda Sivec, Blaž Trebar. Date: 2009-02-11
This work is licenced under the Creative Commons Attribution 3.0 Slovenia.