Here you can Tokenise, Tag and Lemmatise Slovene texts. The tags (morphosyntactic descriptions, MSDs) follow the JOS morphosyntactic specifications and can be shown either in Slovene (e.g. Gp-g = glagol pomožni pogojnik) or English (e.g. Va-c = Verb auxiliary conditional). The output file is in "vertical" format, appropriate for using in SketchEngine and CWB. Each line is either an XML tag (<doc>, <p>, <s> and </s>, </p>, </doc>) or an annotated token. Token lines are tab-separated and contain 1) the token, 2) the lemma (base form) of the word, and 3) the MSD tag. For punctuation, the MSD and lemma fields are identical to the token. The MSDs can be converted into various other formats with the JOS MSD conversion tables.

Analyse the text and the result or the compressed files, with the MSDs in Slovene or English.

