Evropski sporazum - Priloga II Europe Agreement - Annex II Jasna Belc, SVEZ Zagotovitev digitalnega originala Provision of digital original Špela Vintar, FF Poravnava Alignment Peter Holozan, Amebis Leksikalne oznake Lexical annotation Tomaž Erjavec, IJS Tokenizacija, tagiranje, pretvorba v TEI Tokenisation, tagging, conversion to TEI Version 2.0 483 Kb, 25 kW Odsek za inteligentne sisteme Institut "Jožef Stefan" Jamova 39 1000 Ljubljana Dept. of Intelligent Systems, Jozef Štefan Institute Jamova 39 SI-1000 Ljubljana Slovenia

This bi-text of the IJS-ELAN corpus is freely available, provided that the the sources described in this Header are acknowledged. Copyright of the two digital originals for this corpus held by SVEZ: Office of the Government of the Republic of Slovenia for European Affairs

To vzporedno poravnano besedilo korpusa IJS-ELAN je prosto dostopno, pod pogojem, da se citira njegove vire, dokumentirane v tej glavi. Avtorske pravice nad digitalnima originaloma tega besedila pripadajo SVEZ: Služba Vlade RS za evropske zadeve

Evropski sporazum o pridružitvi med republiko slovenijo na eni strani in evropskimi skupnostmi in njihovimi državami članicami, ki delujejo v okviru evropske unije na drugi strani 10. junij 1996 Luksemburg Europe Agreement Establishing an Association Between the European Communities and their Member States, Acting within the Framework of the European Union, of the One Part, and the Republic of Slovenia, of the Other Part June 10. 1996 Luxembourg

This text is part of the LJU1 site contribution to the EU MLIS project ELAN: European Language Activity Network. For more information see the IJS-ELAN homepage .

Typography codes and ToC information removed; Quotes normalised to ", list bullets to -.

This document was originally formatted as a table. For easier processing large segments of text containing nummerical data were omitted.

The digital original was segmented and aligned with Atril and the alignments hand corrected.

Tokenisation into word and character (punctuation) elements was performed by the MULTEXT program mtlex with MULTEXT-East Slovene segmentation resources.

Words automatically marked with context disambiguated lemma and MULTEXT-East morphosyntactic description. English words additionally tagged with BROWN-like tagset by two taggers (TnT,QTAG).

1999-05-16 Tomaž Erjavec Editor V1.0 Header 1999-06-22 Tomaž Erjavec Editor Some more errors of tokenisation corrected; Availability statement changed. 2001-09-06 Tomaž Erjavec Editor Slovene texts tagged with TnT trained on "1984" and bettered with other resources. English texts also tagged with TnT trained on "1984" and additionaly with TnT trained on the Penn Treebank and QTag email service. 2002-04-01 Tomaž Erjavec Editor Recoded in P4/XML and prepared V2.0 for distribution.