Multext-East Home Page

MULTEXT-East: Multilingual Text Tools and Corpora for Central and Eastern European Languages

The MULTEXT-East resources are a multilingual dataset for language engineering research and development. This dataset contains, for Bulgarian, Croatian, Czech, English, Estonian, Hungarian, Lithuanian, Resian, Romanian, Russian, Slovene, and Serbian, some, or all of the following language resources: the MULTEXT-East morphosyntactic specifications, lexica, and annotated "1984" corpus; the MULTEXT-East parallel and comparable text and speech corpora; and associated documentation.

MULTEXT-East resources Version 3 (latest release: 2004-07-07)

What's new in V3:

History

The MULTEXT-East project was a spin-off of MULTEXT and ran from '95 to '97. MULTEXT-East developed language resources for six languages: Bulgarian, Czech, Estonian, Hungarian, Romanian, and Slovene, as well as for English, as the 'hub' language of the project. It also adapted existing tools and standards to these languages. The main results of the project were an annotated multilingual corpus and lexical resources for the seven languages.

The extended results of the project were made available in 1998, first on CD-ROM and then via TRACTOR, the TELRI Research Archive of Computational Tools and Resources. This first release is also mirrored here.

In the scope of the Concede project, a new release was made available in 2002; it contained only the (updated and corrected) morphosytntactic resources from the first release. This second release was made freely available for research use via the Web - it is available here.

Finally, the third release was made in 2004 - it updates and brings together the first two, and is also available via the Web, here.

For further information on the MULTEXT-East project, its results and their exploitation you can consult the annotated bibliography of MULTEXT-East, available in HTML and various other formats. Documentation on Version 3 is avalilable here.


Server statistics are available for the MULTEXT-East pages.


Page http://nl.ijs.si/ME/, last updated 2004-10-22, Tomaž Erjavec

Valid HTML 4.01!