Expected impact


The need for widely available and reusable tools and data for corpus-based NLP research is well-known. MULTEXT is answering this need for research involving EU languages by developing a comprehensive set of corpus-handling tools and providing a multilingual, partially parallel corpus in six EU languages to serve as a test-bed for these tools. MULTEXT-East will extend this effort to six CEE languages, by adapting MULTEXT 's tools, developing linguistic resources for these six languages, and providing a multilingual corpus comparable to the one developed for EU languages within MULTEXT . This will validate and enhance MULTEXT 's tools and its software and markup standards. Most importantly, it will enable not only early use of developing standards in CEE countries, but also the possibility for feedback as a result of adaptation to a vastly different set of languages.

Tomaz Erjavec
Mon May 20 13:01:13 MDT 1996