TEI Header

^§file description

^§title statement

^§title

Multext-East cesAna: Nineteen Eighty-Four, Polish

^§statement of responsibility

^§name	Natalia Kotsyba
^§responsibility	Overall Responsibility
^§responsibility	Morphosyntactic specifications (the theoretical part and converting the MSD index to the format demanded by the specs), tag correspondence tables from the IPIC to the MTE format for the converter.

^§statement of responsibility

^§name	Adam Radziszewski
^§responsibility	Preparing a list of tags and statistics of their usage from the IPIC for conversion (MSD index within the morphosyntactic specifications), conversion code, extracting the lexicon from the IPIC and recalculating statistics to fit the MTE tagset.

^§statement of responsibility

^§name	Ivan Derzhanski
^§responsibility	Morphosyntactic specifications -- categories, values, attributes, editing notes

^§statement of responsibility

^§name	Tomaž Erjavec, JSI
^§responsibility	MTE TEI P5 conversion and conformance.

^§edition statement

^§edition

MULTEXT-East, Version 4

^§publication statement

^§distributor

Institute for Interdisciplinary Studies, „Artes Liberales” Warsaw University, Warsaw, Poland

^§address

Krakowskie Przedmieście 26/28
00-046 Warszawa
Poland

^§address

natalia@ibi.uw.edu.pl

^§availability

Freely available for non-commercial use provided that this Header is included in its entirety with any copy distributed.

^§date
when = 2010-05-04

2010-05-04

^§source description

^§structured bibliographic citation

^§monographic level

^§title

1984

^§author

George Orwell

^§author

Translator: Tomasz Mirkowicz

^§imprint

date	1993
publisher	Da Capo
publication place	Warsaw, Poland

^§encoding description

^§project description

EU Capacities Project GA 211938 "MondiLex"

^§editorial practice declaration

^§interpretation

The tagging of the text was performed with the help of the TaKIPI program (http://nlp.ipipan.waw.pl/TaKIPI/), specially developed for tagging Polish using the IPIC (IIS PAS Corpus: http://korpus.pl) tagset and based on the Morfeusz Morphosyntactic Analyzer for Polish (http://nlp.ipipan.waw.pl/~wolinski/morfeusz/). Afterwards the tag converter was used to recode it into MTE-style format. To conform with MTE’s major demands, the converter provides a more detailed description of some parts of speech, different PoS grouping and considerable differences in word segmentation principles. A detailed description of the correspondences between tags can be found at http://www.domeczek.pl/~natko/papers/MTE-pl_Ljub.pdf. The discussed conversion method has been implemented in the Python programming language; the code and the data are available online at http://domeczek.pl/~polukr/mte-conv/.

^§tagging declaration

^§namespace
name = http://www.tei-c.org/ns/1.0

^§tag usage gi = text occurs = 1	text
^§tag usage gi = body occurs = 1	text body
^§tag usage gi = div occurs = 27	text division
^§tag usage gi = p occurs = 1401	paragraph
^§tag usage gi = s occurs = 6666	s-unit
^§tag usage gi = w occurs = 79772	word
^§tag usage gi = c occurs = 17641	character
^§tag usage gi = back occurs = 1	back matter
^§tag usage gi = docAuthor occurs = 1	document author
^§tag usage gi = docDate occurs = 1	document date
^§tag usage gi = f occurs = 169	feature
^§tag usage gi = fLib occurs = 1	feature library
^§tag usage gi = fs occurs = 1324	feature structure
^§tag usage gi = fvLib occurs = 1	feature-value library
^§tag usage gi = head occurs = 2	heading
^§tag usage gi = item occurs = 3	item
^§tag usage gi = label occurs = 3	label
^§tag usage gi = list occurs = 1	list
^§tag usage gi = ref occurs = 1	reference
^§tag usage gi = symbol occurs = 169	symbolic value

^§revision description

^§change	2010-04-23_<date>Natalia Kotsyba_<name>, Tomaž Erjavec_<name> Changes in a couple MSDs, Final for Version 4.
^§change	2010-02-28_<date>Natalia Kotsyba_<name> Some orhtographic and annotation mistakes were corrected and numbers of parts and chapters introduced.
^§change	2009-11-04_<date>Tomaž Erjavec_<name> Draft P5 version.
^§change	2009-09-07_<date>Natalia Kotsyba_<name> Text of novel in TEI P4.