<cesHeader
version="4.1"
type="text"
lang=en
creator=OCS
status="update"
date.created="1997-11-24"
date.updated="1997-12-21"
>
<filedesc>
<titlestmt>
<h.title>Multext-East cesAna: Nineteen Eighty-Four, Hungarian</h.title>
<respstmt>
<respname>Csaba Oravecz</respname>
<resptype>Overall Responsibility</resptype>
<respname>Vladimír Petkevič</respname>
<resptype>Conversion to cesAna DTD </resptype>
</respstmt>
</titlestmt>
<editionstmt version="1.0">MTE Final Release</editionstmt>
<extent>
<wordCount>80705</wordCount>
<byteCount units="MB">18.4</byteCount>
<extnote>wordCount represents he number of TOK TYPE=WORD
elements in the text. byteCount is in megaBytes</extnote>
</extent>
<publicationstmt>
<distributor>
Research Institute for Linguistics, Hungarian Academy of Sciences
</distributor>
<pubaddress> Budapest, Színház u. 5-9.</pubaddress>
<eaddress type="email">oravecz@nytud.hu</eaddress>
<eaddress type="www">http://www.nytud.hu</eaddress>
<availability status="restricted">
Available for research purposes upon receipt of signed agreement
</availability>
<pubDate value="1998-01-01">January 1st, 1998</pubDate>
</publicationstmt>
<sourcedesc>
<biblStruct>
<monogr>
<h.title>1984</h.title>
<h.author>George Orwell</h.author>
<imprint>
<pubdate>1989</pubdate>
<publisher>Európa Könyvkiadó</publisher>
<pubplace>Budapest</pubplace>
</imprint>
</monogr>
</biblStruct>
</sourcedesc>
</filedesc>
<encodingdesc>
<projectdesc>
MULTEXT-East:
Multilingual Text Tools and Corpora for Central and Eastern
European Languages.
EU Copernicus Project COP106
</projectdesc>
<editorialdecl>
<transduction>
In the cesDoc to cesAna conversion, DIV, QUOTE, Q tags and
HEAD, POEM, LIST elements have been omitted. cesDoc P
elements are encoded as PAR, and S as S.
cesDoc sub-S level tags are omitted: DATE, NAME, ABBR, etc.
</transduction>
<quotation>
Q and QUOTE tags from the cesDoc source not retained.
</quotation>
<segmentation>
S segmentation same as in cesDoc source (hand-validated).
TOK segmentation performed with mtseg and manually corrected,
</segmentation>
</editorialdecl>
<tagsdecl>
<tagusage gi=chunkList occurs=1>
Element corresponds to TEXT of the cesDoc source
</tagusage>
<tagusage gi=chunk occurs=1>
Element corresponds to BODY of the cesDoc source
</tagusage>
<tagusage gi=par occurs=1303>
Elements correspond to P elements of the cesDoc source.
The FROM attribute gives the reference to the ID of the
corresponding cesDoc P element.
</tagusage>
<tagusage gi=s occurs=6768>
Elements correspond to S elements of the cesDoc source
The FROM attribute gives the reference to the ID of the
corresponding cesDoc S element.
</tagusage>
<tagusage gi=tok occurs=98426>
Tokens are of TYPE=WORD or PUNCT, with the CLASS attribute
giving the mtseg class of the token.
</tagusage>
<tagusage gi=orth occurs=98426>
Contains the orthography of the token, as found in the
cesDoc source.
</tagusage>
<tagusage gi=disamb occurs=80705>
Contains disambiguated lexical information.
</tagusage>
<tagusage gi=lex occurs=111945>
Contains undisambiguated lexical information.
</tagusage>
<tagusage gi=base occurs=192650>
Base or lemmma of a token.
</tagusage>
<tagusage gi=msd occurs=192650>
Morphosyntactic description of a token.
</tagusage>
<tagusage gi=ctag occurs=98426>
Corpus tag.
</tagusage>
</tagsdecl>
</encodingdesc>
<profiledesc>
<creation date="1997-11-04">
</creation>
<langusage>
<![ %ONECOMPONENT [ &ISOlang; ]]>
<language id=ns-hu iso639=hu>Newspeak Hungarian</language>
</langusage>
</profiledesc>
<revisiondesc>
<change>
<changedate>1997-11-24</changedate>
<respname>Csaba Oravecz, RIL</respname>
<h.item>Initial header</h.item>
</change>
<change>
<changedate>1997-12-21</changedate>
<respname>Tomaz Erjavec, IJS</respname>
<h.item>Converted from ISO Latin-2 to SGML entities</h.item>
<h.item>Changed ... to …</h.item>
<h.item>Modified EDITIONSTMT, BYTECOUNT</h.item>
</change>
</revisiondesc>
</cesheader>