TEI Header

§file description
§title statement
§title
id = mteo-et.title
Multext-East cesDoc corpus: Nineteen Eighty-Four, Estonian
§statement of responsibility
§name Viire Villandi
§responsibility entered and validated the text
§statement of responsibility
§name Heili Orav
§responsibility added CES tags
§statement of responsibility
§name Heiki-Jaan Kaalep
§responsibility supervised the work
§statement of responsibility
§name Heiki-Jaan Kaalep
§responsibility modified the tags and header for version 4
§statement of responsibility
§name Leho Paldre
§responsibility modified the tags and header for version 4.1
§statement of responsibility
§name Greg Priest-Dorman
§responsibility Added tagging of sentences in paragraphs using MtSgml and Estonian resources.
§statement of responsibility
§name Leho Paldre
§responsibility Manually checked automatic tagging of sentences. Corrected 48 typos.
§statement of responsibility
§name Heiki-Jaan Kaalep
§responsibility Corrected the final bytecount and wordcount.
§statement of responsibility
§name Tomaž Erjavec
§responsibility Conversion to XML/TEI P5
§edition statement
§edition MULTEXT-East, Version 4
§extent 79334<measure> WordCount represents the number of words in this text exclusive of tags and header information. ByteCount reflects the approximate size of the file containing the doctype and cesDoc element including all text, tags and header information.
§publication statement
§address http://nl.ijs.si/ME/V4/
§distributor TÜ arvutuslingvistika uurimisgrupp
§address Tiigi 78-232, Tartu, Estonia
§address eAddress: hkaalep@psych.ut.ee
§date
when = 2010-05-09
2010-05-09
§source description
§fully-structured bibliographic citation
§title statement
§title Multext-East CES1: Nineteen Eighty-Four, Estonian
§statement of responsibility
name Viire Villandi
responsibility entered and validated the text
§statement of responsibility
name Heili Orav
responsibility added CES tags
§statement of responsibility
name Heiki-Jaan Kaalep
responsibility supervised the work
§statement of responsibility
name Heiki-Jaan Kaalep
responsibility modified the tags and header for version 4
§statement of responsibility
name Leho Paldre
responsibility modified the tags and header for version 4.1
§statement of responsibility
name Greg Priest-Dorman
responsibility Added tagging of sentences in paragraphs using MtSgml and Estonian resources.
§statement of responsibility
name Leho Paldre
responsibility Manually checked automatic tagging of sentences. Corrected 48 typos.
§statement of responsibility
name Heiki-Jaan Kaalep
responsibility Corrected the final bytecount and wordcount.
§edition statement

MTE Final Release

§publication statement
§distributor TÜ arvutuslingvistika uurimisgrupp
§address Tiigi 78-232, Tartu, Estonia
§address eAddress: hkaalep@psych.ut.ee
§availability

Freely available

§date
when = 1997-10-01
October 1, 1997
§source description
§structured bibliographic citation
monographic level
title 1984
author George Orwell
statement of responsibility
name Elias Treeman
responsibility Translator from English
edition Loomingu Raamatukogu 1990 nr. 48-51
imprint
publisher Perioodika
publication place Tallinn
date 1990
§encoding description
§project description

MULTEXT-East: Multilingual Text Tools and Corpora for Central and Eastern European Languages. EU Copernicus Project COP106

§editorial practice declaration
§normalization

Corpus Encoding Standard, Version 4.1 CES LEVEL: 1

§correction principles

§segmentation

Marked up to the level of paragraph: P, QUOTE plus marking of sub-paragraph element Q, incl. broken Qs. Some marking of particular sub-paragraph elements. BODY, DIV, HEAD, ITEM, L, LIST, NOTE, P, POEM, PTR, QUOTE, TEXT are used so that to be in harmony with the English electronic version of 1984 for MULTEXT-EAST v. 4; the differences are due only to the differences between the English electronic and Estonian printed version. ABBR, DATE, FOREIGN, HI, MENTIONED, NAME, NUM, Q, TITLE are used sloppily.

§hyphenation

No end-of-line hyphenation

§tagging declaration
§namespace
name = http://www.tei-c.org/ns/1.0
§tag usage
gi = abbr occurs = 73
abbreviation
§tag usage
gi = body occurs = 1
text body
§tag usage
gi = date occurs = 18
date
§tag usage
gi = div occurs = 28
text division
§tag usage
gi = foreign occurs = 93
foreign
§tag usage
gi = head occurs = 5
heading
§tag usage
gi = hi occurs = 183
highlighted
§tag usage
gi = item occurs = 4
item
§tag usage
gi = l occurs = 32
verse line
§tag usage
gi = list occurs = 1
list
§tag usage
gi = mentioned occurs = 44
mentioned
§tag usage
gi = name occurs = 2457
name
§tag usage
gi = note occurs = 2
note
§tag usage
gi = num occurs = 14
number
§tag usage
gi = p occurs = 1289
paragraph
§tag usage
gi = lg occurs = 10
line group
§tag usage
gi = ptr occurs = 2
pointer
§tag usage
gi = q occurs = 2192
separated from the surrounding text with quotation marks
§tag usage
gi = quote occurs = 35
quotation
§tag usage
gi = s occurs = 6658
s-unit
§tag usage
gi = text occurs = 1
text
§tag usage
gi = title occurs = 29
title
§text-profile description
§language usage
§language
ident = ns-et
Newspeak Estonian
§text classification
§category reference
target = orwl
§revision description
§change 10/31/96 <date> Heiki-Jaan Kaalep, UT <name> Changed the header to conform to the new CES version
§change 19/02/97 <date> Leho Paldre, UT <name> Identified broken Qs, removed redundant HEADs, checked MENTIONED tags; updated the header
§change 1997-03-20<date>Tomaz Erjavec, IJS<name>Normalisation of corpus component CESHEADER elements: CESHEADER, EDITIONSTMT, TITLESTMT/H.TITLE
§change 1997-03-20<date>Tomaz Erjavec, IJS<name>ISO LANGUAGEs implemented as marked section PUBLIC ent
§change 1997-03-20<date>Tomaz Erjavec, IJS<name>Language (WSDs) implemented as PUBLIC entities
§change 1997-03-20<date>Tomaz Erjavec, IJS<name>Newspeak LANGUSAGE/LANGUAGE IDs now ns-xx for lang xx
§change 1997-03-20<date>Tomaz Erjavec, IJS<name>Now every QUOTE in 1984 has at least one P
§change 1997-04-04<date>Greg Priest-Dorman<name>inserted S tags in the locations given by MtSeg
§change 1997-04-04<date>Greg Priest-Dorman<name> inserted Q and HI tags where necessary as a result of S tag insertion
§change 1997-04-04<date>Greg Priest-Dorman<name>updated TAGUSAGE
§change 1997-06-18<date>Leho Paldre, UT<name>Checked S tagging manually; removed 2 redundant HEADs
§change 1997-06-18<date>Leho Paldre, UT<name> Corrected 48 typoes manually; added 2.5 missing sentences.
§change 1997-06-18<date>Leho Paldre, UT<name>updated TAGUSAGE
§change 1997-08-06<date>Tomaž Erjavec<name>Removed empty S Oet.2.8.1.3, Oet.2.10.7.10.1 and empty Q Oet.1.9.58.7.1, Oet.4.15.4.1
§change 1997-08-06<date>Tomaž Erjavec<name>normalised some HI, FOREIGN so that RE does not occur in tag
§change 1997-08-06<date>Tomaž Erjavec<name>updated TAGUSAGE for Q and S, BYTECOUNT
§change 1997-08-18<date>Heiki-Jaan Kaalep<name>Removed 2 S which contained random keystrokes and nothing more
§change 1997-08-18<date>Heiki-Jaan Kaalep<name>normalised one HI so that RE does not occur in tag
§change 1997-08-18<date>Heiki-Jaan Kaalep<name>updated TAGUSAGE for S, BYTECOUNT
§change 1997-09-25<date>Tomaž Erjavec<name>Changed editionStmt, byteCount, pubDate to final form
§change 1997-12-03<date>Tomaž Erjavec<name>Fixed mdash ents that were not terminated by ;
§change 2004-05-10<date>Tomaž Erjavec<name>Converted to TEI P4, prepared for MTE V3
§change 2010-05-09<date>Tomaž Erjavec<name>Conversion to MULTEXT-East TEI P5.