TEI Header

§file description
§title statement
§title
id = mteo-bg.title
Multext-East cesDoc corpus: Nineteen Eighty-Four, Bulgarian
§statement of responsibility
§name Lydia Sinapova, Ludmila Dimitrova, Kiril Simov
§responsibility Typing-in '1984', inserting paragraph and some sub-paragraph level tagging.
§statement of responsibility
§name Lydia Sinapova
§responsibility Modified full Orwell markup down to sub-paragraph level to conform to CES V4.0, using the English version as a base
§statement of responsibility
§name Greg Priest-Dorman
§responsibility Added tagging of sentences in paragraphs using MtSgml and Bulgarian resources.
§statement of responsibility
§name Tomaž Erjavec
§responsibility Conversion to XML/TEI P5
§edition statement
§edition MULTEXT-East, Version 4
§extent 87235<measure> WordCount represents the number of words in this text exclusive of tags and header information. Microsoft Word 6.0 was used to count words. ByteCount reflects the approximate size of the file containing the doctype and cesDoc element including all text, tags and header information.
§publication statement
§address http://nl.ijs.si/ME/V4/
§distributor Institue of Mathematics, Bulgarian Academy of Sciences
§address Acad G. Bonchev st. bl.8 1113 Sofia, Bulgaria
§address eAddress: mult@ling.math.acad.bg
§date
when = 2010-05-09
2010-05-09
§source description
§fully-structured bibliographic citation
§title statement
§title Multext-East CES1: Nineteen Eighty-Four, Bulgarian
§statement of responsibility
name Lydia Sinapova, Ludmila Dimitrova, Kiril Simov
responsibility Typing-in '1984', inserting paragraph and some sub-paragraph level tagging.
§statement of responsibility
name Lydia Sinapova
responsibility Modified full Orwell markup down to sub-paragraph level to conform to CES V4.0, using the English version as a base
§statement of responsibility
name Greg Priest-Dorman
responsibility Added tagging of sentences in paragraphs using MtSgml and Bulgarian resources.
§edition statement

MTE Final Release

§publication statement
§distributor Institue of Mathematics, Bulgarian Academy of Sciences
§address Acad G. Bonchev st. bl.8 1113 Sofia, Bulgaria
§address eAddress: mult@ling.math.acad.bg
§availability

Available for research purposes upon receipt of signed agreement

§date
when = 1997-10-01
October 1, 1997
§source description
§structured bibliographic citation
monographic level
title Nineteen Eighty Four (Bulgarian)
author George Orwell
statement of responsibility
responsibility Translator:
name Lydia Bozhilova
imprint
date 1989
publisher Profizdat
publication place Sofia
§encoding description
§project description

MULTEXT-East: Multilingual Text Tools and Corpora for Central and Eastern European Languages. EU Copernicus Project COP106

§editorial practice declaration
§normalization

Corpus Encoding Standard, Version 4.0 CES LEVEL: 1

§quotation
form = std

No quotation marks are preserved in text. Rendition attribute values on Q and QUOTE tags are adapted from ISOpub and ISOnum standard entity set names Two rendition short-cuts are used, 'rend=mdash' stands for 'rend="PRE mdash POST mdash"' 'rend=dblq' stands for 'rend="PRE ldquo POST rdquo"' 'rend="PRE mdash" (or "PRE ldquo") is used when the quoted dialogue ends up with the paragraph (there is no other typographical distinction). 'rend="POST mdash" (or "POST rdquo") is used when there is no typographical distinction (except ordinary punctuation) for the beginning of the quoted dialogue. No default rendition is used.

§segmentation

Marked up to the level of paragraph: P, QUOTE, POEM, NOTE, plus marking of sub-paragraph element Q. Some marking of particular sub-paragraph elements: NAME, DATE, TIME, MENTIONED, FOREIGN, ABBR.

§hyphenation

No end-of-line hyphenation present.

§tagging declaration
§namespace
name = http://www.tei-c.org/ns/1.0
§tag usage
gi = abbr occurs = 28
abbreviation
§tag usage
gi = body occurs = 1
text body
§tag usage
gi = date occurs = 40
date
§tag usage
gi = div occurs = 28
text division
§tag usage
gi = foreign occurs = 29
foreign
§tag usage
gi = head occurs = 1
heading
§tag usage
gi = hi occurs = 103
highlighted
§tag usage
gi = item occurs = 4
item
§tag usage
gi = l occurs = 26
verse line
§tag usage
gi = list occurs = 1
list
§tag usage
gi = mentioned occurs = 256
mentioned
§tag usage
gi = name occurs = 1704
name
§tag usage
gi = note occurs = 8
note
§tag usage
gi = num occurs = 34
number
§tag usage
gi = p occurs = 1321
paragraph
§tag usage
gi = lg occurs = 7
line group
§tag usage
gi = ptr occurs = 8
pointer
§tag usage
gi = q occurs = 2203
separated from the surrounding text with quotation marks
§tag usage
gi = quote occurs = 34
quotation
§tag usage
gi = s occurs = 6649
s-unit
§tag usage
gi = text occurs = 1
text
§tag usage
gi = title occurs = 41
title
§text-profile description
§language usage
§language
ident = bg-cl
Bulgarian colloquial
§language
ident = ns-bg
Newspeak Bulgarian
§language
ident = ns-jg-bg
Newspeak official jargon Bulgarian
§text classification
§category reference
target = orwl
§revision description
§change 1996-10-25<date>Lydia Sinapova<name>Replaced Q tags with MENTIONED tags where appropriate
§change 1996-10-25<date>Lydia Sinapova<name>linked broken Q tags with "prev" and "next" attributes
§change 1996-10-25<date>Lydia Sinapova<name>all occurrences of "..." have been replaced with the ISO_8879:1986 Publishing entity "hellip"
§change 1996-02-20<date>Lydia Sinapova<name>Replaced Q and HI tags with MENTIONED tags in accordance to the English tagging where appropriate
§change 1996-02-20<date>Lydia Sinapova<name>Changes in the use of NAME tag with TYPE=PLACE - removed where previously used for names of rivers and oceans
§change 1996-02-20<date>Lydia Sinapova<name>Tagged using Bulgarian jargon with LANG=BG-CL corresponding to English LANG=EN-CK
§change 1996-02-20<date>Lydia Sinapova<name>Using LIST tag corresponding to the English tagging
§change 1997-03-20<date>Tomaz Erjavec, IJS<name>Normalisation of corpus component CESHEADER elements: CESHEADER, EDITIONSTMT, TITLESTMT/H.TITLE
§change 1997-03-20<date>Tomaz Erjavec, IJS<name>ISO LANGUAGEs implemented as marked section PUBLIC ent
§change 1997-03-20<date>Tomaz Erjavec, IJS<name>Language (WSDs) implemented as PUBLIC entities
§change 1997-03-20<date>Tomaz Erjavec, IJS<name>Newspeak LANGUSAGE/LANGUAGE IDs now ns-xx for lang xx
§change 1997-03-20<date>Tomaz Erjavec, IJS<name>Now every QUOTE in 1984 has at least one P
§change 1997-03-27<date>Tomaz Erjavec, IJS<name>Substituted IGCY entity with JCY
§change 1997-04-04<date>Greg Priest-Dorman<name>inserted S tags in the locations given by MtSeg
§change 1997-04-04<date>Greg Priest-Dorman<name> inserted Q tags where necessary as a result of S tag insertion
§change 1997-04-04<date>Greg Priest-Dorman<name>updated TAGUSAGE for Q and S
§change 1997-08-06<date>Tomaž Erjavec<name>Removed empty S Obg.1.2.27.2.1 and empty Q Obg.1.1.34.11.1
§change 1997-08-06<date>Tomaž Erjavec<name>updated TAGUSAGE for Q and S, BYTECOUNT
§change 1997-09-25<date>Tomaž Erjavec<name>Changed editionStmt, byteCount, pubDate, Availability to final form
§change 2004-05-10<date>Tomaž Erjavec<name>Converted to TEI P4, prepared for MTE V3
§change 2010-05-09<date>Tomaž Erjavec<name>Conversion to MULTEXT-East TEI P5.