CES header in HTML

This document is a HTML 3.2 rendering of a Corpus Encoding Specification DTD document,
by Fred, using the ceshdr2html_tmap.fred translation map.

Note that this HTML translation does not contain all the information from the original document.

Uses ISO 8859-1 (Latin-1) encoding.

CES header

Version: 4.1, Type: text, Language: en,
Creator: SIT, Status: update, Created: 1997-09-27, Updated: 1997-12-20

File Description

Title Statement

Title:: CES: Nineteen Eighty-Four, Russian
Responsibility: Paul Sokolovsky, Sergey Sryvkin (Proofreading, hyphenation deletion, formatting, inserting paragraph and sub-paragraph level tagging.)

Edition:

MTE Final Release

Extent:

76469 words, 2.2 mb
Note: WordCount represents the number of words in this text exclusive of tags and header information, counted before markup process. ByteCount reflects the approximate size of the file containing the doctype and cesDoc element including all text, tags and header information.

Publication Statement

Distributor:: Severodonetsk Institute of Technology, East-Ukraine State University
Address:: Sovetsky st., bl. 3a, Severodonetsk, Lugansk reg., Ukraine
Electronic address:: : Paul.Sokolovsky@technologist.com
Availiability:: Available for research purposes upon receipt of signed agreement
Publication date:: January 1st, 1998

Source Description

Full Bibliography

Title Statement

Title:: Orwell's 1984, Russian: plaintext electronic edition
Responsibility: Maxim Moshkov's Library (Made the electronic edition available on the Internet)

Publication Statement

Distributor:: Maxim Moshkov's Library
Address:: http://www.moshkow.orc.ru/koi http://www.alkar.net/moshkow/html-KOI
Availiability:
Publication date:: Unknown

Source Description

Structured Bibliography

Monography

Title:

Nineteen Eighty Four (Russian)

Author:

George Orwell

Author:

Translator: V. Golyshev

Imprint

Publication date:: Unknown
Publisher:: Unknown
Place:: Unknown

Encoding Description

Project Description:

MULTEXT-East: Multilingual Text Tools and Corpora for Central and Eastern European Languages. EU Copernicus Project COP106 This text is volunteer contribution to project.

Editorial declaration:

Conformance:: Corpus Encoding Standard, Version 4.0
Correction:
Quotation:: No quotation marks are preserved in text. Due to stipulations of russian written language, only doublequotes used in rendition ("PRE ldquo POST rdquo")
Segmentation:: Marked up to the paragraph level: P, QUOTE, NOTE, plus marking of sub-paragraph element Q. Some marking of particular sub-paragraph elements: NAME, DATE, TIME, MENTIONED, FOREIGN, ABBR.
Hyphenation:: No hyphenation marks are present in text.

Tag declaration:

abbr = 11: All abbreviations are marked.
body = 1
date = 36: All dates which contain one or more digits (the characters 0-9) are marked, including dates specifying day/month/year and dates consisting only of a year. The attribute 'iso8601' is used consistently. If there were two dates in one phrase, one consisting of digits and other lexical, latter marked up too, e.g. "in 1944 and forty-five" No attempt was made to identify or mark dates in other forms.
div = 28
foreign = 348: In some mteO-??.ces it was pointed that only hilited in typographic text words were marked. We, rather, markup newspeak words, if they by some reasons, mostly morphological, cannot be correct for russian. The one typical example is translation of 'telescreen'. Newspeak idea was wonderfully, as we think, carried into it. Instead of just literally translating "telescreen" into "теле-экран", the translator contracted (unusual phenomena for russian) 'е' & ';э' resulting in "телекран". It sounds even more awful, knowing that "кран" stem has no semantic relation to original "экран". So, this word can't be in plain russian - it's 100% newspeak-russian!
head = 1
hi = 10: This applies to rend attribute of other tags too. As our primary source for markup was an electronic plaintext version, no character-level typographical rendition except capitalization was present in original. Para-level included only line-breaking & centering. Though we look in printed book in process, we decided not to put character-level rendition, because book is different version, and because CES guides that all rendition should be resolved in descreptive tags ;& no rendition attrs which mere purpose is to recreate original view should be left. So, only CA ;& CE values of rend are used. As in others mteO-??, capitalized text was decapitilized.
item = 4
l = 39
list = 1
mentioned = 216: CES and TEI give little vague criteria on what to mark mentioned. We were trying to inherit occurances from Oen, though somewhere it may be inconsistent.
name = 2105: All names of people, places, organizations, products, and events, are marked. Person names in the genitive are not marked. All names of countries and towns are marked with type=place. Names of rivers and oceans are marked too with type=place. Some other proper-nouns(groups) denoted places were marked, e.g. Golden Country ;& Chestnut Tree Caf;é
note = 2: Strange, but electronic version had no notes, though in printed reference they exist. We have reinserted them.
num = 12: Numbers are marked only if corresponding one in english version is marked too So, there only some occurences are marked.
poem = 10
ptr = 2
q = 2160: The Q tag is used to mark slogans and quoted dialogue. The attribute "broken=yes" is currently not inserted when no sentence terminating punctuation (either inside the Q itself or in the intervening text between two Qs) appears between two dialogue fragments by the same speaker.
quote = 30: QUOTE marks quotations from outside sources, including extensive quotations from Winston's diary and Goldstein's treatise.
ref = 1
s = 0: S tags have not yet been insterted.
text = 1
title = 44

Revision Description

Date: 27 Sep 1997 (Team)

Started -- Automatically marked some names

Date: 2 Oct 1997 (Team)

Created header

Date: 6 Oct 1997 (Team)

Hand-marking most of first chapter: quotations, names, etc.

Date: 8 Oct 1997 (Sergey Sryvkin)

Completed first chapter: quotations, names, etc.

Date: 9 Oct 1997 (Sergey Sryvkin)

Check and clean-up the first chapter.

Date: 16 Nov 1997 (Paul Sokolovsky)

tagusage, entities encoding

Date: 19 Nov 1997 (Sergey Sryvkin)

Completed second chapter.

Date: 23 Nov 1997 (Sergey Sryvkin)

Completed third and fourth chapter of Part 1. Quotation, mentioning made without attributes. Mark-uped some similar phrases in whole text.

Date: 2 Dec 1997 (Sergey Sryvkin)

Completed fifth,sixth and seventh chapter of Part 1. Quotation, mentioning made without attributes. Mark-uped almost all similar phrases in whole text.

Date: 6 Dec 1997 (Sergey Sryvkin)

Completed Part 1 and Part 2. Chapter 1 of Part 3 has been completed too. Quotation, mentioning made without attributes. Marked-up all similar phrases in whole text.

Date: 12 Dec 1997 (Sergey Sryvkin)

Completed all text. Quotation, mentioning made without attributes.

Date: 15 Dec 1997 (Paul Sokolovsky)

Proofreading. Correcting typos and tagging. Part 1.

Date: 16 Dec 1997 (Paul Sokolovsky)

Proofreading. Correcting typos and tagging. Part 2. Linking broken q's.

Date: 16 Dec 1997 (Paul Sokolovsky)

Proofreading completed.
"..." changed to entity "hellip".
Missing footnotes inserted.

Date: 20 Dec 1997 (Tomaz Erjavec)

Changed PUBDATE in PUBLICATIONSTMT
Changed AVAILABILITY
Changed SOURCEDESC
Changed some other minor things in the header
Inserted BYTECOUNT
Inserted missing TAGUSAGEs
Inserted OCCURS in all TAGUSAGEs
Changed ID prefix 'ORWru' to 'Oru'
Inserted IDs for P, POEM, LIST, ITEM, L

Meta-Made by et