This document is a HTML 3.2 rendering of a
Corpus Encoding Specification
DTD document, produced in the scope of the
MULTEXT-East
project, by
Fred.
Note that this HTML translation does not contain all the information from the cesHeader.
CES header
Creator: LD
Created: 1996-05-14
Updated: 1997-09-25
File Description
- Title Statement
- Title:
- Multext-East CES1: Fiction, Bulgarian
- Responsibility
- Ludmila Dimitrova
(
Inserting paragraph and some sub-paragraph level tagging
)
Lydia Sinapova
(
Correcting spelling of the electronic text
Inserting additional paragraph and some
sub-paragraph level tagging.
)
- Responsibility
- Ludmila Dimitrova
(Comparing the electronic version with
the printed publications of the novels
'PASSION or the Death of Alice' by Emilia Dvorianova
and'I want, I believe, I can' by Julia Berberyan
and checking the electronics versions. Modifing full
Bulgarian Fiction corpus to conform to CES V4.0
inserting paragraph and sub-paragraph level tagging.
Checked and modified markup down to sub-paragraph level.
)
- Edition:
- MTE Final Release
- Extent:
- 97251 words
2689133 bytes
Note:
WordCount represents the number of words in this text
exclusive of tags and header information.
Microsoft Word 6.0 was used to count words.
ByteCount reflects the approximate size of the file
containing the doctype and cesDoc element including
all text, tags and header information.
- Publication Statement
- Distributor:
-
Institute of Mathematics,
Bulgarian Academy of Sciences
- Address:
-
Acad G. Bonchev st. bl.8
1113 Sofia, Bulgaria
- Electronic address:
- mult@ling.math.acad.bg
- Availiability:
-
Available for research purposes upon receipt of signed agreement
- Publication date:
- October 1, 1997
- Source Description
- Full Bibliography
- Title Statement
- Title:
-
Electronic form of the novel
"Passion or the death of Alice"
- Responsibility
-
Publishing house "OBSIDIAN"
(
Provided the electronic version of the novel
"Passion or the death of Alice"
)
- Publication Statement
- Distributor:
-
Publishing house "OBSIDIAN"
- Address:
-
Sofia 1124, Dobromir Hriz st. 31, Bulgaria
- Availiability:
-
Available for internal use only
by the publishing house
- Publication date:
- 1995
- Source Description
- Structured Bibliography
- Monography
- Title:
-
Passion или
смъртта на
Алиса
- Author:
- Емилия
Дворянова
- Imprint
- Publication date:
- 1995
- Publisher:
- Publishing House "OBSIDIAN"
- Place:
- Sofia
- Full Bibliography
- Title Statement
- Title:
-
Electronic form of the novel
"I Want, I believe, I can"
- Responsibility
-
Publishing house "ABAGAR HOLDING"
(
Provided the electronic version of the novel
"I Want, I believe, I can"
)
- Publication Statement
- Distributor:
-
Publishing house "ABAGAR HOLDING"
- Address:
-
Sofia 1124, Dobromir Hriz st. 31, Bulgaria
- Availiability:
-
Available for internal use only
by the publishers
- Publication date:
- 1995
- Source Description
- Structured Bibliography
- Monography
- Title:
-
Искам, вярвам,
мога
- Author:
- Юлия
Берберян
- Imprint
- Publication date:
- 1995
- Publisher:
- Publishing House "ABAGAR HOLDING"
- Place:
- Sofia
Encoding Description
- Project Description:
-
MULTEXT-East:
Multilingual Text Tools and Corpora for Central and
Eastern European Languages.
EU Copernicus Project COP106
- Tag declaration:
- abbr = 952
All abbreviations are marked.
- body = 1
- byline = 2
- closer = 2
- date = 110
All dates which contain one or more digits (the characters 0-9) are
marked, including dates specifying day/month/year and dates consisting
only of a year. No attempt was made to identify or mark dates in other forms.
- dateline = 2
- div = 20
- docAuthor = 2
- foreign = 4
Only Latin words are marked as FOREIGN.
- head = 18
- hi = 233
The highlighting tag is used to mark words and phrases which were
typographically distinguished, and
for which no other more precise tag is applicable. In most of these
cases, such highlighting signifies emphasis.
- l = 2
- mentioned = 206
- name = 3410
All names of people, places, organizations,
products, and events, are marked.
Some person names in the genitive are also marked.
- num = 996
Anything containing one or more digits (the characters 0-9) that is
not part of a date are marked as a number.
Numbers appearing in the form: 6/4 or 10:30 are tagged
separately (in the tagged text scores of sport games
are represented in such way).
- opener = 2
- p = 1332
- poem = 1
- q = 675
The Q tag is used to mark quoted dialogue.
- quote = 57
QUOTE marks quotations from outside sources.
- text = 1
- time = 74
- title = 52
Revision Description
- Date: 1996-10-25
Ludmila Dimitrova
- The tags: ABBR with EXPAN tag; NAME with type attributes PERSON,
ORG, PLACE, PRODUCT, EVENT; NUM; DATE; TIME; TITLE and FOREIGN
have been inserted appropriately
- Replaced Q and HI tags with MENTIONED tags where appropriate
- All occurrences of "..." have been replaced with the
ISO_8879:1986 Publishing entity "hellip"
- All occurrences of "%" have been replaced with the
ISO_8879:1986 Publishing entity "percnt"
- Date: 1996-11-12
Lydia Sinapova
- Correction in comments
- Date: 1996-11-15
Lydia Sinapova
- tagging abbreviated initials of personal names
- Date: 1997-03-20
Tomaz Erjavec, IJS
- Normalisation of corpus component CESHEADER elements:
CESHEADER, EDITIONSTMT, TITLESTMT/H.TITLE
- ISO LANGUAGEs implemented as marked section PUBLIC ent
- Language (WSDs) implemented as PUBLIC entities
- Date: 1997-03-27
Tomaz Erjavec, IJS
- Substituted IGCY entity with JCY (345 occurences)
- Date: 1997-09-25
Tomaž Erjavec
- Changed editionStmt, byteCount, pubDate, Availability
to final form