This document is a HTML 3.2 rendering of a
Corpus Encoding Specification
DTD document, produced in the scope of the
MULTEXT-East
project, by
Fred.
Note that this HTML translation does not contain all the information from the cesHeader.
CES header
Creator: ET
Created: 1996-05-07
Updated: 1997-09-25
File Description
- Title Statement
- Title:
- Multext-East CES1: Newspapers, Slovene
- Responsibility
-
Tomaž Erjavec,
LST group,
Dept. for Intelligent Systems
Jozef Štefan Institute
(
CES1 conformance.
)
Miro Romih
Amebis d.o.o
(
Up-translation from diskette format,
typographical error correction.
)
- Edition:
- MTE Final Release
- Extent:
- 101749 words
808924 bytes
- Publication Statement
- Distributor:
-
Dept. for Intelligent Systems
Jozef Štefan Institute,
- Address:
-
Jamova 39,
Ljubljana, Slovenia
- Electronic address:
-
tomaz.erjavec@ijs.si
- Electronic address:
-
http://nl.ijs.si/ME
- Availiability:
-
Available for research purposes upon receipt of signed agreement
- Publication date:
- October 1, 1997
- Source Description
- Full Bibliography
- Title Statement
- Title:
-
Original digital form of the 'Dnevnik' articles:
editor's diskettes with idiosyncratic markup
- Responsibility
-
The 'Dnevik' Daily
(
Collected the edited the texts from authors
)
- Publication Statement
- Distributor:
-
The 'Dnevik' Daily
- Address:
-
Ljubljana, Slovenia
- Availiability:
-
- Publication date:
-
Unknown
- Source Description
- Structured Bibliography
- Monography
- Title:
-
45 articles from the 'Dnevnik' Daily
- Imprint
- Publisher:
-
Dnevik
- Publication date:
-
8--10 1995
- Place:
-
Ljubljana, Slovenia
Encoding Description
- Project Description:
-
MULTEXT-East:
Multilingual Text Tools and Corpora for Central and Eastern
European Languages.
EU Copernicus Project COP106
- Tag declaration:
- body = 1
- byline = 54
- date = 45
- div = 396
- docauthor = 55
- figdesc = 67
- figure = 67
- head = 379
- name = 83
- opener = 45
- p = 1204
- q = 881
- text = 1
Revision Description
- Date: 1996-05-06
Amebis d.o.o.
-
Corrected spelling mistakes that could be caught with
spelling checker;
Up-translated to almost-CES
- Date: 1996-05-07
Tomaž Erjavec, IJS
- Made header
- Glued the articles received from Amebis together
- Fixed some mistakes in Amebis encoding
- Date: 1996-08-08
Tomaž Erjavec, IJS
- Revised header CES version numbers and made doctype PUBLIC
- Word segmentation shows some more typos,
e.g. '0Nekateri', '4O'; corrected these silently.
- Converted from ISO-2 to SGML ents
- Date: 1996-10-30
Tomaž Erjavec, IJS
- Found all Č and č were switched - corrected
- Revised header CES version and packed for IM3
- Date: 1997-03-20
Tomaz Erjavec, IJS
- Normalisation of corpus component CESHEADER elements:
CESHEADER, EDITIONSTMT, TITLESTMT/H.TITLE
- ISO LANGUAGEs implemented as marked section PUBLIC ent
- Language (WSDs) implemented as PUBLIC entities
- Date: 1997-09-25
Tomaž Erjavec
- Changed editionStmt, Extent, pubDate, Availability
to final form