TEI Header

§file description
§title statement
§title
id = mten-cs.title
Multext-East cesDoc corpus: Newspapers, Czech
§statement of responsibility
§name Vladimír Petkevič
§responsibility Checked and modified markup up for correctness down to the subparagraph level
§statement of responsibility
§name Tomaž Erjavec
§responsibility Conversion to XML/TEI P5
§edition statement
§edition MULTEXT-East, Version 4
§extent
§measure
type = words
90410
§publication statement
§address http://nl.ijs.si/ME/V4/
§distributor Institute of Theoretical and Computational Linguistics, Faculty of Philosophy, Charles University, Czech Republic ÚTKL FFUK
§address Celetná 13, Prague, Czech Republic
§address eAddress: Vladimir.Petkevic@ff.cuni.cz
§address eAddress: ftp: ucnk.ff.cuni.cz directory: pub/corpora/ME
§date
when = 2010-05-09
2010-05-09
§source description
§fully-structured bibliographic citation
§title statement
§title Multext-East CES1: Newspapers, Czech
§statement of responsibility
name Vladimír Petkevič
responsibility Checked and modified markup up for correctness down to the subparagraph level
§edition statement

MTE Final Release

§publication statement
§distributor Institute of Theoretical and Computational Linguistics, Faculty of Philosophy, Charles University, Czech Republic ÚTKL FFUK
§address Celetná 13, Prague, Czech Republic
§address eAddress: Vladimir.Petkevic@ff.cuni.cz
§address eAddress: ftp: ucnk.ff.cuni.cz directory: pub/corpora/ME
§availability

Available for research purposes upon receipt of signed agreement

§date
when = 1997-10-01
October 1, 1997
§source description
§fully-structured bibliographic citation
title statement
title AA Lidové noviny - collection of articles, 1991-1994; Obtained in electronic form (WordPerfect format)
statement of responsibility
name publisher: Lidové noviny, Praha
responsibility typed in in electronic form (WordPerfect format)
publication statement
distributor publisher: Lidové noviny, Praha distributor of the paper version of newspaper articles The electronic texts were made available for the the Institute of Theoretical and Computational Linguistics, Faculty of Philosophy, Charles University, Czech Republic ÚTKL FFUK for research purposes
address Prague, Czech republic
availability

Electronic form available for non-profit purposes It was made available for: Institute of Theoretical and Computational Linguistics, Faculty of Philosophy, Charles University, Czech Republic ÚTKL FFUK

date 1991-1994
source description
structured bibliographic citation
monographic level
title Lidové noviny - collection of 451 articles from the 1991-1994 period
author various newspapermen
imprint
date 1991-1994
publisher Lidové noviny
publication place Praha
§encoding description
§project description

MULTEXT-East: Multilingual Text Tools and Corpora for Central and Eastern European Languages. EU Copernicus Project COP106

§editorial practice declaration
§normalization

Corpus Encoding Standard, Version 4.0 CES LEVEL: 1

§correction principles

§hyphenation

The texts contain no hyphens

§segmentation

One level of DIV is used: each DIV denotes a separate article. Marked up down to the paragraph level and to some subparagraph level elements. Sentences are not marked up.

§tagging declaration
§namespace
name = http://www.tei-c.org/ns/1.0
§tag usage
gi = abbr occurs = 1239
abbreviation
§tag usage
gi = body occurs = 1
text body
§tag usage
gi = byline occurs = 76
byline
§tag usage
gi = date occurs = 547
date
§tag usage
gi = dateline occurs = 189
dateline
§tag usage
gi = div occurs = 450
text division
§tag usage
gi = docAuthor occurs = 74
docAuthor
§tag usage
gi = foreign occurs = 9
foreign
§tag usage
gi = head occurs = 537
heading
§tag usage
gi = hi occurs = 59
highlighted
§tag usage
gi = name occurs = 802
name
§tag usage
gi = num occurs = 1225
number
§tag usage
gi = opener occurs = 189
opener
§tag usage
gi = p occurs = 1360
paragraph
§tag usage
gi = q occurs = 587
separated from the surrounding text with quotation marks
§tag usage
gi = text occurs = 1
text
§text-profile description
§creation
§date 1996-05-10
§text classification
§category reference
target = news
§revision description
§change 1996-05-10<date>Vladimír Petkevič, ÚTKL FFUK, Praha<name>Corrected some spelling errors
§change 1996-05-10<date>Vladimír Petkevič, ÚTKL FFUK, Praha<name>Marked-up to CES1 compliance:
§change 1996-05-10<date>Vladimír Petkevič, ÚTKL FFUK, Praha<name>created header
§change 1996-05-10<date>Vladimír Petkevič, ÚTKL FFUK, Praha<name>inserted DIV (one level)
§change 1996-05-10<date>Vladimír Petkevič, ÚTKL FFUK, Praha<name>inserted subparagraph tags, such as P, Q, NAME, ABBR, FOREIGN, BYLINE tags etc.
§change 1996-10-22<date>Vladimír Petkevič, ÚTKL FFUK, Praha<name> Corrected the header so as to meet the requirements imposed by creating the corpus containing all corpus components as one SGML document
§change 1997-03-17<date>Tomaž Erjavec, IJS<name>Normalisation: DIV/COMPLETE=Y deleted; is default
§change 1997-03-20<date>Tomaž Erjavec, IJS<name>Normalisation of corpus component CESHEADER elements: CESHEADER, EDITIONSTMT, TITLESTMT/H.TITLE
§change 1997-03-20<date>Tomaž Erjavec, IJS<name>ISO LANGUAGEs implemented as marked section PUBLIC ent
§change 1997-03-20<date>Tomaž Erjavec, IJS<name>Language (WSDs) implemented as PUBLIC entities
§change 1997-09-25<date>Tomaž Erjavec<name>Changed editionStmt, Extent, pubDate, Availability to final form
§change 1997-09-26<date>Vladimír Petkevič<name>Corrected some remaining typos
§change 2004-05-10<date>Tomaž Erjavec<name>Converted to TEI P4, prepared for MTE V3
§change 2010-05-09<date>Tomaž Erjavec<name>Conversion to MULTEXT-East TEI P5.