TEI Header

§file description
§title statement
id = mtef-sl.title
Multext-East cesDoc corpus: Fiction, Slovene
§statement of responsibility
§name Tomaž Erjavec,
§responsibility Error correction and CES conformance.
§statement of responsibility
§name Tomaž Erjavec
§responsibility Conversion to XML/TEI P5
§edition statement
§edition MULTEXT-East, Version 4
type = words
§publication statement
§address http://nl.ijs.si/ME/V4/
§distributor Dept. of Knowledge Technologies, Jožef Stefan Institute
§address Jamova 39, Ljubljana, Slovenia
§address eAddress: tomaz.erjavec@ijs.si
§address eAddress: http://nl.ijs.si/ME
when = 2010-05-09
§source description
§fully-structured bibliographic citation
§title statement
§title Multext-East CES1: Fiction, Slovene
§statement of responsibility
name Tomaž Erjavec, Dept. for Intelligent Systems Jozef Štefan Institute
responsibility Error correction and CES conformance.
§edition statement

MTE Final Release

§publication statement
§distributor Dept. of Knowledge Technologies, Jožef Stefan Institute
§address Jamova 39, Ljubljana, Slovenia
§address eAddress: tomaz.erjavec@ijs.si
§address eAddress: http://nl.ijs.si/ME

Available for research purposes upon receipt of signed agreement

when = 1997-10-01
October 1, 1997
§source description
§fully-structured bibliographic citation
title statement
title Digital form of 'Galjot', obtained via OCR
statement of responsibility
name The Slovene Society for Blind and Visually Impaired
responsibility OCR'ed the novel
publication statement
distributor The Slovene Society for Blind and Visually Impaired
address Ljubljana

date Unknown
source description
structured bibliographic citation
monographic level
title Galjot
author Jančar, Drago
publisher Mladinska knjiga
date 1984
publication place Ljubljana, Slovenia
§encoding description
§project description

MULTEXT-East: Multilingual Text Tools and Corpora for Central and Eastern European Languages. EU Copernicus Project COP106

§editorial practice declaration

Corpus Encoding Standard, Version 4.0 CES LEVEL: 1

§correction principles

The OCR'ed text of the novel has been automtaically spell-checked.

form = std

Rendition attribute values on Q and QUOTE tags are adapted from ISOpub and ISOnum standard entity set names Spoken passages are marked by Q even where there are no typographical marks to denote them.


All text semi-automatically dehyphenated; errors possible where the two parts of the word are both words


Two levels of DIV are used: the first denotes the chapters, the second divisions which are marked by spacing in the original text. DIV type=chapter is usually followed by a HEAD and OPENER. Marked up to the level of paragraph plus marking of particular sub-paragraph elements: Q, ABBR.

§tagging declaration
name = http://www.tei-c.org/ns/1.0
§tag usage
gi = abbr occurs = 31
§tag usage
gi = body occurs = 1
text body
§tag usage
gi = cell occurs = 75
§tag usage
gi = div occurs = 208
text division
§tag usage
gi = foreign occurs = 5
§tag usage
gi = head occurs = 52
§tag usage
gi = hi occurs = 47
§tag usage
gi = item occurs = 11
§tag usage
gi = l occurs = 4
verse line
§tag usage
gi = list occurs = 1
§tag usage
gi = opener occurs = 25
§tag usage
gi = p occurs = 1850
§tag usage
gi = lg occurs = 2
line group
§tag usage
gi = q occurs = 903
separated from the surrounding text with quotation marks
§tag usage
gi = quote occurs = 4
§tag usage
gi = row occurs = 15
§tag usage
gi = table occurs = 3
§tag usage
gi = text occurs = 1
§text-profile description
§date 1996-04-18
§text classification
§category reference
target = fict
§revision description
§change 1996-03-15<date>Amebis d.o.o.<name> Corrected spelling mistakes that could be caught with spelling checker; Converted OCR codes into quasi SGML markup.
§change 1996-04-18<date>Tomaž Erjavec, IJS<name>Corrected some spelling errors, hyphenations;
§change 1996-04-18<date>Tomaž Erjavec, IJS<name> Marked-up to CES1 compliance: - created header - inserted div (two levels) with head and opener - inserted q, quotes - inserted table, poem, list - inserted some abbr, foreign
§change 1996-05-02<date>Tomaž Erjavec, IJS<name> Corrected the header, so it better corresponds to CES recommendations
§change 1996-05-02<date>Tomaž Erjavec, IJS<name>Fixed n, id values in DIVs
§change 1996-05-02<date>Tomaž Erjavec, IJS<name>mdash entity is now used only for sentential punctuation
§change 1996-08-08<date>Tomaž Erjavec, IJS<name>Corrected the header - made preamble PUBLIC
§change 1996-11-02<date>Tomaž Erjavec, IJS<name>Corrected the header to IM3 specs
§change 1997-03-20<date>Tomaz Erjavec, IJS<name>Normalisation of corpus component CESHEADER elements: CESHEADER, EDITIONSTMT, TITLESTMT/H.TITLE
§change 1997-03-20<date>Tomaz Erjavec, IJS<name>ISO LANGUAGEs implemented as marked section PUBLIC ent
§change 1997-03-20<date>Tomaz Erjavec, IJS<name>Language (WSDs) implemented as PUBLIC entities
§change 1997-09-25<date>Tomaž Erjavec<name>Changed editionStmt, Extent, pubDate, Availability to final form
§change 2004-05-10<date>Tomaž Erjavec<name>Converted to TEI P4, prepared for MTE V3
§change 2010-05-09<date>Tomaž Erjavec<name>Conversion to MULTEXT-East TEI P5.