Institut für Informationsverarbeitung
Geisteswissenschaftliche Fakultät
Karl-Franzens-Universität Graz
Academic year 2007/2008

Standards for digital encoding

Tomaž Erjavec

Page http://nl.ijs.si/et/teach/graz07/standards last updated 2008-01-10
Summary: The course addresses digital encoding of language resources, such as corpora, lexica, or complex digital editions. This area is becoming increasingly important due to the growing size and variety of "digital language" and its potential for interchange and exploitation, and concerns both humanities computing and human language technologies. The course concentrates on the Text Encoding Initiative Guidelines and their application, as well as introcuding XSLT. Examples studied include annotated multilingual corpora, dictionaries, and complex digital editions. Lectures are accompanied by hands-on sessions. The course should enable students to understand, produce, and use TEI encoded texts of various types and for various purposes.
Related course: Introduction to Human Language Technologies

Timetable "Standards for digital encoding" 2007/2008

Lectures and lab sessions are on Fridays 9.30 - 12.30 (3 x 45 minutes + breaks). Consultations are in the breaks between the lectures or by appointment.

Week Date Topics Lecture materials Lab session Assignment
1 9/11/07 Discussion Excerices
2 16/11/07 Background: XML, Namespaces, XPath Introduction to Oxygen XML editor: making a valid XML document (recipes), exercises with XPath.
3 23/11/07 Introduction to TEI: motivation, history, structure of documents Marking up a text in TEI Lite,
basic XSLT transforms
4 30/11/07 TEI: namespaces, Roma Transforming documents into TEI
Student presentations: areas of interest
5 7/12/07 The TEI class system Student presentations: presentation of materials
6 14/12/07 Modifying the TEI Student presentations: TEI encodings
7 11/1/08 TEI stylesheets Student presentations

Assesment and Due Dates

The course score is computed on the basis of:

Reading materials

The course materials are heavily based on the following sources, by Syd Bauman, Lou Burnard, Matthew Driscoll, Julia Flanders, and Sebastian Rahtz: Thanks to all the people who have put work in preparing the above materias, special thanks for making them available on the web, and a very special <thanks>thanks for allowing others to teach by them!</thanks>

Useful links:


Valid HTML 4.01!