Institut für Informationsverarbeitung
Geisteswissenschaftliche Fakultät
Karl-Franzens-Universität Graz
Academic year 2006/2007

Standards for digital encoding

Tomaž Erjavec

Page http://nl.ijs.si/et/teach/graz06/standards last updated 2006-12-02
Summary: The course addresses digital encoding of language resources, such as corpora, lexica, or complex digital editions. This area is becoming increasingly important due to the growing size and variety of "digital language" and its potential for interchange and exploitation, and concerns both humanities computing and human language technologies. The course concentrates on the Text Encoding Initiative Guidelines and their application, as well as introcuding XSLT. Examples studied include annotated multilingual corpora, dictionaries, and complex digital editions. Lectures are accompanied by hands-on sessions. The course should enable students to understand, produce, and use TEI encoded texts of various types and for various purposes.
Related course: Annotating language data

Timetable "Standards for digital encoding" 2006/2007

Lectures and lab sessions are on Fridays 9am-2pm (5 x 45 minutes + breaks). Consultations are in the breaks between the lectures or by appointment.

Week Date Topics Lecture materials Lab session Assignment
1 3/11/06 Background: XML, Namespaces, XPath Introduction to Oxygen XML editor: making a valid XML document, exercises with XPath.
2 10/11/06 Introduction to TEI: motivation, history, structure of documents Marking up a text in TEI Lite,
basic XSLT transforms
Assignment 1
3 17/11/06 TEI structure: architecture and parameterisation of P5, textual criticism An exercise with Roma (local) Information on student projects
4 24/11/06 TEI and corpora: intro to computer corpora, TEI encoding of linguistic analyses, the TEI header XSLT excercises: making an index, pointers Assignment 2
5 1/12/06 Wrap-up: linking, ODD, TEI bits and bobs Selected talks from TEI workshops at Brown University, August 2006 and Oxford, September 2006: Project presentations

Assesment and Due Dates

The course score is computed on the basis of:

Reading materials

The course materials are heavily based on the following sources, by Syd Bauman, Lou Burnard, Matthew Driscoll, Julia Flanders, and Sebastian Rahtz: Thanks to all the people who have put work in preparing the above materias, special thanks for making them available on the web, and a very special <thanks>thanks for allowing others to teach by them!</thanks>

Useful links:


Valid HTML 4.01!