Summary: The course addresses digital encoding of language
resources, such as corpora, lexica, or complex digital editions. This
area is becoming increasingly important due to the growing size and
variety of "digital language" and its potential for interchange and
exploitation, and concerns both humanities computing and human
language technologies. The course concentrates on the
Text Encoding
Initiative Guidelines and their application, as well as introcuding XSLT.
Examples studied
include annotated multilingual corpora, dictionaries, and complex
digital editions. Lectures are accompanied by hands-on sessions. The
course should enable students to understand, produce, and use TEI
encoded texts of various types and for various purposes.
Lectures and lab sessions are on Fridays 9.30 - 12.30 (3 x 45 minutes + breaks).
Consultations are in the breaks between the lectures or by appointment.
Week |
Date |
Topics |
Lecture materials |
Lab session |
Assignment |
1 |
9/11/07 |
Discussion |
|
|
Excerices |
2 |
16/11/07 |
Background:
XML, Namespaces, XPath |
|
Introduction to
Oxygen XML editor:
making a valid XML document
(recipes),
exercises with XPath.
|
|
3 |
23/11/07 |
Introduction to TEI:
motivation, history, structure of documents
|
|
Marking up a text in TEI Lite,
basic XSLT transforms
|
|
4 |
30/11/07 |
TEI: namespaces, Roma |
|
Transforming documents into TEI
Student presentations: areas of interest
|
|
5 |
7/12/07 |
The TEI class system |
|
Student presentations: presentation of materials |
6 |
14/12/07 |
Modifying the TEI |
|
Student presentations: TEI encodings |
|
7 |
11/1/08 |
TEI stylesheets |
|
Student presentations |
|
Assesment and Due Dates
The course score is computed on the basis of:
- Assignments (30%): two assignments, to be handed in one, max. two
weeks after receiving the assignment.
- Project (70%): composed of the practical work + written report,
formatted as a usual conference paper. The project work is
to be presented at the last lecture (1.12.2006) and the
report handed in by the end of the term (1.2.2007), at the latest.
Reading materials
The course materials are heavily based on the following sources, by
Syd Bauman, Lou Burnard, Matthew Driscoll, Julia Flanders, and Sebastian Rahtz:
Thanks to all the people who have put work in preparing the above materias,
special thanks for making them available on the web, and a
very special <thanks>thanks for allowing others to teach by them!</thanks>
Useful links: