Summary: The course addresses digital encoding of language
resources, such as corpora, lexica, or complex digital editions. This
area is becoming increasingly important due to the growing size and
variety of "digital language" and its potential for interchange and
exploitation, and concerns both humanities computing and human
language technologies. The course concentrates on the
Text Encoding
Initiative Guidelines and their application, as well as introcuding XSLT.
Examples studied
include annotated multilingual corpora, dictionaries, and complex
digital editions. Lectures are accompanied by hands-on sessions. The
course should enable students to understand, produce, and use TEI
encoded texts of various types and for various purposes.
Lectures and lab sessions are on Fridays 9am-2pm (5 x 45 minutes + breaks).
Consultations are in the breaks between the lectures or by appointment.
Week |
Date |
Topics |
Lecture materials |
Lab session |
Assignment |
1 |
3/11/06 |
Background:
XML, Namespaces, XPath |
|
Introduction to
Oxygen XML editor:
making a valid XML document, exercises with XPath.
|
|
2 |
10/11/06 |
Introduction to TEI:
motivation, history, structure of documents
|
|
Marking up a text in TEI Lite,
basic XSLT transforms
|
Assignment 1 |
3 |
17/11/06 |
TEI structure:
architecture and parameterisation of P5,
textual criticism
|
|
An
exercise with Roma (local)
|
Information on student
projects
|
4 |
24/11/06 |
TEI and corpora:
intro to computer corpora, TEI encoding of linguistic analyses, the TEI header
|
|
XSLT excercises: making an index, pointers |
Assignment 2 |
5 |
1/12/06 |
Wrap-up:
linking, ODD, TEI bits and bobs
|
Selected talks from TEI workshops at Brown University, August 2006 and
Oxford, September 2006:
|
|
Project presentations |
Assesment and Due Dates
The course score is computed on the basis of:
- Assignments (30%): two assignments, to be handed in one, max. two
weeks after receiving the assignment.
- Project (70%): composed of the practical work + written report,
formatted as a usual conference paper. The project work is
to be presented at the last lecture (1.12.2006) and the
report handed in by the end of the term (1.2.2007), at the latest.
Reading materials
The course materials are heavily based on the following sources, by
Syd Bauman, Lou Burnard, Matthew Driscoll, Julia Flanders, and Sebastian Rahtz:
Thanks to all the people who have put work in preparing the above materias,
special thanks for making them available on the web, and a
very special <thanks>thanks for allowing others to teach by them!</thanks>
Useful links: