Institut für Informationsverarbeitung
Geisteswissenschaftliche Fakultät
Karl-Franzens-Universität Graz
Academic year 2007/2008

Introduction to Human Language Technologies

Tomaž Erjavec

Page last updated 2008-01-24
Summary: The course introduces the field of human language technologies, or natural language processing. We give an overview of application areas, such as machine translation, speech synthesis, and corpus linguistics, and methods and tools used for automatic analysis of language data, such as regular expressions, part-of-speech tagging, syntactic parsing and alignment of parallel corpora. The course combines lectures and practical work with NLTK.
Related course: Standards for digital encoding

Timetable "Introduction to Human Language Technologies" 2006/2007

Lectures and lab sessions are on Fridays 2pm-5pm (3 x 45 minutes + breaks). Consultations are in the breaks between the lectures or by appointment. The detailed syllabus, slides, readings, and assignments will be posted here as the course progresses.

Week Date Topics Lecture Lab session Assignment
1 9/11/07 Introduction: what are HLT Slides .ppt, handout .pdf NLTK overview
2 16/11/07 Computer corpora Slides .ppt, handout .pdf
See also: S. Schulte im Walde, H. Zinsmeister.
ESSLLI 2006 course, part 1: Introduction
NLTK: First steps with Python
3 23/11/07 NLTK NLTK: Python: control structures and dictionaries
4 30/11/07 NLTK NLTK: Python: regular expressions Excercises
5 7/12/07 NLTK NLTK: Words I. Homework: Excercises 7 and 10 from NLTK book, 3.2.4 (p. 76)
6 14/12/07 NLTK NLTK: Words II.
7 11/1/08 Character sets Slides .ppt, handout .pdf NLTK: Words II.

Assesment and Due Dates

Written exam, 1.5 hours.
Exam questions are here
Valid HTML 4.01!