Overview | ↑ |
What is a corpus? | ← ↑ → |
Using corpora | ← ↑ → |
Characteristics of a corpus | ← ↑ → |
Typology of corpora | ← ↑ → |
History | ← ↑ → |
Literature on corpora | ← ↑ → |
Slovene language corpora | ← ↑ → |
Compilation of corpora | ↑ |
Steps in the preparation of a corpus | ← ↑ → |
What annotation can be added to the text of the corpus? | ← ↑ → |
Markup Methods | ← ↑ → |
Computer coding of corpora | ← ↑ → |
Examples of TEI encoding in corpora: meta-data | ← ↑ → |
<teiHeader id="ecmr.H" type="text" lang="sl-en" creator=ET status="update" date.created="1999-04-13" date.updated="1999-06-22" > <fileDesc> <titleStmt> <title lang="sl">Ekonomsko ogledalo; 13 številk 98/99</title> <title lang="en">Slovenian Economic Mirror; 13 issues, 98/99</title> <respstmt> <name>Andrej Skubic, FF</name> <resp lang="sl">Zagotovitev digitalnega originala, poravnava</resp> <resp lang="en">Provision of digital original, alignment</resp> <name>Tomaž Erjavec, IJS</name> <resp lang="sl">Tokenizacija, pretvorba v TEI</resp> <resp lang="en">Tokenisation, conversion to TEI</resp> </respStmt> </titleStmt> ...
Examples of TEI encoding in corpora: Structure of the text | ← ↑ → |
<quote id="Osl.1.8.18" rend="center;it"> <lg id="Osl.1.8.18.1"> <l id="Osl.1.8.18.1.1">Tam pod kostanjevim drevesom</l> <l id="Osl.1.8.18.1.2">izdala si me,</l> <l id="Osl.1.8.18.1.3">izdal sem te,</l> <l id="Osl.1.8.18.1.4">ne da bi trenila z očesom.</l> </lg> </quote> <p id="Osl.1.8.19"> <s id="Osl.1.8.19.1">Trije možje se niso niti ganili.</s> <s id="Osl.1.8.19.2">Toda ko je <name>Winston</name> znova pogledal v Rutherfordov propadli obraz, je opazil, da so njegove oči polne solz.</s> ...
Examples of TEI encoding in corpora: Morphosyntactic descriptions | ← ↑ → |
<s id="Osl.1.2.2.1"> <w lemma="biti" ana="Vcps-sma">Bil</w> <w lemma="biti" ana="Vcip3s--n">je</w> <w lemma="jasen" ana="Afpmsnn">jasen</w><c>,</c> <w lemma="mrzel" ana="Afpmsnn">mrzel</w> <w lemma="aprilski" ana="Aopmsn">aprilski</w> <w lemma="dan" ana="Ncmsn">dan</w> <w lemma="in" ana="Ccs">in</w> <w lemma="ura" ana="Ncfpn">ure</w> <w lemma="biti" ana="Vcip3p--n">so</w> <w lemma="biti" ana="Vmps-pfa">bile</w> <w lemma="trinajst" ana="Mcnpnl">trinajst</w><c>.</c> </s> <fs id="Vcps-sma" select="sl" feats="V0. V1.c V2.p V3.s V5.s V6.m V7.a"/> <fs id="Vcps-sman----n" select="cs" feats="V0. V1.c V2.p V3.s V5.s V6.m V7.a V8.n V13.n"/> <fs id="Vcps-smay----n" select="cs" feats="V0. V1.c V2.p V3.s V5.s V6.m V7.a V8.y V13.n"/> <fs id="Vcps-sna" select="sl" feats="V0. V1.c V2.p V3.s V5.s V6.n V7.a"/> <fs id="Vcps-snan----n" select="cs" feats="V0. V1.c V2.p V3.s V5.s V6.n V7.a V8.n V13.n"/> <fLib type="Verb"> <f id="V0." select="en ro sl cs bg et hu hr sr sl-rozaj" name="PoS"><sym value="Verb"/></f> <f id="V1.m" select="en ro sl cs bg et hu hr sr sl-rozaj" name="Type"><sym value="main"/></f> <f id="V1.a" select="en ro sl cs bg et hu hr sr sl-rozaj" name="Type"><sym value="auxiliary"/></f> <f id="V1.o" select="en ro sl cs et hr sr sl-rozaj" name="Type"><sym value="modal"/></f> <f id="V1.c" select="ro sl cs hr sr sl-rozaj" name="Type"><sym value="copula"/></f> <f id="V1.b" select="en" name="Type"><sym value="base"/></f>
Examples of TEI encoding in corpora: Alignment | ← ↑ → |
<linkGrp id="Oslen.1" type="body" targtype="s" domains="Oen Osl"> <link xtargets="Osl.1.2.2.1 ; Oen.1.1.1.1"> <link xtargets="Osl.1.2.2.2 ; Oen.1.1.1.2"> <link xtargets="Osl.1.2.3.1 ; Oen.1.1.2.1"> <link xtargets="Osl.1.2.3.2 ; Oen.1.1.2.2"> ... <link xtargets="Osl.1.2.6.5 ; Oen.1.1.5.5"> <link xtargets="Osl.1.2.6.6 ; Oen.1.1.5.6 Oen.1.1.5.7"> <link xtargets="Osl.1.2.6.7 ; Oen.1.1.5.8"> ...
Examples of use | ↑ |
Lexicology | ← ↑ → |
Automatic translation | ← ↑ → |
Slovene sentence: evropi vlada veliki brat ELAN model: europe government big brother Bible model: evropi brother chief upright . Czech translation: evropi vláda velké bratr .
The future of corpus and data-driven linguistics | ↑ |
The future of corpus and data-driven linguistics | ← ↑ → |