Abstract | ← ↑ → |
Overview | ← ↑ → |
TEI History: Establishment and Motivations | ← ↑ → |
TEI History: Basics and First Drafts | ← ↑ → |
TEI Guidelines: P3 | ← ↑ → |
What is XML? | ← ↑ → |
TEI Guidelines: P4 (& P5) | ← ↑ → |
The TEI Consortium | ← ↑ → |
Projects Using the TEI | ← ↑ → |
TEI in Japan | ← ↑ → |
Why not more? | ← ↑ → |
The TEI Guidelines | ← ↑ → |
Structure of the TEI DTD | ← ↑ → |
The Core Tagset | ← ↑ → |
Base Tagsets | ← ↑ → |
Additional Tagsets | ← ↑ → |
Examples of TEI Use | ← ↑ → |
<div1 type="story">
<head rend="large underlined" type="sub">
President pledges safeguards for 2,400 British troops
in Bosnia
</head>
<head rend="very large bold" type="main">
Major agrees to enforced no-fly zone
</head>
<byline>
By George Jones, Political Editor, in Washington
</byline>
<p>
Greater Western intervention in the conflict in
former Yugoslavia was pledged by President Bush ...
</p>
</div1>
TEI.analysis Example | ← ↑ → |
<seg id="orwl.en.24" corresp="orwl.sl.24">
<s id="Oen.1.1.4.5">
<q>
<w ana="Af" lemma="big">Big</w>
<w ana="Ncms" lemma="brother">Brother</w>
<w ana="Vaip3s" lemma="be">is</w>
<w ana="Vmpp" lemma="watch">watching</w>
<w ana="Pp2" lemma="you">you</w>
</q>
<w ana="Dd" lemma="the">the</w>
<w ana="Ncns" lemma="caption">caption</w>
<w ana="Vmis" lemma="say">said</w>
<c ana="Cs" lemma="while">while</w>
<w ana="Dd" lemma="the">the</w>
<w ana="Af" lemma="dark">dark</w>
<w ana="Ncnp" lemma="eye">eyes</w>
<w ana="Vmis" lemma="look">looked</w>
<w ana="Rmp" lemma="deep">deep</w>
<w ana="Sp" lemma="into">into</w>
<w ana="Np" lemma="winston">Winston</w>
<w type="rsplit" ana="St" lemma="'s">'s</w>
<w ana="Ps3" lemma="own">own</w>
<c ctag=".">.</c>
</s>
</seg>
TEI.fs Example | ← ↑ → |
<fsLib>
<fs type="Noun" id="Ncfda" feats="N1.c N2.f N3.d N4.a"/>
<fs type="Noun" id="Ncfdd" feats="N1.c N2.f N3.d N4.d"/>
<fs type="Noun" id="Ncfdg" feats="N1.c N2.f N3.d N4.g"/>
...
</fsLib>
<fLib>
<f id="N1.c" select="en ro sl cs bg et hu hr" name="Type">
<sym value="common"/>
</f>
<f id="N1.p" select="en ro sl cs bg et hu hr" name="Type">
<sym value="proper"/>
</f>
<f id="N2.m" select="en ro sl cs bg hr" name="Gender">
<sym value="masculine"/>
</f>
<f id="N2.f" select="en ro sl cs bg hr" name="Gender">
<sym value="feminine"/>
</f>
<f id="N2.n" select="en ro sl cs bg hr" name="Gender">
<sym value="neuter"/>
</f>
...
</fLib>
TEI Lite | ← ↑ → |
The Advanatages of using TEI | ← ↑ → |
The Disadvanatages of Using TEI | ← ↑ → |
The GENIA Project | ← ↑ → |
The GENIA Corpus | ← ↑ → |
GPML and TEI | ← ↑ → |
Implementing the Conversion | ← ↑ → |
The TEI.GENIA DTD | ← ↑ → |
<!DOCTYPE teiCorpus.2
PUBLIC "-//TEI P4//DTD Main Document Type//EN"
"http://www.tei-c.org/P4X/DTD/tei2.dtd" [
<!ENTITY % TEI.XML "INCLUDE" >
<!ENTITY % TEI.general "INCLUDE">
<!ENTITY % TEI.prose "INCLUDE">
<!ENTITY % TEI.dictionaries "INCLUDE">
<!ENTITY % TEI.terminology "INCLUDE">
<!ENTITY % TEI.linking "INCLUDE">
<!ENTITY % TEI.analysis "INCLUDE">
<!ENTITY % TEI.fs "INCLUDE">
<!ENTITY % TEI.corpus "INCLUDE">
<!ENTITY % TEI.extensions.ent SYSTEM 'geniaex.ent'>
<!ENTITY % TEI.extensions.dtd SYSTEM 'geniaex.dtd'>
]>
Overall Corpus Structure | ← ↑ → |
<!DOCTYPE teiCorpus.2 SYSTEM "genia-tei.dtd">
<TEIcorpus.2>
<teiHeader type="corpus">*Corpus_header*</teiHeader>
<TEI.2 id="*MEDLINE_ID*">
<teiHeader type="text">*Article_header*</teiHeader>
<text>
<body>
<div type="abstract">
<head>*Title_of_article*</head>
<p>*Abstract_of_article*</p>
</div>
<div type="ontology">*Local_ontology*</div>
<div type="lexicon">*Local_lexicon*</div>
</body>
</text>
</TEI.2>
*More_articles*
</TEIcorpus.2>
The TEI Header | ← ↑ → |
<encodingDesc>
<projectDesc>
<p>The GENIA project seeks to automatically extract ...</p>
</projectDesc>
<samplingDecl>
<p>The corpus consits of abstracts found by ...</p>
</samplingDecl>
<tagsDecl>
<tagUsage gi="body" occurs="670"></tagUsage>
<tagUsage gi="cl" occurs="491"></tagUsage>
<tagUsage gi="div" occurs="2010"></tagUsage>
<tagUsage gi="entry" occurs="19472"></tagUsage>
<tagUsage gi="form" occurs="19472"></tagUsage>
<tagUsage gi="head" occurs="670"></tagUsage>
<tagUsage gi="p" occurs="670"></tagUsage>
<tagUsage gi="ptr" occurs="27305"></tagUsage>
<tagUsage gi="s" occurs="5109"></tagUsage>
<tagUsage gi="term" occurs="48906"></tagUsage>
<tagUsage gi="termEntry" occurs="22707"></tagUsage>
<tagUsage gi="tig" occurs="22707"></tagUsage>
<tagUsage gi="xptr" occurs="14874"></tagUsage>
<tagUsage gi="xr" occurs="19472"></tagUsage>
</tagsDecl>
</encodingDesc>
<profileDesc>
<langUsage>
<language id="en">English</language>
<language id="la">Latin</language>
</langUsage>
</profileDesc>
Text Annotation | ← ↑ → |
<div type="abstract">
<head>Retinoic acid downmodulates erythroid differentiation and
<term ana="SEM-94.000">GATA1 expression</term> in
<term ana="SEM-94.001">purified adult-progenitor culture</term>.
</head>
<p>
<s>
In
<cl ana="SEM-94.011 SEM-94.012"
function="(OR SEM-94.011 SEM-94.012)">
<term ana="SEM-94.013">clonogenetic fetal calf serum</term>
<term ana="SEM-94.014">-supplemented (FCS+)</term>
or
<term ana="SEM-94.015">-nonsupplemented (FCS-)</term>
<term ana="SEM-94.016">culture</term>
</cl>
treated with saturating levels of
<term ana="SEM-94.018">interleukin-3</term>
(<term ana="SEM-94.019">IL-3</term>)
<term ana="SEM-94.020">granulocyte- macrophage
colony-stimulating factor</term> ...
</s>
...
Further Annotation | ← ↑ → |
LTG XML Tools | ← ↑ → |
Processing OHSUMED | ← ↑ → |
Using LTG XML Tools on GENIA | ← ↑ → |
<SENTENCE>
<W C='W' P='DT' C2='DD'>Some</W>
<W C='W' P='VBN' C2='VVN' LM='convert'>converted</W>
<W C='W' P='IN' C2='II'>from</W>
<W C='W' P='JJ' C2='JJ'>ventricular</W>
<W C='W' P='NN' C2='NN1' LM='fibrillation' VSTEM='fibrillate'>fibrillation</W>
<W C='W' P='TO' C2='II'>to</W>
<W C='W' P='JJ' C2='JJ' VSTEM='organize'>organized</W>
<W C='W' P='NNS' C2='NN2' LM='rhythm'>rhythms</W>
<W C='W' P='IN' C2='II'>by</W>
<W C='HYW' P='JJ'>defibrillation-trained</W>
<W C='W' P='NN' C2='NN1' LM='ambulance'>ambulance</W>
<W C='W' P='NNS' C2='NN2' LM='technician'>technicians</W>
<PHR C='BR'>
<W C='BR' P='(' C2='('>(</W>
<W C='ABBR' P='NNS' C2='NP1'>EMT-Ds</W>
<W C='BR' P=')' C2=')'>)</W>
</PHR>
<W C='W' P='MD' C2='VM' LM='will'>will</W>
<W C='W' P='VB' C2='VV0' LM='refibrillate'>refibrillate</W>
<W C='W' P='IN' C2='II'>before</W>
<W C='W' P='NN' C2='NN1' LM='hospital'>hospital</W>
<W C='W' P='NN' C2='NN1' LM='arrival' VSTEM='arrive'>arrival</W>
<W C='.' P='.' C2='.'>.</W>
</SENTENCE>