In this section the MULTEXT set of lexicon
specifications is applied to
Italian (Calzolari and Monachini 1994).
The language-specific values added for Italian are highligthed with
the code `l-spec'.
Furthermore, a preliminary tagset for Italian is proposed. This is based on the tagset used by our tagger, but also takes into account the criteria expressed above for the construction of the tagset, and the results of a first cycle of experimentations on the MULTEXT tagger.
A table containing the the translation of the tag into the regular expression and its definition is presented, i.e.
TAG Reg.expr. Definition
NMS Ncms- Common noun, masc.sing.
A table displaying the mapping between lexicon specifications and corpus
tags is provided, along with an examplification.
5.1.1 Nouns (N)
---------------
5.1.1.1 Lexicon
============ =========== =========== ====
Attribute Value Example Code
============ =========== =========== ====
Type common libro c
proper Gianni p
------------ ----------- ----------- ----
Gender masculine uomo m
feminine donna f
l-spec common insegnante c
------------ ----------- ----------- ----
Number singular uomini s
plural donne p
l-spec invariant attivita' n
------------ ----------- ----------- ----
Case (n.a.) (n.a.) -
============ =========== =========== ====
5.1.1.2 Corpus
======= ================== ====================================
Tag Regular expression Definition
======= ================== ====================================
NMS Ncms- Common noun, masc. sing.
NMP Ncmp- Common noun, masc. plur.
NMN Ncmn- Common noun, masc. invar.
NFS Ncfs- Common noun, fem. sing.
NFP Ncfp- Common noun, fem. plur.
NFN Ncfn- Common noun, fem. invar.
NNS Nccs- Common noun, comm. sing.
NNP Nccp- Common noun, comm. plur.
NNN Nccn- Common noun, comm. invar.
NP Np..- Proper noun
======= ================== ====================================
5.1.1.3 Combinations
========= ======= =============================================
Lexicon Corpus Example
========= ======= =============================================
Ncms- NMS libro
Ncmp- NMP libri
Ncmn- NMN re, caffe' (il/i)
Ncfs- NFS casa
Ncfp- NFP case
Ncfn- NFN attivita' (la/le)
Nccs- NNS insegnante (un/una)
Nccp- NNP insegnanti (gli/le)
Nccn- NNN sosia (il/la, i/le)
Np..- NP Mario, Maria, Borboni
========= ======= =============================================
5.1.1.4 Some obsevations for the corpus tagset
The idea of the French group to tag Proper Nouns simply with NP
(collapsing the information on gender and number) seems the best
solution.
5.1.2 Verb (V)
--------------
5.1.2.1 Lexicon
============ =========== =========== ====
Attribute Value Example Code
============ =========== =========== ====
Type main amare m
auxiliary avere a
------------ ----------- ----------- ----
Mood/VForm indicative amo i
subjunctive ami s
imperative ama m
conditional amerei c
infinitive amare n
participle amato p
gerund amando g
------------ ----------- ----------- ----
Tense present amo p
imperfect amavo i
future amero' f
past amai s
------------ ----------- ----------- ----
Person first amo 1
second ami 2
third ama 3
------------ ----------- ----------- ----
Number singular amo s
plural amiamo p
------------ ----------- ----------- ----
Gender masculine amato m
feminine amata f
l-spec common amante c
============ =========== =========== ====
5.1.2.2 Corpus
========= ======================= ======================================
Tag Regular Expression Definition
========= ======================= ======================================
VAS1IP Vaip1s- Aux. Verb, 1st pers.sing., pres.indic.
VAS2IP Vaip2s- Aux. Verb, 2nd pers.sing., pres.indic.
VAS3IP Vaip3s- Aux. Verb, 3rd pers.sing., pres.indic.
VAP2IP Vaip2p- Aux. Verb, 2nd pers.plur., pres.indic.
VAP3IP Vaip3p- Aux. Verb, 3rd pers.plur., pres.indic.
VAP1ICP Va[is]p1p- Aux. Verb, 1stpers.plur.,pres.indic/cong
VAY^2IP Vaip(1s|3p)- Aux. Verb, 1st sing./3rd plur., pres.indic
VAS1II Vaii1s- Aux. Verb, 1st pers.sing., impf.indic.
VAS2II Vaii2s- Aux. Verb, 2nd pers.sing., impf.indic.
VAS3II Vaii3s- Aux. Verb, 3rd pers.sing., impf.indic.
VAP1II Vaii1p- Aux. Verb, 1st pers.plur., impf.indic.
VAP2II Vaii2p- Aux. Verb, 2nd pers.plur., impf.indic.
VAP3II Vaii3p- Aux. Verb, 3rd pers.plur., impf.indic.
VAS1IF Vaif1s- Aux. Verb, 1st pers.sing., fut. indic.
VAS2IF Vaif2s- Aux. Verb, 2nd pers.sing., fut. indic.
VAS3IF Vaif3s- Aux. Verb, 3rd pers.sing., fut. indic.
VAP1IF Vaif1p- Aux. Verb, 1st pers.plur., fut. indic.
VAP2IF Vaif2p- Aux. Verb, 2nd pers.plur., fut. indic.
VAP3IF Vaif3p- Aux. Verb, 3rd pers.plur., fut. indic.
VAS1IR Vais1s- Aux. Verb, 1st pers.sing., past indic.
VAS2IR Vais2s- Aux. Verb, 2nd pers.sing., past indic.
VAS3IR Vais3s- Aux. Verb, 3rd pers.sing., past indic.
VAP1IR Vais1p- Aux. Verb, 1st pers.plur., past indic.
VAP3IR Vais3p- Aux. Verb, 3rd pers.plur., past indic.
VAP2ICR Va(is)|(si)2p- Aux. Verb, 2nd p.pl., past indic./pres.cong
VASXCP Vacp.s- Aux. Verb, 1/2/3 p. sing., pres.subj.
VAP2CMP Va[sm]p2p- Aux. Verb, 2nd pers.plur., pres.subj./imper.
VAP3CP Vasp3p- Aux. Verb, 3rd pers.plur., pres.subj.
VAS^3CI Vasi^3s- Aux. Verb, 1/2 pers.sing., impf.subj.
VAS3CI Vasi3s- Aux. Verb, 3rd pers.sing., impf.subj.
VAP1CI Vasi1p- Aux. Verb, 1st pers.plur., impf.subj.
VAP3CI Vasi3p- Aux. Verb, 3rd pers.plur., impf.subj.
VAS2MP Vamp2s- Aux. Verb, 2nd pers.sing., pres.impr.
VAS2MPE Vamp2s-y Aux. Verb, 2nd pers.sing., pres.impr. + clit.
VAP2MPE Vamp2p-y Aux. Verb, 2nd pers.plur., pres.impr. + clit.
VAS1DP Vacp1s- Aux. Verb, 1st pers.sing., pres.cond.
VAS2DP Vacp2s- Aux. Verb, 2nd pers.sing., pres.cond.
VAS3DP Vacp3s- Aux. Verb, 3rd pers.sing., pres.cond.
VAP1DP Vacp1p- Aux. Verb, 1st pers.plur., pres.cond.
VAP2DP Vacp2p- Aux. Verb, 2nd pers.plur., pres.cond.
VAP3DP Vacp3p- Aux. Verb, 3rd pers.plur., pres.cond.
VAF Vanp--- Aux. Verb, infinitive
VAFE Vanp--cy Aux. Verb, infinitive + clitic
VANSPP Vapp-sc Aux. Verb, comm.sing., pres.part.
VANPPP Vapp-pc Aux. Verb, comm.plur., pres.part.
VAMSPR Vaps-sm Aux. Verb, masc.sing., past part.
VAMPPR Vaps-pm Aux. Verb, masc.plur., past part.
VAFSPR Vaps-sf Aux. Verb, femm.sing., past part.
VAFPPR Vaps-pf Aux. Verb, femm.plur., past part.
VAMSPRE Vaps-smy Aux. Verb, masc.sing., past part. + clitic
VAMPPRE Vaps-pmy Aux. Verb, masc.plur., past part. + clitic
VAFSPRE Vaps-sfy Aux. Verb, femm.sing., past part. + clitic
VAFPPRE Vaps-pfy Aux. Verb, femm.plur., past part. + clitic
VAG Vagp--- Aux. Verb, gerund
VAGE Vagp---y Aux. Verb, gerund + clitic
VS1IP Vmip1s- Main Verb, 1st pers.sing., pres.indic
VS3IP Vmip3s- Main Verb, 3rd pers.sing., pres.indic
VP3IP Vmip3p- Main Verb, 3rd pers.plur., pres.indic
VP1ICP Vm[is]p1p Main Verb,1stpers.plur.,pres.indic/cong
VP2IMPP Vm([im]p2p-)|(ps-pf) M.V., 2nd pl., pres.indic/imper|pstprt f.pl.
VP2IMP Vm([im]p2p)- Main Verb, 2nd pl., pres.indic/imper
VSXICP Vm(sp.s)|(ip2s)- M.V., 1/2/3 sg.,pres.subj.|2ndsg. pres.indic.
VS^1IMP Vm[im]^1s- Main Verb, not 1stsg.,pres.indic./imper.
VS2IMP Vm[im]p2s- Main Verb, 2nd sg., pres.indic/imper
VP2IMCPP Vm([ims]p2p-)|(ps-pf) M.V., 2pl., pr.ind/imp/sub|pst.prt f.pl.
VS1II Vmii1s- Main Verb, 1st pers.sing., impf.indic.
VS2II Vmii2s- Main Verb, 2nd pers.sing., impf.indic.
VS3II Vmii3s- Main Verb, 3rd pers.sing., impf.indic.
VP1II Vmii1p- Main Verb, 1st pers.plur., impf.indic.
VP2II Vmii2p- Main Verb, 2nd pers.plur., impf.indic.
VP3II Vmii3p- Main Verb, 3rd pers.plur., impf.indic.
VS1IF Vmif1s- Main Verb, 1st pers.sing., fut. indic.
VS2IF Vmif2s- Main Verb, 2nd pers.sing., fut. indic.
VS3IF Vmif3s- Main Verb, 3rd pers.sing., fut. indic.
VP1IF Vmif1p- Main Verb, 1st pers.plur., fut. indic.
VP2IF Vmif2p- Main Verb, 2nd pers.plur., fut. indic.
VP3IF Vmif3p- Main Verb, 3rd pers.plur., fut. indic.
VS1IR Vmis1s- Main Verb, 1st pers.sing., past indic.
VS2IR Vmis2s- Main Verb, 2nd pers.sing., past indic.
VS3IR Vmis3s- Main Verb, 3rd pers.sing., past indic.
VP1IR Vmis1p- Main Verb, 1st pers.plur., past indic.
VP3IR Vmis3p- Main Verb, 3rd pers.plur., past indic.
VP2ICR Vm(is)|(si)2p- Main Verb, 2nd p.pl., past indic./pres.subj.
VP2CP Vmsp2p- Main Verb, 2nd pers.plur., pres.subj. amiate
VP3CP Vmsp3p- Main Verb, 3rd pers.plur., pres.subj. amino
VSXCP Vmcp.s- Main Verb, 1/2/3 p. sing., pres.subj.
VS^3CI Vmsi^3s- Main Verb, 1/2 pers.sing., impf.subj.
VS3CI Vmsi3s- Main Verb, 3rd pers.sing., impf.subj.
VP1CI Vmsi1p- Main Verb, 1st pers.plur., impf.subj.
VP3CI Vmsi3p- Main Verb, 3rd pers.plur., impf.subj.
VS2MPE Vmmp2s-y Main Verb, 2nd pers.sing., pres.impr. + clit.
VP2MPE Vmmp2p-y Main Verb, 2nd pers.plur., pres.impr. + clit.
VS1DP Vmcp1s- Main Verb, 1st pers.sing., pres.cond.
VS2DP Vmcp2s- Main Verb, 2nd pers.sing., pres.cond.
VS3DP Vmcp3s- Main Verb, 3rd pers.sing., pres.cond.
VP1DP Vmcp1p- Main Verb, 1st pers.plur., pres.cond.
VP2DP Vmcp2p- Main Verb, 2nd pers.plur., pres.cond.
VP3DP Vmcp3p- Main Verb, 3rd pers.plur., pres.cond.
VF Vmnp--- Main Verb, infinitive
VFE Vmnp---y Main Verb, infinitive + clitic
VNSPP Vmpp-sc Main Verb, comm.sing., pres.part.
VNPPP Vmpp-pc Main Verb, comm.plur., pres.part.
VMSPR Vmps-sm Main Verb, masc.sing., past part.
VMPPR Vmps-pm Main Verb, masc.plur., past part.
VFSPR Vmps-sf Main Verb, femm.sing., past part.
VFPPR Vmps-pf Main Verb, femm.plur., past part.
VMSPRE Vmps-smy Main Verb, masc.sing., past part. +c
VMPPRE Vmps-pmy Main Verb, masc.plur., past part. +c
VFSPRE Vmps-sfy Main Verb, femm.sing., past part. +c
VFPPRE Vmps-pfy Main Verb, femm.plur., past part. +c
VG Vmgp--- Main Verb, gerund
VGE Vmgp---y Main Verb, gerund + clitic
-------------------- more collapsed tagset -----------------------
VA1P Va[iscm][pifs]1p-- Aux. verb, 1st person plur.
VA1S Va[iscm][pifs]1s-- Aux. verb, 1st person sing.
VA2P Va[iscm][pifs]2p-- Aux. verb, 2nd person plur.
VA2S Va[iscm][pifs]2s-- Aux. verb, 2nd person sing.
VA3P Va[iscm][pifs]3p-- Aux. verb, 3rd person plur.
VA3S Va[iscm][pifs]3s-- Aux. verb, 3rd person sing.
VAFPPS Vaps-pf- Aux. verb, fem. plur., past part.
VAFSPS Vaps-sf- Aux. verb, fem. sing., past part.
VAMPPS Vaps-pm- Aux. verb, masc. plur., past part.
VAMSPS Vaps-sm- Aux. verb, masc. sing., past part.
VAN Vanp---- Aux. verb, infinitive
VAFE Vanp---- Aux. Verb, infinitive + enclitic
VAG Vagp---- Aux. Verb, gerund
VAGE Vagp---- Aux. Verb, gerund + enclitic
VAPP Vapp-..- Aux. verb, pres. participle
V1P Vm[iscm][pifs]1p-- Main Verb, 1st person plur.
V1S Vm[iscm][pifs]1s-- Main Verb, 1st person sing.
V2P Vm[iscm][pifs]2p-- Main Verb, 2nd person plur.
V2S Vm[iscm][pifs]2s-- Main Verb, 2nd person sing.
V3P Vm[iscm][pifs]3p-- Main Verb, 3rd person plur.
V3S Vm[iscm][pifs]3s-- Main Verb, 3rd person sing.
VFPPS Vmps-pf- Main Verb, fem. plur., past part.
VFSPS Vmps-sf- Main Verb, fem. sing., past part.
VMPPS Vmps-pm- Main Verb, masc. plur., past part.
VMSPS Vmps-sm- Main Verb, masc. plur., past part.
VF Vmnp---- Main Verb, infinitive
VFE Vmnp----y Main Verb, infinitive + enclitic
VG Vmgp---- Main Verb, gerund
VGE Vmgp----y Main Verb, gerund + enclitic
VPP Vmpp-..- Main Verb, pres. participle
--------------------- more collapsed end ----------------------
====== =================== ===================================
5.1.2.3 Combinations
============ ======== =============================================
Lexicon Corpus Example
============ ======== =============================================
Vaip1s- +++ VAS1IP ho
Vaip2s- VAS2IP hai, sei
Vaip3s- VAS3IP ha, e'
Vaip2p- VAP2IP avete, siete
Vaip3p- +++ VAP3IP hanno
Vaip1p- VAP1ICP abbiamo, siamo
Vaip1s- +++ VAY^2IP sono
Vaip3p- +++ VAY^2IP sono
Vaii1s- VAS1II avevo, ero
Vaii2s- VAS2II avevi, eri
Vaii3s- VAS3II aveva, era
Vaii1p- VAP1II avevamo, eravamo
Vaii2p- VAP2II avevate, eravate
Vaii3p- VAP3II avevano, erano
Vaif1s- VAS1IF avro', saro'
Vaif2s- VAS2IF avrai, sarai
Vaif3s- VAS3IF avra', sara'
Vaif1p- VAP1IF avremo, saremo
Vaif2p- VAP2IF avrete, sarete
Vaif3p- VAP3IF avranno, saranno
Vais1s- VAS1IR ebbi, fui
Vais2s- VAS2IR avesti, fosti
Vais3s- VAS3IR ebbe, fu
Vais1p- VAP1IR avemmo, fummo
Vais3p- VAP3IR ebbero, furono
Vais2s- VAP2ICR aveste, foste
Vasp1s- VASXCP abbia, sia
Vasp2s- VASXCP abbia, sia
Vasp3s- VASXCP abbia, sia
Vasp1p- VAP1ICP abbiamo, siamo
Vasp2p- VAP2CMP abbiate, siate
Vasp3p- VAP3CP abbiano, siano
Vasi1s- VAS^3CI avessi, fossi
Vasi2s- VAS^3CI avessi, fossi
Vasi3s- VAS3CI avesse, fosse
Vasi1p- VAP1CI avessimo, fossimo
Vasi2s- VAP2ICR aveste, foste
Vasi3p- VAP3CI avessero, fossero
Vamp2s- VAS2MP abbi, sii
Vamp2s-y VAS2MPE abbilo, siilo
Vamp2p- VAP2CMP abbiate, siate
Vamp2p-y VAP2MPE abbiatelo, siatelo
Vacp1s- VAS1DP avrei, sarei
Vacp2s- VAS2DP avresti, saresti
Vacp3s- VAS3DP avrebbe, sarebbe
Vacp1p- VAP1DP avremmo, saremmo
Vacp2p- VAP2DP avreste, sareste
Vacp3p- VAP3DP avrebbero, sarebbero
Vanp--- VAF avere, essere
Vanp--cy VAFE averlo, esserlo
Vapp-sc VANSPP avente, essente
Vapp-pc VANPPP aventi, essenti
Vaps-sm VAMSPR avuto, stato
Vaps-pm VAMPPR avuti, stati
Vaps-sf VAFSPR avuta, stata
Vaps-pf VAFPPR avute, state
Vaps-smy VAMSPRE avutolo
Vaps-pmy VAMPPRE avutili
Vaps-sfy VAFSPRE avutala
Vaps-pfy VAFPPRE avuteli
Vagp--- VAG avendo, essendo
Vagp---y VAGE avendolo, essendolo
Vmip1s- VS1IP amo, leggo, servo
Vmip2s- +++ VSXICP ami
Vmip2s- +++ VS2IMP leggi, servi
Vmip3s- --- VS^1IMP ama
Vmip3s- --- VS3IP legge, serve
Vmip1p- VP1ICP amiamo, leggiamo, serviamo
Vmip2p- *** VP2IMPP amate, servite
Vmip2p- *** VP2IMP leggete
Vmip2p- *** VP2IMCPP premiate
Vmip3p- VP3IP amano, leggono, servono
Vmii1s- VS1II amavo,
Vmii2s- VS2II amavi,
Vmii3s- VS3II amava
Vmii1p- VP1II amavano
Vmii2p- VP2II amavate
Vmii3p- VP3II amavano
Vmif1s- VS1IF amero'
Vmif2s- VS2IF amerai
Vmif3s- VS3IF amera'
Vmif1p- VP1IF ameremo
Vmif2p- VP2IF amerete
Vmif3p- VP3IF ameranno
Vmis1s- VS1IR amai
Vmis2s- VS2IR amasti
Vmis3s- VS3IR amo'
Vmis1p- VP1IR amammo
Vmis2p- VP2ICR amaste, leggeste, serviste
Vmis3p- VP3IR amarono
Vmsp1s- +++ VSXCP legga
Vmsp1s- +++ VSXICP ami
Vmsp2s- --- VSXCP legga
Vmsp2s- --- VSXICP ami
Vmsp3s- *** VSXCP legga
Vmsp3s- *** VSXICP ami
Vmsp1p- VP1ICP amiamo, leggiamo, serviamo
Vmsp2p- """ VP2CP amiate, leggiate, serviate
Vmsp2p- """ VP2ICMPP premiate
Vmsp3p- VP3CP amino, leggano, servano
Vmsi1s- VS^3CI amassi, leggessi, servissi
Vmsi2s- VS^3CI amassi, leggessi, servissi
Vmsi3s- VS3CI amasse, leggesse, servisse
Vmsi1p- VP1CI amassimo
Vmsi2p- VP2ICR amaste, leggeste, serviste
Vmsi3p- VP3CI amassero
Vmmp2s- +++ VS^1IMP ama
Vmmp2s- +++ VS2IMP leggi, servi
Vmmp2p- --- VP2IMPP amate
Vmmp2p- --- VP2IMP leggete, servite
Vmmp2p- --- VP2IMCPP premiate
Vmmp2s-y VS2MPe amalo, leggilo, servilo
Vmmp2p-y VP2MPe amatelo, leggetelo, servitelo
Vmcp1s- VS1DP amerei
Vmcp2s- VS2DP ameresti
Vmcp3s- VS3DP amarebbe
Vmcp1p- VP1DP ameremmo
Vmcp2p- VP2DP amereste
Vmcp3p- VP3DP amerebero
Vmnp--- VF amare
Vmnp---y VFE amarlo
Vmpp-sc VNSPP amante
Vmpp-pc VNPPP amanti
Vmps-sm VMSPR amato, letto, servito
Vmps-pm VMPPR amati, letti, serviti
Vmps-sf VFSPR amata, letta, servita
Vmps-pf +++ VP2IMCPP premiate
Vmps-pf +++ VP2IMPP amate, servite
Vmps-pf +++ VFPPR lette
Vmps-smy VMSPRE amatolo
Vmps-pmy VMPPRE amatili
Vmps-sfy VFSPRE amatala
Vmps-pfy VFPPRE amatele
Vmgp--- VG amando
Vmgp---y VGE amandolo
----------------------- more collapsed tagset ------------------
Vaip1s- VA1S ho
Vaip2s- VA2S hai
Vaip3s- VA3S ha
Vaip1p- VA1P abbiamo
Vaip2p- VA2P avete
Vaip3p- VA3P hanno
Vaii1s- VA1S avevo
Vaii2s- VA2S avevi
Vaii3s- VA3S aveva
Vaii1p- VA1P avevamo
Vaii2p- VA2P avevate
Vaii3p- VA3P avevano
Vaif1s- VA1S avro'
Vaif2s- VA2S avrai
Vaif3s- VA3S avra'
Vaif1p- VA1P avremo
Vaif2p- VA2P avrete
Vaif3p- VA3P avranno
Vais1s- VA1S ebbi
Vais2s- VA2S avesti
Vais3s- VA3S ebbe
Vais1p- VA1P avemmo
Vais2p- VA2P aveste
Vais3p- VA3P ebbero
Vasp1s- VA1S abbia
Vasp2s- VA2S abbia
Vasp3s- VA3S abbia
Vasp1p- VA1P abbiamo
Vasp2p- VA2P abbiate
Vasp3p- VA3P abbiano
Vasi1s- VA1S avessi
Vasi2s- VA2S avessi
Vasi3s- VA3S avesse
Vasi1p- VA1P avessimo
Vasi2p- VA2P aveste
Vasi3p- VA3P avessero
Vamp2s- VA2S abbi
Vamp2p- VA2P abbiate
Vacp1s- VA1S avrei
Vacp2s- VA2S avresti
Vacp3s- VA3S avrebbe
Vacp1p- VA1P avremmo
Vacp2p- VA2P avreste
Vacp3p- VA3P avrebbero
Vanp--- VAF avere
Vanp---y VAFE averlo
Va-cspp VANSPP avente
Va-cppp VANPPP aventi
Va-msps VAMSPR avuto
Va-mpps VAMPPR avuti
Va-fsps VAFSPR avuta
Va-fpps VAFPPR avute
Va-gp-- VAG avendo
Va-gp--y VAGE avendolo
Vmip1s- V1S amo
Vmip2s- V2S ami
Vmip3s- V3S ama
Vmip1p- V1P amiamo
Vmip2p- V2P amate
Vmip3p- V3P amano
Vmii1s- V1S amavo
Vmii2s- V2S amavi
Vmii3s- V3S amava
Vmii1p- V1P amavamo
Vmii2p- V2P amavate
Vmii3p- V3P amavano
Vmif1s- V1S amero'
Vmif2s- V2S amerai
Vmif3s- V3S amera'
Vmif1p- V1P ameremo
Vmif2p- V2P amerete
Vmif3p- V3P ameranno
Vmis1s- V1S amai
Vmis2s- V2S amasti
Vmis3s- V3S amo'
Vmis1p- V1P amammo
Vmis2p- V2P amaste
Vmis3p- V3P amarono
Vmsp1s- V1S ami
Vmsp2s- V2S ami
Vmsp3s- V3S ami
Vmsp1p- V1P amiamo
Vmsp2p- V2P amiate
Vmsp3p- V3P amino
Vmsi1s- V1S amassi
Vmsi2s- V2S amassi
Vmsi3s- V3S amasse
Vmsi1p- V1P amassimo
Vmsi2p- V2P amaste
Vmsi3p- V3P amassero
Vmmp2s- V2S ama
Vmmp2p- V2P amate
Vmcp1s- V1S amerei
Vmcp2s- V2S ameresti
Vmcp3s- V3S amerebbe
Vmcp1p- V1P ameremmo
Vmcp2p- V2P amereste
Vmcp3p- V3P amerebbero
Vmnp--- VF amare
Vmnp---y VFE amarlo
Vm-cspp VNSPP amante
Vm-cppp VNPPP amanti
Vm-msps VMSPR amato
Vm-mpps VMPPR amati
Vm-fsps VFSPR amata
Vm-fpps VFPPR amate
Vm-gp-- VG amando
Vm-gp--y VGE amandolo
========= ======= ========================================
5.1.2.4 Some observations for corpus tagset
An observation concerns the special marking for the auxiliaries: the
taggers are in general not able to disambiguate the cases in which the
auxiliaries are used as full verbs ("io ho un cane" , "i bambini sono
nel prato") from the cases when they are auxiliaries. The distinction
of the auxiliaries is used only in order to isolate 'avere' and
'essere' from the other verbs.
For verbs, two different sets of tags are proposed, the first more
fine-grained for more accurate distinctions and the latter more
coarse-grained, which follows the approach proposed by the French
group.
The collapsing proposed by the French group of Moods and Tenses, if
considered wrt to the performances of our tagger, appears restrictive:
for many unambiguous tenses and moods, the Italian tagger is able to
formulate the correct analysis (e.g. conditional, subjunctive
imperfect, indicative past etc.) and these distinctions are, in our
opinion, worth being maintained. It has to be noticed that the
ambiguities between verb forms depend also on different lexical verbs.
In Italian, the major ambiguities concerns the 2nd sing and plur of
the present indicative and imperative, ama-amate; leggi-leggete.
However, this is again not a general rule.
Another very common ambiguity is between the 2nd pers. of the
indicative and the 1st, 2nd, 3rd person of the present subjunctive.
Therefore not always it is possible to decide unambiguosly on the
person.
Some more frequent typical homographies in Italian are listed below:
VP1ICP amiamo
VP2IMP leggete
VP2IMPP amate
VP2IMCPP premiate
VP2ICR amaste
VS^3CI amassi
VSXCP legga
VAY^2IP sono
VSXICP ami
VS2IMP leggi
VS^1IMP ama
VS^1IMP ama
In the design of corpus tagsets for verbs careful attention should be
given to the enclitic phenomenon: at present our tagger is able to
recognize the presence of the clitics which is signalled by the
addition of the mark "+E" (plus clitic) to the regular verb tag.
5.1.3 Adjectives (A)
--------------------
5.1.3.1 Lexicon
============ =========== =========== ====
Attribute Value Example Code
============ =========== =========== ====
Type - - -
------------------------------------ ----
Degree positive buono p
comparative migliore c
superlative buonissimo s
------------------------------------ ----
Gender masculine buono m
feminine buona f
l-spec common dolce c
------------ --- - ----- ----------- ----
Number singular buono s
plural buoni p
l-spec invariant pari n
------------ --- - ----- ----------- ----
Case (n.a.) (n.a.) -
============ =========== =========== ====
5.1.3.2 Corpus
======= ================== ====================================
Tag Regular expression Definition
======= ================== ====================================
AFP A-.fp- Adjective fem. plur.
AFS A-.fs- Adjective fem. sing.
AFN A-.fn- Adjective fem. invar.
AMP A-.mp- Adjective masc. plur.
AMS A-.ms- Adjective masc. sing.
AMN A-.mn- Adjective masc. invar.
AMP A-.mp- Adjective comm. plur.
AMS A-.ms- Adjective comm. sing.
AMN A-.mn- Adjective comm. invar.
======= ================== ====================================
5.1.3.3 Combinations
========= ======= =============================================
Lexicon Corpus Example
========= ======= =============================================
A-pms- AMS vero
A-pmp- AMP veri
A-pmn- AMN oggetto (complemento/i oggetto: grammatical language)
A-pfs- AFS vera
A-pfp- AFP vere
A-pfn- AFN valore (clausola valore: juridical language)
A-pcs- ANS dolce (biscotto, torta)
A-pcp- ANP dolci (biscotti, dolci)
A-pcn- ANN pari (risultato/i, somma/e)
A-sms- AMS verissimo
A-smp- AMP verissimi
A-sfs- AFS verissima
A-sfp- AFP verissime
========= ======= =============================================
5.1.3.4 Observations
The comparative Degree applies only to a close set of adjectives (e.g.
maggiore, migliore, etc). All other adjectives form their comparatives
with "piu'" + adjective (e.g., piu' forte). Superlative is also an
analytical form (il piu' forte), but can be also synthetically formed:
grandissimo, massimo.
5.1.4. Pronouns
---------------
5.1.4.1. Lexicon
============ =========== =========== ====
Attribute Value Example Code
============ =========== =========== ====
Type personal io p
demonstrat. quello d
indefinite chiunque i
possessive mio s
interrog. chi t
relative che r
exclamative quanto e
------------ ----------- ----------- ----
Person first io 1
second tu 2
third egli 3
------------ ----------- ----------- ----
Gender masculine questo m
feminine questa f
l-spec common io c
------------ ----------- ----------- ----
Number singular questo s
plural questi p
l-spec invariant che n
------------ ----------- ----------- ----
Case (n.a.) (n.a.) -
------------ ----------- ----------- ----
Possessor - - -
============ =========== =========== ====
5.1.4.2 Corpus
======= ========= ====================================
Tag Reg.Expr. Definition
======= ========= ====================================
PDMS Pd-ms-- Demonstrative pronoun masc.sing.
PDMP Pd-mp-- Demonstrative pronoun masc.plur.
PDFS Pd-fs-- Demonstrative pronoun femm.sing.
PDFP Pd-fp-- Demonstrative pronoun femm.plur.
PDNS Pd-cs-- Demonstrative pronoun comm.sing.
PDNP Pd-cp-- Demonstrative pronoun comm.plur.
PIMS Pi-ms-- Indefinite pronoun masc.sing.
PIMP Pi-mp-- Indefinite pronoun masc.plur.
PIFS Pi-fs-- Indefinite pronoun femm.sing.
PIFP Pi-fp-- Indefinite pronoun femm.plur.
PINS Pi-cs-- Indefinite pronoun comm.sing.
PINP Pi-cp-- Indefinite pronoun comm.plur.
PPMS Ps.ms-- Possessive pronoun, masc.sing.
PPMP Ps.mp-- Possessive pronoun, masc.plur.
PPFS Ps.fs-- Possessive pronoun, femm.sing.
PPFP Ps.fp-- Possessive pronoun, femm.plur.
PPNP Ps.cp-- Possessive pronoun, comm.plur.
PWNS P[tre]-cs-- Interr./Rel./Escl. pronoun, comm.sing.
PWNP P[tre]-cp-- Interr./Rel./Escl. pronoun, comm.plur.
PWNN P[tre]-cn-- Interr./Rel./Escl. pronoun, comm.plur.
PWMS P[tre]-ms-- Interr./Rel./Escl. pronoun, masc.sing.
PWMP P[tre]-mp-- Interr./Rel./Escl. pronoun, masc.plur.
PWFS P[tre]-fs-- Interr./Rel./Escl. pronoun, femm.sing.
PWFP P[tre]-fp-- Interr./Rel./Escl. pronoun, femm.plur.
PQNS1 Pp1cs-- Personal pronoun, 1st pers., comm.sing.
PQNS2 Pp2cs-- Personal pronoun, 2nd pers., comm.sing.
PQMS3 Pp3ms-- Personal pronoun, 3rd pers., masc.sing.
PQFS3 Pp3fs-- Personal pronoun, 3rd pers., femm.sing.
PQNN3 Pp3cn-- Personal pronoun, 3rd pers., comm.inv.
PQNP1 Pp1cp-- Personal pronoun, 1st pers., comm.plur.
PQNP2 Pp2cp-- Personal pronoun, 2nd pers., comm.plur.
PQNP3 Pp3cp-- Personal pronoun, 3rd pers., comm.plur.
PQMP3 Pp3mp-- Personal pronoun, 3rd pers., masc.plur.
PQFP3 Pp3fp-- Personal pronoun, 3rd pers., femm.plur.
--------- more collapsed ----------------
PFP P..fp--- Pronoun, fem. plur.
PFS P..fs--- Pronoun, fem. plur.
PMP P..mp--- Pronoun, masc. plur.
PMS P..ms--- Pronoun, masc. sing.
PNS P..cs--- Pronoun, comm. sing.
PNP P..cp--- Pronoun, comm. plur.
PNN P..cn--- Pronoun, comm. inv.
---------- more collapsed end -----------
====== ========== ==========================================
5.1.4.3 Combinations
========= ======= =============================================
Lexicon Corpus Example
========= ======= =============================================
Pd-ms-- PDMS quello, costui
Pd-mp-- PDMP quelli
Pd-fs-- PDFS quella
Pd-fp-- PDFP quelle
Pd-cs-- PDNS cio'
Pd-cp-- PDNP coloro
Pi-ms-- PIMS ognuno
Pi-mp-- PIMP alcuni
Pi-fs-- PIFS ognuna
Pi-fp-- PIFP alcune
Pi-cs-- PINS chiunque, tale
Pi-cp-- PINP tali
Ps1ms-- PPMS mio, nostro
Ps1mp-- PPMP miei
Ps1fs-- PPFS mia
Ps1fp-- PPFP mie
Ps2ms-- PPMS tuo, vostro
Ps2mp-- PPMP tuoi
Ps2fs-- PPFS tua
Ps2fp-- PPFP tue
Ps3ms-- PPMS suo
Ps3mp-- PPMP suoi
Ps3fs-- PPFS sua
Ps3fp-- PPFP sue
Ps3cp-- PPNP loro
Pt-cs-- PWNS chi? quale?
Pt-cp-- PWNP quali?
Pt-cn-- PWNN che?
Pt-ms-- PWMS quanto?
Pt-mp-- PWMP quanti?
Pt-fs-- PWFS quanta?
Pt-fp-- PWFP quante?
Pr-cn-- PWNN cui
Pr-ms-- PWMS quanto
Pr-mp-- PWMP quanti
Pr-fs-- PWFS quanta
Pr-fp-- PWFP quante
Pr-cs-- PWNS chi, quale
Pr-cp-- PWNP quali
Pe-ms-- PWMS quanto!
Pe-mp-- PWMP quanti!
Pe-fs-- PWFS quanta!
Pe-fp-- PWFP quante!
Pe-cs-- PWNS quale!
Pe-cp-- PWNP quali!
Pe-cn-- PWNN che!
Pp1cs-- PQNS1 io, me, mi
Pp2cs-- PQNS2 tu, te, ti,
Pp3ms-- PQMS3 egli, lui, esso, gli, lo
Pp3fs-- PQFS3 ella, lei, essa, le, la
Pp3cn-- PQNN3 si
Pp1cp-- PQNP1 noi, ci
Pp2cp-- PQNP2 voi, vi
Pp3cp-- PQNP3 loro,
Pp3mp-- PQMP3 essi, li
Pp3fp-- PQFP3 esse, le
--------------------- more collapsed -----------------------------
P..fp--- PFP mie, queste, quante etc.
P..fs--- PFS mia, questa, quanta etc.
P..mp--- PMP miei, questi, quanti etc.
P..ms--- PMS mio, questo, quanto etc.
P..cs--- PNS quale
P..cp--- PNP quali
P..cn--- PNN che, cui, altrui
-------------------- more collapsed end --------------------
==============================
5.1.4.4 Observations
For pronouns, the strategy of proposing two different tagsets, the one
more collapsed and the other more fine-grained is followed.
As far as the pronominal paradigm is concerned, Case is not encoded at
present in our DMI (Calzolari et al. 1983).
Personal pronouns are not lemmatized: 'gli' is not considered the
dative form of the base pronoun 'egli' (he), but constitutes a
separate entry.
The Italian pronominal paradigm is the following:
'forme toniche' (strong forms): subj (io, egli), compl (me, lui)
ama me / da' a me -- dir-obj/prep-obj --
(he loves me / he gives to me)
ama lui / da' a lui -- dir-obj/prep-obj --
(she loves him / she gives to him)
'forme atone' (weak forms): - compl (mi, gli/lo)
mi da' / mi ama -- ind-obj/dir-obj --
(he gives me / he loves me)
gli da' -- ind-obj --
(he gives him)
lo ama -- dir-obj --
(she loves him)
This paradigm can be mapped on the Case system proposed by the French
group, in the following way:
io, egli = subj = nom
mi/me = dir-obj/ind-obj/prep-obj = obj -] acc, dat, prep+obl
lui = dir-obj/prep-obj = obj -] acc, prep+obl
gli = ind-obj = dat
lo = dir-obj = acc
5.1.5 Determiners (Pronominal Adjectives) (D)
---------------------------------------------
5.1.5.1. Lexicon
============ =========== =========== ====
Attribute Value Example Code
============ =========== =========== ====
Type demonstrat. questo d
indefinite ogni i
possessive mio s
interrogat. che t
exclamative quanto e this value has been added
relative quanto r this value has been added
------------ ----------- ----------- ----
Person first mio 1
second tuo 2
third suo 3
------------ ----------- ----------- ----
Gender masculine questo m
feminine questa f
l-spec common ogni c
------------ ----------- ----------- ----
Number singular quello s
plural quelli p
l-spec invariant altrui n
------------ ----------- ----------- ----
Case (n.a.) (n.a.) -
------------ ----------- ----------- ----
Possessor - - -
============ =========== =========== ====
5.1.5.2 Corpus
======= ============ ============================================
Tag Regular exp. Definition
======= ============ ============================================
DDNS Dd-ns-- Demonstrative pron.adj. comm.inv.
DDNP Dd-np-- Demonstrative pron.adj. comm.plur.
DDMS Dd-ms-- Demonstrative pron.adj. masc.sing.
DDMP Dd-mp-- Demonstrative pron.adj. masc.plur.
DDFS Dd-fs-- Demonstrative pron.adj. femm.sing.
DDFP Dd-fp-- Demonstrative pron.adj. femm.plur.
DIMS Di-ms-- Indefinite pron.adj. masc.sing.
DIMP Di-mp-- Indefinite pron.adj. masc.plur.
DIFS Di-fs-- Indefinite pron.adj. femm.sing.
DIFP Di-fp-- Indefinite pron.adj. femm.plur.
DINS Di-cs-- Indefinite pron.adj. comm.sing.
DINP Di-cp-- Indefinite pron.adj. comm.plur.
DPMS Ds.ms-- Possessive pron.adj., masc.sing.
DPMP Ds.mp-- Possessive pron.adj., masc.plur.
DPFS Ds.fs-- Possessive pron.adj., femm.sing.
DPFP Ds.fp-- Possessive pron.adj., femm.plur.
DPNN Ds-cn-- Possessive pron.adj., comm.inv.
DWNN D[tre]-cn-- Interr/Relat./escl. pron.adj., comm.inv.
DWMS D[tre]-ms-- Interr/Relat./escl. pron.adj., masc.sing.
DWMP D[tre]-mp-- Interr/Relat./escl. pron.adj., masc.plur.
DWFS D[tre]-fs-- Interr/Relat./escl. pron.adj., femm.sing.
DWFP D[tre]-fp-- Interr/Relat./escl. pron.adj., femm.plur.
DWNS D[tre]-cs-- Interr/Relat./escl. pron.adj., comm.sing.
DWNP D[tre]-cp-- Interr/Relat./escl. pron.adj., comm.plur.
--------- more collapsed ----------------
DFP D..fp--- Determiner, fem. plur.
DFS D..fs--- Determiner, fem. plur.
DMP D..mp--- Determiner, masc. plur.
DMS D..ms--- Determiner, masc. sing.
DNS D..cs--- Determiner, comm. sing.
DNP D..cp--- Determiner, comm. plur.
DNN D..cn--- Determiner, comm. inv.
--------- more collapsed end --------------
======= ============ ================================================
5.1.5.3 Combinations
========= ========== =================================
Lexicon Corpus Example
========= ========== =================================
Dd-cs-- DDNS tale
Dd-cp-- DDNP tali
Dd-ms-- DDMS quello
Dd-mp-- DDMP quelli
Dd-fs-- DDFS quella
Dd-fp-- DDFP quelle
Di-ms-- DIMS nessun
Di-mp-- DIMP alcuni
Di-fs-- DIFS nessuna
Di-fp-- DIFP alcune
Di-cs-- DINS ogni
Di-cp-- DINP quali
Ds1ms-- DPMS mio, nostro
Ds1mp-- DPMP miei
Ds1fs-- DPFS mia
Ds1fp-- DPFP mie
Ds2ms-- DPMS tuo, vostro
Ds2mp-- DPMP tuoi
Ds2fs-- DPFS tua
Ds2fp-- DPFP tue
Ds3ms-- DPMS suo
Ds3mp-- DPMP suoi
Ds3fs-- DPFS sua
Ds3fp-- DPFP sue
Ds-cn-- DPNN altrui
Dr-cn-- DWNN cui
Dr-ms-- DWMS quanto
Dr-mp-- DWMP quanti
Dr-fs-- DWFS quante
Dr-fp-- DWFP quanti
Dr-cs-- DWNS quale
Dr-cp-- DWNP quale
Dt-cn-- DWNN che
Dt-ms-- DWMS quanto
Dt-mp-- DWMP quanti
Dt-fs-- DWFS quante
Dt-fp-- DWFP quanti
Dt-cs-- DWNS quale
Dt-cp-- DWNP quale
De-cn-- DWNN che
De-cp-- DWNP quali
De-cs-- DWNS quale
De-ms-- DWMS quanto
De-mp-- DWMP quanti
De-fs-- DWFS quanta
De-fp-- DWFP quante
----------------------- more collapsed -----------------------------
D..fp--- DFP mie, queste, quante etc.
D..fs--- DFS mia, questa, quanta etc.
D..mp--- DMP miei, questi, quanti etc.
D..ms--- DMS mio, questo, quanto etc.
D..cs--- DNS quale
D..cp--- DNP quali
D..cn--- DNN altrui
----------------------- more collapsed end -----------------------
========= ========== =================================
5.1.5.4 Combinations
On the basis of the strategy adopted for Pronouns, also for Determiners
two tagsets are proposed.
5.1.6 Articles (T)
------------------
5.1.6.1. Lexicon
============ =========== =========== ====
Attribute Value Example Code
============ =========== =========== ====
Type definite il d
indefinite un i
------------ ----------- ----------- ----
Gender masculine il m
feminine la f
l-spec common l' c
------------ ----------- ----------- ----
Number singular la s
plural le p
------------ ----------- ----------- ----
Case (n.a.) (n.a.) -
============ =========== =========== ====
5.1.6.2. Corpus
======== ========== ==========================================
Tag Reg.Expr. Definition
======== ========== ==========================================
RMS Tdms- Article, definite, masc.sing.
RMP Tdmp- Article, definite, masc.plur.
RFS Tdfs- Article, definite, femm.sing.
RFP Tdfp- Article, definite, femm.plur.
RNS Tdcs- Article, definite, comm.sing.
RIMS Tims- Article, indefinite, masc.sing.
RIFS Tifs- Article, indefinite, femm.sing.
======== ========== ==========================================
5.1.6.3. Combinations
========= ======== ==========================================
Lexicon Corpus Example
========= ======== ==========================================
Tdms- RMS il, lo
Tdmp- RMP i, gli
Tdfs- RFS la
Tdfp- RFP le
Tdcs- RNS l' (amico/a)
Tims- RIMS un, uno
Tifs- RIFS una, un'
================== ==========================================
5.1.7 Adverbs (R)
-----------------
5.1.7.1 Lexicon
============ ====== ===== =========== ====
Attribute Value Example Code
============ =========== =========== ====
Type - - -
------------ ----------- ----------- ----
Degree positive bene p
superlative benissimo s
============ =========== =========== ====
5.1.7.2 Corpus
======= ================== ===========================
Tag Regular Expression Definition
======= ================== ===========================
B R-p Adverb positive
BS R-s Adverb superaltive
======= ================== ===========================
5.1.7.3 Combinations
========= =========== ============================
Lexicon Corpus Example
========= =========== ============================
R-p B fortemente
R-s BS fortissimamente
========= =========== ============================
5.1.7.4. Observations
The feature Type is not encoded in the Italian lexicon.
5.1.8. Adposition (S)
---------------------
5.1.8.1. Lexicon
============ =========== =========== ====
Attribute Value Example Code
============ =========== =========== ====
Type preposition di, a, da p
------------ ----------- ----------- ----
Formation simple di s
compound dello c
------------ ----------- ----------- ----
Gender masculine dello m This attribute and values
feminine alla f have been added
l-spec common dell' c
------------ ----------- ----------- ----
Number singular al s This attribute and values
plural ai p have been added
============ =========== =========== ====
5.1.8.2 Corpus
======= ================== =====================
Tag Regular Expression Definition
======= ================== =====================
E Sp- Preposition simple
EA Spc.. Preposition compound
======= ================== =====================
5.1.8.3 Combinations
========= ================ =======================
Lexicon Corpus Example
========= ================ =======================
Sp E di
Spcfs EA della
Spcfp EA delle
Spcms EA del, dello
Spcmp EA dei, degli
Spccn EA dell'
========= ================ =======================
5.1.8.4 Observations
The Italian policy for encoding fused prepositions foresees to attach
the morphological information of the article to the preposition tag.
5.1.9 Conjunctions (C)
----------------------
5.1.9.1. Lexicon
============ =========== =========== ====
Attribute Value Example Code
============ =========== =========== ====
Type coordinat. e c
subordinat. perche' s
============ =========== =========== ====
5.1.9.2 Corpus
======= ================== =========================
Tag Regular Expression Definition
======= ================== =========================
CC Cc Coordinative conjunction
CS Cs Subordinative conjunction
======= ============================================
5.1.9.3 Combinations
========= =========== ============================
Lexicon Corpus Example
========= =========== ============================
Cc CC ma
Cs CS perche'
========= =========== ============================
5.1.10 Numerals (M)
-------------------
5.1.10.1. Lexicon
============ =========== =========== ====
Attribute Value Example Code
============ =========== =========== ====
Type cardinal cento c
ordinal primo o
------------ ----------- ----------- ----
Gender masculine primo m
feminine prima f
------------ ----------- ----------- ----
Number singular secondo s
plural secondi p
------------ ----------- ----------- ----
Case (n.a.) (n.a.) -
============ =========== =========== ====
6.4.10.2 Corpus
======= ================== ============================
Tag Regular Expression Definition
======= ================== ============================
NMS M.ms- Numeral, masc.sing.
NFS M.fs- Numeral, femm.sing.
NMP M.mp- Numeral, masc.plur.
NFP M.fp- Numeral, femm.plur.
N Mc--- Numeral cardinal
======= ================================================
5.1.10.3 Combinations
========= ========= ===============================
Lexicon Corpus
========= ========= ===============================
M.ms- NMS primo
M.fs- NFS prima
M.mp- NMP primi
M.fp- NFP prime
Mc--- N zero, cento
========= ========= ===============================
5.1.11 Interjection (I)
-----------------------
5.1.11.1. Corpus
======= =========== =====================================
Tag Reg. Expr. Definition
======= =========== =====================================
I I Interjection
======= =========== =====================================
5.1.11.2. Combinations
======= =========== =====================================
Lexicon Corpus Example
======= =========== =====================================
I I oh
======= =========== =====================================
5.1.12 Unique membership class (U)
----------------------------------
None
5.1.13. Residual (X)
--------------------
5.1.13.2 Corpus
======= =================== ====================
Tag Regular Expression Definition
======= =================== ====================
NY ??? "Guessed" Noun
AY ??? "Guessed" Adjective
======= =================== ====================
5.1.13.3 Combinations
========= ========= ===============================
Lexicon Corpus Example
========= ========= ===============================
??? NY bit
??? AY computerizzato
========= ========= ===============================
5.1.13.4 Observations
At corpus level, we have the tag SY which is used to mark symbols,
letters, acronyms, foreign words, toponyms etc., in general unknown
words, for which a "guess" is provided.
5.1.13 Punctuation
========= ============================
Tag Example
========= ============================
punct .,;:?! etc.
========= ============================