In this section the MULTEXT set of lexicon
specifications is applied to
Italian (Calzolari and Monachini 1994).
The language-specific values added for Italian are highligthed with
the code `l-spec'.
Furthermore, a preliminary tagset for Italian is proposed. This is based on the tagset used by our tagger, but also takes into account the criteria expressed above for the construction of the tagset, and the results of a first cycle of experimentations on the MULTEXT tagger.
A table containing the the translation of the tag into the regular expression and its definition is presented, i.e.
TAG Reg.expr. Definition NMS Ncms- Common noun, masc.sing.A table displaying the mapping between lexicon specifications and corpus tags is provided, along with an examplification.
5.1.1 Nouns (N) --------------- 5.1.1.1 Lexicon ============ =========== =========== ==== Attribute Value Example Code ============ =========== =========== ==== Type common libro c proper Gianni p ------------ ----------- ----------- ---- Gender masculine uomo m feminine donna f l-spec common insegnante c ------------ ----------- ----------- ---- Number singular uomini s plural donne p l-spec invariant attivita' n ------------ ----------- ----------- ---- Case (n.a.) (n.a.) - ============ =========== =========== ==== 5.1.1.2 Corpus ======= ================== ==================================== Tag Regular expression Definition ======= ================== ==================================== NMS Ncms- Common noun, masc. sing. NMP Ncmp- Common noun, masc. plur. NMN Ncmn- Common noun, masc. invar. NFS Ncfs- Common noun, fem. sing. NFP Ncfp- Common noun, fem. plur. NFN Ncfn- Common noun, fem. invar. NNS Nccs- Common noun, comm. sing. NNP Nccp- Common noun, comm. plur. NNN Nccn- Common noun, comm. invar. NP Np..- Proper noun ======= ================== ==================================== 5.1.1.3 Combinations ========= ======= ============================================= Lexicon Corpus Example ========= ======= ============================================= Ncms- NMS libro Ncmp- NMP libri Ncmn- NMN re, caffe' (il/i) Ncfs- NFS casa Ncfp- NFP case Ncfn- NFN attivita' (la/le) Nccs- NNS insegnante (un/una) Nccp- NNP insegnanti (gli/le) Nccn- NNN sosia (il/la, i/le) Np..- NP Mario, Maria, Borboni ========= ======= ============================================= 5.1.1.4 Some obsevations for the corpus tagset The idea of the French group to tag Proper Nouns simply with NP (collapsing the information on gender and number) seems the best solution.
5.1.2 Verb (V) -------------- 5.1.2.1 Lexicon ============ =========== =========== ==== Attribute Value Example Code ============ =========== =========== ==== Type main amare m auxiliary avere a ------------ ----------- ----------- ---- Mood/VForm indicative amo i subjunctive ami s imperative ama m conditional amerei c infinitive amare n participle amato p gerund amando g ------------ ----------- ----------- ---- Tense present amo p imperfect amavo i future amero' f past amai s ------------ ----------- ----------- ---- Person first amo 1 second ami 2 third ama 3 ------------ ----------- ----------- ---- Number singular amo s plural amiamo p ------------ ----------- ----------- ---- Gender masculine amato m feminine amata f l-spec common amante c ============ =========== =========== ==== 5.1.2.2 Corpus ========= ======================= ====================================== Tag Regular Expression Definition ========= ======================= ====================================== VAS1IP Vaip1s- Aux. Verb, 1st pers.sing., pres.indic. VAS2IP Vaip2s- Aux. Verb, 2nd pers.sing., pres.indic. VAS3IP Vaip3s- Aux. Verb, 3rd pers.sing., pres.indic. VAP2IP Vaip2p- Aux. Verb, 2nd pers.plur., pres.indic. VAP3IP Vaip3p- Aux. Verb, 3rd pers.plur., pres.indic. VAP1ICP Va[is]p1p- Aux. Verb, 1stpers.plur.,pres.indic/cong VAY^2IP Vaip(1s|3p)- Aux. Verb, 1st sing./3rd plur., pres.indic VAS1II Vaii1s- Aux. Verb, 1st pers.sing., impf.indic. VAS2II Vaii2s- Aux. Verb, 2nd pers.sing., impf.indic. VAS3II Vaii3s- Aux. Verb, 3rd pers.sing., impf.indic. VAP1II Vaii1p- Aux. Verb, 1st pers.plur., impf.indic. VAP2II Vaii2p- Aux. Verb, 2nd pers.plur., impf.indic. VAP3II Vaii3p- Aux. Verb, 3rd pers.plur., impf.indic. VAS1IF Vaif1s- Aux. Verb, 1st pers.sing., fut. indic. VAS2IF Vaif2s- Aux. Verb, 2nd pers.sing., fut. indic. VAS3IF Vaif3s- Aux. Verb, 3rd pers.sing., fut. indic. VAP1IF Vaif1p- Aux. Verb, 1st pers.plur., fut. indic. VAP2IF Vaif2p- Aux. Verb, 2nd pers.plur., fut. indic. VAP3IF Vaif3p- Aux. Verb, 3rd pers.plur., fut. indic. VAS1IR Vais1s- Aux. Verb, 1st pers.sing., past indic. VAS2IR Vais2s- Aux. Verb, 2nd pers.sing., past indic. VAS3IR Vais3s- Aux. Verb, 3rd pers.sing., past indic. VAP1IR Vais1p- Aux. Verb, 1st pers.plur., past indic. VAP3IR Vais3p- Aux. Verb, 3rd pers.plur., past indic. VAP2ICR Va(is)|(si)2p- Aux. Verb, 2nd p.pl., past indic./pres.cong VASXCP Vacp.s- Aux. Verb, 1/2/3 p. sing., pres.subj. VAP2CMP Va[sm]p2p- Aux. Verb, 2nd pers.plur., pres.subj./imper. VAP3CP Vasp3p- Aux. Verb, 3rd pers.plur., pres.subj. VAS^3CI Vasi^3s- Aux. Verb, 1/2 pers.sing., impf.subj. VAS3CI Vasi3s- Aux. Verb, 3rd pers.sing., impf.subj. VAP1CI Vasi1p- Aux. Verb, 1st pers.plur., impf.subj. VAP3CI Vasi3p- Aux. Verb, 3rd pers.plur., impf.subj. VAS2MP Vamp2s- Aux. Verb, 2nd pers.sing., pres.impr. VAS2MPE Vamp2s-y Aux. Verb, 2nd pers.sing., pres.impr. + clit. VAP2MPE Vamp2p-y Aux. Verb, 2nd pers.plur., pres.impr. + clit. VAS1DP Vacp1s- Aux. Verb, 1st pers.sing., pres.cond. VAS2DP Vacp2s- Aux. Verb, 2nd pers.sing., pres.cond. VAS3DP Vacp3s- Aux. Verb, 3rd pers.sing., pres.cond. VAP1DP Vacp1p- Aux. Verb, 1st pers.plur., pres.cond. VAP2DP Vacp2p- Aux. Verb, 2nd pers.plur., pres.cond. VAP3DP Vacp3p- Aux. Verb, 3rd pers.plur., pres.cond. VAF Vanp--- Aux. Verb, infinitive VAFE Vanp--cy Aux. Verb, infinitive + clitic VANSPP Vapp-sc Aux. Verb, comm.sing., pres.part. VANPPP Vapp-pc Aux. Verb, comm.plur., pres.part. VAMSPR Vaps-sm Aux. Verb, masc.sing., past part. VAMPPR Vaps-pm Aux. Verb, masc.plur., past part. VAFSPR Vaps-sf Aux. Verb, femm.sing., past part. VAFPPR Vaps-pf Aux. Verb, femm.plur., past part. VAMSPRE Vaps-smy Aux. Verb, masc.sing., past part. + clitic VAMPPRE Vaps-pmy Aux. Verb, masc.plur., past part. + clitic VAFSPRE Vaps-sfy Aux. Verb, femm.sing., past part. + clitic VAFPPRE Vaps-pfy Aux. Verb, femm.plur., past part. + clitic VAG Vagp--- Aux. Verb, gerund VAGE Vagp---y Aux. Verb, gerund + clitic VS1IP Vmip1s- Main Verb, 1st pers.sing., pres.indic VS3IP Vmip3s- Main Verb, 3rd pers.sing., pres.indic VP3IP Vmip3p- Main Verb, 3rd pers.plur., pres.indic VP1ICP Vm[is]p1p Main Verb,1stpers.plur.,pres.indic/cong VP2IMPP Vm([im]p2p-)|(ps-pf) M.V., 2nd pl., pres.indic/imper|pstprt f.pl. VP2IMP Vm([im]p2p)- Main Verb, 2nd pl., pres.indic/imper VSXICP Vm(sp.s)|(ip2s)- M.V., 1/2/3 sg.,pres.subj.|2ndsg. pres.indic. VS^1IMP Vm[im]^1s- Main Verb, not 1stsg.,pres.indic./imper. VS2IMP Vm[im]p2s- Main Verb, 2nd sg., pres.indic/imper VP2IMCPP Vm([ims]p2p-)|(ps-pf) M.V., 2pl., pr.ind/imp/sub|pst.prt f.pl. VS1II Vmii1s- Main Verb, 1st pers.sing., impf.indic. VS2II Vmii2s- Main Verb, 2nd pers.sing., impf.indic. VS3II Vmii3s- Main Verb, 3rd pers.sing., impf.indic. VP1II Vmii1p- Main Verb, 1st pers.plur., impf.indic. VP2II Vmii2p- Main Verb, 2nd pers.plur., impf.indic. VP3II Vmii3p- Main Verb, 3rd pers.plur., impf.indic. VS1IF Vmif1s- Main Verb, 1st pers.sing., fut. indic. VS2IF Vmif2s- Main Verb, 2nd pers.sing., fut. indic. VS3IF Vmif3s- Main Verb, 3rd pers.sing., fut. indic. VP1IF Vmif1p- Main Verb, 1st pers.plur., fut. indic. VP2IF Vmif2p- Main Verb, 2nd pers.plur., fut. indic. VP3IF Vmif3p- Main Verb, 3rd pers.plur., fut. indic. VS1IR Vmis1s- Main Verb, 1st pers.sing., past indic. VS2IR Vmis2s- Main Verb, 2nd pers.sing., past indic. VS3IR Vmis3s- Main Verb, 3rd pers.sing., past indic. VP1IR Vmis1p- Main Verb, 1st pers.plur., past indic. VP3IR Vmis3p- Main Verb, 3rd pers.plur., past indic. VP2ICR Vm(is)|(si)2p- Main Verb, 2nd p.pl., past indic./pres.subj. VP2CP Vmsp2p- Main Verb, 2nd pers.plur., pres.subj. amiate VP3CP Vmsp3p- Main Verb, 3rd pers.plur., pres.subj. amino VSXCP Vmcp.s- Main Verb, 1/2/3 p. sing., pres.subj. VS^3CI Vmsi^3s- Main Verb, 1/2 pers.sing., impf.subj. VS3CI Vmsi3s- Main Verb, 3rd pers.sing., impf.subj. VP1CI Vmsi1p- Main Verb, 1st pers.plur., impf.subj. VP3CI Vmsi3p- Main Verb, 3rd pers.plur., impf.subj. VS2MPE Vmmp2s-y Main Verb, 2nd pers.sing., pres.impr. + clit. VP2MPE Vmmp2p-y Main Verb, 2nd pers.plur., pres.impr. + clit. VS1DP Vmcp1s- Main Verb, 1st pers.sing., pres.cond. VS2DP Vmcp2s- Main Verb, 2nd pers.sing., pres.cond. VS3DP Vmcp3s- Main Verb, 3rd pers.sing., pres.cond. VP1DP Vmcp1p- Main Verb, 1st pers.plur., pres.cond. VP2DP Vmcp2p- Main Verb, 2nd pers.plur., pres.cond. VP3DP Vmcp3p- Main Verb, 3rd pers.plur., pres.cond. VF Vmnp--- Main Verb, infinitive VFE Vmnp---y Main Verb, infinitive + clitic VNSPP Vmpp-sc Main Verb, comm.sing., pres.part. VNPPP Vmpp-pc Main Verb, comm.plur., pres.part. VMSPR Vmps-sm Main Verb, masc.sing., past part. VMPPR Vmps-pm Main Verb, masc.plur., past part. VFSPR Vmps-sf Main Verb, femm.sing., past part. VFPPR Vmps-pf Main Verb, femm.plur., past part. VMSPRE Vmps-smy Main Verb, masc.sing., past part. +c VMPPRE Vmps-pmy Main Verb, masc.plur., past part. +c VFSPRE Vmps-sfy Main Verb, femm.sing., past part. +c VFPPRE Vmps-pfy Main Verb, femm.plur., past part. +c VG Vmgp--- Main Verb, gerund VGE Vmgp---y Main Verb, gerund + clitic -------------------- more collapsed tagset ----------------------- VA1P Va[iscm][pifs]1p-- Aux. verb, 1st person plur. VA1S Va[iscm][pifs]1s-- Aux. verb, 1st person sing. VA2P Va[iscm][pifs]2p-- Aux. verb, 2nd person plur. VA2S Va[iscm][pifs]2s-- Aux. verb, 2nd person sing. VA3P Va[iscm][pifs]3p-- Aux. verb, 3rd person plur. VA3S Va[iscm][pifs]3s-- Aux. verb, 3rd person sing. VAFPPS Vaps-pf- Aux. verb, fem. plur., past part. VAFSPS Vaps-sf- Aux. verb, fem. sing., past part. VAMPPS Vaps-pm- Aux. verb, masc. plur., past part. VAMSPS Vaps-sm- Aux. verb, masc. sing., past part. VAN Vanp---- Aux. verb, infinitive VAFE Vanp---- Aux. Verb, infinitive + enclitic VAG Vagp---- Aux. Verb, gerund VAGE Vagp---- Aux. Verb, gerund + enclitic VAPP Vapp-..- Aux. verb, pres. participle V1P Vm[iscm][pifs]1p-- Main Verb, 1st person plur. V1S Vm[iscm][pifs]1s-- Main Verb, 1st person sing. V2P Vm[iscm][pifs]2p-- Main Verb, 2nd person plur. V2S Vm[iscm][pifs]2s-- Main Verb, 2nd person sing. V3P Vm[iscm][pifs]3p-- Main Verb, 3rd person plur. V3S Vm[iscm][pifs]3s-- Main Verb, 3rd person sing. VFPPS Vmps-pf- Main Verb, fem. plur., past part. VFSPS Vmps-sf- Main Verb, fem. sing., past part. VMPPS Vmps-pm- Main Verb, masc. plur., past part. VMSPS Vmps-sm- Main Verb, masc. plur., past part. VF Vmnp---- Main Verb, infinitive VFE Vmnp----y Main Verb, infinitive + enclitic VG Vmgp---- Main Verb, gerund VGE Vmgp----y Main Verb, gerund + enclitic VPP Vmpp-..- Main Verb, pres. participle --------------------- more collapsed end ---------------------- ====== =================== =================================== 5.1.2.3 Combinations ============ ======== ============================================= Lexicon Corpus Example ============ ======== ============================================= Vaip1s- +++ VAS1IP ho Vaip2s- VAS2IP hai, sei Vaip3s- VAS3IP ha, e' Vaip2p- VAP2IP avete, siete Vaip3p- +++ VAP3IP hanno Vaip1p- VAP1ICP abbiamo, siamo Vaip1s- +++ VAY^2IP sono Vaip3p- +++ VAY^2IP sono Vaii1s- VAS1II avevo, ero Vaii2s- VAS2II avevi, eri Vaii3s- VAS3II aveva, era Vaii1p- VAP1II avevamo, eravamo Vaii2p- VAP2II avevate, eravate Vaii3p- VAP3II avevano, erano Vaif1s- VAS1IF avro', saro' Vaif2s- VAS2IF avrai, sarai Vaif3s- VAS3IF avra', sara' Vaif1p- VAP1IF avremo, saremo Vaif2p- VAP2IF avrete, sarete Vaif3p- VAP3IF avranno, saranno Vais1s- VAS1IR ebbi, fui Vais2s- VAS2IR avesti, fosti Vais3s- VAS3IR ebbe, fu Vais1p- VAP1IR avemmo, fummo Vais3p- VAP3IR ebbero, furono Vais2s- VAP2ICR aveste, foste Vasp1s- VASXCP abbia, sia Vasp2s- VASXCP abbia, sia Vasp3s- VASXCP abbia, sia Vasp1p- VAP1ICP abbiamo, siamo Vasp2p- VAP2CMP abbiate, siate Vasp3p- VAP3CP abbiano, siano Vasi1s- VAS^3CI avessi, fossi Vasi2s- VAS^3CI avessi, fossi Vasi3s- VAS3CI avesse, fosse Vasi1p- VAP1CI avessimo, fossimo Vasi2s- VAP2ICR aveste, foste Vasi3p- VAP3CI avessero, fossero Vamp2s- VAS2MP abbi, sii Vamp2s-y VAS2MPE abbilo, siilo Vamp2p- VAP2CMP abbiate, siate Vamp2p-y VAP2MPE abbiatelo, siatelo Vacp1s- VAS1DP avrei, sarei Vacp2s- VAS2DP avresti, saresti Vacp3s- VAS3DP avrebbe, sarebbe Vacp1p- VAP1DP avremmo, saremmo Vacp2p- VAP2DP avreste, sareste Vacp3p- VAP3DP avrebbero, sarebbero Vanp--- VAF avere, essere Vanp--cy VAFE averlo, esserlo Vapp-sc VANSPP avente, essente Vapp-pc VANPPP aventi, essenti Vaps-sm VAMSPR avuto, stato Vaps-pm VAMPPR avuti, stati Vaps-sf VAFSPR avuta, stata Vaps-pf VAFPPR avute, state Vaps-smy VAMSPRE avutolo Vaps-pmy VAMPPRE avutili Vaps-sfy VAFSPRE avutala Vaps-pfy VAFPPRE avuteli Vagp--- VAG avendo, essendo Vagp---y VAGE avendolo, essendolo Vmip1s- VS1IP amo, leggo, servo Vmip2s- +++ VSXICP ami Vmip2s- +++ VS2IMP leggi, servi Vmip3s- --- VS^1IMP ama Vmip3s- --- VS3IP legge, serve Vmip1p- VP1ICP amiamo, leggiamo, serviamo Vmip2p- *** VP2IMPP amate, servite Vmip2p- *** VP2IMP leggete Vmip2p- *** VP2IMCPP premiate Vmip3p- VP3IP amano, leggono, servono Vmii1s- VS1II amavo, Vmii2s- VS2II amavi, Vmii3s- VS3II amava Vmii1p- VP1II amavano Vmii2p- VP2II amavate Vmii3p- VP3II amavano Vmif1s- VS1IF amero' Vmif2s- VS2IF amerai Vmif3s- VS3IF amera' Vmif1p- VP1IF ameremo Vmif2p- VP2IF amerete Vmif3p- VP3IF ameranno Vmis1s- VS1IR amai Vmis2s- VS2IR amasti Vmis3s- VS3IR amo' Vmis1p- VP1IR amammo Vmis2p- VP2ICR amaste, leggeste, serviste Vmis3p- VP3IR amarono Vmsp1s- +++ VSXCP legga Vmsp1s- +++ VSXICP ami Vmsp2s- --- VSXCP legga Vmsp2s- --- VSXICP ami Vmsp3s- *** VSXCP legga Vmsp3s- *** VSXICP ami Vmsp1p- VP1ICP amiamo, leggiamo, serviamo Vmsp2p- """ VP2CP amiate, leggiate, serviate Vmsp2p- """ VP2ICMPP premiate Vmsp3p- VP3CP amino, leggano, servano Vmsi1s- VS^3CI amassi, leggessi, servissi Vmsi2s- VS^3CI amassi, leggessi, servissi Vmsi3s- VS3CI amasse, leggesse, servisse Vmsi1p- VP1CI amassimo Vmsi2p- VP2ICR amaste, leggeste, serviste Vmsi3p- VP3CI amassero Vmmp2s- +++ VS^1IMP ama Vmmp2s- +++ VS2IMP leggi, servi Vmmp2p- --- VP2IMPP amate Vmmp2p- --- VP2IMP leggete, servite Vmmp2p- --- VP2IMCPP premiate Vmmp2s-y VS2MPe amalo, leggilo, servilo Vmmp2p-y VP2MPe amatelo, leggetelo, servitelo Vmcp1s- VS1DP amerei Vmcp2s- VS2DP ameresti Vmcp3s- VS3DP amarebbe Vmcp1p- VP1DP ameremmo Vmcp2p- VP2DP amereste Vmcp3p- VP3DP amerebero Vmnp--- VF amare Vmnp---y VFE amarlo Vmpp-sc VNSPP amante Vmpp-pc VNPPP amanti Vmps-sm VMSPR amato, letto, servito Vmps-pm VMPPR amati, letti, serviti Vmps-sf VFSPR amata, letta, servita Vmps-pf +++ VP2IMCPP premiate Vmps-pf +++ VP2IMPP amate, servite Vmps-pf +++ VFPPR lette Vmps-smy VMSPRE amatolo Vmps-pmy VMPPRE amatili Vmps-sfy VFSPRE amatala Vmps-pfy VFPPRE amatele Vmgp--- VG amando Vmgp---y VGE amandolo ----------------------- more collapsed tagset ------------------ Vaip1s- VA1S ho Vaip2s- VA2S hai Vaip3s- VA3S ha Vaip1p- VA1P abbiamo Vaip2p- VA2P avete Vaip3p- VA3P hanno Vaii1s- VA1S avevo Vaii2s- VA2S avevi Vaii3s- VA3S aveva Vaii1p- VA1P avevamo Vaii2p- VA2P avevate Vaii3p- VA3P avevano Vaif1s- VA1S avro' Vaif2s- VA2S avrai Vaif3s- VA3S avra' Vaif1p- VA1P avremo Vaif2p- VA2P avrete Vaif3p- VA3P avranno Vais1s- VA1S ebbi Vais2s- VA2S avesti Vais3s- VA3S ebbe Vais1p- VA1P avemmo Vais2p- VA2P aveste Vais3p- VA3P ebbero Vasp1s- VA1S abbia Vasp2s- VA2S abbia Vasp3s- VA3S abbia Vasp1p- VA1P abbiamo Vasp2p- VA2P abbiate Vasp3p- VA3P abbiano Vasi1s- VA1S avessi Vasi2s- VA2S avessi Vasi3s- VA3S avesse Vasi1p- VA1P avessimo Vasi2p- VA2P aveste Vasi3p- VA3P avessero Vamp2s- VA2S abbi Vamp2p- VA2P abbiate Vacp1s- VA1S avrei Vacp2s- VA2S avresti Vacp3s- VA3S avrebbe Vacp1p- VA1P avremmo Vacp2p- VA2P avreste Vacp3p- VA3P avrebbero Vanp--- VAF avere Vanp---y VAFE averlo Va-cspp VANSPP avente Va-cppp VANPPP aventi Va-msps VAMSPR avuto Va-mpps VAMPPR avuti Va-fsps VAFSPR avuta Va-fpps VAFPPR avute Va-gp-- VAG avendo Va-gp--y VAGE avendolo Vmip1s- V1S amo Vmip2s- V2S ami Vmip3s- V3S ama Vmip1p- V1P amiamo Vmip2p- V2P amate Vmip3p- V3P amano Vmii1s- V1S amavo Vmii2s- V2S amavi Vmii3s- V3S amava Vmii1p- V1P amavamo Vmii2p- V2P amavate Vmii3p- V3P amavano Vmif1s- V1S amero' Vmif2s- V2S amerai Vmif3s- V3S amera' Vmif1p- V1P ameremo Vmif2p- V2P amerete Vmif3p- V3P ameranno Vmis1s- V1S amai Vmis2s- V2S amasti Vmis3s- V3S amo' Vmis1p- V1P amammo Vmis2p- V2P amaste Vmis3p- V3P amarono Vmsp1s- V1S ami Vmsp2s- V2S ami Vmsp3s- V3S ami Vmsp1p- V1P amiamo Vmsp2p- V2P amiate Vmsp3p- V3P amino Vmsi1s- V1S amassi Vmsi2s- V2S amassi Vmsi3s- V3S amasse Vmsi1p- V1P amassimo Vmsi2p- V2P amaste Vmsi3p- V3P amassero Vmmp2s- V2S ama Vmmp2p- V2P amate Vmcp1s- V1S amerei Vmcp2s- V2S ameresti Vmcp3s- V3S amerebbe Vmcp1p- V1P ameremmo Vmcp2p- V2P amereste Vmcp3p- V3P amerebbero Vmnp--- VF amare Vmnp---y VFE amarlo Vm-cspp VNSPP amante Vm-cppp VNPPP amanti Vm-msps VMSPR amato Vm-mpps VMPPR amati Vm-fsps VFSPR amata Vm-fpps VFPPR amate Vm-gp-- VG amando Vm-gp--y VGE amandolo ========= ======= ======================================== 5.1.2.4 Some observations for corpus tagset An observation concerns the special marking for the auxiliaries: the taggers are in general not able to disambiguate the cases in which the auxiliaries are used as full verbs ("io ho un cane" , "i bambini sono nel prato") from the cases when they are auxiliaries. The distinction of the auxiliaries is used only in order to isolate 'avere' and 'essere' from the other verbs. For verbs, two different sets of tags are proposed, the first more fine-grained for more accurate distinctions and the latter more coarse-grained, which follows the approach proposed by the French group. The collapsing proposed by the French group of Moods and Tenses, if considered wrt to the performances of our tagger, appears restrictive: for many unambiguous tenses and moods, the Italian tagger is able to formulate the correct analysis (e.g. conditional, subjunctive imperfect, indicative past etc.) and these distinctions are, in our opinion, worth being maintained. It has to be noticed that the ambiguities between verb forms depend also on different lexical verbs. In Italian, the major ambiguities concerns the 2nd sing and plur of the present indicative and imperative, ama-amate; leggi-leggete. However, this is again not a general rule. Another very common ambiguity is between the 2nd pers. of the indicative and the 1st, 2nd, 3rd person of the present subjunctive. Therefore not always it is possible to decide unambiguosly on the person. Some more frequent typical homographies in Italian are listed below: VP1ICP amiamo VP2IMP leggete VP2IMPP amate VP2IMCPP premiate VP2ICR amaste VS^3CI amassi VSXCP legga VAY^2IP sono VSXICP ami VS2IMP leggi VS^1IMP ama VS^1IMP ama In the design of corpus tagsets for verbs careful attention should be given to the enclitic phenomenon: at present our tagger is able to recognize the presence of the clitics which is signalled by the addition of the mark "+E" (plus clitic) to the regular verb tag.
5.1.3 Adjectives (A) -------------------- 5.1.3.1 Lexicon ============ =========== =========== ==== Attribute Value Example Code ============ =========== =========== ==== Type - - - ------------------------------------ ---- Degree positive buono p comparative migliore c superlative buonissimo s ------------------------------------ ---- Gender masculine buono m feminine buona f l-spec common dolce c ------------ --- - ----- ----------- ---- Number singular buono s plural buoni p l-spec invariant pari n ------------ --- - ----- ----------- ---- Case (n.a.) (n.a.) - ============ =========== =========== ==== 5.1.3.2 Corpus ======= ================== ==================================== Tag Regular expression Definition ======= ================== ==================================== AFP A-.fp- Adjective fem. plur. AFS A-.fs- Adjective fem. sing. AFN A-.fn- Adjective fem. invar. AMP A-.mp- Adjective masc. plur. AMS A-.ms- Adjective masc. sing. AMN A-.mn- Adjective masc. invar. AMP A-.mp- Adjective comm. plur. AMS A-.ms- Adjective comm. sing. AMN A-.mn- Adjective comm. invar. ======= ================== ==================================== 5.1.3.3 Combinations ========= ======= ============================================= Lexicon Corpus Example ========= ======= ============================================= A-pms- AMS vero A-pmp- AMP veri A-pmn- AMN oggetto (complemento/i oggetto: grammatical language) A-pfs- AFS vera A-pfp- AFP vere A-pfn- AFN valore (clausola valore: juridical language) A-pcs- ANS dolce (biscotto, torta) A-pcp- ANP dolci (biscotti, dolci) A-pcn- ANN pari (risultato/i, somma/e) A-sms- AMS verissimo A-smp- AMP verissimi A-sfs- AFS verissima A-sfp- AFP verissime ========= ======= ============================================= 5.1.3.4 Observations The comparative Degree applies only to a close set of adjectives (e.g. maggiore, migliore, etc). All other adjectives form their comparatives with "piu'" + adjective (e.g., piu' forte). Superlative is also an analytical form (il piu' forte), but can be also synthetically formed: grandissimo, massimo.
5.1.4. Pronouns --------------- 5.1.4.1. Lexicon ============ =========== =========== ==== Attribute Value Example Code ============ =========== =========== ==== Type personal io p demonstrat. quello d indefinite chiunque i possessive mio s interrog. chi t relative che r exclamative quanto e ------------ ----------- ----------- ---- Person first io 1 second tu 2 third egli 3 ------------ ----------- ----------- ---- Gender masculine questo m feminine questa f l-spec common io c ------------ ----------- ----------- ---- Number singular questo s plural questi p l-spec invariant che n ------------ ----------- ----------- ---- Case (n.a.) (n.a.) - ------------ ----------- ----------- ---- Possessor - - - ============ =========== =========== ==== 5.1.4.2 Corpus ======= ========= ==================================== Tag Reg.Expr. Definition ======= ========= ==================================== PDMS Pd-ms-- Demonstrative pronoun masc.sing. PDMP Pd-mp-- Demonstrative pronoun masc.plur. PDFS Pd-fs-- Demonstrative pronoun femm.sing. PDFP Pd-fp-- Demonstrative pronoun femm.plur. PDNS Pd-cs-- Demonstrative pronoun comm.sing. PDNP Pd-cp-- Demonstrative pronoun comm.plur. PIMS Pi-ms-- Indefinite pronoun masc.sing. PIMP Pi-mp-- Indefinite pronoun masc.plur. PIFS Pi-fs-- Indefinite pronoun femm.sing. PIFP Pi-fp-- Indefinite pronoun femm.plur. PINS Pi-cs-- Indefinite pronoun comm.sing. PINP Pi-cp-- Indefinite pronoun comm.plur. PPMS Ps.ms-- Possessive pronoun, masc.sing. PPMP Ps.mp-- Possessive pronoun, masc.plur. PPFS Ps.fs-- Possessive pronoun, femm.sing. PPFP Ps.fp-- Possessive pronoun, femm.plur. PPNP Ps.cp-- Possessive pronoun, comm.plur. PWNS P[tre]-cs-- Interr./Rel./Escl. pronoun, comm.sing. PWNP P[tre]-cp-- Interr./Rel./Escl. pronoun, comm.plur. PWNN P[tre]-cn-- Interr./Rel./Escl. pronoun, comm.plur. PWMS P[tre]-ms-- Interr./Rel./Escl. pronoun, masc.sing. PWMP P[tre]-mp-- Interr./Rel./Escl. pronoun, masc.plur. PWFS P[tre]-fs-- Interr./Rel./Escl. pronoun, femm.sing. PWFP P[tre]-fp-- Interr./Rel./Escl. pronoun, femm.plur. PQNS1 Pp1cs-- Personal pronoun, 1st pers., comm.sing. PQNS2 Pp2cs-- Personal pronoun, 2nd pers., comm.sing. PQMS3 Pp3ms-- Personal pronoun, 3rd pers., masc.sing. PQFS3 Pp3fs-- Personal pronoun, 3rd pers., femm.sing. PQNN3 Pp3cn-- Personal pronoun, 3rd pers., comm.inv. PQNP1 Pp1cp-- Personal pronoun, 1st pers., comm.plur. PQNP2 Pp2cp-- Personal pronoun, 2nd pers., comm.plur. PQNP3 Pp3cp-- Personal pronoun, 3rd pers., comm.plur. PQMP3 Pp3mp-- Personal pronoun, 3rd pers., masc.plur. PQFP3 Pp3fp-- Personal pronoun, 3rd pers., femm.plur. --------- more collapsed ---------------- PFP P..fp--- Pronoun, fem. plur. PFS P..fs--- Pronoun, fem. plur. PMP P..mp--- Pronoun, masc. plur. PMS P..ms--- Pronoun, masc. sing. PNS P..cs--- Pronoun, comm. sing. PNP P..cp--- Pronoun, comm. plur. PNN P..cn--- Pronoun, comm. inv. ---------- more collapsed end ----------- ====== ========== ========================================== 5.1.4.3 Combinations ========= ======= ============================================= Lexicon Corpus Example ========= ======= ============================================= Pd-ms-- PDMS quello, costui Pd-mp-- PDMP quelli Pd-fs-- PDFS quella Pd-fp-- PDFP quelle Pd-cs-- PDNS cio' Pd-cp-- PDNP coloro Pi-ms-- PIMS ognuno Pi-mp-- PIMP alcuni Pi-fs-- PIFS ognuna Pi-fp-- PIFP alcune Pi-cs-- PINS chiunque, tale Pi-cp-- PINP tali Ps1ms-- PPMS mio, nostro Ps1mp-- PPMP miei Ps1fs-- PPFS mia Ps1fp-- PPFP mie Ps2ms-- PPMS tuo, vostro Ps2mp-- PPMP tuoi Ps2fs-- PPFS tua Ps2fp-- PPFP tue Ps3ms-- PPMS suo Ps3mp-- PPMP suoi Ps3fs-- PPFS sua Ps3fp-- PPFP sue Ps3cp-- PPNP loro Pt-cs-- PWNS chi? quale? Pt-cp-- PWNP quali? Pt-cn-- PWNN che? Pt-ms-- PWMS quanto? Pt-mp-- PWMP quanti? Pt-fs-- PWFS quanta? Pt-fp-- PWFP quante? Pr-cn-- PWNN cui Pr-ms-- PWMS quanto Pr-mp-- PWMP quanti Pr-fs-- PWFS quanta Pr-fp-- PWFP quante Pr-cs-- PWNS chi, quale Pr-cp-- PWNP quali Pe-ms-- PWMS quanto! Pe-mp-- PWMP quanti! Pe-fs-- PWFS quanta! Pe-fp-- PWFP quante! Pe-cs-- PWNS quale! Pe-cp-- PWNP quali! Pe-cn-- PWNN che! Pp1cs-- PQNS1 io, me, mi Pp2cs-- PQNS2 tu, te, ti, Pp3ms-- PQMS3 egli, lui, esso, gli, lo Pp3fs-- PQFS3 ella, lei, essa, le, la Pp3cn-- PQNN3 si Pp1cp-- PQNP1 noi, ci Pp2cp-- PQNP2 voi, vi Pp3cp-- PQNP3 loro, Pp3mp-- PQMP3 essi, li Pp3fp-- PQFP3 esse, le --------------------- more collapsed ----------------------------- P..fp--- PFP mie, queste, quante etc. P..fs--- PFS mia, questa, quanta etc. P..mp--- PMP miei, questi, quanti etc. P..ms--- PMS mio, questo, quanto etc. P..cs--- PNS quale P..cp--- PNP quali P..cn--- PNN che, cui, altrui -------------------- more collapsed end -------------------- ============================== 5.1.4.4 Observations For pronouns, the strategy of proposing two different tagsets, the one more collapsed and the other more fine-grained is followed. As far as the pronominal paradigm is concerned, Case is not encoded at present in our DMI (Calzolari et al. 1983). Personal pronouns are not lemmatized: 'gli' is not considered the dative form of the base pronoun 'egli' (he), but constitutes a separate entry. The Italian pronominal paradigm is the following: 'forme toniche' (strong forms): subj (io, egli), compl (me, lui) ama me / da' a me -- dir-obj/prep-obj -- (he loves me / he gives to me) ama lui / da' a lui -- dir-obj/prep-obj -- (she loves him / she gives to him) 'forme atone' (weak forms): - compl (mi, gli/lo) mi da' / mi ama -- ind-obj/dir-obj -- (he gives me / he loves me) gli da' -- ind-obj -- (he gives him) lo ama -- dir-obj -- (she loves him) This paradigm can be mapped on the Case system proposed by the French group, in the following way: io, egli = subj = nom mi/me = dir-obj/ind-obj/prep-obj = obj -] acc, dat, prep+obl lui = dir-obj/prep-obj = obj -] acc, prep+obl gli = ind-obj = dat lo = dir-obj = acc 5.1.5 Determiners (Pronominal Adjectives) (D) --------------------------------------------- 5.1.5.1. Lexicon ============ =========== =========== ==== Attribute Value Example Code ============ =========== =========== ==== Type demonstrat. questo d indefinite ogni i possessive mio s interrogat. che t exclamative quanto e this value has been added relative quanto r this value has been added ------------ ----------- ----------- ---- Person first mio 1 second tuo 2 third suo 3 ------------ ----------- ----------- ---- Gender masculine questo m feminine questa f l-spec common ogni c ------------ ----------- ----------- ---- Number singular quello s plural quelli p l-spec invariant altrui n ------------ ----------- ----------- ---- Case (n.a.) (n.a.) - ------------ ----------- ----------- ---- Possessor - - - ============ =========== =========== ==== 5.1.5.2 Corpus ======= ============ ============================================ Tag Regular exp. Definition ======= ============ ============================================ DDNS Dd-ns-- Demonstrative pron.adj. comm.inv. DDNP Dd-np-- Demonstrative pron.adj. comm.plur. DDMS Dd-ms-- Demonstrative pron.adj. masc.sing. DDMP Dd-mp-- Demonstrative pron.adj. masc.plur. DDFS Dd-fs-- Demonstrative pron.adj. femm.sing. DDFP Dd-fp-- Demonstrative pron.adj. femm.plur. DIMS Di-ms-- Indefinite pron.adj. masc.sing. DIMP Di-mp-- Indefinite pron.adj. masc.plur. DIFS Di-fs-- Indefinite pron.adj. femm.sing. DIFP Di-fp-- Indefinite pron.adj. femm.plur. DINS Di-cs-- Indefinite pron.adj. comm.sing. DINP Di-cp-- Indefinite pron.adj. comm.plur. DPMS Ds.ms-- Possessive pron.adj., masc.sing. DPMP Ds.mp-- Possessive pron.adj., masc.plur. DPFS Ds.fs-- Possessive pron.adj., femm.sing. DPFP Ds.fp-- Possessive pron.adj., femm.plur. DPNN Ds-cn-- Possessive pron.adj., comm.inv. DWNN D[tre]-cn-- Interr/Relat./escl. pron.adj., comm.inv. DWMS D[tre]-ms-- Interr/Relat./escl. pron.adj., masc.sing. DWMP D[tre]-mp-- Interr/Relat./escl. pron.adj., masc.plur. DWFS D[tre]-fs-- Interr/Relat./escl. pron.adj., femm.sing. DWFP D[tre]-fp-- Interr/Relat./escl. pron.adj., femm.plur. DWNS D[tre]-cs-- Interr/Relat./escl. pron.adj., comm.sing. DWNP D[tre]-cp-- Interr/Relat./escl. pron.adj., comm.plur. --------- more collapsed ---------------- DFP D..fp--- Determiner, fem. plur. DFS D..fs--- Determiner, fem. plur. DMP D..mp--- Determiner, masc. plur. DMS D..ms--- Determiner, masc. sing. DNS D..cs--- Determiner, comm. sing. DNP D..cp--- Determiner, comm. plur. DNN D..cn--- Determiner, comm. inv. --------- more collapsed end -------------- ======= ============ ================================================ 5.1.5.3 Combinations ========= ========== ================================= Lexicon Corpus Example ========= ========== ================================= Dd-cs-- DDNS tale Dd-cp-- DDNP tali Dd-ms-- DDMS quello Dd-mp-- DDMP quelli Dd-fs-- DDFS quella Dd-fp-- DDFP quelle Di-ms-- DIMS nessun Di-mp-- DIMP alcuni Di-fs-- DIFS nessuna Di-fp-- DIFP alcune Di-cs-- DINS ogni Di-cp-- DINP quali Ds1ms-- DPMS mio, nostro Ds1mp-- DPMP miei Ds1fs-- DPFS mia Ds1fp-- DPFP mie Ds2ms-- DPMS tuo, vostro Ds2mp-- DPMP tuoi Ds2fs-- DPFS tua Ds2fp-- DPFP tue Ds3ms-- DPMS suo Ds3mp-- DPMP suoi Ds3fs-- DPFS sua Ds3fp-- DPFP sue Ds-cn-- DPNN altrui Dr-cn-- DWNN cui Dr-ms-- DWMS quanto Dr-mp-- DWMP quanti Dr-fs-- DWFS quante Dr-fp-- DWFP quanti Dr-cs-- DWNS quale Dr-cp-- DWNP quale Dt-cn-- DWNN che Dt-ms-- DWMS quanto Dt-mp-- DWMP quanti Dt-fs-- DWFS quante Dt-fp-- DWFP quanti Dt-cs-- DWNS quale Dt-cp-- DWNP quale De-cn-- DWNN che De-cp-- DWNP quali De-cs-- DWNS quale De-ms-- DWMS quanto De-mp-- DWMP quanti De-fs-- DWFS quanta De-fp-- DWFP quante ----------------------- more collapsed ----------------------------- D..fp--- DFP mie, queste, quante etc. D..fs--- DFS mia, questa, quanta etc. D..mp--- DMP miei, questi, quanti etc. D..ms--- DMS mio, questo, quanto etc. D..cs--- DNS quale D..cp--- DNP quali D..cn--- DNN altrui ----------------------- more collapsed end ----------------------- ========= ========== ================================= 5.1.5.4 Combinations On the basis of the strategy adopted for Pronouns, also for Determiners two tagsets are proposed. 5.1.6 Articles (T) ------------------ 5.1.6.1. Lexicon ============ =========== =========== ==== Attribute Value Example Code ============ =========== =========== ==== Type definite il d indefinite un i ------------ ----------- ----------- ---- Gender masculine il m feminine la f l-spec common l' c ------------ ----------- ----------- ---- Number singular la s plural le p ------------ ----------- ----------- ---- Case (n.a.) (n.a.) - ============ =========== =========== ==== 5.1.6.2. Corpus ======== ========== ========================================== Tag Reg.Expr. Definition ======== ========== ========================================== RMS Tdms- Article, definite, masc.sing. RMP Tdmp- Article, definite, masc.plur. RFS Tdfs- Article, definite, femm.sing. RFP Tdfp- Article, definite, femm.plur. RNS Tdcs- Article, definite, comm.sing. RIMS Tims- Article, indefinite, masc.sing. RIFS Tifs- Article, indefinite, femm.sing. ======== ========== ========================================== 5.1.6.3. Combinations ========= ======== ========================================== Lexicon Corpus Example ========= ======== ========================================== Tdms- RMS il, lo Tdmp- RMP i, gli Tdfs- RFS la Tdfp- RFP le Tdcs- RNS l' (amico/a) Tims- RIMS un, uno Tifs- RIFS una, un' ================== ========================================== 5.1.7 Adverbs (R) ----------------- 5.1.7.1 Lexicon ============ ====== ===== =========== ==== Attribute Value Example Code ============ =========== =========== ==== Type - - - ------------ ----------- ----------- ---- Degree positive bene p superlative benissimo s ============ =========== =========== ==== 5.1.7.2 Corpus ======= ================== =========================== Tag Regular Expression Definition ======= ================== =========================== B R-p Adverb positive BS R-s Adverb superaltive ======= ================== =========================== 5.1.7.3 Combinations ========= =========== ============================ Lexicon Corpus Example ========= =========== ============================ R-p B fortemente R-s BS fortissimamente ========= =========== ============================ 5.1.7.4. Observations The feature Type is not encoded in the Italian lexicon. 5.1.8. Adposition (S) --------------------- 5.1.8.1. Lexicon ============ =========== =========== ==== Attribute Value Example Code ============ =========== =========== ==== Type preposition di, a, da p ------------ ----------- ----------- ---- Formation simple di s compound dello c ------------ ----------- ----------- ---- Gender masculine dello m This attribute and values feminine alla f have been added l-spec common dell' c ------------ ----------- ----------- ---- Number singular al s This attribute and values plural ai p have been added ============ =========== =========== ==== 5.1.8.2 Corpus ======= ================== ===================== Tag Regular Expression Definition ======= ================== ===================== E Sp- Preposition simple EA Spc.. Preposition compound ======= ================== ===================== 5.1.8.3 Combinations ========= ================ ======================= Lexicon Corpus Example ========= ================ ======================= Sp E di Spcfs EA della Spcfp EA delle Spcms EA del, dello Spcmp EA dei, degli Spccn EA dell' ========= ================ ======================= 5.1.8.4 Observations The Italian policy for encoding fused prepositions foresees to attach the morphological information of the article to the preposition tag. 5.1.9 Conjunctions (C) ---------------------- 5.1.9.1. Lexicon ============ =========== =========== ==== Attribute Value Example Code ============ =========== =========== ==== Type coordinat. e c subordinat. perche' s ============ =========== =========== ==== 5.1.9.2 Corpus ======= ================== ========================= Tag Regular Expression Definition ======= ================== ========================= CC Cc Coordinative conjunction CS Cs Subordinative conjunction ======= ============================================ 5.1.9.3 Combinations ========= =========== ============================ Lexicon Corpus Example ========= =========== ============================ Cc CC ma Cs CS perche' ========= =========== ============================ 5.1.10 Numerals (M) ------------------- 5.1.10.1. Lexicon ============ =========== =========== ==== Attribute Value Example Code ============ =========== =========== ==== Type cardinal cento c ordinal primo o ------------ ----------- ----------- ---- Gender masculine primo m feminine prima f ------------ ----------- ----------- ---- Number singular secondo s plural secondi p ------------ ----------- ----------- ---- Case (n.a.) (n.a.) - ============ =========== =========== ==== 6.4.10.2 Corpus ======= ================== ============================ Tag Regular Expression Definition ======= ================== ============================ NMS M.ms- Numeral, masc.sing. NFS M.fs- Numeral, femm.sing. NMP M.mp- Numeral, masc.plur. NFP M.fp- Numeral, femm.plur. N Mc--- Numeral cardinal ======= ================================================ 5.1.10.3 Combinations ========= ========= =============================== Lexicon Corpus ========= ========= =============================== M.ms- NMS primo M.fs- NFS prima M.mp- NMP primi M.fp- NFP prime Mc--- N zero, cento ========= ========= =============================== 5.1.11 Interjection (I) ----------------------- 5.1.11.1. Corpus ======= =========== ===================================== Tag Reg. Expr. Definition ======= =========== ===================================== I I Interjection ======= =========== ===================================== 5.1.11.2. Combinations ======= =========== ===================================== Lexicon Corpus Example ======= =========== ===================================== I I oh ======= =========== ===================================== 5.1.12 Unique membership class (U) ---------------------------------- None 5.1.13. Residual (X) -------------------- 5.1.13.2 Corpus ======= =================== ==================== Tag Regular Expression Definition ======= =================== ==================== NY ??? "Guessed" Noun AY ??? "Guessed" Adjective ======= =================== ==================== 5.1.13.3 Combinations ========= ========= =============================== Lexicon Corpus Example ========= ========= =============================== ??? NY bit ??? AY computerizzato ========= ========= =============================== 5.1.13.4 Observations At corpus level, we have the tag SY which is used to mark symbols, letters, acronyms, foreign words, toponyms etc., in general unknown words, for which a "guess" is provided. 5.1.13 Punctuation ========= ============================ Tag Example ========= ============================ punct .,;:?! etc. ========= ============================