# Define morph
Verb
<head sem pred> =<form>
<head sem voice> =active/passive
!prefixe
<head agr> ==VAgr
<head agr num> =singular/plural
<head agr pers> =1/2/3
<head agr gen> =masculine/feminine
<head tensed> =no/yes
<head prd> =no/yes
<head type> =aux/main
<bar> =0
type(X)<head type> =X
rol(X) <head sem nom> =X
function(X) <head sem functional> =X
...
my_VAgr(N,P,G)
<head agr num> = N
<head agr pers> = P
<head agr gen> = G
#paradigm verb_suf4
!Encl(_,imperson)
!common
!rol(denom)
ar n {+a}{-b} $nom_fem8
ăr n {-a}{+b} $nom_fem8
#paradigm verb1
- !Verb !type(main)
$indic_prez_1
$indic_imperf_1
$indic_perfsim_1
$indic_mmcperf_1
$conj_prez_1
$imper_prez_1
$infin_prez_1
$part_1
$gerund_2
# paradigm indic_mmcperf_1
!VTensed(past-perfect,indicative)
asem v {+past} !my_Vagr(singular,1,_)
aseşi v {+past} !my_Vagr(singular,2,_)
ase v {+past} !my_Vagr(singular,3,_)
aserăm v {+past} !my_Vagr(plural,1,_)
aserăţi v {+past} !my_Vagr(plural,2,_)
aseră v {+past} !my_Vagr(plural,3,_)
# Lexicon base
...
abroga * v/n/adj !pref(none)
...
# Lexicon vform
...
abrog v @abroga\base !allcases $verb1
abrog n @abroga\base !rol(denom/patient) $verb_suf4
$verb_suf_part1
...
We run the word-form generator (part of mac-ELU) on this lexicon and got almost 1.300.000 wordforms.
Below is shown the information dispayed by the word-form generator for one inflected form:
mergăr+eţ+ilor
<6>
bar = 0
cat = adj/n
form = merge
head : agr : (NomAgr)
pers = 3
num = plural
gen = masculine
case = dative/genitive/vocative
encl = yes
hum = person
intensify = none
pos = after/before
prefix = none
sem : nom = actor
pred = merge
type = common
As mac-ELU implementation of Romanian morphology covers not only inflectional morphology but the regular derivatives as well (see example above),the number of lemmas in mac-ELU is less than the one in MULTEXT.
The information generated as above was automatically translated into corresponding MULTEXT entries, getting rid of features not included in MULTEXT and changing the lemma form (in case of derivatives) to the inflectional lemma.
From the representation above, the translator generated the MULTEXT entries:
mergăreţilor mergăreţ Ncmpony mergăreţilor mergăreţ Ncmpoyy mergăreţilor mergăreţ Ncmpvny mergăreţilor mergăreţ Ncmpvyy mergăreţilor mergăreţ Afpmpony mergăreţilor mergăreţ Afpmpyy mergăreţilor mergăreţ Afpmpnvy mergăreţilor mergăreţ Afpmpvyy
Due to a large redundancy generated by the attribute "clitic" on most of the grammar categories, we would like to modify our dictionary so that to explicitely use the "clitic" attribute only in case this attribute is responsible for a graphemic modification in the spelling of the wordforms. By using a "don't care" value in all the other cases ("-"), the number of corpus useful entries would be significantly reduced (by almost 35%).
The mac-ELU system is fully functional-equivalent to the SUN-OS ELU implementation
and both systems are jointly distributed by ISSCO-Geneve and RACAI-Bucharest.
| Top
| Next
| Table of contents
| Multext-East
| LPL/CNRS