[developers] PET morphology

Stephan Oepen oe at ifi.uio.no
Wed Nov 18 23:23:36 CET 2009


from what you sent, there appears to be a serious bug in PET, but it  
is not actually chaining orthographemic rules.  the [incr tsdb()]  
derivations sows the actual tree, and if one knows how to read the  
verbose debugging output carefully, it shows exactly the same tree.   
the lists of rules in square brackets indicate remaining  
orthographemic rules needed for ‘completion’, i.e. to satisfy the  
segmentation hypothesis generated at the string level prior to  
parsing.  so the surprising observation is that an edge with a non- 
empty inflr_todo value (the list in brackets) can qualify as a result.

if you checked the current grammar version into the LOGON tree and  
emailed instructions on how to replicate the problem, maybe peter  
could be enticed to have a look?

best, oe




On 18. nov. 2009, at 21.26, Bart Cramer <bart.cramer at gmail.com> wrote:

> Dear all,
>
> I am currently facing the following issue, and I was hoping one of  
> you could help. A feature of the system we have is that  
> morphological rules can form chains. However, I would like to  
> prevent that, and wrote the following supertype for all inflectional  
> rules:
>
> lex-infl-rule := lex-rule &
> [ ARGS   < [ SYNSEM #synsem,
>             INFL   - ] >,
>  SYNSEM #synsem,
>  INFL   + ].
>
> However, I still get the chains. For instance, with the following  
> entries in the irregs.tab file, "weißen" is ultimately recognised as 
>  a form of the verb "wissen", after applying the two inflectional ru 
> les:
>
> "
> weißen IR-ADJ-MW-*-*-PL weiß
> weiß IR-VERB-NPD-PR-1-SG wissen
> "
>
> with verbose output, I get:
>
> (2654 v-slash-none 0 0 1 [root-inf-final] [ir-verb-npd-pr-1-sg ir- 
> adj-mw-*-*-pl]
>   (2651 v-branch-right 0 0 1 [ir-verb-npd-pr-1-sg ir-adj-mw-*-*-pl]
>     (2650 ir-verb-npd-inf 0 0 1 [ir-verb-npd-pr-1-sg ir-adj-mw-*-*-pl]
>       (4 verb-wissen-21/lt-verb-reg-npnom 0 0 1 [ir-verb-npd-inf ir- 
> verb-npd-pr-1-sg ir-adj-mw-*-*-pl]
>         (1 "weißen" 0 0 1 <0:1>)))))
>
> An unexpected result, as the output of the adjectival rule can be  
> chained to other rules, even though that shouldn't unify (because of  
> the INFL feature).
>
> When outputting in tsdb, that same reading only 'sees' the last  
> application of a morphological rule (that might be a separate bug):
>
> 1 at 9@-1 at -1@-1 at -1@-1 at -1@-1 at -1@(root-inf-final (2654 v-slash-none 0 0 1  
> (2651 v-branch-right 0 0 1 (2650 ir-verb-npd-inf 0 0 1 (4 verb- 
> wissen-21 0 0 1 ("weißen" 1 "\\"weißen\\""))))))@@@@
>
> The whole thing works fine in LKB, and turning chart mapping on or  
> off doesn't help.
>
> Can anybody give me a hint what can be wrong, either in the grammar  
> itself, the settings files, or in PET? I have a little, separate  
> tree, where this issue is singled out (small lexicon), which might  
> speed up debugging. Any help is welcome!
>
> Best,
>
> Bart.




More information about the developers mailing list