[pet] [delph-in] Poll to identify actively used functionality in PET

Rebecca Dridan bec.dridan at gmail.com
Wed Jul 7 02:03:17 CEST 2010


I couldn't attend the PetRoadMap discussion - is there any summary of 
the discussion, or at least what decisions were made on the wiki?

> Input formats we'd like to discard:
>
> - pic / pic_counts
> - yy_counts
> - smaf
> - fsr
>

Particularly, what is the plan for inputs? FSC seemed to do everything I 
had needed from PIC, but at the time it was undocumented, experimental 
code. Will FSC be the default input format when annotation beyond POS 
tags is needed?

>
> -default-les=traditional  determine default les by posmapping for all
>                           lexical gaps

Does this mean that we can either hypothesise every generic entry for 
every token (and then filter them), or not use generic entries at all? I 
found this to be a major efficiency issue when large numbers of generic 
entries were available. I don't have a problem with defaulting to the 
current "all" setting, but I think there are still possible 
configurations where one would like to react only when lexical gaps were 
found.

>
> Because these are the only modules that require the inclusion of ECL,
> support for ECL in PET will also be removed.
I celebrate the removal of ECL, but will there be any way of doing more 
than white space tokenisation natively in PET, or was the decision made 
that PET will always be run in conjunction with an LKB pre-processing step?

Rebecca




More information about the pet mailing list