[developers] [pet] [delph-in] Poll to identify actively used functionality in PET
ulrich.schaefer at dfki.de
Wed Jul 7 18:34:40 CEST 2010
From my / Heart of Gold's point of view, it would be OK to have a PET
without PIC and SMAF. REPP isn't used in any of the hybrid workflows AFAIR.
It implies that we will probably loose integrations for Italian and
Norwegian (no harm as the grammars are no longer actively developed I
guess) and maybe also Greek, but we would gain cleaner and uniform
configuration settings, hopefully.
I hope we will also be able to integrate Spanish with FreeLing via FSC soon.
I would also like to replace the current stdin/stderr communication in
PetModule by XML-RPC as soon as possible.
Am 07.07.2010 12:56, schrieb Stephan Oepen:
> i guess you're asking about FSR (aka FSPP) and REPP? the latter now
> supersedes FSPP in the LKB, and for all i know the existing FSPP
> support in PET (based on ECL) is not UniCode-enabled and builds on the
> deprecated SMAF. hence, no practical loss purging that from PET now,
> i'd think? REPP, on the other hand, should be natively supported in
> PET, in my view. i seem to recall that you had a C++ implementation
> of REPP? woodley has a C implementation. maybe sometime this fall we
> could jointly look at the choices (and remaining limitations: i
> believe none of the existing implementations is perfect in terms of
> characterization corner cases), and then add native REPP support to PET?
> as for FSC, there is pretty good documentation on the wiki now, and it
> seems the format is reasonably stable. i am inclined to preserve YY
> format, as the non-XML alternative to inputting a PoS-annotated token
> finally, i see your point about efficiency losses in -default-les=all
> mode when combined with a very large number of generics (i.e. one per
> LE type); personally, i'd think lexical instantiation can be optimized
> to alleviate these concerns. i personally find the limitations in the
> old generics mode so severe that i can't imagine going back to that
> mode. but if there were active users who'd be badly affected by its
> removal prior to optimizing -default-les=all further, i have no
> opinion on when best to ditch the old mode.
> best, oe
> On 7. juli 2010, at 02.03, Rebecca Dridan <bec.dridan at gmail.com> wrote:
>> I couldn't attend the PetRoadMap discussion - is there any summary of
>> the discussion, or at least what decisions were made on the wiki?
>>> Input formats we'd like to discard:
>>> - pic / pic_counts
>>> - yy_counts
>>> - smaf
>>> - fsr
>> Particularly, what is the plan for inputs? FSC seemed to do
>> everything I had needed from PIC, but at the time it was
>> undocumented, experimental code. Will FSC be the default input format
>> when annotation beyond POS tags is needed?
>>> -default-les=traditional determine default les by posmapping for all
>>> lexical gaps
>> Does this mean that we can either hypothesise every generic entry for
>> every token (and then filter them), or not use generic entries at
>> all? I found this to be a major efficiency issue when large numbers
>> of generic entries were available. I don't have a problem with
>> defaulting to the current "all" setting, but I think there are still
>> possible configurations where one would like to react only when
>> lexical gaps were found.
>>> Because these are the only modules that require the inclusion of ECL,
>>> support for ECL in PET will also be removed.
>> I celebrate the removal of ECL, but will there be any way of doing
>> more than white space tokenisation natively in PET, or was the
>> decision made that PET will always be run in conjunction with an LKB
>> pre-processing step?
Dr. Ulrich Schaefer http://dfki.de/~uschaefer phone:+49681857755154
DFKI Language Technology Lab, D-66123 Saarbruecken, Germany
Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
Trippstadter Strasse 122, D-67663 Kaiserslautern, Germany
Geschaeftsfuehrung: Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster
(Vorsitzender), Dr. Walter Olthoff. Vorsitzender des Aufsichtsrats:
Prof. Dr. h.c. Hans A. Aukes. Amtsgericht Kaiserslautern, HRB 2313
More information about the developers