[developers] Punctuation and "-default-les" type mapping in PET/ERG
Christopher Rupp
Christopher.Rupp at cl.cam.ac.uk
Mon Apr 14 12:28:49 CEST 2008
Hi Richard,
I saw your message on Saturday, but I didn't find time to put in my
contribution.
Essentially, I would have reiterated part of my message to this list on March
28th.
Christopher.Rupp at cl.cam.ac.uk said:
> As I think I mentioned, I was getting WARNING messages pretty similar to the
> ones Richard showed, when using -default-les as a PET option with a smaf-conf
> file and posmapping settings. I managed to suppress these messages by
> modifying the definitions in the smaf.conf file, but I think most of the
> changes removed information. (I did put some information back via gen-lex
> entries, but that's not a direct fix.)
I think we need to see both the posmapping definition and the file defined
under
smaf-conf, to understand where possible conflicts occur and whether they affect
the (potential) coverage of a given grammar. I think Dan essentially got to the
same point. (If I'd been a bit quicker I could probably saved him the route
there.)
As far as I understand it using the SMAF interface requires a smaf-conf
definition.
You may have inherited such a file with the name saf.conf instead.
Unfortunately,
I don't know of any adequate documentation of the notation used in these files.
The examples I have are the same as those on the Wiki page:
http://wiki.delph-in.net/moin/SmafPet
I believe there has been some change in the way these files are interpreted.
Some
of the warnings I saw were explicable in terms of mismatching types and path
settings. I expect the CARG warnings (Richard's type-I) fall into this class.
Others, appear more worrying to me, e.g. Richard's type-II:
; WARNING: failed to create dag for new path-value ("Reptiles" = Reptiles)
Or the instance I had like:
; WARNING: failed to create dag for new path-value ("2,2-dipyridylamine =
"2,2-dipyridylamine")
These look more like a data type mismatch which can't have been intended in the
original code. I also note that one of the main changes I made involved
simplifying
the path along which type definitions were made:
Old> pos.[tag='JJ'] -> gMap.type='aj_-_i-unk_le'
New< pos.[tag='JJ'] -> gMap='aj_-_i-unk_le'
To me this implies a mechanism that stood back from the feature structure
information, but that step being collapsed. However, I say "implies" because I
don't have either knowledge of the code or adequate documentation.
>From the results that I have, I haven't been able to show any reduction in
coverage
by the steps I took to suppress these warning messages. I suggest you let me
know
which posmapping and .conf file definitions you are using on the SMAF
interface.
Then we can see if the sources of these conflicts are apparent.
More generally, we need to know what operations are permitted on the SMAF
interface. For example, in the case of morphological variants, I have a rule:
ersatz.[] -> edgeType='tok+morph' stem=content.name tokenStr=content.name
gMap.carg=content.surface inject='t' analyseMorph='t'
But no information about what the flags "inject='t'" or "analyseMorph='t'"
actually
do, assuming they are working correctly.
I'm aware that there are plans to update the functionality for the input of
information at the token level from outside the grammar framework. However,
some
of us need to work with the existing functionality while those mechanisms are
in development. In the short term we can probably patch up this issue of
unexpected
warning messages, but we need better support and documentation of the existing
interfaces and more information about the functionality and development
schedule
for their successors.
Sorry, I only got to send this today. This information might have saved a
couple of
messages over the weekend, but I still regard this as patched rather than
solved.
Cheers,
C.J.
More information about the developers
mailing list