[developers] Punctuation and "-default-les" type mapping in PET/ERG

Christopher Rupp Christopher.Rupp at cl.cam.ac.uk
Mon Apr 14 12:28:49 CEST 2008


Hi Richard,

I saw your message on Saturday, but I didn't find time to put in my 
contribution.
Essentially, I would have reiterated part of my message to this list on March 
28th.

Christopher.Rupp at cl.cam.ac.uk said:
> As I think I mentioned, I was getting WARNING messages pretty similar to the
> ones Richard showed, when using -default-les as a PET option with a smaf-conf
> file and posmapping settings. I managed to suppress these messages by
> modifying the definitions in the smaf.conf file, but I think most of the
> changes removed information. (I did put some information back via gen-lex
> entries, but that's not a direct fix.) 

I think we need to see both the posmapping definition and the file defined 
under
smaf-conf, to understand where possible conflicts occur and whether they affect
the (potential) coverage of a given grammar. I think Dan essentially got to the
same point. (If I'd been a bit quicker I could probably saved him the route 
there.)

As far as I understand it using the SMAF interface requires a smaf-conf 
definition.
You may have inherited such a file with the name saf.conf instead. 
Unfortunately,
I don't know of any adequate documentation of the notation used in these files.
The examples I have are the same as those on the Wiki page:

http://wiki.delph-in.net/moin/SmafPet

I believe there has been some change in the way these files are interpreted. 
Some
of the warnings I saw were explicable in terms of mismatching types and path
settings. I expect the CARG warnings (Richard's type-I) fall into this class.
Others, appear more worrying to me, e.g. Richard's type-II:

; WARNING: failed to create dag for new path-value ("Reptiles" = Reptiles)

Or the instance I had like:

; WARNING: failed to create dag for new path-value ("2,2-dipyridylamine = 
"2,2-dipyridylamine")

These look more like a data type mismatch which can't have been intended in the
original code. I also note that one of the main changes I made involved 
simplifying
the path along which type definitions were made:

Old> pos.[tag='JJ'] -> gMap.type='aj_-_i-unk_le'

New< pos.[tag='JJ'] -> gMap='aj_-_i-unk_le'

To me this implies a mechanism that stood back from the feature structure 
information, but that step being collapsed. However, I say "implies" because I 
don't have either knowledge of the code or adequate documentation.

>From the results that I have, I haven't been able to show any reduction in 
coverage
by the steps I took to suppress these warning messages. I suggest you let me 
know
which posmapping and .conf file definitions you are using on the SMAF 
interface.
Then we can see if the sources of these conflicts are apparent.

More generally, we need to know what operations are permitted on the SMAF 
interface. For example, in the case of morphological variants, I have a rule:

ersatz.[] -> edgeType='tok+morph' stem=content.name tokenStr=content.name 
gMap.carg=content.surface inject='t' analyseMorph='t'

But no information about what the flags "inject='t'" or "analyseMorph='t'" 
actually
do, assuming they are working correctly.

I'm aware that there are plans to update the functionality for the input of
information at the token level from outside the grammar framework. However, 
some
of us need to work with the existing functionality while those mechanisms are
in development. In the short term we can probably patch up this issue of 
unexpected
warning messages, but we need better support and documentation of the existing
interfaces and more information about the functionality and development 
schedule
for their successors.

Sorry, I only got to send this today. This information might have saved a 
couple of
messages over the weekend, but I still regard this as patched rather than 
solved.

Cheers,

C.J.









More information about the developers mailing list