[developers] A Question on the PET Parser

Dan Flickinger danf at stanford.edu
Sat Dec 11 22:48:51 CET 2010


I guess the missing piece is that this line needs to be added to the XX.set file, where XX is whatever your PET grammar loading file is:

postload-lisp-files := "mrsglobals.lsp".

Of course, if this file is located inside a subdirectory called `lkb' in your grammar (as is typically the case for Matrix grammars), then the above line would need to include this path:
postload-lisp-files := "lkb/mrsglobals.lsp".

I have attached an adapted version of the `g8gap' grammar where I have added the `pet' subdirectory and the file `g8gap.tdl' for PET, along with the file pet/mrs.set, which is not necessary for the traditional MRS outputp from PET, but which is needed to get the XML MRS output.  With this grammar, the following should enable you to see traditional MRSs:

cheap -t -mrs g8gap

And the following produces XML MRSs:

cheap -t -mrs=new g8gap

You should be able to compare the changes I made to g8gap (relative to the LKB distribution) for guidance on what you'll need in your grammar.

  Dan

----- Original Message -----
From: "Francis Bond" <bond at ieee.org>
To: "이 용훈" <yleeuiuc at hanmail.net>
Cc: "developers" <developers at delph-in.net>
Sent: Saturday, December 11, 2010 10:46:01 AM
Subject: Re: [developers] A Question on the PET Parser

G'day,

I have CCed this to the developers list, as they may be able to answer
questions that I cannot.

> I am now developing a small grammar for English and Korean using the LKB system, but the grammar itself is not as big as the ERG. However, my grammar is a little different from ERG. Recently, I tried to convert my grammar in order to use the grammar in the PET parser. I succeeded in parsing the English sentences using the PET parser, but the result contains no MRS structure as follows, even though I set the value for ‘label-path’ and ‘label-path-tail’ in the ‘pet.set’ file.
>
>
>
> [
>   RELS:<>
>   HCONS: <> ].
>
>
>
> Even LTOP and INDEX don’t appear in the MRS structure.
>
>
>
> I think that the problems come from the setting environments which are encoded in the four setting files: flop.set, global.set, pet.set, and qc.set. That’s why I send a mail to ask some questions on the PET system.
>
>
>
> As you know, the LKB system contains a small grammar for English. For example, it contains ‘g8gap’ which is the implementation of Copestake (2002). (I attached the grammar for your convenience.) This grammar contains MRS and the gap feature which can handle LDD. As you know, this grammar is too small compared with the ERG. My main question is whether I can run such a small grammar using the PET parser or not. If possible, how can I set these files (flop.set, global.set, pet.set, and qc.set) in order to make the grammar run in the PET system? If there is a detail reference on how to set these files, please let me know.
>
>
> In my thought, if the PET parser works properly, the small grammar such as ‘g8gap’ as well as the grammar ERG can be handled with the PET parser. (Of cause, I think that the PET system can also handle my own grammar, if the PET parser works properly.) However, there are several differences between the grammar files in ‘g8gap’ and those of the ERG. Therefore, I want to know how I can set the files (flop.set, global.set, pet.set, and qc.set) in order to make the PET parser use the grammar ‘g8gap’ to process the English sentences.
>
>
> My questions are as follows:
>
> [1] If the grammar itself does not fit into the format of Grammar Matrix (for example, ‘g8gap’), is it impossible to run the grammar using the PET parser? If possible, how can I set the files (flop.set, global.set, pet.set, and qc.set) in order to see the MRS structure in the results of the ‘cheap’ parsing?

As long as the grammar is not wildly different it should be possible.
Note that PET doesn't handle defaults, so if you use them then you
cannot use PET.

> [2] The grammar ‘g8gap’ (lexicon.tdl) contains ‘SEM.RELS.LIST.FIRST.PRED’ for relations in the MRS structure. The ‘pet.set’ file, however, specifies ‘label-path := "SYNSEM.LKEYS.--KEYREL.WLINK’ and ‘label-path-tail := "WLINK”’. If I change the specification as in ‘label-path := " ORTH.LIST.FIRST’ and ‘label-path-tail := " ORTH.LIST.FIRST”’, does it work or not? If not, why? What is the function of ‘WLINK’?

I think WLINK is not used any more.  Which version of the ERG are you
looking at?   It may be easier to look at a newsimpler grammar, like
the KRG2:
http://krg.khu.ac.kr/

> [3] How can I set the parameter for ‘qc-structure :=’ in the ‘pet.set’ file?

qc-structure := $qc_unif_set_pack.

Seems to be the most popular, but I think you can just comment it out.

> [4] What is the function of ‘keyarg-marker-path’ and how can I set this parameter in the ‘global.set’ file? (The grammar ‘g8gap’ does not contain the attribute/feature ‘KEY-ARG’)

This is used for optimising the parser.  If you don't have  KEY-ARG
set in you grammar, then you can comment it out.

I don't know the best citation for a deeper explanation, perhaps
someone else can help us out.  There is some discussion at
<http://wiki.delph-in.net/moin/JacyPerformance> but it assumes you
know a lot already.


> [5] The grammar ‘g8gap’ (lexicon.tdl) contains ‘ORTH.LIST.FIRST’ for orthography. The ‘global.set’ file, however, specifies ‘orth-path := STEM’. If I change the specification as in ‘orth-path := ORTH.LIST.FIRST’, does it work or not? If not, why?

It depends where the orthographic string is in your grammar.  In Jacy
it is: STEM, but it can differ for different grammars.  If your
grammar were available online, we could perhaps take a look.

> [6] What is the function of ‘special-name-cons’ and the type ‘cons’ in the ‘flop.set’ file? (The grammar ‘g8gap’ does not contain the type ‘cons’)

It is an internal function which you should not have to worry about
(it should not be exposed in flop.set in my opinion).

> [7] How can I set the setting parameter for ‘pseudo-types :=’ in the ‘flop.set’ file?

I think you can leave this undefined.

> [8] What is the function of ‘qc.tdl’ and how can I set this file?

qc is the quick check (called check paths in the lkb) see:
http://wiki.delph-in.net/moin/JacyPerformance

for a small grammar you don't need it (just comment it out).

> Before I send you a mail, I tried to find the answers for the above questions in the DELPH-IN wiki and other several website, but I could not find any answer for these questions. That is why I need your help.

The documentation is still sadly incomplete.

> I want to use the ‘g8gap’ to teach (to my graduate students) how we can make use of the PET parser. I am very sorry to bother you, but please answer to my questions. I am very grateful for your reading my e-mail and answering to my questions.

I hope that this helped a little.

--
Francis Bond <http://www3.ntu.edu.sg/home/fcbond/>
Division of Linguistics and Multilingual Studies
Nanyang Technological University

-------------- next part --------------
A non-text attachment was scrubbed...
Name: g8gap.tgz
Type: application/x-compressed-tar
Size: 18756 bytes
Desc: not available
URL: <http://lists.delph-in.net/archives/developers/attachments/20101211/131e55ae/attachment.bin>


More information about the developers mailing list