[developers] A Question on the PET Parser
Francis Bond
bond at ieee.org
Sat Dec 11 10:46:01 CET 2010
G'day,
I have CCed this to the developers list, as they may be able to answer
questions that I cannot.
> I am now developing a small grammar for English and Korean using the LKB system, but the grammar itself is not as big as the ERG. However, my grammar is a little different from ERG. Recently, I tried to convert my grammar in order to use the grammar in the PET parser. I succeeded in parsing the English sentences using the PET parser, but the result contains no MRS structure as follows, even though I set the value for ‘label-path’ and ‘label-path-tail’ in the ‘pet.set’ file.
>
>
>
> [
> RELS:<>
> HCONS: <> ].
>
>
>
> Even LTOP and INDEX don’t appear in the MRS structure.
>
>
>
> I think that the problems come from the setting environments which are encoded in the four setting files: flop.set, global.set, pet.set, and qc.set. That’s why I send a mail to ask some questions on the PET system.
>
>
>
> As you know, the LKB system contains a small grammar for English. For example, it contains ‘g8gap’ which is the implementation of Copestake (2002). (I attached the grammar for your convenience.) This grammar contains MRS and the gap feature which can handle LDD. As you know, this grammar is too small compared with the ERG. My main question is whether I can run such a small grammar using the PET parser or not. If possible, how can I set these files (flop.set, global.set, pet.set, and qc.set) in order to make the grammar run in the PET system? If there is a detail reference on how to set these files, please let me know.
>
>
> In my thought, if the PET parser works properly, the small grammar such as ‘g8gap’ as well as the grammar ERG can be handled with the PET parser. (Of cause, I think that the PET system can also handle my own grammar, if the PET parser works properly.) However, there are several differences between the grammar files in ‘g8gap’ and those of the ERG. Therefore, I want to know how I can set the files (flop.set, global.set, pet.set, and qc.set) in order to make the PET parser use the grammar ‘g8gap’ to process the English sentences.
>
>
> My questions are as follows:
>
> [1] If the grammar itself does not fit into the format of Grammar Matrix (for example, ‘g8gap’), is it impossible to run the grammar using the PET parser? If possible, how can I set the files (flop.set, global.set, pet.set, and qc.set) in order to see the MRS structure in the results of the ‘cheap’ parsing?
As long as the grammar is not wildly different it should be possible.
Note that PET doesn't handle defaults, so if you use them then you
cannot use PET.
> [2] The grammar ‘g8gap’ (lexicon.tdl) contains ‘SEM.RELS.LIST.FIRST.PRED’ for relations in the MRS structure. The ‘pet.set’ file, however, specifies ‘label-path := "SYNSEM.LKEYS.--KEYREL.WLINK’ and ‘label-path-tail := "WLINK”’. If I change the specification as in ‘label-path := " ORTH.LIST.FIRST’ and ‘label-path-tail := " ORTH.LIST.FIRST”’, does it work or not? If not, why? What is the function of ‘WLINK’?
I think WLINK is not used any more. Which version of the ERG are you
looking at? It may be easier to look at a newsimpler grammar, like
the KRG2:
http://krg.khu.ac.kr/
> [3] How can I set the parameter for ‘qc-structure :=’ in the ‘pet.set’ file?
qc-structure := $qc_unif_set_pack.
Seems to be the most popular, but I think you can just comment it out.
> [4] What is the function of ‘keyarg-marker-path’ and how can I set this parameter in the ‘global.set’ file? (The grammar ‘g8gap’ does not contain the attribute/feature ‘KEY-ARG’)
This is used for optimising the parser. If you don't have KEY-ARG
set in you grammar, then you can comment it out.
I don't know the best citation for a deeper explanation, perhaps
someone else can help us out. There is some discussion at
<http://wiki.delph-in.net/moin/JacyPerformance> but it assumes you
know a lot already.
> [5] The grammar ‘g8gap’ (lexicon.tdl) contains ‘ORTH.LIST.FIRST’ for orthography. The ‘global.set’ file, however, specifies ‘orth-path := STEM’. If I change the specification as in ‘orth-path := ORTH.LIST.FIRST’, does it work or not? If not, why?
It depends where the orthographic string is in your grammar. In Jacy
it is: STEM, but it can differ for different grammars. If your
grammar were available online, we could perhaps take a look.
> [6] What is the function of ‘special-name-cons’ and the type ‘cons’ in the ‘flop.set’ file? (The grammar ‘g8gap’ does not contain the type ‘cons’)
It is an internal function which you should not have to worry about
(it should not be exposed in flop.set in my opinion).
> [7] How can I set the setting parameter for ‘pseudo-types :=’ in the ‘flop.set’ file?
I think you can leave this undefined.
> [8] What is the function of ‘qc.tdl’ and how can I set this file?
qc is the quick check (called check paths in the lkb) see:
http://wiki.delph-in.net/moin/JacyPerformance
for a small grammar you don't need it (just comment it out).
> Before I send you a mail, I tried to find the answers for the above questions in the DELPH-IN wiki and other several website, but I could not find any answer for these questions. That is why I need your help.
The documentation is still sadly incomplete.
> I want to use the ‘g8gap’ to teach (to my graduate students) how we can make use of the PET parser. I am very sorry to bother you, but please answer to my questions. I am very grateful for your reading my e-mail and answering to my questions.
I hope that this helped a little.
--
Francis Bond <http://www3.ntu.edu.sg/home/fcbond/>
Division of Linguistics and Multilingual Studies
Nanyang Technological University
More information about the developers
mailing list