[developers] Using the ERG for research

Kelly, Matthew Alexander mak582 at psu.edu
Tue Jul 2 22:34:18 CEST 2019


My thanks to both of you.

I managed to get both of your examples working.

Unfortunately, the generate function doesn’t reliably generate the prepositional object variation given the double object version of a sentence, or vice versa. Sometimes it only generates passive and active voice variations.

I think my best approach may be to not use the generate function and simply use the parse to identify the sentence structure and then swap the order of the NPs for the two objects, inserting or removing a “to” between the noun phrases as necessary.

Is there a reference page that explains the syntax of the parse tree and how to traverse it / manipulate it using Python, like in the def find_v_dat_dlr function below? I’d like to identify the boundaries of the two noun phrases at the end of each sentence.

Thanks again,

Matthew.
-----
Matthew A. Kelly, PhD
E370 Westgate Building
College of Information Science and Technology
The Pennsylvania State University


On Jul 2, 2019, at 6:05 AM, goodman.m.w at gmail.com<mailto:goodman.m.w at gmail.com> wrote:

Hello Matthew,

I can offer a prototype using PyDelphin v0.9.2. The following Python 3 code assumes you have PyDelphin (https://github.com/delph-in/pydelphin<https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdelph-in%2Fpydelphin&data=02%7C01%7Cmak582%40psu.edu%7C0e34b28ecb3042eb0fae08d6fef5ac78%7C7cf48d453ddb4389a9c1c115526eb52e%7C0%7C1%7C636976728463817227&sdata=6%2BxKC68kO33f0z7fytf990MvM1Q50CZKM7zsbKeEJ%2Fs%3D&reserved=0>) and ACE (http://sweaglesw.org/linguistics/ace/<https://nam01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fsweaglesw.org%2Flinguistics%2Face%2F&data=02%7C01%7Cmak582%40psu.edu%7C0e34b28ecb3042eb0fae08d6fef5ac78%7C7cf48d453ddb4389a9c1c115526eb52e%7C0%7C1%7C636976728463817227&sdata=OF1T%2FnGoolWgTVbzkmi7bc30z%2FYkLkMrwvlm%2BAJ6rT8%3D&reserved=0>) installed and a copy of the ERG .dat file (from ACE's website).

First, if you want to detect types in the syntax like 'v_dat_dlr', you probably want to look through the derivation tree. The following function descends through the tree and looks for the type:

    from delphin import derivation

    def find_v_dat_dlr(d):
        if isinstance(d, derivation.UdfNode):
            if d.entity == 'v_dat_dlr':
                return d.terminals()
            else:
                xs = []
                for dtr in d.daughters:
                    xs.extend(find_v_dat_dlr(dtr))
                return xs
        else:
            return []

This function could probably be improved but it seems to work for now.

Now if you want to parse a sentence, generate from its MRS, and inspect each derivation tree, you can do this:

    from delphin.interfaces import ace

    grm = '/home/mwg/grammars/erg-2018-x86-64-0.9.30.dat'  # adjust as necessary
    sent = 'The gracious hostess offered the special guest her seat.'
    parse_response = ace.parse(grm, sent)
    first_result = parse_response.result(0)
    mrs = first_result['mrs']

    gen_response = ace.generate(grm, mrs)
    if gen_response:
        for result in gen_response.results():
            drv = result.derivation()
            dative_terminals = find_v_dat_dlr(drv)
            if dative_terminals:
                print(result['surface'])
                print('  ({})'.format(
                    ', '.join(tml.form for tml in dative_terminals)))
                print()

Running this prints matching sentences and the token that went through the dative lexical rule.

    Her seat was offered by the gracious hostess to the special guest.
      (offered)

    The gracious hostess offered her seat to the special guest.
      (offered)

I hope that is enough to get started.

On Tue, Jul 2, 2019 at 3:59 AM Alexandre Rademaker <arademaker at gmail.com<mailto:arademaker at gmail.com>> wrote:
Hi Matthew,

Maybe someone with more experience than me can add something here, but as far as I remember from the LREC tutorial in 2016, we can easily make a pipeline of two ACE calls.

I also found this page http://moin.delph-in.net/AceUse<https://nam01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmoin.delph-in.net%2FAceUse&data=02%7C01%7Cmak582%40psu.edu%7C0e34b28ecb3042eb0fae08d6fef5ac78%7C7cf48d453ddb4389a9c1c115526eb52e%7C0%7C1%7C636976728463827223&sdata=n3Dd0CiyNZ2duvbWoMrlS%2B5RTPVYIE8%2BGHPK0mRoocM%3D&reserved=0>


$ cat test.txt
The gracious hostess offered the special guest her seat.

$ cat test.txt  | ~/hpsg/ace/ace -g ~/hpsg/ace/erg.dat -Tf1 | ~/hpsg/ace/ace -g ~/hpsg/ace/erg.dat -e
NOTE: 1 readings, added 2940 / 844 edges to chart (306 fully instantiated, 210 actives used, 238 passives used) RAM: 7523k
NOTE: parsed 1 / 1 sentences, avg 7523k, time 0.04242s
Her seat was offered by the gracious hostess to the special guest.
The gracious hostess offered her seat to the special guest.
The special guest was offerred her seat by the gracious hostess.
The gracious hostess offered the special guest her seat.
NOTE: 1063 passive, 833 active edges in final generation chart; built 1264 passives total. [4 results]

NOTE: generated 1 / 1 sentences, avg 13582k, time 0.10198s
NOTE: transfer did 1025 successful unifies and 992 failed ones


Using http://pydelphin.readthedocs.io<https://nam01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpydelphin.readthedocs.io&data=02%7C01%7Cmak582%40psu.edu%7C0e34b28ecb3042eb0fae08d6fef5ac78%7C7cf48d453ddb4389a9c1c115526eb52e%7C0%7C1%7C636976728463827223&sdata=HVlQaWTvA1sm15eHVwOqTVnbE5jJimQJ7KyAON5gGPE%3D&reserved=0> may be easier to collect the outputs and control the interaction.


Best,


--
Alexandre Rademaker
http://arademaker.github.io<https://nam01.safelinks.protection.outlook.com/?url=http%3A%2F%2Farademaker.github.io&data=02%7C01%7Cmak582%40psu.edu%7C0e34b28ecb3042eb0fae08d6fef5ac78%7C7cf48d453ddb4389a9c1c115526eb52e%7C0%7C1%7C636976728463837219&sdata=z2Nhvmnch%2BnU%2F8t5WNTb70Sgmqs5wBl4Zuwlr5k1txg%3D&reserved=0>


> On 1 Jul 2019, at 13:34, Ann Copestake <aac10 at cl.cam.ac.uk<mailto:aac10 at cl.cam.ac.uk>> wrote:
>
> in case anyone has time to respond to Matthew
>
>
> -------- Forwarded Message --------
> Subject:      Using the ERG for research
> Date: Fri, 28 Jun 2019 17:12:35 -0400
> From: Matthew Kelly <mak582 at psu.edu<mailto:mak582 at psu.edu>>
> To:   lingo at delph-in.net<mailto:lingo at delph-in.net>
>
>
> Dear LinGO lab,
>
> I'm a post-doctoral researcher at Penn State looking to use your ERG software package to convert a few thousand dative double object sentences into dative prepositional object sentences and vice versa.
>
> For example, I'd like to convert the prepositional object sentence "The gracious hostess offered her seat to the special guest." to the double object sentence "The gracious hostess offered the special guest her seat."
>
> I am able to perform this conversion by using the online interface to "analyze" a sentence and then pressing "generate" to produce alternatives. How could I do this in batch, automatically?
>
> Thank you,
> Matthew.
>
> --
> Matthew A. Kelly, Ph.D.
> E370 Westgate Building
> College of Information Sciences and Technology
> The Pennsylvania State University
>




--
-Michael Wayne Goodman

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20190702/56d5c826/attachment-0001.html>


More information about the developers mailing list