[developers] Hacking the DELPH-IN framework for a null morpheme: a semi-success

Emily M. Bender ebender at uw.edu
Fri May 10 18:48:00 CEST 2019

Thanks, David. We have switched it for a different character and ace is

On Fri, May 10, 2019 at 9:11 AM David Inman <davinman at uw.edu> wrote:

> No, I just ended up falling back to the LKB.
> The ! represents a [+glottalic] component on the preceding segment, and
> it's possible to use another symbol to indicate it if you need ace to work.
> David Inman
> PhD Candidate
> University of Washington Linguistics
> On Fri, May 10, 2019 at 7:07 AM Emily M. Bender <ebender at uw.edu> wrote:
>> We just hit this same problem with the Nuuchahnulth grammar in 567 --- it
>> contains irules like this one:
>> noun-pc105_lrt1-suffix :=
>> %suffix (* =\!iˑ)
>> noun-pc105_lrt1-lex-rule.
>> ... where ace doesn't seem to be respecting the \ escaping the !. David,
>> did you ever find a resolution here?
>> Emily
>> On Fri, Apr 12, 2019 at 12:33 PM David Inman <davinman at uw.edu> wrote:
>>> As Emily indicated, the prefixes always attach first, followed by the
>>> suffixes. The two LKB parses are identical.
>>> I'm having difficulty parsing with Ace, apparently due to special
>>> symbols. Once I introduce clitics with = I get the following:
>>> ERROR: morpho: no such letter set as `a'
>>> I had other problems in ace previously so just switched to LKB.
>>> David Inman
>>> PhD Candidate
>>> University of Washington Linguistics
>>> On Fri, Apr 12, 2019 at 9:44 AM Woodley Packard <sweaglesw at sweaglesw.org>
>>> wrote:
>>>> Hi Stephan,
>>>> I agree that the ambiguity you reference, and that David and Emily
>>>> believe they are experiencing, and I also alluded to a few emails ago,
>>>> exists in principle and might be interesting in some situations.  I would
>>>> be surprised to see it result in two identical uninflected lexical edges in
>>>> the chart, however, at least in ACE’s current implementation.  The
>>>> ambiguity arises as a result of two ways to apply the first rule, not from
>>>> two different lexical starting places.  Furthermore I didn’t (?) think ACE
>>>> was set up to generate multiple output edges from a single rule/daughter
>>>> combination, as I maintain would be required for this ambiguity.
>>>> To the question of whether ace blocks that ambiguity: the only
>>>> difference would be in the ORTH feature of the two intermediate edges, I
>>>> suppose.  Until recently ACE did not even write orthographemic changes to
>>>> that feature.  I suppose at the moment it arbitrarily picks one or the
>>>> other.  Just now it seems that approach could theoretically result in
>>>> inadvertently taking an orthographemic dead-end, although I’m not aware of
>>>> that issue ever coming up in practice.  I think I could be persuaded that
>>>> the right thing to do would be to generate both variants (from just one
>>>> source lexical edge), but before implementing that I think an appropriate
>>>> improvement to UDF would be in order.
>>>> Enjoy the train ride!
>>>> Woodley
>>>> On Apr 12, 2019, at 9:15 AM, Stephan Oepen <oe at ifi.uio.no> wrote:
>>>> hi woodley, and all,
>>>> i would have thought our current engines are quite capable of
>>>> generating seemingly duplicate derivations, owing to incomplete information
>>>> being recorded about orthographemic segmentation.  in fact, i believe i
>>>> recall at least JaCY exposing this phenomenon in the LKB and PET.
>>>> assume two orthographemic rules, each with two subrules (which both can
>>>> apply to some stem), maybe something like:
>>>> one :-
>>>> %suffix (* s) (e ss)
>>>> [...].
>>>> two :=
>>>> %suffix (!s !sed) (s ssed)
>>>> [...].
>>>> not sure the above is quite right, but i want to allow two chains
>>>> through these rules that only differ in the internal string segmentation,
>>>> e.g. assuming a stem ‘fore’ and a final surface form ‘foressed’:
>>>> fore one fores two foressed
>>>> fore one foress two foressed
>>>> from what i recall about at least the PET implementation, i would
>>>> expect the above to result in two distinct lexical sub-trees whose
>>>> derivations in current UDX will look alike (and which are guaranteed to
>>>> immediately pack)
>>>> would you expect to block such ambiguity (in ACE)?  if so, on what
>>>> basis, and which of the two variants should prevail?
>>>> —ever since yi and others started retracing (with some effort) the
>>>> exact string-level effects of orthographemic nodes in our derivations i
>>>> have been thinking we should extend UDX and probably both record which
>>>> affixation sub-rule applied, and what the strings at the ‘top’ and the
>>>> ‘bottom’ looked like.
>>>> would this extra information seem adequate and sufficient to you (and
>>>> others)?  if so, i would like to try and work out how to extend the UDX
>>>> syntax, while maintaining backwards compatibility.
>>>> greetings from the train to finse 1222!  oe
>>>> On Fri, 12 Apr 2019 at 06:51 Woodley Packard <sweaglesw at sweaglesw.org>
>>>> wrote:
>>>>> Hi David,
>>>>> If I understand correctly you have a lexical entry whose orthography
>>>>> in the lexicon is “=0” but which only ever appears in combination with the
>>>>> prefix or suffix or both, which lets you cover up the fact that the =0 was
>>>>> ever there.  Sounds reasonable to me.
>>>>> Forgive me if this is too obvious and not what’s going on, but:
>>>>>  getting two parses when both suffix and prefix are present seems likely to
>>>>> be caused by unconstrained order of application of those rules.  Have you
>>>>> checked whether the two parses you get have the rules applying in opposite
>>>>> orders?  If so, the solution is simply to constrain things so that one of
>>>>> them cannot consume the other’s output.
>>>>> If on the other hand the two parses have identical derivations then
>>>>> the result is unexpected — at least under the currently used definition of
>>>>> derivation trees.  There have been suggestions that derivations with
>>>>> different internal inflected string values resulting from different
>>>>> subrules of the %prefix and %suffix mechanisms should be considered
>>>>> distinct (and that those should be recorded as part of the derivation
>>>>> tree), but to my knowledge none of our systems supports that yet, nor do I
>>>>> believe a format has been decided upon.
>>>>> Best,
>>>>> Woodley
>>>>> On Apr 11, 2019, at 8:56 PM, David Inman <davinman at uw.edu> wrote:
>>>>> Hello developers,
>>>>> I am using the irules to define a null morpheme by having prefixes and
>>>>> suffixes overwrite a string (=0, 3rd person marking on a clitic complex)
>>>>> when they attach to it. The irules look like this:
>>>>> past-prefix-2 :=
>>>>> %prefix (* =int) (=0 =int)
>>>>> past-lex-rule.
>>>>> clitic-plural-suffix :=
>>>>> %suffix (* =ʔał) (=0 =ʔał)
>>>>> clitic-plural-lex-rule.
>>>>> This works and generates strings that are lacking the =0 morpheme.
>>>>> Except that in the case where both a prefix and a suffix attach, the parser
>>>>> enters two =0 morphemes into the parse chart and will parse it doubly.
>>>>> (This does not happen for contentful roots.) If the =0 has only "suffixes"
>>>>> after it, then I get one parse. If it has only "prefixes" then I also get
>>>>> one parse. I think the parser sees that =0 can be overwritten either by the
>>>>> prefix or the suffix so it hypothesizes it twice. I'm using the morph rules
>>>>> a bit differently than intended, but is this a case that should be
>>>>> supported? Is there any way around this so that I limit the parsers
>>>>> behavior and get one parse?
>>>>> David Inman
>>>>> PhD Candidate
>>>>> University of Washington Linguistics
>> --
>> Emily M. Bender
>> Professor, Department of Linguistics
>> University of Washington
>> Twitter: @emilymbender

Emily M. Bender
Professor, Department of Linguistics
University of Washington
Twitter: @emilymbender
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20190510/daabda3d/attachment.html>

More information about the developers mailing list