[developers] Hacking the DELPH-IN framework for a null morpheme: a semi-success

Stephan Oepen oe at ifi.uio.no
Fri Apr 12 18:15:21 CEST 2019


hi woodley, and all,

i would have thought our current engines are quite capable of generating
seemingly duplicate derivations, owing to incomplete information being
recorded about orthographemic segmentation.  in fact, i believe i recall at
least JaCY exposing this phenomenon in the LKB and PET.

assume two orthographemic rules, each with two subrules (which both can
apply to some stem), maybe something like:

one :-
%suffix (* s) (e ss)
[...].


two :=
%suffix (!s !sed) (s ssed)
[...].

not sure the above is quite right, but i want to allow two chains through
these rules that only differ in the internal string segmentation, e.g.
assuming a stem ‘fore’ and a final surface form ‘foressed’:

fore one fores two foressed
fore one foress two foressed

from what i recall about at least the PET implementation, i would expect
the above to result in two distinct lexical sub-trees whose derivations in
current UDX will look alike (and which are guaranteed to immediately pack)

would you expect to block such ambiguity (in ACE)?  if so, on what basis,
and which of the two variants should prevail?

—ever since yi and others started retracing (with some effort) the exact
string-level effects of orthographemic nodes in our derivations i have been
thinking we should extend UDX and probably both record which affixation
sub-rule applied, and what the strings at the ‘top’ and the ‘bottom’ looked
like.

would this extra information seem adequate and sufficient to you (and
others)?  if so, i would like to try and work out how to extend the UDX
syntax, while maintaining backwards compatibility.

greetings from the train to finse 1222!  oe




On Fri, 12 Apr 2019 at 06:51 Woodley Packard <sweaglesw at sweaglesw.org>
wrote:

> Hi David,
>
> If I understand correctly you have a lexical entry whose orthography in
> the lexicon is “=0” but which only ever appears in combination with the
> prefix or suffix or both, which lets you cover up the fact that the =0 was
> ever there.  Sounds reasonable to me.
>
> Forgive me if this is too obvious and not what’s going on, but:  getting
> two parses when both suffix and prefix are present seems likely to be
> caused by unconstrained order of application of those rules.  Have you
> checked whether the two parses you get have the rules applying in opposite
> orders?  If so, the solution is simply to constrain things so that one of
> them cannot consume the other’s output.
>
> If on the other hand the two parses have identical derivations then the
> result is unexpected — at least under the currently used definition of
> derivation trees.  There have been suggestions that derivations with
> different internal inflected string values resulting from different
> subrules of the %prefix and %suffix mechanisms should be considered
> distinct (and that those should be recorded as part of the derivation
> tree), but to my knowledge none of our systems supports that yet, nor do I
> believe a format has been decided upon.
>
> Best,
> Woodley
>
> On Apr 11, 2019, at 8:56 PM, David Inman <davinman at uw.edu> wrote:
>
> Hello developers,
>
> I am using the irules to define a null morpheme by having prefixes and
> suffixes overwrite a string (=0, 3rd person marking on a clitic complex)
> when they attach to it. The irules look like this:
>
> past-prefix-2 :=
> %prefix (* =int) (=0 =int)
> past-lex-rule.
>
> clitic-plural-suffix :=
> %suffix (* =ʔał) (=0 =ʔał)
> clitic-plural-lex-rule.
>
> This works and generates strings that are lacking the =0 morpheme. Except
> that in the case where both a prefix and a suffix attach, the parser enters
> two =0 morphemes into the parse chart and will parse it doubly. (This does
> not happen for contentful roots.) If the =0 has only "suffixes" after it,
> then I get one parse. If it has only "prefixes" then I also get one parse.
> I think the parser sees that =0 can be overwritten either by the prefix or
> the suffix so it hypothesizes it twice. I'm using the morph rules a bit
> differently than intended, but is this a case that should be supported? Is
> there any way around this so that I limit the parsers behavior and get one
> parse?
>
> David Inman
> PhD Candidate
> University of Washington Linguistics
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20190412/804da49d/attachment-0001.html>


More information about the developers mailing list