<html>
  <head>
    <meta content="text/html; charset=windows-1252"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    That could be helpful - thanks!  <br>
    <br>
    I would like to see a decoupling of the predicate normalisation from
    the question of what ACE or the LKB or whatever does.  I think
    predicate normalisation can perfectly well be treated as a *MRS
    &lt;-&gt; *MRS conversion which could be provided by an external
    tool.  The realiser itself should work with whatever predicates come
    out of the unknown word mechanism.  In fact, I think that for what
    Alex is doing, not normalising would be perfectly fine.<br>
    <br>
    That said, it would be good to provide a method for predicate
    normalisation using the rules the grammar writer defines already in
    conjunction with a large stem list, defaulting to no stemming with
    words which weren't on the list.  My reasoning is:<br>
    <br>
    - we want a solution which works for languages other than English<br>
    <br>
    - as far as possible, we want this to be under grammar writer
    control.  e.g., in the case where the grammar writer finds a
    particular stem which is treated incorrectly, they can add it to the
    list appropriately (or the irregs file).<br>
    <br>
    The main applications I can see for normalisation are cases where
    there is some external resource of some description which needs
    stem-based predicates - e.g., automatically created transfer rules. 
    I think it's only needed for regeneration in the situation where one
    needs to change tense or plurality, etc and where the predicate
    normalisation is part of the mechanism for telling the morphology
    generation what the stem is.  <br>
    <br>
    I may well be missing something, though.  It's one of these cases
    where I remember being involved in a discussion, possibly in Jerez,
    but not the content of the discussion.<br>
    <br>
    All best,<br>
    <br>
    Ann<br>
    <br>
    <br>
    <div class="moz-cite-prefix">On 08/02/2016 16:14, John Carroll
      wrote:<br>
    </div>
    <blockquote
      cite="mid:CCCBD12B-45B7-4DEA-9603-BDF61052B756@sussex.ac.uk"
      type="cite">Would the morpha and morphg tools at <a
        moz-do-not-send="true"
        href="http://users.sussex.ac.uk/%7Ejohnca/morph.html"><a class="moz-txt-link-freetext" href="http://users.sussex.ac.uk/~johnca/morph.html">http://users.sussex.ac.uk/~johnca/morph.html</a></a>
      be appropriate for predicate normalisation for parsing and
      generation? They are inverses of each other, i.e.
      <div><br>
      </div>
      <div>
        <div>$ echo "zanned_VBD" | ./morpha.ix86_darwin -actf
          verbstem.list</div>
        <div>zann+ed_VBD<br>
          $ echo zann+ed_VBD | ./morphg.ix86_darwin -ctf verbstem.list<br>
          zanned_VBD<br>
          <br>
        </div>
        <div><br>
        </div>
        <div>John</div>
        <div><br>
          <div>
            <div>On 7 Feb 2016, at 10:38, Ann Copestake wrote:</div>
            <br class="Apple-interchange-newline">
            <blockquote type="cite">
              <meta http-equiv="Content-Type" content="text/html;
                charset=windows-1252">
              <div bgcolor="#FFFFFF" text="#000000"> for realization at
                least, isn't it adequate to use a lemma list extracted
                from (say) WordNet to support predicate normalisation? 
                <br>
                <br>
                But, the application that Alex is interested in is a
                form of regeneration.  So I think that as long as the
                generator accepts what the parser outputs for unknown
                words, it really doesn't matter whether or not it's
                normalised.  I don't know whether or not anyone is using
                the realiser for applications which are broad-coverage
                (hence need unknown words) and where the *MRS is
                constructed from scratch (hence need to use lemmas for
                the predicates).  Excluding MT, of course.<br>
                <br>
                All best,<br>
                <br>
                Ann<br>
                <br>
                <div class="moz-cite-prefix">On 07/02/2016 10:15,
                  Stephan Oepen wrote:<br>
                </div>
                <blockquote
cite="mid:CA+_Fm6JCwB9Rb9W7yaA97ccU=gJzJFGEjqvfEP1Nv=H05vaUKw@mail.gmail.com"
                  type="cite">there actually are two separate
                  mechanism to discuss: (a) lexical instantiation for
                  unknown predicates (in realization) and (b) predicate
                  normalization for unknown words (in parsing).
                  <div><br>
                  </div>
                  <div>as for (a), i find the current LKB mechanism
                    about as generic as i can imagine (and consider
                    appropriate).  the grammar provides an inventory of
                    generic lexical entries for realization (these are
                    in part distinct from the parsing ones, in the ERG,
                    because the strategies for dealing with inflection
                    are different).  for each such entry, the
                    grammar declares which MRS predicate activates it
                    and how to determine its orthography.  the former is
                    accomplished via a regular expression, e.g.
                    something like /^named$/ or /^_([^_]+)/.  the latter
                    either comes from the (unique) parameter of the
                    relation with the unknown predicate (CARG in the
                    ERG) or from the part of the predicate matched
                    as the above capture group (the lemma field).  there
                    is no provision for generic lexical entries with
                    decomposed semantics (in realization).</div>
                  <div><br>
                  </div>
                  <div>regarding (b), the ERG in parsing outputs
                    predicates like the ones alex had noticed.  these
                    are not fully normalized because there is no
                    reliable lemmatization facility for unknown
                    words inside the parser (and, thus, generic entries
                    for parsing predominantly are full forms).  what is
                    recorded in the ‘lemma’ field is the actual surface
                    form, concatenated with the PoS that activated the
                    generic entry.  the ERG provides a mechanism for
                    post-parsing normalization, again in mostly
                    declarative and general form: triggered by regular
                    expressions looking for PTB PoS tags in predicate
                    names, an orthographemic rule of the grammar can
                    (optionally) be invoked on the remainder of the
                    ‘lemma’ field.  if i recall correctly, we
                    ‘disambiguate’ lemmatization naïvely and take the
                    first output from the set of matches of that rule.
                     the resulting string is injected into a
                    predicate template, e.g. something like
                    "_~a_n_unknown_rel".</div>
                  <div><br>
                  </div>
                  <div>i believe, at the time,<span></span> i did not
                    want to enable predicate normalization as part of
                    the standard parsing set-up because of its heuristic
                    (naïve disambiguation) nature.  for an input of,
                    say, ‘they zanned’, our current parsers have no
                    knowledge beyond the surface form and its tag VBD;
                    hence, we provide what we know as
                    ‘_zanned/VBD_u_unknown’.  the past tense
                    orthographemic rule of the ERG will hypothesize
                    three candidate stems (‘zanne’, ‘zann’, or ‘zan’).
                     it would require more information than is in the
                    grammar to do a better job of lemmatization than my
                    current heuristic.</div>
                  <div><br>
                  </div>
                  <div>—having refreshed my memory of the issues, i
                    retract my suggestion to enable predicate
                    normalization (in its current form) in MRS
                    construction after parsing.  i wish someone would
                    work on providing a broader-coverage solution to
                    this problem.  but we have added an input fix-up
                    transfer step to realization in the meantime<span></span>,
                    and that would seem like a good place for heuristic
                    predicate normalization, for the time being.  it
                    would enable round-trip parsing and generation, yet
                    preserve exact information in parser outputs for
                    someone to put a better normalization module there.</div>
                  <div><br>
                  </div>
                  <div>best wishes, oe</div>
                  <div><br>
                  </div>
                  <div><br>
                    On Sunday, February 7, 2016, Woodley Packard &lt;<a
                      moz-do-not-send="true"
                      class="moz-txt-link-abbreviated"
                      href="mailto:sweaglesw@sweaglesw.org"><a class="moz-txt-link-abbreviated" href="mailto:sweaglesw@sweaglesw.org">sweaglesw@sweaglesw.org</a></a>&gt;
                    wrote:<br>
                    <blockquote class="gmail_quote" style="margin:0 0 0
                      .8ex;border-left:1px #ccc solid;padding-left:1ex">
                      <div style="word-wrap:break-word">Hello Alex,
                        <div><br>
                        </div>
                        <div>This is a corner of the generation game
                          that is not yet implemented in ACE.  It’s been
                          on the ToDo list for years but nobody has
                          bugged me about it so it has been sitting at
                          low priority.  As Stephan mentioned, the
                          mechanism to make it work in the LKB is both
                          somewhat fiddly and covered in a few cobwebs,
                          so I had somewhat aloofly hoped that over the
                          years someone would have straightened things
                          out to where generation from unknown
                          predicates had a canonical approach (e.g.
                          implemented for multiple grammars or multiple
                          platforms).  I would be interested to hear
                          whether Glenn Slayden (who is on this list)
                          has implemented this in the Agree generator?</div>
                        <div><br>
                        </div>
                        <div>I’m willing to put the hour or two it would
                          take to make this work, but wonder if other
                          DELPH-IN developers/grammarians have ideas
                          about ways in which the current setup (as
                          implemented in the ERG’s custom lisp code that
                          patches into the LKB, if memory serves) could
                          be improved upon in the process?</div>
                        <div><br>
                        </div>
                        <div>Regards,</div>
                        <div>-Woodley</div>
                        <div><br>
                          <div>
                            <blockquote type="cite">
                              <div>On Feb 6, 2016, at 2:48 AM, Alexander
                                Kuhnle &lt;<a moz-do-not-send="true"
                                  class="moz-txt-link-abbreviated"
                                  href="mailto:aok25@cam.ac.uk">aok25@cam.ac.uk</a>&gt;

                                wrote:</div>
                              <br>
                              <div>
                                <div
style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px">
                                  <div style="margin:0cm 0cm
                                    0.0001pt;font-size:11pt;font-family:Calibri,sans-serif">Dear

                                    all,</div>
                                  <div style="margin:0cm 0cm
                                    0.0001pt;font-size:11pt;font-family:Calibri,sans-serif"> </div>
                                  <div style="margin:0cm 0cm
                                    0.0001pt;font-size:11pt;font-family:Calibri,sans-serif">We

                                    came across the problem of
                                    generating from MRS involving
                                    unknown words, for instance, in the
                                    sentence “I like porcelain.”
                                    (parsing gives
                                    "_porcelain/NN_u_unknown_rel"). Is
                                    there an option for ACE so that
                                    these cases can be handled?</div>
                                  <div style="margin:0cm 0cm
                                    0.0001pt;font-size:11pt;font-family:Calibri,sans-serif">Moreover,

                                    we came across the example “The
                                    phosphorus self-combusts.” vs ?“The
                                    phosphorus is self-combusted.” Where
                                    the first doesn’t parse, the second
                                    does, but doesn’t generate (again
                                    presumably because of
                                    "_combusted/VBN_u_unknown_rel"). It
                                    seems to not recognise verbs with a
                                    “self-“ prefix, but does for past
                                    participles.</div>
                                  <div style="margin:0cm 0cm
                                    0.0001pt;font-size:11pt;font-family:Calibri,sans-serif"> </div>
                                  <div style="margin:0cm 0cm
                                    0.0001pt;font-size:11pt;font-family:Calibri,sans-serif">Many

                                    thanks,</div>
                                  <div style="margin:0cm 0cm
                                    0.0001pt;font-size:11pt;font-family:Calibri,sans-serif">Alex</div>
                                </div>
                              </div>
                            </blockquote>
                          </div>
                          <br>
                        </div>
                      </div>
                    </blockquote>
                  </div>
                </blockquote>
                <br>
              </div>
            </blockquote>
          </div>
          <br>
        </div>
      </div>
    </blockquote>
    <br>
  </body>
</html>