<html>
  <head>
    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <div class="moz-cite-prefix">Good points.  <br>
      <br>
      In this application, as of now, we only send tags within a single
      "basic" part of speech (e.g., NN.*, VB.*, JJ.*, RB.*, DT, IN). 
      I'd like not to be limited to choosing between NN and VBG, for
      example, though.<br>
      <br>
      I guess we could add a feature, within or besides +TNT, for each
      of the PTB tags, + or - (or perhaps, better, w/ a probability)....
      How does that sound?  (This will interact with the type hierarchy
      that the FSC tokenizer uses is PET.  Actually, maybe not: maybe a
      rule per in pos.tdl?)<br>
      <br>
      Any further thoughts on multi-token lexemes would be most
      sincerely appreciated.  (I'm assuming that they would be in a
      different cell/context of the chart.)<br>
      <br>
      This is working satisfactorily (preliminarily).<br>
      <br>
      Thanks MUCH!<br>
      Paul<br>
      <br>
      <br>
      On 9/18/2013 11:28 AM, Bec Dridan wrote:<br>
    </div>
    <blockquote
cite="mid:CAKRPO=N6a0+i-pffDVXcmh4onKiAcK-xaxrT-aGWVaYhw6RLwg@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div>Hi Paul,<br>
          <br>
        </div>
        <div>People more expert in the chart mapping rules and the
          grammar might want to chime in, but broadly speaking, your
          rule looks like it will work. There's a couple of ways you may
          run into issues:<br>
          <br>
        </div>
        <div> * if you input multiple tags for the same token, rules get
          complicated<br>
        </div>
        <div> * you may get unexpected results when the ERG native token
          is a multi-token entry (like "for example")<br>
        </div>
        <div> * as I said before, sometimes the mapping between PTB and
          ERG types is not what you'd expect<br>
          <br>
        </div>
        <div>But if you are limiting the places and tags where you try
          and restrict, you should be able to come up with a workable
          solution this way, I think.<br>
          <br>
        </div>
        <div>Rebecca<br>
        </div>
        <div><br>
        </div>
      </div>
      <div class="gmail_extra"><br>
        <br>
        <div class="gmail_quote">On Wed, Sep 18, 2013 at 5:04 PM, Paul
          Haley <span dir="ltr">&lt;<a moz-do-not-send="true"
              href="mailto:paul@haleyai.com" target="_blank">paul@haleyai.com</a>&gt;</span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0
            .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div text="#000000" bgcolor="#FFFFFF">
              <div>Your comments have been quite helpful in getting me
                headed in what appears to be the right direction...<br>
                <br>
                I now think default LEs (whether none or only for gaps)
                has little bearing provided there is a tag provided (at
                least that is what I am observing in the behavior.)<br>
                <br>
                I have modified lfr.tdl as below and confirmed that I no
                longer get the native verbal LE for "array" provided any
                of NN, NNS, NNPS, NNP (it looks like I need to send $
                instead of S for two of those, though.)<br>
                <br>
                What do you think?  Do I need a bunch of the latter?<br>
                <br>
                Thanks again!<br>
                Paul<br>
                <br>
                <br>
                #|
                <div class="im"><br>
                  generic_non_ne+native_lfr := lexical_filtering_rule
                  &amp;<br>
                  [ +CONTEXT &lt; [ SYNSEM.PHON.ONSET con_or_voc ] &gt;,<br>
                    +INPUT &lt; [ SYNSEM.PHON.ONSET unk_onset,
                  ORTH.CLASS non_ne ] &gt;,<br>
                    +OUTPUT &lt; &gt;,<br>
                    +POSITION "I1@C1" ].<br>
                </div>
                |#<br>
                <br>
                exclude_verbal_given_nominal_lfr :=
                lexical_filtering_rule &amp;<br>
                [ +CONTEXT &lt; [ +TNT.+TAGS &lt; ^N.*$ &gt; ]&gt;,<br>
                  +INPUT &lt; [ SYNSEM basic_verb_synsem ] &gt;,
                <div class="im"><br>
                    +OUTPUT &lt; &gt;,<br>
                    +POSITION "I1@C1" ].<br>
                  <br>
                  <br>
                </div>
                <div>
                  <div class="h5"> On 9/18/2013 10:50 AM, Bec Dridan
                    wrote:<br>
                  </div>
                </div>
              </div>
              <div>
                <div class="h5">
                  <blockquote type="cite">
                    <div dir="ltr">
                      <div>
                        <div>Hi Paul,<br>
                        </div>
                        <div><br>
                          DEFAULT_LES controls when we use the default
                          generics rather than, or possibly alongside
                          the native entry.<br>
                          The options mean, as far as I understand them:<br>
                          <br>
                          NO_DEFAULT_LES: if there is no native entry,
                          do nothing, ignore tags, parse will fail.<br>
                          DEFAULT_LES_ALL: always create a generic entry
                          from any input POS tags (although these can be
                          filtered out later)<br>
                          DEFAULT_LES_POSGAPS_LEXGAPS: create a generic
                          entry from any input POS tags only where there
                          was no native entry available<br>
                          <br>
                        </div>
                        <div>None of them have anything to do with
                          restricting native entries.<br>
                          <br>
                        </div>
                        Restricting lexical entries the way you want is
                        generally called supertagging, although the term
                        "supertag" also refers to the fact that the tags
                        generally used in this manner are more
                        fine-grained than standard POS tags.
                        Unfortunately, that's not in the mainstream PET
                        release so far, because it is not that
                        straightforward. There are several development
                        implementations around that might do what you
                        want, but they would all need to be configured
                        to your particular set up. For one thing, the
                        mapping from PTB tags isn't always clear-cut -
                        the ERG lexical entries don't always align
                        exactly with the PTB distinctions and so most
                        (all?) work has been based on restricting by
                        tags related to the lexical entries.  As far as
                        I know, there's no current implementations that
                        can restrict by PTB POS tags, although others
                        might know?<br>
                        <br>
                      </div>
                      Rebecca<br>
                      <div><br>
                        <br>
                        <br>
                        <br>
                        <br>
                      </div>
                    </div>
                    <div class="gmail_extra"><br>
                      <br>
                      <div class="gmail_quote">On Wed, Sep 18, 2013 at
                        4:12 PM, Paul Haley <span dir="ltr">&lt;<a
                            moz-do-not-send="true"
                            href="mailto:paul@haleyai.com"
                            target="_blank">paul@haleyai.com</a>&gt;</span>
                        wrote:<br>
                        <blockquote class="gmail_quote" style="margin:0
                          0 0 .8ex;border-left:1px #ccc
                          solid;padding-left:1ex">
                          <div text="#000000" bgcolor="#FFFFFF">
                            <div>I should correct my prior...  <br>
                              <br>
                              It is not that the native LEs are taking
                              precedence, but that native LEs that are
                              not consistent with the input PoS are
                              still being added to the chart.  <br>
                              <br>
                              For example, if I pass in "array" with
                              "NN", I'm still getting array_v1 in the
                              chart.  I want array_n1 in the chart.  So,
                              what I'm after is pruning the native LEs
                              to those that are consistent with the
                              input PoS (or living with the generics in
                              the case of no natives).<br>
                              <br>
                              Does that sound like what you called
                              super-tagging?<span><font color="#888888"><br>
                                  <br>
                                  Paul</font></span>
                              <div>
                                <div><br>
                                  <br>
                                  On 9/18/2013 10:04 AM, Paul Haley
                                  wrote:<br>
                                </div>
                              </div>
                            </div>
                            <div>
                              <div>
                                <blockquote type="cite">
                                  <div>I had that fear, too!  Which is
                                    why I asked.<br>
                                    <br>
                                    I gave it a try with no default
                                    LEs.  To my surprise, the native
                                    lexical entries are still taking
                                    precedence!  (So I must be missing
                                    something.)<br>
                                    <br>
                                    On 9/18/2013 9:42 AM, Bec Dridan
                                    wrote:<br>
                                  </div>
                                  <blockquote type="cite">
                                    <div dir="ltr">
                                      <div>
                                        <div>
                                          <div>Hi Paul,<br>
                                            <br>
                                          </div>
                                          The POS input to PET is only
                                          designed for unknown word
                                          handling (ie when there are no
                                          corresponding ERG LEs, as you
                                          noticed).  It sounds like what
                                          you are after is more like
                                          supertagging, restricting the
                                          lexical types used according
                                          to some tags on the input?
                                          I've played around a bit with
                                          different methods to do that,
                                          but none of them are currently
                                          in the main branch of PET.  <br>
                                          <br>
                                        </div>
                                        What you propose with the
                                        filtering rule will, I think,
                                        force the grammar to use generic
                                        types everywhere, rather than
                                        use what's in the lexicon. I
                                        very much doubt that is what you
                                        want to do?<br>
                                        <br>
                                      </div>
                                      Rebecca<br>
                                    </div>
                                    <div class="gmail_extra"><br>
                                      <br>
                                      <div class="gmail_quote">On Wed,
                                        Sep 18, 2013 at 3:26 PM, Paul
                                        Haley <span dir="ltr">&lt;<a
                                            moz-do-not-send="true"
                                            href="mailto:paul@haleyai.com"
                                            target="_blank">paul@haleyai.com</a>&gt;</span>
                                        wrote:<br>
                                        <blockquote class="gmail_quote"
                                          style="margin:0 0 0
                                          .8ex;border-left:1px #ccc
                                          solid;padding-left:1ex">
                                          <div text="#000000"
                                            bgcolor="#FFFFFF">
                                            <div>Hello,<br>
                                              <br>
                                              I may be making some
                                              conceptual progress on
                                              this...<br>
                                              <br>
                                              I went back to the chart
                                              mapping tutorial (<a
                                                moz-do-not-send="true"
                                                href="http://moin.delph-in.net/Chart_Mapping"
                                                target="_blank">http://moin.delph-in.net/Chart_Mapping</a>)
                                              and found myself looking
                                              at the following lexical
                                              filtering rule from the
                                              ERG's lfr.tdl:<br>
                                              <blockquote> ;; throw out
                                                generic whenever a
                                                native entry is
                                                available, unless the
                                                token is<br>
                                                ;; a named entity (which
                                                now includes names
                                                activated because of
                                                mixed case or<br>
                                                ;; non-sentence-initial
                                                capitalization).<br>
                                                ;;<br>
                                                generic_non_ne+native_lfr
                                                :=
                                                lexical_filtering_rule
                                                &amp;<br>
                                                [ +CONTEXT &lt; [
                                                SYNSEM.PHON.ONSET
                                                con_or_voc ] &gt;,<br>
                                                  +INPUT &lt; [
                                                SYNSEM.PHON.ONSET
                                                unk_onset, ORTH.CLASS
                                                non_ne ] &gt;,<br>
                                                  +OUTPUT &lt; &gt;,<br>
                                                  +POSITION "I1@C1" ].<br>
                                                <br>
                                              </blockquote>
                                              Is it the case that I want
                                              the +CONTEXT and +INPUT to
                                              be exactly reversed with
                                              NO_DEFAULT_LES or
                                              DEFAULT_LES_POSGAPS_LEXGAPS?<br>
                                              <br>
                                              Thank you,<br>
                                              Paul
                                              <div>
                                                <div><br>
                                                  <br>
                                                  On 9/17/2013 4:54 PM,
                                                  Paul Haley wrote:<br>
                                                </div>
                                              </div>
                                            </div>
                                            <div>
                                              <div>
                                                <blockquote type="cite">Hi,
                                                  <br>
                                                  <br>
                                                  It seems that when I
                                                  send FSC w/ TNT tags
                                                  for some but not all
                                                  tokens I get ERG LEs
                                                  that do not satisfy
                                                  the provided tags when
                                                  using any of
                                                  NO_DEFAULT_LES,
                                                  DEFAULT_LES_ALL, or
                                                  DEFAULT_LES_POSGAPS_LEXGAPS. 
                                                  It does respect these
                                                  tags when there are no
                                                  corresponding ERG LEs,
                                                  however, which is
                                                  good. <br>
                                                  <br>
                                                  Is there a way that I
                                                  can get PET w/ the ERG
                                                  to respect the TNT
                                                  tags when provided but
                                                  otherwise use the ERG
                                                  LEs? <br>
                                                  <br>
                                                  Thank you, <br>
                                                  Paul <br>
                                                  <br>
                                                </blockquote>
                                                <br>
                                              </div>
                                            </div>
                                          </div>
                                        </blockquote>
                                      </div>
                                      <br>
                                    </div>
                                  </blockquote>
                                  <br>
                                </blockquote>
                                <br>
                              </div>
                            </div>
                          </div>
                        </blockquote>
                      </div>
                      <br>
                    </div>
                  </blockquote>
                  <br>
                </div>
              </div>
            </div>
          </blockquote>
        </div>
        <br>
      </div>
    </blockquote>
    <br>
  </body>
</html>