<div dir="ltr"><div dir="ltr"><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Sorry, I forgot one more point:</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">10. Disallow unification of strings or regexes with anything</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">This follows a conversation on Emily's student list about strings being primitive types. Currently they are just one term that's possible in a conjunction, so the syntax allows this:</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"> a := b & [ ATTR "string" & type & < list, ... > & "another string" ].</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">This allow applies to ^regex$ patterns. In the other thread we concluded that strings are of type 'string', where this type may be defined separate from the grammar, or may exist in a type hierarchy, but all quoted "strings" in TDL are like instances of that type and don't create new hierarchy entries (I'm not sure what type the regexes are, though). Furthermore, these strings should probably never appear in a conjunction with other types, and not with features. The only other term that makes sense in a conjunction with a string is a coreference, and indeed we see this in the ERG's mtr.tdl:</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"> ... PRED #pred & "~._v_", ...</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Anyway, the question is whether we enforce this in the TDL syntax, somewhere else, or not at all. Similarly, do we enforce in the syntax that regexes are not valid in type files (which is the case according to a comment in lkb/src/io-tdl/tdltypeinput.lsp)?<br></div></div></div><br><div class="gmail_quote"><div dir="ltr">On Thu, Sep 6, 2018 at 4:29 PM <a href="mailto:goodman.m.w@gmail.com">goodman.m.w@gmail.com</a> <<a href="mailto:goodman.m.w@gmail.com">goodman.m.w@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Hi all,</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">There are some remaining issues with TDL that I'd like to clean up. First I will summarize some decisions made (or at least not rejected) in previous email threads:<br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">1. Supertypes appear before other terms in a conjunction only by convention (not enforced in the syntax)<br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">2. Docstrings are triple-quoted and may appear before any top-level term or before the final . terminator</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">3. Comments may appear in definitions anywhere that spaces can, except within strings/regexes/affixing-patterns</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">The following changes are things I think people agree with, so I'd like to consider them as decided:<br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">4. Removal of the :< operator (if accepted as a variant of :=, throw a warning)</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">5. Removal of 'single-quoted-symbols</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">6. Removal of double-quoted "docstrings"<br></div><div><div style="font-family:arial,helvetica,sans-serif" class="gmail_default">7. Removal of non-regex uses of ^ (otherwise any BNF of TDL is necessarily incomplete because the "extended-syntax" use of ^ is open-ended)<br></div></div><div><br></div><div><div style="font-family:arial,helvetica,sans-serif" class="gmail_default">And there's at least one point I don't think we reached a decision on:</div><div style="font-family:arial,helvetica,sans-serif" class="gmail_default"><br></div><div style="font-family:arial,helvetica,sans-serif" class="gmail_default">8. Instances must have exactly 1 "supertype" (which is really just a type and not a supertype, i.e., it doesn't change the type hierarchy)<br></div></div><div><br></div><div><div style="font-family:arial,helvetica,sans-serif" class="gmail_default">Also:</div><div style="font-family:arial,helvetica,sans-serif" class="gmail_default"><br></div><div style="font-family:arial,helvetica,sans-serif" class="gmail_default">9. Does anyone know how wild-cards differ from letter-sets? I see HaG has a wild-card and suffix pattern like these:</div><div style="font-family:arial,helvetica,sans-serif" class="gmail_default"><br></div><div style="font-family:arial,helvetica,sans-serif" class="gmail_default"> %(wild-card (?g ui))</div><div style="font-family:arial,helvetica,sans-serif" class="gmail_default"> ...<br></div><div style="font-family:arial,helvetica,sans-serif" class="gmail_default"> %suffix (!c!v !c!vn) (!v?g !vn)<br></div><div style="font-family:arial,helvetica,sans-serif" class="gmail_default"></div><div style="font-family:arial,helvetica,sans-serif" class="gmail_default">My guess is that wild-cards match but are not used in the replacement, which I can imagine is useful if you want the replacement to use the second of two matches but not the first. It makes me wonder why we don't just use regex substitutions for these things.<br></div></div><div><br></div><div><div style="font-family:arial,helvetica,sans-serif" class="gmail_default">If nobody responds about (1)--(7), I'll make sure the syntax description on the TdlRfc wiki reflects those decisions.</div><br></div>-- <br><div dir="ltr" class="m_9202793427700650718gmail_signature">-Michael Wayne Goodman</div></div></div></div>
</blockquote></div><br clear="all"><br>-- <br><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature">-Michael Wayne Goodman</div>