<div dir="ltr">Thanks, Berthold, for the explanation. This may turn out to be useful<div>for Lushootseed and Nuu-chah-nulth, though the situation there is somewhat</div><div>different. In particular, I'm intrigued that you are treating __REDUP__ and</div><div>the base as separate lexical items. Is this because of the whitespace</div><div>conventions of the language? Or because it was easier to do things this</div><div>way than completely within the morphology? </div><div> </div><div>Also, you said (in a separate exchange we had) that the older solution was </div><div>non-compositional ... presumably because you were using the native lex entry </div><div>for the redup form and then squashing its EPs. Am I understanding correctly </div><div>that this current solution is compositional because the ersatz item is </div><div>semantically empty?</div><div> </div><div>Emily</div><div> </div></div><div class="gmail_extra"> <div class="gmail_quote">On Sun, Sep 13, 2015 at 6:45 PM, Francis Bond <<a href="mailto:bond@ieee.org" target="_blank">bond@ieee.org</a>> wrote: <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Thanks Berthold (and Woodley). This looks very promising, if a bit daunting (my mastery of the chart mapping machinery is still shaky). We will try it out, and probably pester you with more questions. <div class="HOEnZb"><div class="h5"> On Mon, Sep 14, 2015 at 8:00 AM, Berthold Crysmann <<a href="mailto:berthold.crysmann@gmail.com">berthold.crysmann@gmail.com</a>> wrote: > Hi Francis, Emily, and who else might be listening on this, > > I have now got a version of my revised analysis working (thanks also to > Woodley for fixing a critical issue in ace for me). > > Here's a brief description (for more details look at the HaG code > $LOGONROOT/llf/hag, or ask me): > > Hausa has some 4 inflectional classes that use total reduplication. While > tonal patterns may differ between base and reduplicant, the reduplicant does > not undergo inflection different from the base. Thus, on the string level > (once we have dealt with suprasegmental markings), the reduplicant is > faithful to the base (lucky for me). We do get further inflectional marking > on the base, however. > > E.g. in plural class 12, we get > > nâs nâs `nurses' > sìkêt sìkêt `skirts' > jōjì jōjì `judges' > > The latter can be morphologically possessed, yielding "jōjì jōjiìnmù" (our > judges). > > So in this class, the reduplicant is just a copy of the base lexeme's orth > value (+ tone + length). > > A more interesting case are augmentative adjectives. Here the base for the > productive formation is partially reduplicated in the masc and fem singular, > but one drops the partial reduplication in the plural, to have total > reduplication of a clipped base instead. > > E.g. class 14: > > mālàm bundumēmḕ -> mā̀làmai bundumā bùndùmā̀ > > Which can also undergo type raising, giving > > bundumā bùndùmàn mā̀làmai > > Note that the inflection -n on the base does not get copied to the > reduplicant. > > > As for the analysis, here is what I now do (in ace): > > 1. Parsing > > As a first step, I trigger an ersatz, in token mapping, on every item in the > chart that is followed by some other item. > The surface string I substitute is __REDUP__, which corresponds to exactly > one lexical item (redup_n). > I tried filtering already on this first step on partial identity of form > (with regexps), but that did not seem to work. The ersatzing records the > original string in +CARG. The rule applies after processing of > suprasegmentals (tone and length), and copies that information over as well. > See tmr/redup.tdl. > > In a second step, I shall use lexical filtering, which applies after lookup > and morphological processing, to get rid of unlicensed reduplication entries > in the chart. I'll probably implement that beginning of next week. > > During morphological processing, I introduce constraints for the reduplicant > as part of the morphological rules applying to the base. In particular, this > enables me to select precisely at which step in the derivation I want to > memoise the identity of the base. All the constraints on the reduplicant are > collected in MORPH.MCLASS.--REDUP. See rule n_pl12_lr (irules.tdl), or, even > better, n_pl14_lr. > > In syntax, I use a binary rule to combine the reduplicant ersatz with the > base and impose all the constraints the base has for the shape of the > reduplicant (rule n-pl-reduplication in rules.tdl). > > Different plural formation patterns require different levels of identity: > e.g. class 12 copies the segments and suprasegmentals of the lexeme, whereas > class 14 copies the segments of the derived plural and imposes a fixed H+ > pattern on the reduplicant, and a fixed L+ tone pattern on the base. Either > of them exempts further inflectional markings from reduplication. > > This is all done by the morphological rules applying to the base, which > also store the constraints re the reduplicant in MORPH.--REDUP, having a > value for orthography (--STEM string) and suprasegmentals (--SUPRA supra). > See the rules for class 12 and class 14 in irules.tdl. The binary > reduplication rule then imposes these constraints on the reduplicant. > > Some sample sentences for you to test: > > joji jojinmu sun zo > joji jojin sun zo > malamai bunduma bunduma sun zo > > In the hag directory, you can test with > > ace -l -g ace/hausa.dat > > for parsing > > or with > > ace -T -g ace/hausa.dat|ace -e -l --show-gen-chart -g ace/hausa.g.dat > > for generation. > > > 2. Generation > > Using a single generic entry for all reduplicants, I can now trigger the one > I need based on very general properties, i.e. plural in Hausa. Look at > reduplication.mtr for reference. Since there's only one rule left now, it is > soon going to move into the main trigger.mtr. > > Using an ersatz, I have to replace the generic __REDUP__ phonology with > something sensible at some point: I use post-generation chart mapping to > copy over the relevant string from the base. The relevant rule is the third > or fourth one down in tmr/post-generation.tdl. Likewise, I copy over the > constraints on tone and vowel length (SUPRA) as well, which are imposed in > the --REDUP.--SUPRA features. > > > 3. Conclusion > > To summarise, the new solution scales up to open classes, and it is fully > supported in ace. As for the LKB, you can test with the ersatz as input, > e.g. "__redup__ joji sun zo" parses and generates. Generation will only use > the ersatz. I shall look into refining my user function to get proper > generation output. > > For the moment, I kept the old analysis alive for testing in the LKB: if you > type in "nas nas sun zo", you should still get an analysis, and you can even > generate from it (using the ersatz). In ace, this "native" analysis is > disabled by requiring an empty RELS on the reduplicant (as a type addendum > on the phrasal type), which will filter out the native contentful entry, and > keep the semantically empty one (the ersatz), thus avoiding any spurious > ambiguity. Since the LKB analyses are a superset of the ace analyses, there > will not be any issue w.r.t. treebanking. > > As I feel about it, total reduplication is the killer argument for having > chart mapping in the LKB, or else for having a complete ace-based > development environment. > > 4. Outlook: Chinese and Indonesian > > As far as I can tell, my approach for Hausa should be straightforward to > port to these two languages. There is the question of the X: you would > probably treat that as a morphophonological effect of some rule application > (I guess). So, given our technologies, the most straightforward thing to do > is place all morphophonological changes on the base. That leaves you free to > impose simple total identity on the reduplicant: just choose the right > moment in the derivation (thanks to Woodley who has just fixed the recording > of orth values in ace an hour or two ago). If that should not be viable > (there are languages like that), one needs to replicate some of the > morphology in token mapping. For reference, look at how the ERG deals with > plurals of unknown words. > > > All the best, > > Berthold > > > On 08/09/15 16:26, Berthold Crysmann wrote: >> >> Hi Francis, >> >> I guess you only worry about total reduplication. While in principle in >> Chinese you could get away with using string unification (thanks to the >> script, word length is limited), but specifying the letter set will not be >> much for for either humans or machines.... >> >> I am currently working on total reduplication in my Hausa grammar. I had a >> first solution that is non-compositional in the semantics. I.e. I just use a >> binary rule that glues the stuff together, conditioned on identity of >> predicates, but throws away the semantic contribution in the reduplicant. >> Works in parsing, but needs *item-specific* trigger rules in generation. >> >> Right now, I am exploring with ersatzing. Things should work well as long >> as only one of the reduplicants undergoes additional morphology. Otherwise, >> you'll have to memoise parts of the original string, so you can apply >> regular morphophonological changes to the reduplicant. Seems to work in the >> LKB. >> >> I shall commit that new analysis very soon. I shall also send you a >> detailed description. >> >> Cheers, >> >> Berthold >> >> On 08/09/15 15:28, Francis Bond wrote: >>> >>> G'day, >>> >>> we are working with a couple of languages where we would like to be >>> able to write lexical rules that do things like: >>> >>> Take a two character word in Chinese (AB) >>> AB -> ABAB >>> AB -> AABB >>> AB -> AAB >>> AB -> ABXAB (where X is fixed by the rule) >>> AB -> AXAB >>> AB -> AXB >>> >>> Take a one character word (A): >>> A -> AA >>> A -> AXA >>> A -> AAX >>> >>> In Indonesian we want to take an arbitrary word and produce a duplicate >>> w -> w-w (kasus -> kasus-kasus) >>> >>> More examples here: >>> <a href="http://moin.delph-in.net/LADChineseReduplication" rel="noreferrer" target="_blank">http://moin.delph-in.net/LADChineseReduplication</a> >>> and >>> <a href="http://moin.delph-in.net/LADChineseAnotA" rel="noreferrer" target="_blank">http://moin.delph-in.net/LADChineseAnotA</a> >>> >>> Is this (or some of this) possible with the DELPH-IN tools? If so, >>> can someone explain how to do it (or point to a paper or website that >>> tells us how to do it)? >>> >>> Thanks in advance, >>> >>> >> >> > > > -- > Berthold Crysmann<<a href="mailto:crysmann@linguist.jussieu.fr">crysmann@linguist.jussieu.fr</a>> > CNRS, Laboratoire de linguistique formelle (UMR 7110), U Paris Diderot > Case 7031, 5 rue Thomas Mann, 75205 Paris cedex 13 > Bureau 545, bâtiment Olympe de Gouges, rue Albert Einstein, 75013 Paris > </div></div><div class="HOEnZb"><div class="h5">-- Francis Bond <<a href="http://www3.ntu.edu.sg/home/fcbond/" rel="noreferrer" target="_blank">http://www3.ntu.edu.sg/home/fcbond/</a>> Division of Linguistics and Multilingual Studies Nanyang Technological University </div></div></blockquote></div> <div> </div>-- <div class="gmail_signature"><div dir="ltr">Emily M. Bender Professor, Department of Linguistics Check out CLMS on facebook! <a href="http://www.facebook.com/uwclma" target="_blank">http://www.facebook.com/uwclma</a> </div></div> </div>