<div dir="ltr">Thanks, Berthold, for the explanation. This may turn out to be useful<div>for Lushootseed and Nuu-chah-nulth, though the situation there is somewhat</div><div>different. In particular, I'm intrigued that you are treating __REDUP__ and</div><div>the base as separate lexical items. Is this because of the whitespace</div><div>conventions of the language? Or because it was easier to do things this</div><div>way than completely within the morphology? </div><div><br></div><div>Also, you said (in a separate exchange we had) that the older solution was </div><div>non-compositional ... presumably because you were using the native lex entry </div><div>for the redup form and then squashing its EPs. Am I understanding correctly </div><div>that this current solution is compositional because the ersatz item is </div><div>semantically empty?</div><div><br></div><div>Emily</div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Sun, Sep 13, 2015 at 6:45 PM, Francis Bond <span dir="ltr"><<a href="mailto:bond@ieee.org" target="_blank">bond@ieee.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Thanks Berthold (and Woodley).<br>
<br>
This looks very promising, if a bit daunting (my mastery of the chart<br>
mapping machinery is still shaky). We will try it out, and probably<br>
pester<br>
you with more questions.<br>
<div class="HOEnZb"><div class="h5"><br>
On Mon, Sep 14, 2015 at 8:00 AM, Berthold Crysmann<br>
<<a href="mailto:berthold.crysmann@gmail.com">berthold.crysmann@gmail.com</a>> wrote:<br>
> Hi Francis, Emily, and who else might be listening on this,<br>
><br>
> I have now got a version of my revised analysis working (thanks also to<br>
> Woodley for fixing a critical issue in ace for me).<br>
><br>
> Here's a brief description (for more details look at the HaG code<br>
> $LOGONROOT/llf/hag, or ask me):<br>
><br>
> Hausa has some 4 inflectional classes that use total reduplication. While<br>
> tonal patterns may differ between base and reduplicant, the reduplicant does<br>
> not undergo inflection different from the base. Thus, on the string level<br>
> (once we have dealt with suprasegmental markings), the reduplicant is<br>
> faithful to the base (lucky for me). We do get further inflectional marking<br>
> on the base, however.<br>
><br>
> E.g. in plural class 12, we get<br>
><br>
> nâs nâs `nurses'<br>
> sìkêt sìkêt `skirts'<br>
> jōjì jōjì `judges'<br>
><br>
> The latter can be morphologically possessed, yielding "jōjì jōjiìnmù" (our<br>
> judges).<br>
><br>
> So in this class, the reduplicant is just a copy of the base lexeme's orth<br>
> value (+ tone + length).<br>
><br>
> A more interesting case are augmentative adjectives. Here the base for the<br>
> productive formation is partially reduplicated in the masc and fem singular,<br>
> but one drops the partial reduplication in the plural, to have total<br>
> reduplication of a clipped base instead.<br>
><br>
> E.g. class 14:<br>
><br>
> mālàm bundumēmḕ -> mā̀làmai bundumā bùndùmā̀<br>
><br>
> Which can also undergo type raising, giving<br>
><br>
> bundumā bùndùmàn mā̀làmai<br>
><br>
> Note that the inflection -n on the base does not get copied to the<br>
> reduplicant.<br>
><br>
><br>
> As for the analysis, here is what I now do (in ace):<br>
><br>
> 1. Parsing<br>
><br>
> As a first step, I trigger an ersatz, in token mapping, on every item in the<br>
> chart that is followed by some other item.<br>
> The surface string I substitute is __REDUP__, which corresponds to exactly<br>
> one lexical item (redup_n).<br>
> I tried filtering already on this first step on partial identity of form<br>
> (with regexps), but that did not seem to work. The ersatzing records the<br>
> original string in +CARG. The rule applies after processing of<br>
> suprasegmentals (tone and length), and copies that information over as well.<br>
> See tmr/redup.tdl.<br>
><br>
> In a second step, I shall use lexical filtering, which applies after lookup<br>
> and morphological processing, to get rid of unlicensed reduplication entries<br>
> in the chart. I'll probably implement that beginning of next week.<br>
><br>
> During morphological processing, I introduce constraints for the reduplicant<br>
> as part of the morphological rules applying to the base. In particular, this<br>
> enables me to select precisely at which step in the derivation I want to<br>
> memoise the identity of the base. All the constraints on the reduplicant are<br>
> collected in MORPH.MCLASS.--REDUP. See rule n_pl12_lr (irules.tdl), or, even<br>
> better, n_pl14_lr.<br>
><br>
> In syntax, I use a binary rule to combine the reduplicant ersatz with the<br>
> base and impose all the constraints the base has for the shape of the<br>
> reduplicant (rule n-pl-reduplication in rules.tdl).<br>
><br>
> Different plural formation patterns require different levels of identity:<br>
> e.g. class 12 copies the segments and suprasegmentals of the lexeme, whereas<br>
> class 14 copies the segments of the derived plural and imposes a fixed H+<br>
> pattern on the reduplicant, and a fixed L+ tone pattern on the base. Either<br>
> of them exempts further inflectional markings from reduplication.<br>
><br>
> This is all done by the morphological rules applying to the base, which<br>
> also store the constraints re the reduplicant in MORPH.--REDUP, having a<br>
> value for orthography (--STEM string) and suprasegmentals (--SUPRA supra).<br>
> See the rules for class 12 and class 14 in irules.tdl. The binary<br>
> reduplication rule then imposes these constraints on the reduplicant.<br>
><br>
> Some sample sentences for you to test:<br>
><br>
> joji jojinmu sun zo<br>
> joji jojin sun zo<br>
> malamai bunduma bunduma sun zo<br>
><br>
> In the hag directory, you can test with<br>
><br>
> ace -l -g ace/hausa.dat<br>
><br>
> for parsing<br>
><br>
> or with<br>
><br>
> ace -T -g ace/hausa.dat|ace -e -l --show-gen-chart -g ace/hausa.g.dat<br>
><br>
> for generation.<br>
><br>
><br>
> 2. Generation<br>
><br>
> Using a single generic entry for all reduplicants, I can now trigger the one<br>
> I need based on very general properties, i.e. plural in Hausa. Look at<br>
> reduplication.mtr for reference. Since there's only one rule left now, it is<br>
> soon going to move into the main trigger.mtr.<br>
><br>
> Using an ersatz, I have to replace the generic __REDUP__ phonology with<br>
> something sensible at some point: I use post-generation chart mapping to<br>
> copy over the relevant string from the base. The relevant rule is the third<br>
> or fourth one down in tmr/post-generation.tdl. Likewise, I copy over the<br>
> constraints on tone and vowel length (SUPRA) as well, which are imposed in<br>
> the --REDUP.--SUPRA features.<br>
><br>
><br>
> 3. Conclusion<br>
><br>
> To summarise, the new solution scales up to open classes, and it is fully<br>
> supported in ace. As for the LKB, you can test with the ersatz as input,<br>
> e.g. "__redup__ joji sun zo" parses and generates. Generation will only use<br>
> the ersatz. I shall look into refining my user function to get proper<br>
> generation output.<br>
><br>
> For the moment, I kept the old analysis alive for testing in the LKB: if you<br>
> type in "nas nas sun zo", you should still get an analysis, and you can even<br>
> generate from it (using the ersatz). In ace, this "native" analysis is<br>
> disabled by requiring an empty RELS on the reduplicant (as a type addendum<br>
> on the phrasal type), which will filter out the native contentful entry, and<br>
> keep the semantically empty one (the ersatz), thus avoiding any spurious<br>
> ambiguity. Since the LKB analyses are a superset of the ace analyses, there<br>
> will not be any issue w.r.t. treebanking.<br>
><br>
> As I feel about it, total reduplication is the killer argument for having<br>
> chart mapping in the LKB, or else for having a complete ace-based<br>
> development environment.<br>
><br>
> 4. Outlook: Chinese and Indonesian<br>
><br>
> As far as I can tell, my approach for Hausa should be straightforward to<br>
> port to these two languages. There is the question of the X: you would<br>
> probably treat that as a morphophonological effect of some rule application<br>
> (I guess). So, given our technologies, the most straightforward thing to do<br>
> is place all morphophonological changes on the base. That leaves you free to<br>
> impose simple total identity on the reduplicant: just choose the right<br>
> moment in the derivation (thanks to Woodley who has just fixed the recording<br>
> of orth values in ace an hour or two ago). If that should not be viable<br>
> (there are languages like that), one needs to replicate some of the<br>
> morphology in token mapping. For reference, look at how the ERG deals with<br>
> plurals of unknown words.<br>
><br>
><br>
> All the best,<br>
><br>
> Berthold<br>
><br>
><br>
> On 08/09/15 16:26, Berthold Crysmann wrote:<br>
>><br>
>> Hi Francis,<br>
>><br>
>> I guess you only worry about total reduplication. While in principle in<br>
>> Chinese you could get away with using string unification (thanks to the<br>
>> script, word length is limited), but specifying the letter set will not be<br>
>> much for for either humans or machines....<br>
>><br>
>> I am currently working on total reduplication in my Hausa grammar. I had a<br>
>> first solution that is non-compositional in the semantics. I.e. I just use a<br>
>> binary rule that glues the stuff together, conditioned on identity of<br>
>> predicates, but throws away the semantic contribution in the reduplicant.<br>
>> Works in parsing, but needs *item-specific* trigger rules in generation.<br>
>><br>
>> Right now, I am exploring with ersatzing. Things should work well as long<br>
>> as only one of the reduplicants undergoes additional morphology. Otherwise,<br>
>> you'll have to memoise parts of the original string, so you can apply<br>
>> regular morphophonological changes to the reduplicant. Seems to work in the<br>
>> LKB.<br>
>><br>
>> I shall commit that new analysis very soon. I shall also send you a<br>
>> detailed description.<br>
>><br>
>> Cheers,<br>
>><br>
>> Berthold<br>
>><br>
>> On 08/09/15 15:28, Francis Bond wrote:<br>
>>><br>
>>> G'day,<br>
>>><br>
>>> we are working with a couple of languages where we would like to be<br>
>>> able to write lexical rules that do things like:<br>
>>><br>
>>> Take a two character word in Chinese (AB)<br>
>>> AB -> ABAB<br>
>>> AB -> AABB<br>
>>> AB -> AAB<br>
>>> AB -> ABXAB (where X is fixed by the rule)<br>
>>> AB -> AXAB<br>
>>> AB -> AXB<br>
>>><br>
>>> Take a one character word (A):<br>
>>> A -> AA<br>
>>> A -> AXA<br>
>>> A -> AAX<br>
>>><br>
>>> In Indonesian we want to take an arbitrary word and produce a duplicate<br>
>>> w -> w-w (kasus -> kasus-kasus)<br>
>>><br>
>>> More examples here:<br>
>>> <a href="http://moin.delph-in.net/LADChineseReduplication" rel="noreferrer" target="_blank">http://moin.delph-in.net/LADChineseReduplication</a><br>
>>> and<br>
>>> <a href="http://moin.delph-in.net/LADChineseAnotA" rel="noreferrer" target="_blank">http://moin.delph-in.net/LADChineseAnotA</a><br>
>>><br>
>>> Is this (or some of this) possible with the DELPH-IN tools? If so,<br>
>>> can someone explain how to do it (or point to a paper or website that<br>
>>> tells us how to do it)?<br>
>>><br>
>>> Thanks in advance,<br>
>>><br>
>>><br>
>><br>
>><br>
><br>
><br>
> --<br>
> Berthold Crysmann<<a href="mailto:crysmann@linguist.jussieu.fr">crysmann@linguist.jussieu.fr</a>><br>
> CNRS, Laboratoire de linguistique formelle (UMR 7110), U Paris Diderot<br>
> Case 7031, 5 rue Thomas Mann, 75205 Paris cedex 13<br>
> Bureau 545, bâtiment Olympe de Gouges, rue Albert Einstein, 75013 Paris<br>
><br>
<br>
<br>
<br>
</div></div><div class="HOEnZb"><div class="h5">--<br>
Francis Bond <<a href="http://www3.ntu.edu.sg/home/fcbond/" rel="noreferrer" target="_blank">http://www3.ntu.edu.sg/home/fcbond/</a>><br>
Division of Linguistics and Multilingual Studies<br>
Nanyang Technological University<br>
<br>
</div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature"><div dir="ltr">Emily M. Bender<br>Professor, Department of Linguistics<br>Check out CLMS on facebook! <a href="http://www.facebook.com/uwclma" target="_blank">http://www.facebook.com/uwclma</a><br></div></div>
</div>