[developers] LKB splitting on apostrophe

Stephan Oepen oe at ifi.uio.no
Sun Jan 22 09:03:07 CET 2012


i would be surprised if the LKB had changed in this respect in recent years?  anyway, the REPP pre-processing language is stable (see ReppTop on the wiki) and supported in any self-respecting system (LKB, PET, ACE, agree).  so i would maybe start having Matrix-derived grammars include a vanilla ‘tokenizer.rpp’.  this way, things will be more transparent, and students (who need to) have full control over string-level pre-processing.

best, oe



On 22. jan. 2012, at 03:41, "Emily M. Bender" <ebender at uw.edu> wrote:

> Dear all,
> 
> Students in my class this term are reporting that the LKB is splitting
> words on the character ', even when it's not in the value of
> *punctuation-characters*.  Any idea why this might be?  Anything
> we can do about it?
> 
> Thanks,
> Emily
> 
> -- 
> Emily M. Bender
> Associate Professor
> Department of Linguistics
> Check out CLMS on facebook! http://www.facebook.com/uwclma




More information about the developers mailing list