[developers] TDL syntax extension for regular expressions

Francis Bond fcbond at gmail.com
Thu Jan 29 01:20:05 CET 2009


G'day,


> jointly with peter and dan, we are nearing completion on a first public
> release of the new `chart mapping' machinery (see Adolphs et al., 2008,
> at LREC).  in this context, we would like to add a syntax extension for
> regular expressions to TDL.  we were tempted to use /[a-z]+/, much like
> in awk, perl, et al.  but the slash is already in use for defaults (in
> the LKB).
>
> our current proposal is ^[a-z]+$, i.e. delimit regular expressions with
> an opening cap and a closing dollar sign.  the rationale, here, is that
> we assume regular expressions to be implicitly anchored anyway (hence,
> to match a sub-string, a pattern will have to be padded: ^.*[a-z]+.*$).


Do we gain anything by having separate symbols for start and end?  I would
have thought ^regexp^ can do the job just as well, without taking up another
precious character :-).



-- 
Francis Bond <http://www2.nict.go.jp/x/x161/en/member/bond/>
NICT Language Infrastructure Group
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20090129/b3635c73/attachment.html>


More information about the developers mailing list