[developers] TDL syntax extension for regular expressions
fcbond at gmail.com
Thu Jan 29 01:20:05 CET 2009
> jointly with peter and dan, we are nearing completion on a first public
> release of the new `chart mapping' machinery (see Adolphs et al., 2008,
> at LREC). in this context, we would like to add a syntax extension for
> regular expressions to TDL. we were tempted to use /[a-z]+/, much like
> in awk, perl, et al. but the slash is already in use for defaults (in
> the LKB).
> our current proposal is ^[a-z]+$, i.e. delimit regular expressions with
> an opening cap and a closing dollar sign. the rationale, here, is that
> we assume regular expressions to be implicitly anchored anyway (hence,
> to match a sub-string, a pattern will have to be padded: ^.*[a-z]+.*$).
Do we gain anything by having separate symbols for start and end? I would
have thought ^regexp^ can do the job just as well, without taking up another
precious character :-).
Francis Bond <http://www2.nict.go.jp/x/x161/en/member/bond/>
NICT Language Infrastructure Group
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the developers