[developers] TDL syntax extension for regular expressions

Stephan Oepen oe at ifi.uio.no
Wed Jan 28 13:40:57 CET 2009

dear all,

jointly with peter and dan, we are nearing completion on a first public
release of the new `chart mapping' machinery (see Adolphs et al., 2008,
at LREC).  in this context, we would like to add a syntax extension for
regular expressions to TDL.  we were tempted to use /[a-z]+/, much like
in awk, perl, et al.  but the slash is already in use for defaults (in
the LKB).

our current proposal is ^[a-z]+$, i.e. delimit regular expressions with
an opening cap and a closing dollar sign.  the rationale, here, is that
we assume regular expressions to be implicitly anchored anyway (hence,
to match a sub-string, a pattern will have to be padded: ^.*[a-z]+.*$).

for all i know, the ^ has no special meaning in TDL currently, while $
used to have special meaning in template definitions.  templates are no
longer supported in the DELPH-IN formalism (for all i know, the LKB has
no support for TDL templates today).  hence, there should be no reason
to not re-use the $ as a meaningful character.  ann and bernd, can you
agree with this interpretation of DELPH-IN history?

beyond DELPH-IN, i am wondering about other usages of PET, specifically
in the context of SProUT.  ulrich, if we were to eliminate the template
support in flop, would anyone notice?

                            with thanks in advance; best wishes  -  oe

+++ Universitetet i Oslo (IFI); Boks 1080 Blindern; 0316 Oslo; (+47) 2284 0125
+++     CSLI Stanford; Ventura Hall; Stanford, CA 94305; (+1 650) 723 0515
+++       --- oe at ifi.uio.no; oe at csli.stanford.edu; stephan at oepen.net ---

More information about the developers mailing list