[developers] TDL syntax extension for regular expressions

Ann Copestake Ann.Copestake at cl.cam.ac.uk
Wed Jan 28 19:56:57 CET 2009


actually there is/was a use of ^ in the LKB TDL.  It was used in the
munging rules for Verbmobil etc - you can see it in
erg/data/alex.rules (I think this is in the distributed ERG).
Obviously this file is way out of date and that functionality is
presumably unnecessary given the MT code.  So I don't think there's a
problem with what you're suggesting, but just noting it in case any
one else is using it for something.  Code will need to be removed.

TDL templates as used in the ERG were supported in the LKB code for a
little while but never documented and explicitly discouraged in other
grammars.  I believe I removed the LKB support quite a few years ago.
Looks like $ is still a break character, but that's trivial.

Ann

> 
> dear all,
> 
> jointly with peter and dan, we are nearing completion on a first public
> release of the new `chart mapping' machinery (see Adolphs et al., 2008,
> at LREC).  in this context, we would like to add a syntax extension for
> regular expressions to TDL.  we were tempted to use /[a-z]+/, much like
> in awk, perl, et al.  but the slash is already in use for defaults (in
> the LKB).
> 
> our current proposal is ^[a-z]+$, i.e. delimit regular expressions with
> an opening cap and a closing dollar sign.  the rationale, here, is that
> we assume regular expressions to be implicitly anchored anyway (hence,
> to match a sub-string, a pattern will have to be padded: ^.*[a-z]+.*$).
> 
> for all i know, the ^ has no special meaning in TDL currently, while $
> used to have special meaning in template definitions.  templates are no
> longer supported in the DELPH-IN formalism (for all i know, the LKB has
> no support for TDL templates today).  hence, there should be no reason
> to not re-use the $ as a meaningful character.  ann and bernd, can you
> agree with this interpretation of DELPH-IN history?
> 
> beyond DELPH-IN, i am wondering about other usages of PET, specifically
> in the context of SProUT.  ulrich, if we were to eliminate the template
> support in flop, would anyone notice?
> 
>                             with thanks in advance; best wishes  -  oe
> 
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> +++ Universitetet i Oslo (IFI); Boks 1080 Blindern; 0316 Oslo; (+47) 2284 0125
> +++     CSLI Stanford; Ventura Hall; Stanford, CA 94305; (+1 650) 723 0515
> +++       --- oe at ifi.uio.no; oe at csli.stanford.edu; stephan at oepen.net ---
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++





More information about the developers mailing list