[developers] sentence splitting
Ann.Copestake at cl.cam.ac.uk
Wed May 9 14:25:00 CEST 2007
is anyone using an XML-aware sentence splitter? By this I mean something that
can be given XML marked up text and parameterized to put sentence boundaries
in `sensible' places given the markup.
e.g., with a file containing:
This is sentence is partly in <IT>italics.</IT>
we would like the sentence boundary to be inserted after the </IT>
I am aware there are lots of issues in doing this properly, but leads would be
appreciated. As usual, we're interested in solutions that would be generally
available, preferably Open Source, not proprietary software.
More information about the developers