[developers] PET scripting (REPP)

Megan Schneider caelum at gmail.com
Tue May 28 08:38:50 CEST 2013


Thank you! Processing so much faster now :)


Megan


On Sun, May 26, 2013 at 11:17 PM, Woodley Packard
<sweaglesw at sweaglesw.org>wrote:

> You could `cat' a file with one sentence per line into that same command,
> e.g.:
>
> $ cat test.txt
> "Squeak!" said the mouse.
> The dog said, "Woof."
> $ cat test.txt | ./logon/bin/cheap -t -repp -preprocess-only=yy
> logon/lingo/erg/english
> [.....]
> (1, 0, 1, <0:1>, 1, "“", 0, "null")
> (2, 1, 2, <1:7>, 1, "Squeak", 0, "null")
> (3, 2, 3, <7:8>, 1, "!", 0, "null")
> (4, 3, 4, <8:9>, 1, "”", 0, "null")
> (5, 4, 5, <10:14>, 1, "said", 0, "null")
> (6, 5, 6, <15:18>, 1, "the", 0, "null")
> (7, 6, 7, <19:24>, 1, "mouse", 0, "null")
> (8, 7, 8, <24:25>, 1, ".", 0, "null")
> (9, 0, 1, <0:3>, 1, "The", 0, "null")
> (10, 1, 2, <4:7>, 1, "dog", 0, "null")
> (11, 2, 3, <8:12>, 1, "said", 0, "null")
> (12, 3, 4, <12:13>, 1, ",", 0, "null")
> (13, 4, 5, <14:15>, 1, "“", 0, "null")
> (14, 5, 6, <15:19>, 1, "Woof", 0, "null")
> (15, 6, 7, <19:20>, 1, ".", 0, "null")
> (16, 7, 8, <20:21>, 1, "”", 0, "null")
>
> I guess you can separate the sentences by seeing when the "from" vertex
> identifier resets to 0.
>
> For an entirely different approach, you could try the -Ev options with
> ACE.  The output contains the same data, but it is printed in a different
> format:
>
> $ cat test.txt | ~/cdev/ace/ace -g ~/cdev/ace/erg.dat -Ev 2>/dev/null |
> grep -v '^NOTE'
> “<0:1> Squeak<1:7> !<7:8> ”<8:9> said<10:14> the<15:18> mouse<19:24>
> .<24:25>
>
>
> The<0:3> dog<4:7> said<8:12> ,<12:13> “<14:15> Woof<15:19> .<19:20>
> ”<20:21>
>
>
> Good luck,
> Woodley
>
> On May 26, 2013, at 11:04 PM, Megan Schneider wrote:
>
> > Does anyone know of a good way to get bulk REPP tokenization for a set
> of sentences? The one-by-one method appears to be:
> >
> > echo <sentence> | ./logon/bin/cheap -t -repp -preprocess-only=yy
> ./logon/lingo/erg/english
> >
> > Is there a good way to do this without needing to reload the rules/types
> every sentence? Not looking for a functional difference, just an efficiency
> difference.
> >
> >
> > Thanks!
> > Megan
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20130527/aa840388/attachment.html>


More information about the developers mailing list