<div dir="ltr"><div>Thank you! Processing so much faster now :) </div>Megan </div><div class="gmail_extra"> <div class="gmail_quote">On Sun, May 26, 2013 at 11:17 PM, Woodley Packard <<a href="mailto:sweaglesw@sweaglesw.org" target="_blank">sweaglesw@sweaglesw.org</a>> wrote: <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">You could `cat' a file with one sentence per line into that same command, e.g.: $ cat test.txt "Squeak!" said the mouse. The dog said, "Woof." $ cat test.txt | ./logon/bin/cheap -t -repp -preprocess-only=yy logon/lingo/erg/english [.....] (1, 0, 1, <0:1>, 1, "“", 0, "null") (2, 1, 2, <1:7>, 1, "Squeak", 0, "null") (3, 2, 3, <7:8>, 1, "!", 0, "null") (4, 3, 4, <8:9>, 1, "”", 0, "null") (5, 4, 5, <10:14>, 1, "said", 0, "null") (6, 5, 6, <15:18>, 1, "the", 0, "null") (7, 6, 7, <19:24>, 1, "mouse", 0, "null") (8, 7, 8, <24:25>, 1, ".", 0, "null") (9, 0, 1, <0:3>, 1, "The", 0, "null") (10, 1, 2, <4:7>, 1, "dog", 0, "null") (11, 2, 3, <8:12>, 1, "said", 0, "null") (12, 3, 4, <12:13>, 1, ",", 0, "null") (13, 4, 5, <14:15>, 1, "“", 0, "null") (14, 5, 6, <15:19>, 1, "Woof", 0, "null") (15, 6, 7, <19:20>, 1, ".", 0, "null") (16, 7, 8, <20:21>, 1, "”", 0, "null") I guess you can separate the sentences by seeing when the "from" vertex identifier resets to 0. For an entirely different approach, you could try the -Ev options with ACE. The output contains the same data, but it is printed in a different format: $ cat test.txt | ~/cdev/ace/ace -g ~/cdev/ace/erg.dat -Ev 2>/dev/null | grep -v '^NOTE' “<0:1> Squeak<1:7> !<7:8> ”<8:9> said<10:14> the<15:18> mouse<19:24> .<24:25> The<0:3> dog<4:7> said<8:12> ,<12:13> “<14:15> Woof<15:19> .<19:20> ”<20:21> Good luck, Woodley <div class="HOEnZb"><div class="h5"> On May 26, 2013, at 11:04 PM, Megan Schneider wrote: > Does anyone know of a good way to get bulk REPP tokenization for a set of sentences? The one-by-one method appears to be: > > echo <sentence> | ./logon/bin/cheap -t -repp -preprocess-only=yy ./logon/lingo/erg/english > > Is there a good way to do this without needing to reload the rules/types every sentence? Not looking for a functional difference, just an efficiency difference. > > > Thanks! > Megan </div></div></blockquote></div> </div>