[developers] [incr() tsdb] batch batch processing

Tue Nov 30 00:50:55 CET 2010

Thanks, Francis!  Some follow up questions, since I've
been wondering about this along with Mike.

On Mon, Nov 29, 2010 at 3:46 PM, Francis Bond <bond at ieee.org> wrote:
> G'day,
>
> On 30 November 2010 07:13, Michael Wayne Goodman
> <goodmami at u.washington.edu> wrote:
>> Hi folks,
>>
>> I'm trying to increase the speed of multiple calls to [incr tsdb()]'s
>> batch processing, and could use some help.
>>
>> Background: Part of Matrix development involves running a series of
>> regression tests where we customize grammars and compare (using [incr
>> tsdb()]) their parsing results to a gold standard. We have well over a
>> hundred test grammars, and running the whole set takes some time.
>>
>> While we're switching to parsing with PET, we found that most of the
>> time is spent loading LOGON's default Index.lsp. How can we prevent
>> [incr tsdb()] from loading this file on startup? I've had success
>> moving Index.lsp so it cannot be found, but this is not a very good
>> solution.
>
>    *       To set the database home:
>
> (tsdb:tsdb :home "/home/oe/src/itsdb/src/tsdb/home")
>
>    *       To set the location of skeletons
>
> (tsdb:tsdb :skeletons "/home/oe/src/lkb/src/tsdb/skeletons/english")
>
>
> If you point the skeletons to a directory with none, then it will be
> quicker :-).

But I think we need to point the skeletons to a directory with
skeletons, because we are using them to create profiles.

> You can put these in your .tsdbrc.  See:
> http://wiki.delph-in.net/moin/ItsdbCustomization
>
>> Also, do you have any tips for running batches of batch parsing jobs?
>> Specifically, can we load and parse all grammars within a single
>> running lisp instance? Currently we are sending lisp commands to the
>> LOGON scripts for each grammar separately, meaning all the time spent
>> starting up [incr tsdb()] happens for each grammar.
>>
>> Thanks,
>
> If the MRS globals and other settings are compatible, then you can do
> this, but our experience was it can be the source of extremely hard to
> track down weirdness, so it is not recommended.  For different
> versions of the same grammar though, it should be OK.

They are different grammars, but they all have the same collateral
files, since they're all generated by the customization system.

The question is how do we do it?

Emily

-- 
Emily M. Bender
Associate Professor
Department of Linguistics
Check out CLMA on facebook! http://www.facebook.com/uwclma