[developers] parsers & performance

Mon Oct 26 11:40:01 CET 2015

Thanks, Woodley, for the additional information, and my apologies for not replying sooner.
You have both been extremely helpful.

Best wishes,
Nurit

-----Original Message-----
From: Woodley Packard [mailto:sweaglesw at sweaglesw.org] 
Sent: Friday, October 23, 2015 9:12 AM
To: Stephan Oepen
Cc: Nurit Melnik; developers at delph-in.net
Subject: Re: [developers] parsers & performance

Thanks indeed Stephan — did you say "brevity"?  I would say you gave a rather subtantial (if necessarily non-exhaustive) summary of the history of DELPH-IN parsing efficiency :-)

Since you thoughtfully left room for a few words on ACE, I’ll point out a couple of places I found room to sing a slightly different tune than PET and the LKB.

* The ACE implementation of hyperactive packing is largely what is described in Oep/Col/02.pdf, as you attached.  However, ACE has a configuration parameter to specify that *some* rules should be dealt with hyperactively (i.e. reproduce the AVM for the active edge on demand instead of copying it) while others should be processed in a vanilla active parsing strategy (i.e. keep a copy around).  The determination of which rules go in which category is empirical.

* Extraction of quickcheck vectors is optimized and compiled into native code.

* Careful book-keeping of the scratch slots called for in the Tomabechi unifier, allowing the "new type" and "forward" slots to share the same memory space.

* Packing under subsumption as described by Oepen & Carroll is extended to allow two edges two pack even when neither subsumes the other, under some circumstances.  I call this packing under *generalization* (i.e. an AVM is constructed that subsumes both).

None of these extensions are described in publications, unfortunately.  The first two, for all I recall, only give quite minor speedups.  The third yielded a roughly 20% reduction in memory usage for ACE.  The last is roughly neutral on short and medium length sentences, but can yield dramatic speedups (and considerably less often dramatic slowdowns in the unpacking phase) on long sentences.

Sorry there’s nothing to cite!  Regards, Woodley

> On Oct 22, 2015, at 1:56 PM, Stephan Oepen <oe at ifi.uio.no> wrote:
> 
> nurit,
> 
>> Can anyone point me to citable information about the efficiency of 
>> HPSG parsers and how it has improved over the years?
> 
> that is an interesting question :-).  exact figures may be non-trivial 
> to come by, and a lot will depend on choices in set-up, e.g. the 
> specific grammar(s) and test data assumed.
> 
> i believe the oldest such figures we have available are from the 
> mid-1990s, comparing the PAGE, LKB, and PET parsers, using different 
> ERG versions and a smallish sample of Verbmobil inputs dubbed ‘aged’.
> Oepen & Callmeier (2002;
> http://link.springer.com/chapter/10.1007%2F1-4020-2295-6_19) report
> 36.7 seconds per input for PAGE in mid-1996 and 0.14 seconds for PET 
> in mid-1999 (on a 300-megahertz ultrasparc server).  taking into 
> account increased grammatical coverage between the two measurements, 
> we conjectured an effective speed-up of three orders of magnitude; see 
> Section 4 in the first attachment below.  the main technological 
> innovations over that period, i would say, were greatly improved 
> unifier throughput, avoidance of unnecessary unifications, parsing 
> strategy optimizations, and careful engineering (moving from Lisp to
> C++).
> 
> for all i recall, additional major speed-ups were later obtained from 
> the introduction of (a) subsumption-based ambiguity packing, (b) 
> n-best selective unpacking, and (c) ubertagging.  the former two were 
> originally implemented in the LKB and later imported into PET; 
> ubertagging was first developed in PET and recently integrated with 
> ACE.  unfortunately, i doubt the (relatively simple) ‘aged’ data is 
> complex enough to fully make visible these advances, and a 
> head-to-head comparison to the mid-1990s numbers seems impossible for 
> others reasons anyway.  as i parse ‘aged' today (on my 2-gigahertz
> laptop) using PET and ACE, i see average parse times per input of 0.23 
> and 0.15 seconds, respectively.  comparing to Table 1.5 of that old 
> paper, i estimate that enlarged grammatical coverage has increased the 
> search space by a factor of about twenty: there are on average 137 
> readings per input now, compared to seven back then.
> 
> as for published figures on improvements since around 2000, Oepen & 
> Carroll (2000; https://aclweb.org/anthology/A/A00/A00-2022.pdf) report 
> a speed-up of up to a factor of 11 (in the LKB implementation) for 
> subsumption-based packing, using a larger sample of Verbmobil 
> sentences with substantially higher average length (called ‘blend’); 
> please see Table 2 in the above paper.  i seem to recall that one 
> order of magnitude speed-up (on sufficiently complex inputs) was 
> preserved when importing this technique into PET, but i cannot easily 
> think of a publication for that result.
> 
> as for the introduction of selective unpacking, Zhang et al. (2007;
> https://aclweb.org/anthology/W/W07/W07-2207.pdf) report a speed-up by 
> a little more than a factor of two, on the JH4 section of the LOGON 
> corpus (see Table 2 and Figure 4), but much like for ambiguity packing 
> the relative benefits of selective unpacking would be amplified by 
> more complex inputs, for example the venerable WSJ text.
> 
> up to this point, the methods referred to above maintain what one 
> might call the ‘exact inference’ property of parsing with DELPH-IN 
> grammars, i.e. the parser will consider the full search space 
> (constructing a complete forest) and guarantee retrieval of the 
> globally top-ranked n-best trees from the forest.  with the addition 
> of lexical pruning based on ubertagging, Dridan (2013;
> https://aclweb.org/anthology/D/D13/D13-1120.pdf) introduces more 
> approximative techniques into our tool chain—as have long been common 
> in almost all statistical parsing work and systems like Enju and 
> C&C—and, thus, there are more choices and trade-offs to consider:
> giving up various degrees of accuracy or coverage (or both), in 
> principle, one can make a parser arbitrarily fast :-).
> 
> for the configuration that Dridan (2013) considers a good balance 
> along these dimensions, she reports speed-ups of around a factor of 
> three (on WSJ, CB, and WeScience data), while maintaining or 
> moderately improving accuracy and coverage (see Table 5).  still, 
> ubertagging has not quite turned into the ‘standard’ parsing set-up 
> for the ERG or other DELPH-IN grammars, for all i can see—possibly 
> because of the extra model preparation overhead, possibly because of 
> its approximative nature.  when in doubt, i would recommend its use in 
> efficiency-critical settings.
> 
> finally, PET and ACE implement the same general inventory of 
> techniques (including all mentioned above), but i believe ACE has seen 
> some additional optimization of data structures and critical code, as 
> well as several refinements to hyper-active parsing and ambiguity 
> packing; to date, i think there is no summary publication of these 
> improvements and associated efficiency gains.  i shall let woodley 
> speak more for ACE, but i would generally expect it to be somewhat 
> faster (and possibly a little more memory-demanding) than PET, maybe 
> on average up to a factor of two.
> 
> —so, in summary, DELPH-IN parsers to date may be about 1000 x 10 x 2 x
> 3 x 2 times faster than the mid-1990s PAGE system, or about five 
> orders of magnitude, not taking into account hardware evolution :-).
> at the same time, our grammars have grown much bigger and more 
> complex, and for the ERG at least i would personally say that current 
> parsing efficiency is sufficient for many tasks but inadequate for 
> others, in particular ‘web-scale’ applications.  we used to feel 
> ‘slow’ compared to, say, the Charniak & Johnson parser ten or fifteen 
> years ago, and i would say we have caught up to C&J (while still 
> solving a deeply more interesting problem, in my view).  but in 
> comparison to Enju, recent CCG work, or a system like TurboParser (or 
> what we know about the dependency parser used internally at Google), i 
> am afraid one would have to consider HPSG parsing comparatively 
> expensive, still.  you get what you pay for.
> 
> on this note, let me take the opportunity to share an unpublished 
> manuscript (from five years ago; see the second attachment), where i 
> tried to reflect on technology development in DELPH-IN since the 
> mid-1990s.  this was originally written for a festschrift volume 
> (which sadly never manifested), so please pardon the style.  while 
> writing this email (and also the draft festschrift contribution), i 
> worry that my subjective looking back is bound to omit important 
> evolutionary steps; there has been lots more work on efficiency and 
> robustness, and not all of it has been synthesized into the ‘standard’
> DELPH-IN parsing tools.  thus, all, please correct and complement as 
> you see fit, and please forgive the brevity and haste of this message!
> 
> best wishes, oe
> <Oep:Cal:02.pdf><uszkoreit.pdf>