[developers] Extracting and representing tree structures

Petter Haugereid petterha at gmail.com
Tue Nov 27 07:48:19 CET 2012


Hi everybody,

As some of you know, the grammar I am writing has rather
unconventional (left-branching) tree structures. I am currently
writing a grammar for Hebrew, and the need for more readable tree
representations where the constituent structure is represented has
become apparent. (The constituent structure is retrievable from the
AVM of the parsed string.) I have written a python script that reads
the AVM text files exported from itsdb and produces latex trees.
Ideally I would have liked the LKB to produce the trees. The AVMs I
have look something like this:

[ popping-rule
   HEAD  @1 verb
   STACK @3< >
   SLASH  < >
   ARGS  <
[ HEAD  compl
   STACK @2< [ HEAD  @1 ] >$oplus$@3
   SLASH  < >
   ARGS <
[ HEAD  compl
   STACK @2
   SLASH  < >
   ARGS <
[ embedding-rule
   HEAD  compl
   STACK @2< [ HEAD  @1 ] > $oplus$ @3
   SLASH  < >
   ARGS <
[ HEAD  @1 verb
   STACK  @3<>
   SLASH  < >
   ARGS <
[ HEAD   verb
   STACK  @3<>
   SLASH  < @4 >
   ARGS < @4[ ORTH   Jon
             HEAD   noun ],
          [ ORTH   sier
             HEAD   verb ] > ] > ],
          [ ORTH   at
             HEAD   compl ] > ],
          [ ORTH   han
             HEAD   noun ] > ],
          [ ORTH   sov
             HEAD   verb ] > ] > ]

The constituent structure can be retrieved by looking at the STACK
feature of the mother of each word (The STACK shows the path to the
root):

[ VP
  [ N Jon],
  [ V sier],
  [ CP
    [ C at]
    [ NP han]
    [ V sov ] ] ]

Is there a way to do this with the LKB? And can it be done without
changing the source code?
(I apologize about my previous unfinished mail if you received it.)

Best,

Petter
-- 
Petter Haugereid
Postdoctoral Researcher
Department of Computer Science
University of Haifa


More information about the developers mailing list