[lkb] Parsing rich data with LKB

Glenn Slayden glenn at thai-language.com
Sat Mar 20 20:28:24 CET 2010


I'm certainly no expert on the LKB, but only two approaches come to my mind.
Both are hacky.

1. you could attach your metadata to lemmas as coded suffixes, and then use
the morphological features of the LKB to map these into the appropriate
feature structures. This approach would allow you to use a static grammar to
parse unseen sentences, as you probably want.

2. you could programmatically generate TDL with the feature structures for a
particular input, and then load that as part of a "grammar" which is a
grammar that is specific for parsing that one input.

Best,

Glenn

-----Original Message-----
From: lkb-bounces at emmtee.net [mailto:lkb-bounces at emmtee.net] On Behalf Of
Katya Alahverdzhieva
Sent: Wednesday, March 17, 2010 6:48 AM
To: lkb at delph-in.net
Subject: [lkb] Parsing rich data with LKB

Dear LKB people,

How would you go about using LKB to parse data that is richer than just 
text, and also to define temporal constraints? How do I parse data which 
comes not as a stream of tokens, but as a list of feature structures?

I have a corpus of transcriptions of spoken text, annotated with gesture 
and prosody information, including the time of their performance. I'm 
trying to write a grammar in LKB whose rules take into account the 
timestamps, the pitch accents and the gestures represented as sets of 
feature-values.

For instance, I need to somehow capture in my grammar rules the notion 
of temporal overlap, i.e., whether a gesture is happening at the same 
time as a word/sequence of words. Also, I am trying to parse richer data 
where words are not just tokens, but whole feature structures 
(containing prosody, timestamps and gesture description).

In practical terms, what would everyone's recommended approach be to 
parsing structured data like this and to comparing temporal 
performances? Are there plugins or such software for this? Does anyone 
know of any examples that I could look at to examine how it's done?

Thanks in advance for any hints!

Cheers
Katya

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.




More information about the lkb mailing list