[lkb] Parsing rich data with LKB

Katya Alahverdzhieva K.Alahverdzhieva at sms.ed.ac.uk
Wed Mar 17 14:48:24 CET 2010


Dear LKB people,

How would you go about using LKB to parse data that is richer than just 
text, and also to define temporal constraints? How do I parse data which 
comes not as a stream of tokens, but as a list of feature structures?

I have a corpus of transcriptions of spoken text, annotated with gesture 
and prosody information, including the time of their performance. I'm 
trying to write a grammar in LKB whose rules take into account the 
timestamps, the pitch accents and the gestures represented as sets of 
feature-values.

For instance, I need to somehow capture in my grammar rules the notion 
of temporal overlap, i.e., whether a gesture is happening at the same 
time as a word/sequence of words. Also, I am trying to parse richer data 
where words are not just tokens, but whole feature structures 
(containing prosody, timestamps and gesture description).

In practical terms, what would everyone's recommended approach be to 
parsing structured data like this and to comparing temporal 
performances? Are there plugins or such software for this? Does anyone 
know of any examples that I could look at to examine how it's done?

Thanks in advance for any hints!

Cheers
Katya

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.




More information about the lkb mailing list