[developers] Searching treebanks

Ned Letcher ned at nedned.net
Fri May 22 03:57:00 CEST 2020


Heya Francis,

I surveyed syntactic querying tools for treebank search in my thesis.
During development of Typediff <https://github.com/ned2/typediff>, I needed
to embed an interactive querying interface for DELPHIN treebanks, and came
to the conclusion that Fangorn was the best tool for the job. Sadly there
is not a live version of Typediff live currently.

Fangorn <https://github.com/sparcs/fangorn> itself wasn't too hard to get
running I found, and as part of Typediff I created a tool
<https://github.com/ned2/typediff/blob/af2d91c3221182ddb0c8cf55db4127c5c5587544/typediff/parseit.py>
for converting DELPHIN treebanks into the format that Fangorn expects,
which you might be able to use.

I have been hoping to get a version of Typediff up and running somewhere
but it's not something I've been able to prioritise. If I do, I will be
sure to let you know :)

Cheers,
Ned

On Thu, 27 Feb 2020 at 01:09, Emily M. Bender <ebender at uw.edu> wrote:

> For search over semantic representations (MRS, DM, EDS) there's WeSearch:
>
> http://wesearch.delph-in.net/
>
> ... which indexes DeepBank and WikiWoods.
>
> Emily
>
> On Wed, Feb 26, 2020 at 5:29 AM Francis Bond <bond at ieee.org> wrote:
>
>> Thanks for the tip.    If only we all sensibly annotated our corpora with
>> typecraft.
>>
>> On Wed, Feb 26, 2020 at 9:21 PM Lars Hellan <lars.hellan at ntnu.no> wrote:
>>
>>> Hi Francis,
>>>
>>> For Norwegian you can do such things through
>>> https://typecraft.org/tc2wiki/Norwegian_Valency_Corpus, a corpus of
>>> about 20,000 sentences.
>>>
>>>
>>> (Not right on your mark, but perhaps not too far from the sphere of
>>> "anything" ...)
>>>
>>>
>>> Best
>>>
>>> Lars
>>> ------------------------------
>>> *From:* developers-bounces at emmtee.net <developers-bounces at emmtee.net>
>>> on behalf of Francis Bond <bond at ieee.org>
>>> *Sent:* Wednesday, February 26, 2020 2:02:28 PM
>>> *To:* Stephan Oepen; developers at delph-in.net; Rebecca Dridan; Timothy
>>> Baldwin
>>> *Subject:* [developers] Searching treebanks
>>>
>>> G'day,
>>>
>>> does anyone know of any way to search Redwoods (or DELPHIN treebanks in
>>> general)  for trees of a certain type (using something like the Fangorn
>>> interface).  For example, I want to find how often in the treebank 'start'
>>> is intransitive vs NP V VP-ving  vs NP V VP-to vs NP V VP NP  (I start; I
>>> start lecturing; I start to lecture; I start a lecture).
>>>
>>> In fangorn this was "//VP/VB/start[->S/VP/VBG" for NP V VP-ving, ...
>>>
>>> I would be ecstatic if there were an online search I can point my
>>> students at, but would be interested in anything.
>>>
>>>
>>>
>>> --
>>> Francis Bond <http://www3.ntu.edu.sg/home/fcbond/>
>>> Division of Linguistics and Multilingual Studies
>>> Nanyang Technological University
>>>
>>
>>
>> --
>> Francis Bond <http://www3.ntu.edu.sg/home/fcbond/>
>> Division of Linguistics and Multilingual Studies
>> Nanyang Technological University
>>
>
>
> --
> Emily M. Bender (she/her)
> Howard and Frances Nostrand Endowed Professor
> Department of Linguistics
> Faculty Director, CLMS
> University of Washington
> Twitter: @emilymbender
>


-- 
nedned.net
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20200522/0f3b2aef/attachment.html>


More information about the developers mailing list