[developers] WikiWoods

Stephan Oepen oe at ifi.uio.no
Tue Jun 11 01:03:41 CEST 2019


hi alexandre,

which specific version of WikiWoods are you looking at?

starting from 1212 (i.e. the larger and cleaner GML version of the
texts), article names (into the corresponding wikipedia dump) should
be encoding using ⌊δ ... δ⌋ tags.

best wishes, oe

On Mon, Jun 10, 2019 at 11:58 PM Alexandre Rademaker
<arademaker at gmail.com> wrote:
>
>
> Does anyone know if in the files from http://moin.delph-in.net/WikiWoods corpus we can identify the original wikipedia page of each sentence? That is, can we reconstruct the text of the wikipedia page?
>
> Best,
>
> --
> Alexandre Rademaker
> http://arademaker.github.io
>
>
>



More information about the developers mailing list