[developers] Parsing with ACE

Petter Haugereid petterha at gmail.com
Wed Jun 24 14:36:03 CEST 2015


Hi,

I am trying to load my Norwegian grammar into ACE, but I run into some
issues when I try to parse a sentence.

Loading the grammar seems to go fine (the config file is based on that of
Jacy):

petter at tor:~/tools/ace-0.9.21$ ./ace -G norsyg.dat -g
../../logon/petter/norsyg/ace/config.tdl
reading configuration       from `../../logon/petter/norsyg/ace/config.tdl'
reading instance            from
`../../logon/petter/norsyg/ace/../pet/qc.tdl'
reading types               from `../../logon/petter/norsyg/ace/../mtr.tdl'
grammar version             Norsyg (1206)
format error: unknown type `+'.
reading grammar             from
`../../logon/petter/norsyg/ace/../norwegian.tdl'
reading lexical-filtering-rulefrom
`../../logon/petter/norsyg/ace/../lfr.tdl'
reading types               from
`../../logon/petter/norsyg/ace/../matrix.tdl'
reading types               from `../../logon/petter/norsyg/ace/../nor.tdl'
reading types               from
`../../logon/petter/norsyg/ace/../infl-codes.tdl'
reading types               from `../../logon/petter/norsyg/ace/../tmt.tdl'
reading types               from
`../../logon/petter/norsyg/ace/../unknown.tdl'
reading lexical entries     from
`../../logon/petter/norsyg/ace/../tiny-lex.tdl'
reading token-mapping-rule  from
`../../logon/petter/norsyg/ace/../tmr/prelude.tdl'
reading token-mapping-rule  from
`../../logon/petter/norsyg/ace/../tmr/pos.tdl'
reading token-mapping-rule  from
`../../logon/petter/norsyg/ace/../tmr/pos-ipa.tdl'
reading token-mapping-rule  from
`../../logon/petter/norsyg/ace/../tmr/finis.tdl'
reading generic-lex-entry   from `../../logon/petter/norsyg/ace/../gle.tdl'
reading rules               from
`../../logon/petter/norsyg/ace/../rules.tdl'
reading lexical rules       from
`../../logon/petter/norsyg/ace/../tiny-irules.tdl'
reading instance            from
`../../logon/petter/norsyg/ace/../labels.tdl'
reading instance            from
`../../logon/petter/norsyg/ace/../roots.tdl'
checking for glbs...        0.53 sec
processing constraints...   0.67 sec
processing rules            35 ms
processing lex-rules        0 ms
reading irregular forms     from ../irregs.tab
processing lexicon...       1 ms
simple lexemes              0 / 3 = 0.00%
3336 types (1501 glb), 3 lexemes, 77 rules, 1 orules, 983 instances, 722
strings, 234 features
loading maxent model        0 ms
reading tree labels         from
`../../logon/petter/norsyg/ace/../labels.tdl'
loading tree-node-labels
rule filter...              83.3% blocked (39.1% ss)
rule filter...              83.3% blocked (39.0% ss)
rule filter...              83.3% blocked (39.0% ss)
rf-transitive closure...    1 ms
loaded grammar in 2.41391s
 types: 33.9M rules: 8.4M lex-info: 500
 miscellaneous: 62K lex-dgs: 71K miscellaneous: 13.7M sem-index: 85K
stochastic-model: 0 latmap rules: 18K
 ... freezing 55.8M to file map 0x6000000000


But when I try to parse the sentence "Jon sover", I get an error message:

petter at tor:~/tools/ace-0.9.21$ ./ace -g norsyg.dat -Tf1
Jon sover
ERROR: toklist or toklast missing on a token
NOTE: lexemes do not span position 0 `^'!
NOTE: post reduction gap
SKIP: Jon sover
NOTE: ignoring `Jon sover'

It should be noted that I use REPP to add "^ " at the beginning of every
input string, so the string the grammar attempts to parse is "^ Jon sover".
("^" has a lexical entry.)
I don't quite understand the meaning of the ERROR message. I have tried to
find out if there are any TOKENS features that are missing in the grammar,
but I don't know what is expected of the grammar. I am attaching a stripped
down version of the grammar in case anyone would like to try to find out
what goes wrong. (The config file is in ace/.)

Best regards,

Petter
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20150624/c3104675/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: norsyg_2015-06-24.tgz
Type: application/x-gzip
Size: 557909 bytes
Desc: not available
URL: <http://lists.delph-in.net/archives/developers/attachments/20150624/c3104675/attachment-0001.tgz>


More information about the developers mailing list