[developers] Parsing with ACE
Petter Haugereid
petterha at gmail.com
Wed Jun 24 14:36:03 CEST 2015
Hi,
I am trying to load my Norwegian grammar into ACE, but I run into some
issues when I try to parse a sentence.
Loading the grammar seems to go fine (the config file is based on that of
Jacy):
petter at tor:~/tools/ace-0.9.21$ ./ace -G norsyg.dat -g
../../logon/petter/norsyg/ace/config.tdl
reading configuration from `../../logon/petter/norsyg/ace/config.tdl'
reading instance from
`../../logon/petter/norsyg/ace/../pet/qc.tdl'
reading types from `../../logon/petter/norsyg/ace/../mtr.tdl'
grammar version Norsyg (1206)
format error: unknown type `+'.
reading grammar from
`../../logon/petter/norsyg/ace/../norwegian.tdl'
reading lexical-filtering-rulefrom
`../../logon/petter/norsyg/ace/../lfr.tdl'
reading types from
`../../logon/petter/norsyg/ace/../matrix.tdl'
reading types from `../../logon/petter/norsyg/ace/../nor.tdl'
reading types from
`../../logon/petter/norsyg/ace/../infl-codes.tdl'
reading types from `../../logon/petter/norsyg/ace/../tmt.tdl'
reading types from
`../../logon/petter/norsyg/ace/../unknown.tdl'
reading lexical entries from
`../../logon/petter/norsyg/ace/../tiny-lex.tdl'
reading token-mapping-rule from
`../../logon/petter/norsyg/ace/../tmr/prelude.tdl'
reading token-mapping-rule from
`../../logon/petter/norsyg/ace/../tmr/pos.tdl'
reading token-mapping-rule from
`../../logon/petter/norsyg/ace/../tmr/pos-ipa.tdl'
reading token-mapping-rule from
`../../logon/petter/norsyg/ace/../tmr/finis.tdl'
reading generic-lex-entry from `../../logon/petter/norsyg/ace/../gle.tdl'
reading rules from
`../../logon/petter/norsyg/ace/../rules.tdl'
reading lexical rules from
`../../logon/petter/norsyg/ace/../tiny-irules.tdl'
reading instance from
`../../logon/petter/norsyg/ace/../labels.tdl'
reading instance from
`../../logon/petter/norsyg/ace/../roots.tdl'
checking for glbs... 0.53 sec
processing constraints... 0.67 sec
processing rules 35 ms
processing lex-rules 0 ms
reading irregular forms from ../irregs.tab
processing lexicon... 1 ms
simple lexemes 0 / 3 = 0.00%
3336 types (1501 glb), 3 lexemes, 77 rules, 1 orules, 983 instances, 722
strings, 234 features
loading maxent model 0 ms
reading tree labels from
`../../logon/petter/norsyg/ace/../labels.tdl'
loading tree-node-labels
rule filter... 83.3% blocked (39.1% ss)
rule filter... 83.3% blocked (39.0% ss)
rule filter... 83.3% blocked (39.0% ss)
rf-transitive closure... 1 ms
loaded grammar in 2.41391s
types: 33.9M rules: 8.4M lex-info: 500
miscellaneous: 62K lex-dgs: 71K miscellaneous: 13.7M sem-index: 85K
stochastic-model: 0 latmap rules: 18K
... freezing 55.8M to file map 0x6000000000
But when I try to parse the sentence "Jon sover", I get an error message:
petter at tor:~/tools/ace-0.9.21$ ./ace -g norsyg.dat -Tf1
Jon sover
ERROR: toklist or toklast missing on a token
NOTE: lexemes do not span position 0 `^'!
NOTE: post reduction gap
SKIP: Jon sover
NOTE: ignoring `Jon sover'
It should be noted that I use REPP to add "^ " at the beginning of every
input string, so the string the grammar attempts to parse is "^ Jon sover".
("^" has a lexical entry.)
I don't quite understand the meaning of the ERROR message. I have tried to
find out if there are any TOKENS features that are missing in the grammar,
but I don't know what is expected of the grammar. I am attaching a stripped
down version of the grammar in case anyone would like to try to find out
what goes wrong. (The config file is in ace/.)
Best regards,
Petter
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20150624/c3104675/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: norsyg_2015-06-24.tgz
Type: application/x-gzip
Size: 557909 bytes
Desc: not available
URL: <http://lists.delph-in.net/archives/developers/attachments/20150624/c3104675/attachment-0001.tgz>
More information about the developers
mailing list