<div dir="ltr">Thanks a lot! I am now an ACE user.</div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Jun 25, 2015 at 12:19 AM, Woodley Packard <span dir="ltr"><<a href="mailto:sweaglesw@sweaglesw.org" target="_blank">sweaglesw@sweaglesw.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Petter,<br>
<br>
I notice "format error: unknown type `+’." in the grammar loading log. There’s nothing to say where that’s coming from, but in fact it refers to line 53 of rpp/lkb.rpp where a rule starts with '+' when ACE ungenerously believes it ought to start with '!'.<br>
<br>
The next problem I found is that lexemes have no TOKENS feature. This feature is introduced on the type `word’ by a type addendum in tmt.tdl, but lexemes do not inherit from `word’. With a token-aware workflow, the output of the token mapping phase is unified into the TOKENS feature of lexemes; when that feature is missing / not appropriate, it is an unexpected situation.<br>
<br>
Additionally, the token mapping rule "generic_name_tmr" is defaulting all tokens to [ +TRAIT: generic_trait ], which means they are incompatible with native lexical entries. Since there are no POS tags, the generic lexical entries are also incompatible, so you get no lexemes and no parse.<br>
<br>
Finally, the tiny-lex.tdl lexicon has a start-of-string lexeme whose orthography is "START" rather than "^", which makes it unable to match the "^" introduced by the REPP rules.<br>
<br>
I took the liberty of changing tmt.tdl to introduce TOKENS and the accompanying constraints on word-or-lexrule instead of word, commenting out generic_name_tmr, and rewriting START to ^ in tiny-lex.tdl. With these changes I can parse "Jon sover" and get a plausible-looking MRS out.<br>
<br>
I hope that is helpful advice,<br>
-Woodley<br>
<div><div class="h5"><br>
> On Jun 24, 2015, at 5:36 AM, Petter Haugereid <<a href="mailto:petterha@gmail.com">petterha@gmail.com</a>> wrote:<br>
><br>
> Hi,<br>
><br>
> I am trying to load my Norwegian grammar into ACE, but I run into some issues when I try to parse a sentence.<br>
><br>
> Loading the grammar seems to go fine (the config file is based on that of Jacy):<br>
><br>
> petter@tor:~/tools/ace-0.9.21$ ./ace -G norsyg.dat -g ../../logon/petter/norsyg/ace/config.tdl<br>
> reading configuration from `../../logon/petter/norsyg/ace/config.tdl'<br>
> reading instance from `../../logon/petter/norsyg/ace/../pet/qc.tdl'<br>
> reading types from `../../logon/petter/norsyg/ace/../mtr.tdl'<br>
> grammar version Norsyg (1206)<br>
> format error: unknown type `+'.<br>
> reading grammar from `../../logon/petter/norsyg/ace/../norwegian.tdl'<br>
> reading lexical-filtering-rulefrom `../../logon/petter/norsyg/ace/../lfr.tdl'<br>
> reading types from `../../logon/petter/norsyg/ace/../matrix.tdl'<br>
> reading types from `../../logon/petter/norsyg/ace/../nor.tdl'<br>
> reading types from `../../logon/petter/norsyg/ace/../infl-codes.tdl'<br>
> reading types from `../../logon/petter/norsyg/ace/../tmt.tdl'<br>
> reading types from `../../logon/petter/norsyg/ace/../unknown.tdl'<br>
> reading lexical entries from `../../logon/petter/norsyg/ace/../tiny-lex.tdl'<br>
> reading token-mapping-rule from `../../logon/petter/norsyg/ace/../tmr/prelude.tdl'<br>
> reading token-mapping-rule from `../../logon/petter/norsyg/ace/../tmr/pos.tdl'<br>
> reading token-mapping-rule from `../../logon/petter/norsyg/ace/../tmr/pos-ipa.tdl'<br>
> reading token-mapping-rule from `../../logon/petter/norsyg/ace/../tmr/finis.tdl'<br>
> reading generic-lex-entry from `../../logon/petter/norsyg/ace/../gle.tdl'<br>
> reading rules from `../../logon/petter/norsyg/ace/../rules.tdl'<br>
> reading lexical rules from `../../logon/petter/norsyg/ace/../tiny-irules.tdl'<br>
> reading instance from `../../logon/petter/norsyg/ace/../labels.tdl'<br>
> reading instance from `../../logon/petter/norsyg/ace/../roots.tdl'<br>
> checking for glbs... 0.53 sec<br>
> processing constraints... 0.67 sec<br>
> processing rules 35 ms<br>
> processing lex-rules 0 ms<br>
> reading irregular forms from ../irregs.tab<br>
> processing lexicon... 1 ms<br>
> simple lexemes 0 / 3 = 0.00%<br>
> 3336 types (1501 glb), 3 lexemes, 77 rules, 1 orules, 983 instances, 722 strings, 234 features<br>
> loading maxent model 0 ms<br>
> reading tree labels from `../../logon/petter/norsyg/ace/../labels.tdl'<br>
> loading tree-node-labels<br>
> rule filter... 83.3% blocked (39.1% ss)<br>
> rule filter... 83.3% blocked (39.0% ss)<br>
> rule filter... 83.3% blocked (39.0% ss)<br>
> rf-transitive closure... 1 ms<br>
> loaded grammar in 2.41391s<br>
> types: 33.9M rules: 8.4M lex-info: 500<br>
> miscellaneous: 62K lex-dgs: 71K miscellaneous: 13.7M sem-index: 85K stochastic-model: 0 latmap rules: 18K<br>
> ... freezing 55.8M to file map 0x6000000000<br>
><br>
><br>
> But when I try to parse the sentence "Jon sover", I get an error message:<br>
><br>
> petter@tor:~/tools/ace-0.9.21$ ./ace -g norsyg.dat -Tf1<br>
> Jon sover<br>
> ERROR: toklist or toklast missing on a token<br>
> NOTE: lexemes do not span position 0 `^'!<br>
> NOTE: post reduction gap<br>
> SKIP: Jon sover<br>
> NOTE: ignoring `Jon sover'<br>
><br>
> It should be noted that I use REPP to add "^ " at the beginning of every input string, so the string the grammar attempts to parse is "^ Jon sover". ("^" has a lexical entry.)<br>
> I don't quite understand the meaning of the ERROR message. I have tried to find out if there are any TOKENS features that are missing in the grammar, but I don't know what is expected of the grammar. I am attaching a stripped down version of the grammar in case anyone would like to try to find out what goes wrong. (The config file is in ace/.)<br>
><br>
> Best regards,<br>
><br>
> Petter<br>
</div></div>> <norsyg_2015-06-24.tgz><br>
<br>
</blockquote></div><br></div>