[developers] chart mapping missing applicable lexical filtering rule?
paul at haleyai.com
paul at haleyai.com
Sat Jul 7 21:20:35 CEST 2018
Dear Developers,
In one use case, it would be nice to limit the use of capitalized proper nouns to cases in which the input is capitalized. I have been successful in doing so with some exception, such as shown below.
I am surprised by the following behavior and either have something to learn or perhaps there is a bug in PET's chart mapping?
Regards,
Paul
Given a capitalized lexical entry such as:
Bank_NNP := n_-_pn_le & [ORTH <"Bank">,SYNSEM [LKEYS.KEYREL.CARG "Bank",PHON.ONSET con]].
The following lexical filtering rule (which has been simplified for the demonstration purposes of this email):
veto_capitalized_native_uncapitalized_lfr := lexical_filtering_rule & [+CONTEXT <>,+INPUT <[ORTH.FIRST ^[[:upper:]].*$]>,+OUTPUT <>].
will 'correctly' remove Bank_NNP from the chart when the input is "it is the bank" but fails to do so when a period is appended.
PET's logging of lexical rules shows as follows for the first case:
[cm] veto_capitalized_native_uncapitalized_lfr fired: I1:85
L [85 2-3 the_pn_np1_no (1) -0.1123 {} { : } {}] < blk: 2 dtrs: 50 parents: >
[cm] veto_capitalized_native_uncapitalized_lfr fired: I1:92
L [92 3-4 Bank_NNP (1) 0 {} { : } {}] < blk: 2 dtrs: 51 parents: 98 >
[cm] veto_capitalized_native_uncapitalized_lfr fired: I1:98
P [98 3-4 n_sg_ilr (1) 0 {} { : } {}] < blk: 2 dtrs: 92 parents: >
Surprisingly, only the first of these 3 rules applies in the second case.
I don't think it matters, but in our case, input is via FSC in which the period is a token. Thus, the following token mapping rule applies in the second case only:
[cm] suffix_punctuation_tmr fired: C1:50 I1:48 O1:51
I [50 () -1--1 <14:15> "" "." { : } {}] < blk: 0 >
I [48 () -1--1 <10:14> "" "bank" { : } {}] < blk: 2 >
I [51 () -1--1 <10:15> "" "bank." { : } {}] < blk: 0 >
A redacted AVM for the surviving lexical item follows. As far as I can tell, it matches the lexical filtering rule above and thus should not remain in the chart.
L [103 3-4 Bank_NNP (1) 0 {} { : w_period_plr} {}] < blk: 0 dtrs: 63 parents: 110 >
n_-_pn_le
[ ...
SYNSEM ...
PHON phon
[ ONSET con
[ --TL #16:native_token_cons
[ FIRST token
[ +CLASS #17:alphabetic
[ +CASE non_capitalized+lower,
+INITIAL - ],
+FROM #3,
+FORM #18:"bank.",
+TO "15",
+CARG "bank",
...
REST native_token_null ] ] ],
LKEYS lexkeys_norm
[ KEYREL named_nom_relation
[ CFROM #3,
CTO #29:"15",
PRED named_rel,
LBL #15,
LNK *list*,
ARG0 #14,
CARG "Bank" ],
...
ORTH orthography
[ FIRST "Bank",
REST *null*,
FROM #3,
CLASS #17,
...
TOKENS tokens
[ +LIST #16,
+LAST token
[ +CLASS #17,
+FROM "10",
+FORM "bank.",
+TO #29,
+CARG "bank",
...
More information about the developers
mailing list