#!/opt/bin/perl
use 5.011;
use strict;
use warnings;
use English qw(-no_match_vars);
use Parse::Marpa;
my $grammar_description='
start symbol is englishsentence.
semantics are perl5. version is 1.004000.
default lex prefix is /\s+|\A/.
concatenate lines is q{ (scalar @_) ? (join "-", (grep { $_ } @_)) : undef; }.
default action is concatenate lines.
englishsentence: subject, verb, conjunction, object.
englishsentence: subject, verb, object.
specializernoun: noun. q{ "specializernoun($_[0])" }.
ordinarynoun: noun. q{ "ordinarynoun($_[0])" }.
subject: specializernoun, ordinarynoun. q{ "spsub($_[0]+$_[1])" }.
subject: noun. q{ "subject($_[0])" }.
noun: nounlex.
nounlex matches /fruit|banana|time|arrow|flies/.
verb: verblex. q{ "verb($_[0])" }.
verblex matches /like|flies/.
object: preposition, noun. q{ "ob(prep($_[0])+n($_[1]))" }.
conjunction: /like/. q{ "conjunction($_[0])" }.
preposition: prepositionlex.
prepositionlex matches /a\b|an/.
';
my $data1='time like an arrow.';
my @values1=Parse::Marpa::mdl(\$grammar_description,\$data1);
for my $i(@values1) { say $$i; }
my $data2='time flies like an arrow.';
my @values2=Parse::Marpa::mdl(\$grammar_description,\$data2);
for my $i(@values2) { say $$i; }
Output:
PS: If I replace the conjunction-definition (conjunction: qr{like}. q{ "conjunction($_[0])" }.) with a terminal (conjunction matches /like/.), the quadruplication ceases and I get exactly one result for each of the two interpretations (but I lose the action on the conjunction).
subject(time)-verb(like)-ob(prep(an)+n(arrow))
subject(time)-verb(flies)-conjunction(like)-ob(prep(an)+n(arrow))
subject(time)-verb(flies)-conjunction(like)-ob(prep(an)+n(arrow))
subject(time)-verb(flies)-conjunction(like)-ob(prep(an)+n(arrow))
subject(time)-verb(flies)-conjunction(like)-ob(prep(an)+n(arrow))
spsub(specializernoun(time)+ordinarynoun(flies))-verb(like)-ob(prep(an)+n(arrow))
spsub(specializernoun(time)+ordinarynoun(flies))-verb(like)-ob(prep(an)+n(arrow))
spsub(specializernoun(time)+ordinarynoun(flies))-verb(like)-ob(prep(an)+n(arrow))
spsub(specializernoun(time)+ordinarynoun(flies))-verb(like)-ob(prep(an)+n(arrow))
Using
conjunction: conjunctionlex. q{ "conjunction($_[0])" }.
conjunctionlex matches /like/.
I get the unwanted quadruplication back.
1 Kommentar:
Thanks for your interest in Parse::Marpa. The quadruplication of parses is a (mis)feature of the Aycock-Horspool algorithm. This uses a finite automata to group rules into what I'll call Aycock-Horspool states (AH-states), similar to LR states. Parsing is by AH-state. A state can represent several grammar rules, and the same grammar rule can occur in several states.
In my evaluator, I make sure that every rule in every state results counts as a unique parse. If you regard tree traversals involving different AH-states as different "parses", the results you are getting are correct. Two rules occur in two different AH-states and so each parse that is unique from the grammar-rule perspective, appears 4 times in the AH-state based counting of parses.
However, from the grammar writer's point of view, the AH-states are arbitrary. The grammar-rule perspective is the one the grammar writer expects and can use.
I am working on a new version of this parser, simply called Marpa. Marpa will replace Parse::Marpa. Marpa has a completely new evaluator, and will enumerate parses in the more natural way.
Kommentar veröffentlichen