<p>Can we stop looking ahead during ATN simulation or is there some
uncertainty as to which alternative we will ultimately pick, after
consuming more input? Even if there are partial conflicts, we might know
that everything is going to resolve to the same minimum alternative. That
means we can stop since no more lookahead will change that fact. On the
other hand, there might be multiple conflicts that resolve to different
minimums. That means we need more look ahead to decide which of those
alternatives we should predict.</p>
<p>The basic idea is to split the set of configurations {@code C}, into
conflicting subsets {@code (s, _, ctx, _)} and singleton subsets with
non-conflicting configurations. Two configurations conflict if they have
identical {@link ATNConfig#state} and {@link ATNConfig#context} values
but different {@link ATNConfig#alt} value, e.g. {@code (s, i, ctx, _)}
and {@code (s, j, ctx, _)} for {@code i!=j}.</p>
<p>Reduce these configuration subsets to the set of possible alternatives.
You can compute the alternative subsets in one pass as follows:</p>
<p>{@code A_s,ctx = {i | (s, i, ctx, _)}} for each configuration in
{@code C} holding {@code s} and {@code ctx} fixed.</p>
<p>Or in pseudo-code, for each configuration {@code c} in {@code C}:</p>
<pre>
mapc U= c.{@link ATNConfig#alt alt} # map hash/equals uses s and x, not
alt and not pred
</pre>
<p>The values in {@code map} are the set of {@code A_s,ctx} sets.</p>
<p>If {@code |A_s,ctx|=1} then there is no conflict associated with
{@code s} and {@code ctx}.</p>
<p>Reduce the subsets to singletons by choosing a minimum of each subset. If
the union of these alternative subsets is a singleton, then no amount of
more lookahead will help us. We will always pick that alternative. If,
however, there is more than one alternative, then we are uncertain which
alternative to predict and must continue looking for resolution. We may
or may not discover an ambiguity in the future, even if there are no
conflicting subsets this round.</p>
<p>The biggest sin is to terminate early because it means we've made a
decision but were uncertain as to the eventual outcome. We haven't used
enough lookahead. On the other hand, announcing a conflict too late is no
big deal; you will still have the conflict. It's just inefficient. It
might even look until the end of file.</p>
<p>No special consideration for semantic predicates is required because
predicates are evaluated on-the-fly for full LL prediction, ensuring that
no configuration contains a semantic context during the termination
check.</p>
<p><strong>CONFLICTING CONFIGS</strong></p>
<p>Two configurations {@code (s, i, x)} and {@code (s, j, x')}, conflict
when {@code i!=j} but {@code x=x'}. Because we merge all
{@code (s, i, _)} configurations together, that means that there are at
most {@code n} configurations associated with state {@code s} for
{@code n} possible alternatives in the decision. The merged stacks
complicate the comparison of configuration contexts {@code x} and
{@code x'}. Sam checks to see if one is a subset of the other by calling
merge and checking to see if the merged result is either {@code x} or
{@code x'}. If the {@code x} associated with lowest alternative {@code i}
is the superset, then {@code i} is the only possible prediction since the
others resolve to {@code min(i)} as well. However, if {@code x} is
associated with {@code j>i} then at least one stack configuration for
{@code j} is not in conflict with alternative {@code i}. The algorithm
should keep going, looking for more lookahead due to the uncertainty.</p>
<p>For simplicity, I'm doing a equality check between {@code x} and
{@code x'} that lets the algorithm continue to consume lookahead longer
than necessary. The reason I like the equality is of course the
simplicity but also because that is the test you need to detect the
alternatives that are actually in conflict.</p>
<p><strong>CONTINUE/STOP RULE</strong></p>
<p>Continue if union of resolved alternative sets from non-conflicting and
conflicting alternative subsets has more than one alternative. We are
uncertain about which alternative to predict.</p>
<p>The complete set of alternatives, {@code [i for (_,i,_)]}, tells us which
alternatives are still in the running for the amount of input we've
consumed at this point. The conflicting sets let us to strip away
configurations that won't lead to more states because we resolve
conflicts to the configuration with a minimum alternate for the
conflicting set.</p>
<p><strong>CASES</strong></p>
<ul>
<li>no conflicts and more than 1 alternative in set => continue</li>
<p>If all states report the same conflicting set of alternatives, then we
know we have the exact ambiguity set.</p>
<p><code>|A_<em>i</em>|>1</code> and
<code>A_<em>i</em> = A_<em>j</em></code> for all <em>i</em>, <em>j</em>.</p>
<p>In other words, we continue examining lookahead until all {@code A_i}
have more than one alternative and all {@code A_i} are the same. If
{@code A={{1,2}, {1,3}}}, then regular LL prediction would terminate
because the resolved set is {@code {1}}. To determine what the real
ambiguity is, we have to know whether the ambiguity is between one and
two or one and three so we keep going. We can only stop prediction when
we need exact ambiguity detection when the sets look like
{@code A={{1,2}}} or {@code {{1,2},{1,2}}}, etc...</p>
Full LL prediction termination.
<p>Can we stop looking ahead during ATN simulation or is there some uncertainty as to which alternative we will ultimately pick, after consuming more input? Even if there are partial conflicts, we might know that everything is going to resolve to the same minimum alternative. That means we can stop since no more lookahead will change that fact. On the other hand, there might be multiple conflicts that resolve to different minimums. That means we need more look ahead to decide which of those alternatives we should predict.</p>
<p>The basic idea is to split the set of configurations {@code C}, into conflicting subsets {@code (s, _, ctx, _)} and singleton subsets with non-conflicting configurations. Two configurations conflict if they have identical {@link ATNConfig#state} and {@link ATNConfig#context} values but different {@link ATNConfig#alt} value, e.g. {@code (s, i, ctx, _)} and {@code (s, j, ctx, _)} for {@code i!=j}.</p>
<p>Reduce these configuration subsets to the set of possible alternatives. You can compute the alternative subsets in one pass as follows:</p>
<p>{@code A_s,ctx = {i | (s, i, ctx, _)}} for each configuration in {@code C} holding {@code s} and {@code ctx} fixed.</p>
<p>Or in pseudo-code, for each configuration {@code c} in {@code C}:</p>
<pre> mapc U= c.{@link ATNConfig#alt alt} # map hash/equals uses s and x, not alt and not pred </pre>
<p>The values in {@code map} are the set of {@code A_s,ctx} sets.</p>
<p>If {@code |A_s,ctx|=1} then there is no conflict associated with {@code s} and {@code ctx}.</p>
<p>Reduce the subsets to singletons by choosing a minimum of each subset. If the union of these alternative subsets is a singleton, then no amount of more lookahead will help us. We will always pick that alternative. If, however, there is more than one alternative, then we are uncertain which alternative to predict and must continue looking for resolution. We may or may not discover an ambiguity in the future, even if there are no conflicting subsets this round.</p>
<p>The biggest sin is to terminate early because it means we've made a decision but were uncertain as to the eventual outcome. We haven't used enough lookahead. On the other hand, announcing a conflict too late is no big deal; you will still have the conflict. It's just inefficient. It might even look until the end of file.</p>
<p>No special consideration for semantic predicates is required because predicates are evaluated on-the-fly for full LL prediction, ensuring that no configuration contains a semantic context during the termination check.</p>
<p><strong>CONFLICTING CONFIGS</strong></p>
<p>Two configurations {@code (s, i, x)} and {@code (s, j, x')}, conflict when {@code i!=j} but {@code x=x'}. Because we merge all {@code (s, i, _)} configurations together, that means that there are at most {@code n} configurations associated with state {@code s} for {@code n} possible alternatives in the decision. The merged stacks complicate the comparison of configuration contexts {@code x} and {@code x'}. Sam checks to see if one is a subset of the other by calling merge and checking to see if the merged result is either {@code x} or {@code x'}. If the {@code x} associated with lowest alternative {@code i} is the superset, then {@code i} is the only possible prediction since the others resolve to {@code min(i)} as well. However, if {@code x} is associated with {@code j>i} then at least one stack configuration for {@code j} is not in conflict with alternative {@code i}. The algorithm should keep going, looking for more lookahead due to the uncertainty.</p>
<p>For simplicity, I'm doing a equality check between {@code x} and {@code x'} that lets the algorithm continue to consume lookahead longer than necessary. The reason I like the equality is of course the simplicity but also because that is the test you need to detect the alternatives that are actually in conflict.</p>
<p><strong>CONTINUE/STOP RULE</strong></p>
<p>Continue if union of resolved alternative sets from non-conflicting and conflicting alternative subsets has more than one alternative. We are uncertain about which alternative to predict.</p>
<p>The complete set of alternatives, {@code [i for (_,i,_)]}, tells us which alternatives are still in the running for the amount of input we've consumed at this point. The conflicting sets let us to strip away configurations that won't lead to more states because we resolve conflicts to the configuration with a minimum alternate for the conflicting set.</p>
<p><strong>CASES</strong></p>
<ul>
<li>no conflicts and more than 1 alternative in set => continue</li>
<li> {@code (s, 1, x)}, {@code (s, 2, x)}, {@code (s, 3, z)}, {@code (s', 1, y)}, {@code (s', 2, y)} yields non-conflicting set {@code {3}} U conflicting sets {@code min({1,2})} U {@code min({1,2})} = {@code {1,3}} => continue </li>
<li>{@code (s, 1, x)}, {@code (s, 2, x)}, {@code (s', 1, y)}, {@code (s', 2, y)}, {@code (s'', 1, z)} yields non-conflicting set {@code {1}} U conflicting sets {@code min({1,2})} U {@code min({1,2})} = {@code {1}} => stop and predict 1</li>
<li>{@code (s, 1, x)}, {@code (s, 2, x)}, {@code (s', 1, y)}, {@code (s', 2, y)} yields conflicting, reduced sets {@code {1}} U {@code {1}} = {@code {1}} => stop and predict 1, can announce ambiguity {@code {1,2}}</li>
<li>{@code (s, 1, x)}, {@code (s, 2, x)}, {@code (s', 2, y)}, {@code (s', 3, y)} yields conflicting, reduced sets {@code {1}} U {@code {2}} = {@code {1,2}} => continue</li>
<li>{@code (s, 1, x)}, {@code (s, 2, x)}, {@code (s', 3, y)}, {@code (s', 4, y)} yields conflicting, reduced sets {@code {1}} U {@code {3}} = {@code {1,3}} => continue</li>
</ul>
<p><strong>EXACT AMBIGUITY DETECTION</strong></p>
<p>If all states report the same conflicting set of alternatives, then we know we have the exact ambiguity set.</p>
<p><code>|A_<em>i</em>|>1</code> and <code>A_<em>i</em> = A_<em>j</em></code> for all <em>i</em>, <em>j</em>.</p>
<p>In other words, we continue examining lookahead until all {@code A_i} have more than one alternative and all {@code A_i} are the same. If {@code A={{1,2}, {1,3}}}, then regular LL prediction would terminate because the resolved set is {@code {1}}. To determine what the real ambiguity is, we have to know whether the ambiguity is between one and two or one and three so we keep going. We can only stop prediction when we need exact ambiguity detection when the sets look like {@code A={{1,2}}} or {@code {{1,2},{1,2}}}, etc...</p>