A partial retraction to my last entry
Posted by josh at July 29th, 2008
From a theoretical standpoint, my last entry was incorrect when I said that my new algorithm in Gazelle can decide whether a grammar is strong-LL(k) or strong-LL(*). I had proofs against me — these problems have actually been proven undecidable.
Once I heard this result, I spent hours trying to think of a grammar that is strong-LL(k), but that Gazelle will say is not. I am fairly certain that Gazelle’s algorithm will always terminate, so if the proof is to be believed, there must be a grammar that Gazelle falsely claims is not strong-LL(k) or strong-LL(*).
After many hours, I came up with one. Here it is:
s -> "X"* "Y" "Y" "Z"| "X" c;
c -> "Y" c "Y" | "Q";
This grammar is strong-LL(5), but Gazelle falsely claims it is not strong-LL(k) for any k. ANTLR can handle it (as long as the recursion depth is set high enough).
I’m thinking, though, that this isn’t really a problem. My intuition says that it will be rare that anyone actually tries to write a grammar that fails in this way — after all, it took an evening of specifically trying to exercise this case before I could find a grammar that did. I don’t think real grammars will run into this problem.
So even though I can’t handle all of strong-LL(*), I think I can handle the grammars that will actually come up in practical use, and I still have the extremely desirable property of being able to decide whether a grammar fits or not without forcing the user to tweak recursion depth constants. So although my result isn’t as theoretically significant as I thought, I think that from a practical standpoint I’m still in good shape: I have a terminating algorithm that can handle a very useful set of grammars.
“So although my result isn’t as theoretically significant as I thought, I think that from a practical standpoint I’m still in good shape: I have a terminating algorithm that can handle a very useful set of grammars.”
If you’re after theoretically significant, all you need to do is come up with a name (”LL(haberman)”) and ideally some proofs or properties for the set of grammars your new algorithm does recognize.
Matt Brubeck