Gazelle Manual
Posted by josh at February 25th, 2008
Check out the Gazelle manual that I just put online. I’ve put a ton of work into it, and it’s surprisingly substantial for a project at this stage. I invested the work now because I want something I can point people to that demonstrates my plan for Gazelle, even if the implementation isn’t there yet. For people who are veterans in the parsing field, I want to be able to point them here when they ask the question “so what kind of algorithm are you using?” For people who are skeptical that I can really improve on existing tools, I want to point them here to demonstrate my concrete plans for doing so.
Think of it as “the anti-fluff piece.” And as a bonus, when Gazelle is actually ready for general consumption, it will have a great manual ready-to-go.
Hey Josh -
I’ll have you know that if I get a bad grade on my upcoming AI midterm, I’m blaming you because I wasted a chunk of class reading about parser algorithms and not learning highly important logical inference type stuff. Parser stuff is just so fun (as I’m sure you well know)!
Two quick thoughs:
1. I was disappointed you didn’t include any of the stuff from your “Why Gazelle Matters” post in there. I think that really help motivates *why* a person ought to be considered with say stacks of DFAs and how they work.
2. If I had a vote for the next section get elaborated on, it would definitely be the “Introductory Tour” example. Exact specifics on how to run and use the system are what I’m looking for when I check out a new project for the first time.
It was very neat to hear about how all that parsing magic is going on!
Buffalo
I’m very glad that you found the manual engaging enough reading that you made it all the way through! Though I suppose it could just be that your AI class is particularly boring.
And I’m also glad that it made enough sense that you feel like you learned something.
I will definitely have the “Why Gazelle Matters” kind of advocacy material *somewhere*, but I’m not sure if it will go in the manual yet. But maybe the manual is the perfect place for it. I hadn’t thought too much about it yet, so thanks for the suggestion.
And yes, I agree that the introductory tour is totally key. One reason it’s not written yet is because the implementation is in so much flux that I can’t really give instructions there that will still make sense in a month or two.
Thanks for reading, and for your comments!
josh
Josh:
Thank you so much, i was having a hard time working with Gazelle with trial and error (atleast i installed and read the readme). Between work, school and life so little time. Some of the upcoming weekend shall be devoted to the reading of the manual.
Thanks again.
we just have little to say… but thanks
by the way there are more than 4 readers of your little blog
a'abel
I heartily second the need for more introduction/tutorial on what we can do with Gazelle in its current state.
Catherine
Josh-
I haven’t made it through the entire manual yet, but I’m a bit weary of this syntax:
object -> “{” (string “:” value) *(”,”) “}”;
How do you know which set of parenthesis the ‘*’ goes with? What if I want to say zero or more instances of string “:” value?
It is also interesting to note that in section 2.3.2.4 of “Parsing Techniques - A Practical Guide” says:
“… These facilities are very useful and allow the Book grammar to be written more efficiently (Figure 2.11). Some styles even allow constructions like Something 4, meaning “one or more Somethings with a maximum of 4″, or Something , meaning “one or more Somethings separated by commas”; this seems to be a case of overdoing a good thing.
I would have to agree with the book and say that Something , is “overdoing a good thing”… it also seems like Something , could also be confused with meaning “Something one or more times followed by a comma”.
Just some food for thought :-).
Cheers,
-Brian
Brian Maher
@Catherine: wait for Gazelle 0.2 (which hopefully will come before long) — it will actually have enough functionality and support that someone other than me can do something useful with it, *and* as a bonus the manual will tell you how.
@Brian: I’m glad you decided to go for “Parsing Techniques: A Practical Guide”! If the repetition-with-separator operator was nothing but syntactic sugar, I might agree with Grune and Jacobs here, but since the grammar is also the blueprint for how the parse tree is pulled apart at parse time, I strongly believe that the repetition-with-separator is an important and significant tool.
Just for discussion, here’s what the same grammar fragment looks like without the repetition-with-separator operator:
object -> “{” string “:” value (”,” string “:” value)* “}”;
Imagine registering callbacks: if you wanted a callback to get called for every key/value pair, you’d have to register the callback twice — once for the first occurrence, once for the second. Imagine that you were using a Gazelle visualization tool to display a parse tree; the tool will know to group the second occurrence and on in a list, but the first will show up totally separately.
A guiding principle in Gazelle’s design is that the tool should let you model the actual problem as closely as possible. If the whole toolchain understands that this is a list of things separated by commas, there are so many places where this knowledge can be put to good use. Every time I find myself writing something in a Gazelle grammar that is a clear concept when you explain it to a person, but can only be awkwardly approximated when you explain it to Gazelle, I’ve asked myself “is there a better way to say this?”
Another example is operator parsing: most language descriptions will say “here are the operators and here is their precedence.” Why then, should I have to say something awkward and clumsy like:
expr -> factor (” ” factor)*;
factor -> term (”*” term)*;
term -> NUMBER | “(” expr “)”
Why should it be so roundabout to tell Gazelle that there are two operators, both of which are left-associative, with multiplication having higher precedence than addition? It shouldn’t, which is why I’m working on a syntax for operator precedence. Check out my current idea for this in my sketch for Lua grammar:
http://repo.or.cz/w/gazelle.git?a=blob;f=sketches/lua.gzl;h=5cd5e8e5e375bfcbf4e518f1eec09eead691a601;hb=e66d67c40b3032279f1630f6d8f5455dfdb611ec
So anyway, that’s my long answer to a short question. With respect to your feeling uncomfortable about the syntax, you have a point inasmuch as:
foo -> bar *(”,”);
…is currently parsed differently than:
foo -> bar * (”,”);
The first means “0 or more bars separated by commas.” The second means “0 or more bars followed by a single comma. That’s probably a problem. I should rethink this syntax. Thanks for the comment!
josh