Archive

Archive for July 10th, 2007

The Road to 0.1

July 10th, 2007 Josh 2 comments

I want to make a release of my parsing engine within a month. The target feature set is:

  • a better name. “my parsing engine” is a boring mouthful. tentative idea: “Greyhound.” In the spirit of Yacc, Bison, ANTLR, but really fast and lightweight. Downside: the association with buses.
  • the ability to parse grammar files written in the yacc-like grammar language I designed
  • the ability to parse input text using said grammar, for LL(1) grammars
  • the ability to call whatever callbacks were registered for a run of the parser

My target application for my first release is: radically faster versions of recs-join and recs-collate. Only my Amazon friends will know these by name, but they are utilities that operate on streams of JSON files. Extremely useful, but the current Perl-based implementations are extremely slow (about 100 times slower than wc).

Explicitly out-of-scope for the first release:

  • grammars more complicated than LL(1)
  • error recovery
  • character set support
  • the JIT (you might think I’m crazy for attempting this at all, but I plan to use the extremely awesome DynASM framework for JITting).

One month. Let’s see if I can make it.

Categories: Gazelle Tags: