The Road to 0.1

I want to make a release of my parsing engine within a month. The target feature set is:

  • a better name. “my parsing engine” is a boring mouthful. tentative idea: “Greyhound.” In the spirit of Yacc, Bison, ANTLR, but really fast and lightweight. Downside: the association with buses.
  • the ability to parse grammar files written in the yacc-like grammar language I designed
  • the ability to parse input text using said grammar, for LL(1) grammars
  • the ability to call whatever callbacks were registered for a run of the parser

My target application for my first release is: radically faster versions of recs-join and recs-collate. Only my Amazon friends will know these by name, but they are utilities that operate on streams of JSON files. Extremely useful, but the current Perl-based implementations are extremely slow (about 100 times slower than wc).

Explicitly out-of-scope for the first release:

  • grammars more complicated than LL(1)
  • error recovery
  • character set support
  • the JIT (you might think I’m crazy for attempting this at all, but I plan to use the extremely awesome DynASM framework for JITting).

One month. Let’s see if I can make it.

This entry was posted in Gazelle. Bookmark the permalink.

2 Responses to The Road to 0.1

  1. Phil Edry says:

    This website names 15 animals faster than a greyhound.
    http://www.infoplease.com/ipa/A0004737.html

    Turns out a Mongolian wild ass is .65 mph faster than a greyhound, so you could call your parser Mongolian Wild Ass. Or perhaps Mongolian Wild Arse to be more polite. Wait! Mongolian Wild Parser!

  2. Buffalo says:

    Hmmm…I am not down with greyhound, and I am trying to think of a good alternative for you josh. But as usual naming is a dicey business. Here’s my list of characteristics of an ideal name:

    1. Unique and easy to search for
    2. Easy to Spell
    3. Cool sounding
    4. Tells you at least a little bit about what it’s talking about

    I’d say greyhound is only 1/4 (and even then I’m tempted to write grayhound though at least spellcheck catches it). After hearing about WireShark I was thinking a 2 word name might suit you better. You could do animal-adjective combos, though, and then you might be able to choose something a little more bovine which seem to be common in the parser space.

    NimbleCow
    LightningOx

    In particular I like “Bull” because of it’s simultaneous crap/cow/talk connotations.

    BullTongue
    LexiBull
    LinguaBull
    BullBrogue (although the desire to be alliterative is almost overwhelming here I feel that it just invites Ubuntu confusion)

    And then there’s the possibility of just dropping this cow stuff altogether and trying to give some parser type reference that might be a little more obvious:

    Hypersonic Glyph
    LightLingua
    NeatoParse
    LexParseAndMore
    ParseSynthesis
    ParseParty

    Any thoughts?

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>