100 lines of C that can parse any Protocol Buffer
Posted by josh at July 12th, 2008
There’s lots of misinformation flying around the blogosphere about Google’s Protocol Buffers. One common claim is that you can’t parse a protobuf without having the .proto file. This is false, as demonstrated by This 100-line C program that does just that. It can parse an arbitrary protobuf into its field numbers and wire types. This is pretty closely equivalent to what you get from a generic XML parser, except that with XML you get names for the keys (elements) instead of numbers and strings instead of the four or so wire types that are defined by protocol buffers.
In both the XML and the Protocol Buffer case, you want to have more information if you’re going to actually write programs that consume application-domain data. You want documentation that specifies exactly what all the fields mean, and enough information to turn the on-the-wire values into actual numbers where appropriate. It just so happens that Protocol Buffers specify this information in a structured format called a .proto file.
Update, 12:50PM July 12: apparently I wasn’t clear enough: my 100 lines of C does not use any Protocol Buffer library. It implements the decoding itself. My point is that the format is so simple that you can parse it generically in 100 lines of C. (If you’re wondering: you’d be lucky to get within a factor of 10 for a bare-bones XML parser).
This should be much shorter. Create a new FileDescriptorProto with one empty DescriptorProto. From that create a FileDescriptor, lookup the “empty message” descriptor, create a DynamicMessage from that and parse the input. The unknown fieldset contains the fields.
Actually, in Java you can directly parse an UnknownFieldSet.
Matthias
@Matthias,
Maybe I wasn’t clear enough: my 100 lines of C does not use any protobuf library. It actually implements the decoding of a protobuf from scratch, and has no external dependencies except for the standard C library.
josh
That .c source code was the most boring game I’ve ever played. It only outputted some text with no instructions of how to play or anything. Do you even get to save any highscores if you figure out how to play??
Ronald