Porting upb to C++?
I am on the verge of trying something I never thought I’d do. I’m considering porting upb to C++.
My reasons aren’t ideological, they are highly practical. Basically I am realizing that while object-oriented C is OK for a while, it’s very weak at inheritance. Inheritance in C involves a lot of casting, duplicated code and/or macros, and careful discipline. The main problems with this are:
- the code gets longer and less readable
- the code involves more possibly-unsafe operations like casts
Both of these problems make the code ultimately more difficult to audit for security. And getting upb audited for security is something I plan to do very soon.
I am coming to believe that porting to C++ would make upb smaller (in lines of code) and easier for verify for security. However, there are a few major disadvantages that are giving me pause:
- there are still some contexts in which C++ is a no-go, like the Linux kernel, embedded systems that only have a C compiler (but no C++), or projects that want to stay C-only. Doing this port would make upb unsuitable for these contexts.
- projects that are currently C-only would need to create C++ source files to call upb APIs, and will have to link in the C++ runtime
- (possible) C++ could result in a larger binary.
When I look at the downsides though, they don’t seem to pertain to my initial goals of making upb useful for Python, Lua, Ruby, etc. extensions, and for use inside Google. Being useful for really restricted embedded systems is a far-off use case. So it’s sounding like porting to C++ is the right thing to do.
I hope it significantly reduces the line count, as I expect it will. That will make me feel better about giving up the minimalism of C. I will definitely be compiling with -fno-exceptions -fno-rtti -fvisibility-inlines-hidden on gcc. I also won’t be using any of the C++ standard library (not even <string>).
please don’t :’(
@Alejandro: If you want to influence my decision, you should tell me why you don’t want me to.
I definitely have resisted, but practical reasons are demanding that I do. But maybe you are thinking of practical reasons that I’m missing.
@Josh: Hi! You mentioned my reason… I work on a CXX-free embedded controller. Moving from ASN.1 to protocol buffer at IPC (hate threads) level thanks to your awesome upb and I was starting to write the Lua bindings to finish the migration integrating the http front-end and some Lua based helpers running inside the controller. So porting ucb to C++ because will hit me… badly
… your implementation is great!. I understand porting it to C++ will save you some lines… but maybe some refactoring can help to reduce the complexity without having to switch to C++? you can do all the bindings you aim keeping it in C… with all it’s current benefits
@Alejandro: Wow, I didn’t realize there was already someone wanting to use upb on an embedded platform! I definitely want to keep supporting your use case if I can. I think the best way forward will be for me to get a C++ to C translator like Comeau C++. Then I can keep writing in C++, but ship C files that you can build with a plain C compiler. Do you think that will work for you?
assuming the generated C code is good, why not?
thanks!
Have you checked with the security team about this yet? C++ might reduce the number of potentially unsafe manual casting that goes on in the code, but it also introduces a whole host of (often very subtle) security risks of its own… (For example, see: http://chargen.matasano.com/chargen/2009/10/9/a-c-challenge.html and some of the links there) Depending on how much you’re planning on changing — and depending on your security folks — it might well be faster to tighten up the current codebase than to rewrite it. (It’s been a little while since I looked at the code, and I certainly haven’t tried to do a formal review, but I didn’t notice too many scary things going on in there…)
On a more selfish note — I too was eyeing upb for use in a (pure C) project of mine, and would just as soon see it stay C… So take my comments with a grain of salt.
-sq
I’ve also been lazily contemplating upb for use in my projects. Since I tend to write my code in strange languages like D and Haskell, I’ll second the other’s comments! I would not mind having a thin C++ wrapper though.