Home > Uncategorized > Torn over the C++ question

Torn over the C++ question

December 2nd, 2009 Josh Leave a comment Go to comments

I am having a very difficult time deciding whether to go through with the C++ port of upb or to stay in C.

I’ve ported about one third of upb to C++, on a branch, to see how it would turn out. It was a ton of work. Here are my current observations:

  • The C++ is cleaner, more readable, less error-prone code. It’s just a fact. Compare for yourself (C: upb_def.h, upb_def.c; C++: upb_def.h, upb_def.cc). This is due to numerous factors:
    • type-safe containers means fewer casts.
    • “public” and “private” keywords make it easy to separate the private parts of your interface, without having to specify in comments which is which.
    • namespaces and class scope mean that I don’t have to write out my identifiers like upb_fielddef_dothis(), I can just write DoThis().
    • real inheritance and member classes mean I don’t have to explicitly call all the right constructors/destructors, or write explicit casts for upcasts
    • destructors that are guaranteed to run on scope exit mean I can use RAII patterns like mutexes that automatically unlock when the scope is exited
  • The source got shorter; the portion I ported went from 1483 lines to 1133, or a ~30% reduction.
  • The binary got a LOT bigger. I had one function get literally 5x as big. I haven’t figured out why this happened yet. I used templates to make the table generic, but I was extremely careful to make sure that the template only generated a small amount of code — basically just the hash lookup routine, which is small (note: the hash function for strings was not templated or inlined). But another issue is that the C++ compiler appears to emit multiple copies of the same function in the same object file! For example, I found some virtual destructors emitted literally three times in the same file. Why is this?
  • I just heard back from a security guru from the Google security team, who said that C is often easier to audit than C++ because it’s easier to figure out what is actually going on, without having to dig through layers of abstraction. This surprised me (maybe it shouldn’t have, since Sam Quigley said the same thing in a comment on my last entry), but I was also a little bit relieved.

I’m leaning towards sticking with C, for the following reasons:

  • C++ compilers aren’t very good at keeping things small, even when you are juducious with your use of templates.
  • C++ compilers are much more complicated that C compilers, and therefore not as ubiquitous or as easy to trust generally.
  • C isn’t harder to audit for security than C++, and may actually be easier.

I’ll try to take some of the lessons I learned from my partial C++ port to make the C more readable.

Categories: Uncategorized Tags:
  1. Alejandro M.
    December 3rd, 2009 at 06:47 | #1

    great choice! :-)

  2. George P.
    December 8th, 2009 at 18:59 | #2

    I have a hard time accepting the comment that C++ compilers aren’t very good at keeping things small. Which compiler have you used? I would agree that C++ nuances are complicated and that you need to read volumes of books to better grasp the details of optimization.

  3. nobody
    December 14th, 2009 at 15:09 | #3

    GCC 4.5 should get link-time optimization which will hopefully help binary size a lot. ultimately though, i still find myself reaching for C, for the ease of binding to other languages and the smaller size. i still like to keep my code C++ compatible and run it through the C++ compiler occasionally though, just because it’s much stricter on typing.

  4. blankthemuffin
    December 27th, 2009 at 15:04 | #4

    I vote to stick with C for another reason, integration with other languages. If your project is a nice C library, it’s vastly easier for people using other languages, C, Python, whatever, to interface with your library rather than having to re-implement it.

    C++ is imo an awful choice for a library because it ends up restricting this to other things written in C++ (unless of course you wrap it up in a C abi but that is messy).

    I’d prefer a base in C with a C++ wrapper as required.

  5. Kate
    January 6th, 2010 at 11:01 | #5

    Hi. Don’t forget to check the return value of malloc(). You might also enjoy this:
    http://iq0.com/notes/deep.nesting.html

  6. Anonymous
    February 19th, 2010 at 15:33 | #6

    It’s trivial to wrap your C implementation so that it can be called directly from C++. So translating your implementation from C to C++ provides no real benefit, it’s more like busy work. Plus C++ is such an ugly language that its hard to imagine that it won’t die out over the next decade, replaced by C#/Objective C/Java etc. C, on the other hand, has virtues of simplicity, clarity, portability, and performance that other programming languages have yet to supplant.

  1. No trackbacks yet.