Defending RPC

Posted by josh at May 23rd, 2008

Steve Vinoski has come out very vocally against RPC in the last few days: see this blog entry and this mailing list post. The blog entry (which I read first) made him sound like someone who just hasn’t been around large systems much, but then I was surprised to see that he’s a senior fellow or architect or something along those lines at a company that does distributed systems.

His blog entry basically makes fun of Cisco for inventing/releasing another RPC system. It’s not clear exactly what he thinks they should have done instead. What is strange about this criticism is that tons of technology companies have developed their own RPC system — Facebook and Cisco publicly, and other technology companies I am familiar with in a not-so-public way. Guess what: large commercial distributed systems are built largely on RPC. Is he arguing that all of the engineers at these companies simultaneously got the bad idea of investing in something they don’t need? If RPC is such a bad idea, then why is everybody doing it?

“Everybody’s doing it” obviously isn’t a justification alone, but it definitely puts the onus on the person making the critique to show why it’s a bad idea. I got a better idea where he was coming from when I read the mailing list post. Here’s the heart of his argument:

the fundamental problem is that RPC tries to make a distributed invocation look like a local one.This can’t work because the failure modes in distributed systems are quite different from those in local systems, so you find yourself having to introduce more and more infrastructure that tries to hide all the hard details and problems that lurk beneath. That’s how we got Apollo NCS and Sun RPC and DCE and CORBA and DSOM and DCOM and EJB and SOAP and JAX-RPC, to name a few off the top of my head, each better than what came before in some ways but worse in other ways, especially footprint and complexity. But it’s all for naught because no amount of infrastructure can ever hide those problems of distribution. Network partitions are real, timeouts are real, remote host and service crashes are real, the need for piecemeal system upgrade and handling version differences between systems is real, etc. The distributed systems programmer *must* deal with these and other issues because they affect different applications very differently; no amount of hiding or abstraction can make these problems disappear.

Finally something we can agree on! Yes, on a network shit happens, and no sane RPC system will try to hide this from you.

But then again, I don’t know of any RPC system that tries to hide this from you except possibly CORBA. Maybe there’s a horrible history here I don’t know about, but no RPC system I have ever encountered tries to hide from you the fact that on a network, shit happens.

So what are his other criticisms?

RPC systems in C++, Java, etc. also tend to introduce higher degrees of coupling than one would like in a distributed system. Typically you have some sort of IDL that’s used to generate stubs/proxies/skeletons — code that turns the local calls into remote ones, which nobody wants to write or maintain by hand. The IDL is often simple, but the generated code is usually not. That code is normally compiled into each app in the system. Change the IDL and you have to regenerate the code, recompile it, and then retest and redeploy your apps, and you typically have to do that atomically, either all apps or none, because versioning is not accounted for.

Yay, we can agree again. RPC systems that make you do an “all at once” upgrade are a bad idea. But again, no RPC system I have encountered makes you do this. Does this mean that the RPC system guarantees for you that the old and new protocols are compatible? Of course not — you don’t want your framework to be some big “I know what’s best for you” mommy that does really expensive things to solve this problem, like loading both versions of your code at the same time. But any RPC framework worth its salt makes it possible to have different interface versions interoperate. Adding a new parameter? No problem, old servers simply won’t see it. Completely changing the semantics of your call? No problem — just give the new call a new name.

Steve’s criticism amounts to “sucky RPC systems suck.” Yes Steve, yes they do. But a lot of the technology world is running on non-sucky RPC systems, and from time to time you get a glimpse of that when a company like Facebook or Cisco releases their internal RPC system to the outside world. Did Steve check to see if Cisco’s new RPC system is subject to any of his critiques? I haven’t, but I would suspect it isn’t.

Posted in Uncategorized| 10 Comments | 

The future of automatic memory management

Posted by josh at April 9th, 2008

Observation #1: stop-the-world garbage collection is a thorn in the side of latency-sensitive applications.

Observation #2: we will very soon have more cores than we know what to do with.

Prediction: fully concurrent garbage collection is the future of automatic memory management. I’m talking garbage collectors that run in other threads and clean up after me without ever stopping me in the middle of what I’m doing.

It will almost certainly be more expensive in terms of total CPU time, and probably can’t be as aggressive in terms of what it can reclaim at any point in time, but for most applications the latency guarantees will far outweigh.

Discuss.

Posted in Uncategorized| 4 Comments | 

Python threading blues

Posted by josh at March 20th, 2008

Some Python fan please tell me that I’m missing something.

Is this really the boilerplate necessary for creating even the simplest thread in Python?

import threading

class MyThread(threading.Thread):
  def __init__(self, arg, **kwargs):
    threading.Thread.__init__(self, **kwargs)
    self.arg = arg

  def run(self):
    print "I’m running in a thread, with arg %d!" % (self.arg)

thread = MyThread(5)
thread.start()
 

This is making me miss Ruby, for which the equivalent is:

thread = Thread.new(5) { |arg|
  puts "I’m running in a thread, with arg #{arg}!"
}
 

P.S. Gazelle 0.2 is making a lot of progress, but unfortunately won’t hit the 1 month mark I hoped for. Surprise surprise. But when it does come, it’s going to be awesome.

Posted in Uncategorized| 3 Comments | 

More thoughts on good programmers

Posted by josh at February 12th, 2008

I’m finally responding to Buffalo’s perspective on my last post about “Brilliant programmers.” I didn’t say anything at first because I couldn’t think of anything insightful to say. I still can’t, so I’m going to have to just make shit up.

Buffalo approaches the question from an educator’s perspective:

The real million dollar question in my mind, Josh, is what are these super-programmers doing that others are not? Even if there’s just something irreproducible in their genes - shouldn’t this be the sort of thing we could detect somehow? Or if there are strategies involved, could we use them to make the 90 percentile 5x better than everybody else?

[…]

Which of course brings me around to my research - say you had a group of the 50 cs students who enter an average program around the country. And say your goal was to get absolutely as many of them as possible into that 5% and screw everything else. What would you say to them? What would you do with them?

I guess it’s depressing to put this way, but my true belief is that programmers are born, not made. So I think the absolute number one thing a CS education can do for its students is help them understand whether they were born programmers or not.

This isn’t a binary thing, of course, and I don’t even think it’s single-dimensional. Everyone has areas of strength and weakness. And it’s a big world out there — succeeding at Amazon or Microsoft or Google is different than succeeding in the consulting world or at a startup or the technology department of an unrelated industry. I know at least one person who is a decent programmer but is doing quite well by combining good business understanding and fantastic people skills with his competent but not outstanding technical background. There are a lot of paths for people to take.

Then there is the question of what productivity is. In a corporate setting, productivity is making progress toward keeping your customer happy, whoever your customer may be and however bizarre their wants. Well I doubt that writing a self-compiling C compiler or figuring out how to calculate pi was at the top of any customer’s wishlist. Fabrice Bellard, for all his badassery, could end up being fairly unremarkable when you put him in a cube and tell him to write web applications. Or he might find it excruciating and spend the whole time inventing a framework that lets him write web apps the way he thinks. After all, that’s what Paul Graham did.

So disclaim, disclaim, disclaim. My point so far is just to highlight what is probably already obvious: that the landscape of talent is complex and multidimensional.

That said, I think it’s really key that students understand their strengths and weaknesses as early as possible. It will help them decide whether CS is right for them, and if it is what direction they should go within CS. And the best way to achieve this understanding is to expose students to lots of different things. As a bonus, this is also exactly what the overachievers need and want too; more problems to feast their minds on. So if I could sum it up in a word, I think the number one thing that university can do is expose young programmers to lots of topics and give them feedback about how they do in each one.

As to how we can help budding programmers be in the super-productive elite? I’m somewhat hesitant to answer that question, because I feel the answer reveals what the answerer thinks makes him/her a member of this elite class. But my biased answer is that the #1 most valuable trait a programmer can have is resourcefulness. It’s knowing what’s out there — what tools, strategies, file formats, blogs, or best practices — and knowing how they apply to the problem in front of you. If some programmers truly are outperforming others by 20x, it’s not because they’re typing 20x faster — it’s because they have such a solid understanding of the problem’s context that they have laser-like focus on the shortest possible path between them and their goal.

On the flip side, I think the worst thing that can happen for programmers is for them to get caught in their own little worlds where they only know one way of dealing with problems. Java-only programmers are the easiest to pick on example of this vice, but not the only one. You can probably get by only using/liking Java if you work at an all-Java shop where you are a cog in a machine, but you’ll never have any perspective of the big picture. I think we all fall prey to getting comfortable with our favorite languages/tools and thinking of everything through those lenses, but the more we can avoid that, the better we do IMO.

So that’s my answer: show students what’s out there, and encourage them to be resourceful. If there is any trace of programmer in them, it will come to life and demand that you feed it more. Anyone who doesn’t self-motivate at that point is not destined to be a programmer. They might not have the talent, or they might have the talent but not the motivation; either way, they probably should find something else instead.

Posted in Uncategorized| 1 Comment | 

Brilliant programmers

Posted by josh at February 2nd, 2008

Last November Bruce Eckel (the “Thinking in $FOO” guy) gave a speech claiming that 5% of programmers are 20x more productive than the other 95%. Some people flat out denied that was possible. I think it’s a mistake to think of “20x more productive” as “well, Billy wrote class Foo in 3 hours so I guess a superstar programmer would have written the same class in 9 minutes.” It’s not like that at all.

To me, what makes that 5% so bad-ass is that they write code that the mediocre programmer will never write, not if you let them work until the heat death of all the stars in the sky.

I want to take a moment and recognize a few of the people I consider to be the most bad-ass programmers around. This will necessarily be people whose bad-ass work is open source.

  • Fabrice Bellard. This guy is an animal. He wrote QEMU, an hardware emulator that achieves near-native performance by using dynamic translation (translating code from one instruction set to another on-the-fly). He’s won the obfuscated C code contest twice, once with a C compiler capable of compiling itself (in less than 4k, according to the rules of the contest). He later adapted this compiler to a boot disk in which the boot loader loads and compiles the entire Linux kernel (in approximately 15 seconds on modern hardware) and proceeds to boot from it. Also, though its not exactly computer programming, in 1997 he discovered the fastest known algorithm for computing an arbitrary digit of pi.
  • Julian Seward. He wrote Valgrind, which started as the best memory debugger money can buy, and has grown into far more. He also wrote bzip2, as well as GHC, the most popular native code compiler for Haskell.
  • Mike Pall. You want this guy working on your programming language. He created LuaJIT, one of the best and lightweight JITs available for any dynamic language today. And his plans for where he’s going to take LuaJIT in the next year are going to put Lua so far ahead of other dynamic language’s implementations that it hurts.
  • Linus Torvalds: this one is almost too obvious to include, but I have to mention it because with Git he showed that Linux wasn’t just a fluke. He didn’t just get lucky by being in the right place at the right time. This guy’s the real deal.
  • anyone who has ever won the Obfuscated C Code Contest. Maybe not the must useful code ever, but what these people manage to pull off in less than 4k of code impresses me beyond measure. Special mention to Brian Westley, who has won it nine times! My favorite entry: A letter from Charlie to Charlotte.

Guys like this singlehandedly turn out code that 99.9% of programmers will simply never write. More time won’t help, and putting them into teams of people all working on the problem certainly won’t help. There’s just a very small subset of people who will ever create such brilliant solutions to such hard problems, and I have nothing but respect and awe for people of this caliber.

Posted in Uncategorized| 3 Comments | 

Next Postings »