Giving up on AT&T style assembler syntax

July 31st, 2009 Josh No comments

Until recently I had been pretty agnostic about Intel vs. AT&T style assembler syntax. I always noticed that people who had a strong opinion almost always preferred Intel-style, but I didn’t care too much one way or the other.

gcc was my first real compiler, and this was back before binutils supported Intel syntax like they do now. So I read Brennan’s Guide to Inline Assembly (which I still reference frequently), and didn’t worry too much about it.

One thing that always bugged me a little bit was how the instruction names weren’t exactly the same. AT&T made you put these suffixes on your instructions, so mov would become movl. The main problem with this is Googleability.

But today what was previously an annoyance reached the level of being a serious problem. I was looking at an instruction listing and saw the instruction movslq. First I Googled for movsl (presuming that the “q” was a “quadword” suffix), but that yielded nothing. Then I tried Googling for movslq in its entirety, still nothing that seemed to define the instruction.

When I did follow the link, what I discovered is that movslq in AT&T syntax corresponds to movsxd in Intel syntax. The moment I discovered this, it became quite clear to me that AT&T syntax was a dead end. “-M intel” will be my default parameter to objdump from now on.

Categories: Uncategorized Tags:

Git needs a new interface

July 30th, 2009 Josh 5 comments

I’ve been a git advocate for a while, and I use git in two different projects. I think git is an impressive technical accomplishment, but I think its interface (”porcelain”) is not ready for prime-time. I really hope some UI-focused person will design a “v2″ for the git interface so that someday git can be the obvious choice for version control for any project.

Specific problems:

“checkout” is a destructive command

I seriously have no idea what Linus was thinking. It is insanity that:

$ git checkout foo.c

…will overwrite any local modifications you may have to foo.c without asking.

You can’t merge upstream changes into your local, uncommitted modifications

Suppose I cloned some repository and I started hacking it up. My changes are still hacky and not ready to be committed. Say a few days later I want to pull upstream changes, but without committing my hacky changes to my local repository:

$ git pull
remote: Counting objects: 5, done.
remote: Compressing objects: 100% (2/2), done.
remote: Total 3 (delta 0), reused 0 (delta 0)
Unpacking objects: 100% (3/3), done.
From /tmp/foo
   e726259..e20c51a  master     -> origin/master
Updating e726259..e20c51a
error: Entry 'foo.c' not uptodate. Cannot merge.

Git is refusing to perform a merge of my local modifications with the upstream changes. It wants me to commit my local changes first. This is annoying. Every version control system I have ever used supports this, except Git. Asking me to commit my hacky changes is unreasonable; they’re hacky and unfinished. They might not even compile!

Yes, I could do:

$ git stash
$ git pull
$ git stash apply

But why should I have to do this? CVS, SVN, and P4 don’t make me.

Git’s merge conflict resolution workflow is unintuitive

Continuing with the above example, now suppose I committed my local changes and then did a pull, but the changes were conflicting:

$ git pull
Auto-merged foo.c
CONFLICT (content): Merge conflict in foo.c
Automatic merge failed; fix conflicts and then commit the result.

Ok, git is somewhat helpful here, I’ll fix the conflicts in foo.c and commit the result:

$ vim foo.c
$ git commit
foo.c: needs merge
foo.c: unmerged (f388ef85dd65c39e4c76f5e597d3b67f7d1a0726)
foo.c: unmerged (6f4bf54585ae256236c0d6cfa9f114affb94313f)
foo.c: unmerged (06c974ebbfc04394f4fad8a6dcb31e64866fa1bf)
error: Error building trees

Ok, maybe it’s obvious to experienced git users what my error is here, but git’s error message here is worse than unhelpful — it’s downright confusing. I think I’ve resolved the conflict, but all git can think to do is tell me is that “foo.c needs merge” and spit some SHA1’s at me. It gives me absolutely no help about what I need to do to fix the problem.

Suppose that I want to resolve the merge by using either my version or their version verbatim (”accept mine”/”accept theirs”):

$ git checkout foo.c
error: path 'foo.c' is unmerged

Again, unhelpful (and in this case, what I’m trying to say actually makes sense, git just won’t let me do it).

Interface for working with the index almost universally confusing

I understand the difference between the working directory, the index, and the committed tree pretty well. But I cannot for the life of me remember the difference between:

$ git reset --soft
$ git reset --hard
$ git reset --mixed

I can barely keep them straight while I’m reading the manpage. “Soft” resets HEAD but not the working directory or index. “Mixed” resets HEAD and the index, but not the working directory. “Hard” reset HEAD, the index, and the working directory.

In conclusion, this isn’t meant to be an exhaustive list of problems with git’s interface, it’s more meant to be a microcosm. Git’s interface is not intuitive or easy to learn, and its error messages are not helpful. Which is too bad, because as I said I think Git is solid technology. I just hope someone writes a better porcelain for it. I’m not talking about evolutionary changes, I think the suite of top-level commands (checkout, branch, merge, pull, reset, commit, etc) needs to be redesigned from scratch.

Categories: Uncategorized Tags:

The Perils of Writing Good Documentation

July 30th, 2009 Josh 2 comments

I’ve been thinking about documentation lately, and I feel unsatisfied with the options I currently have available to me for writing and publishing documents. This dissatisfaction is not too well defined; I can’t put my finger on exactly what I want, but when I look at my options I’m not too excited about any of them.

When I say “documentation,” I am talking about several slightly different things:

  • Project Homepage for projects like upb and Gazelle. The goal of a project homepage is to answer the question “what is this project and why should I use it?” It should also be attractive enough for a person to feel like this project is high-quality stuff. And of course it should point them to the relevant resources (downloads, source tree, bug tracker, etc). Good examples: http://git-scm.com/, http://www.ruby-lang.org/en/, http://www.gazelle-parser.org.
  • Manuals for projects like upb and Gazelle. The goal of the manual is to provide both tutorial-like and reference-like information about how to use the software. Manuals have a lot of structure and are a bit more formal, since they are intended to precisely explain how the software should be used. They tend to track the software itself more closely than the other types of documentation, and are often even checked into the source tree. For example, Gazelle’s Manual.
  • Design discussions/rationale. This isn’t quite like a manual because instead of describing how the software works, they describe why the software is the way it is. What are the alternatives to your approach and why did you pick the one you did? What are the trade-offs? I don’t think we see as much of this documentation as we should in the open-source world, but one good example is the Python PEP process.
  • General articles about a particular subject. I mean to write some documents that explain the basic ideas of parsing in a more approachable way than most parsing literature. The literature can be a bit oblique, and I think I could do a good job of explaining it in a way that anyone can understand.

The main options that I see available to me are:

  • Plain HTML. Even I have come to the conclusion that this isn’t a good choice any more. Too much work to write, too little flexibility, not enough bang for the buck. Of the four documentation kinds above, the only one it remotely makes sense for is the “Project Homepage” case, but even that is too much work for me. Creating the Gazelle homepage took too much work, and it’s not even that awesome.
  • Personal Wiki / MarkDown. By “Personal Wiki” I mean a wiki that you run yourself. I put this in the same category as MarkDown because the two tend to have the same advantages/disadvantages. The advantages are that you can get a reasonably attractive product with minimal effort, and they are fairly customizable. The big disadvantage is that no two markdown languages are compatible, and there are so many to choose from (seriously: MarkDown, ReStructured Text, Textile, AsciiDoc, and those are just the ones I know off the top of my head). It’s slightly scary to invest a lot into a format that is one of many possible contenders.
  • Hosted Wiki, like the Google Code wiki or the GitHub wiki. In this case hosting is taken care of, but you have less control over the look and more stuff cluttering your page. Also, I can’t figure out why, but something about the design of Google Code makes me totally uninspired to write any documents in its wiki. Another thing to note is that if a hosted wiki disappears (GitHub is only a startup, it could totally go under), it’s not clear what happens to your documents!
  • DocBook, which is a little better than a MarkDown scheme because DocBook seems to have gained some critical mass. Still, the DocBook people seem to have a mild-to-moderate case of XML-itis, and the DocBook homepage seems more concerned with spitting acronyms at you than telling you if DocBook is capable of something basic like theming your document in different ways.

So as you can see, I’m not super satisified with any of my options. The Gazelle Manual uses AsciiDoc, which seems to work ok, and I would probably choose it again. I guess I’d be most inclined to choose either AsciiDoc or DocBook for writing general articles (I like this article about Python types and objects which was made using DocBook and is attractive).

I can’t decide what to do for the Project Homepage or the Design Discussions case. I really want to have attractive Project Homepages, but I don’t have too much web design talent and HTML is too much work for me. For Design Discussions I guess I’m leaning towards the GitHub wiki just because it pairs with the project hosting nicely, though I am somewhat uncomfortable with the idea that GitHub could disappear one day, and that moving my documents from the GitHub wiki somewhere else sounds like a headache.

Categories: Uncategorized Tags:

Don’t forget -march!

July 19th, 2009 Josh 1 comment

The -march flag itself is GCC-specific, but the general advice is universal: don’t forget to tell your compiler that it can take full advantage of your spiffy new CPU! I should know better but I’ve been forgetting to specify -march when compiling upb.

Here’s an extreme example of why. Take an innocent-looking function like:

int float_to_int(float f) {
  return (int)f;
}

Looks simple enough, right? Unfortunately, float -> int casts are stupidly expensive on x86. Without any -m flags, gcc compiles this to:

sub      $0x8, %esp       ; allocate stack space
fnstcw   0x6(%esp)        ; save floating-point control word
flds     $0xc(%esp)       ; push floating-point param onto fp stack
movzwl   0x6(%esp), %eax  ; move prev fp control word into %eax
mov      $0xc, %ah        ; set rounding mode of control word to "truncate"
mov      %ax, 0x4(%esp)   ; save it *back* to the stack
fldcw    0x4(%esp)        ; set the floating-point control word to truncate
fistp    0x2(%esp)        ; store integer from the fp stack to the stack
fldcw    0x6(%esp)        ; set the fp control word back to what it was
movzwl   0x2(%esp), %eax  ; read the value into eax (the return value)
add      $0x8, %esp       ; give the stack space back
ret

This would be funny if it weren’t so sad. All these gymnastics are required because the cast is required to round down (according to the C standard), but that requires the x86’s floating point unit to be in a different mode than for most operations.

Compiling exactly the same code with -msse2 allows the compiler to take advantage of an SSE-only instruction, and the above is replaced with:

cvttss2si  0x4(%esp), %eax     ; convert value to integer with truncation
ret

The difference in this case is astounding. Hopefully this will motivate you never to forget the -march flag!

The right thing to do in my case is compile with -march=core2. When I compile with -march=core2 or -msse3, the compiler to emits the not-quite-as-terse:

sub     $0x4,%esp
flds    0x8(%esp)
fisttpl (%esp)
mov     (%esp),%eax
add     $0x4,%esp
ret

I’m really not sure why gcc prefers this version when sse3 is available. It seems to be more work than the sse2 version. I tend to believe gcc know what it’s doing here, but I’d love to learn why.

Categories: Uncategorized Tags:

Gazelle is going to love SSE 4.2

July 18th, 2009 Josh 3 comments

SSE 4.2 includes text processing instructions. In the words of Ars Technica:

Intel has added a number of new instructions to Nehalem and it has sped up others. The 4.2 version of Intel’s SSE vector extensions takes the x86 ISA back to the future just a bit by adding new string manipulation instructions. I say “back to the future” because ISA-level support for string processing is a hallmark of CISC architectures that was actively deprecated in the post-RISC years; typically, when a writer wants to give an example of crufty old corners of the x86 ISA that have caused pain for chip architects, string manipulation instructions are what he or she reaches for. But the new SSE 4.2 string instructions are aimed at accelerating XML processing, which makes them Web-friendly and therefore modern (i.e., not crufty).

I chuckled a bit when I read this. I’m not very purist when it comes to hardware. If these instructions will make my parsers faster, then they sound great to me!

The four new instructions are:

  • pcmpestri: packed compare of explicit length strings, returning index
  • pcmpestrm: packed compare of explicit length strings, returning mask
  • pcmpistri: packed compare of implicit length strings, returning index
  • pcmpistrm: packed compare of implicit length strings, returning mask

The variants are as follows:

  • implicit length strings are NULL-terminated, explicit strings have an explicit length (ie. the whole input register).
  • they can return an index into the source string (if you were searching for something) or a mask (if you wanted to test each character of the input

Both let you scan a 128-bit SSE register (treating it as either 16 8-bit characters or 8 16-bit characters) and perform all kinds of searches/comparisons. The instructions are configurable; you supply a control word that specifies all of the different variations of the instructions. For example, are the input values signed or unsigned, are we comparing against ranges or specific values, etc.

The reciprocal throughput of these instructions is high (2 cycles) but the latency is annoyingly slow (9 cycles). This means that you have to wait nine cycles after issuing the instruction before you can use the result. It’s hard to think of too many useful things you can execute in parallel while you’re waiting for that answer. As a side note, these figures come from Intel’s IntelĀ® 64 and IA-32 Architectures Optimization Reference Manual, which says that the latency number is a worst case estimate:

Actual performance of these instructions by the out-of-order core execution unit can range from somewhat faster to significantly faster than the latency data shown in these tables.

I’m not enough of a hardware geek to know what to actually expect.

Still, that’s nine cycles to wait before getting a lot of really useful information. In addition to returning the index or mask, the instructions set several of the flags in useful ways.

So what processors have SSE 4.2? Or in other words, how long will my impatient self have to wait to try them out? Apparently SSE 4.2 is available on Penryn, which is the second-gen Core 2, which debuted in 2007/2008. It uses a “45 nm process”, which I’m sure means something to hardware geeks but not to me. All I know is that it’s not the Core 2 that’s inside the MacBook Pro sitting on my lap. And of course SSE 4.2 is in the new Nehalem.

Categories: Gazelle, Hardware Tags: