Anyone who has used GNU Make on a nontrivial project surely has at some point wanted to separate inputs from outputs and intermediates. For example, take a C++ shared library using GCC (using -MMD to autogenerate GNU make dependency information):

  • baz.cxx
  • bar.cxx
  • baz.o
  • baz.dep
  • bar.o
  • bar.dep
  • (symlink to
  • (symlink to

First, anyone who hasn’t already should read How Not to Use VPATH. Now, suppose that I want to put intermediates into $(CURDIR)/tmp and outputs into $(CURDIR)/lib, and those directories must be created as a part of the build. That means that at some point in time, the build script needs to execute mkdir tmp and mkdir lib, because GCC will not auto-create them. These are the solutions I’ve come upon (aside from the obvious but useless “don’t use make”):

  1. Order the rules so that the mkdirs occur before the g++s. Of course, this breaks parallel builds, so it’s not really an option. But it’s the most obvious solution, and it doesn’t break in serial builds.
  2. Modify the rule to depend upon the directory (tmp/%.o : %.cxx tmp). There’s only one call to mkdir, but now we get a new sort of error. Whenever the directory gets modified, it triggers recompilation of all of the objects!
  3. Make the compilation rule (tmp/%.o : %.cxx) execute the mkdir before executing the g++. There’ll be a lot of spurious calls to mkdir, but it should never break.
  4. make a local g++ wrapper script that creates the directory before compiling. Invoke the wrapper script from the Makefile instead of invoking the raw compiler.

Personally, I don’t think that any of those solutions is that great. The problem, as I see it, is the combination of how Make handles dependencies (by modified-timestamp) and how directory timestamps work on Linux (if you add or remove files, it “modifies” the directory). I’ve dealt with it using solutions #1-3, but I haven’t tried #4. I suppose for a large project that #4 isn’t such a bad idea… you can wrap up all of your system-wide rules into it, and then the make output gets shorter, as well. Take this as an example:


while read arg ; do
    if [ "$arg" = "-o" ] ; then
        read outdir
        mkdir -p $(dirname "$outdir")
done < <(echo "$@")

g++ -c -ansi -pedantic -std=c++98 \
    -O3 -m{arch=core2,sse2,fpmath=sse,inline-all-stringops} \
    -W{all,extra,format=2,write-strings,init-self,error} \
    -W{cast-align,cast-qual,pointer-arith,old-style-cast,overloaded-virtual} \
    -f{omit-frame-pointer,strict-aliasing,fast-math,tracer} \
    -I{/usr/local/include,/usr/java/jdk1.5/include,/usr/java/jdk1.5/include/linux} \

Ok, now I’m convinced. I do like #4… It seems like a lot of extra work for small one-off projects, but it’s a marginal cost for a larger environment (where you end up doing more complex tasks like code generation, combining disparate projects into a single build location, automated source-control integration).

I’m moving this week, which means tearing down my living room server. So much for my uptime!

 17:59:28 up 310 days, 22:29,  4 users,  load average: 0.41, 1.09, 1.33

In the meantime, I’ll be backing up my files over SSH to my dad’s homemade NAS box… we’ll see how long that takes with my RCN connection.

Here’s a few tidbits that I’ve run into when building latency-sensitive applications in Linux…

gcc -c -fno-guess-branch-probabilities

If GCC is not provided with branch probability data (either by using -fprofile-generate or __builtin_expect), it will run its own estimate. Unfortunately, very minor changes in source code can have very major changes in these estimates, which can foil someone attempting to measure the effect of source-level optimizations. Personally, my opinion is to use -fprofile-generate and -fprofile-use for maximum performance, but the next best is to have consistent results while developing.

Also, see this brief gcc mailing list discussion about the topic.

ld -z now
Equivalent to gcc -Wl,-z,now (which is probably preferrable, since I write C++ and always use the g++ frontend), this tells the dynamic linker to resolve symbols on program start instead of on first use. I’ve noticed that sometimes, the first run through a particular code path results in a latency of 800 milliseconds. That’s huge; that’s big enough for a human to notice with the naked eye. For server processes, that generally also means that the first request serviced on startup blows chunks, latency-wise. While it may not make a huge difference in the grand scheme of things, it’s nice to be able to pay the cost of symbol lookup up front instead of at first use.

Most software developers have stumbled across the humble C atoi() function. And for most simple jobs, it is sufficient. However, I’ve actually run into cases where it becomes the performance bottleneck, so I’ve spent a little bit of time playing around with various optimizations.

So, what are the problems with the standard atoi() that would cause me to even think about reimplementing it myself for a custom case?

  • It doesn’t provide useful error indicators. It just returns 0 on failure. It also doesn’t consider trailing non-digits to be errors, so its return code can’t really be trusted for total conversion, anyways.
  • It’s at least an order of magnitude slower than rolling your own.

Overall, I’d probably stick with boost::lexical_cast or std::istringstream for the average case, where performance is negligable. since I always use C++, I don’t see any reason to drop down to strtod, but I suppose I would use that if I were writing C code.

A Simple Replacement

For your edification, here’s a simple replacement function that handles most of the stock C version:

int atoi(const char *str) {
    while (*str == ' ') ++str; // skip leading whitespace
    int sign = 1;
    switch (*str) { // handle leading +/-
        case '-': ++str; sign = -1; break;
        case '+': ++str; break;
    int value = 0;
    while (char c = *str) {
        if (c  '9' || value < 0) return 0; // non-digit, or overflow
        value *= 10;
        value += c - '0';
    if (value < 0) return 0; // integer overflow
    return value * sign;

Optimizing my Special Case

In the cases I care about, there are a few additional constraints I can typically take advantage of:

  • Negative numbers can show up, but positive numbers never have a leading + character. Leading 0s never show up.
  • Size of the string is known a priori, so I can work from the back to the front.
  • Maximum size of the string is known (a 32-bit integer isn’t ever going to be more than 10 characters), so I can manually unroll the loop.
  • The caller has validated that there are no non-digits (or doesn’t care if an input error cascades to an output error silently), so I don’t need to check if all values are between 0 and 9 inclusive.

So here’s the optimized version. For some reason, wrapping this up in a function object (called many times per instantiation) performed significantly better than using a plain old global function.

class atoi_func
    atoi_func(): value_() {}

    inline int value() const { return value_; }

    inline bool operator() (const char *str, size_t len)
        value_ = 0;
        int sign = 1;
        if (str[0] == '-') { // handle negative
            sign = -1;

        switch (len) { // handle up to 10 digits, assume we're 32-bit
            case 10:    value_ += (str[len-10] - '0') * 1000000000;
            case  9:    value_ += (str[len- 9] - '0') * 100000000;
            case  8:    value_ += (str[len- 8] - '0') * 10000000;
            case  7:    value_ += (str[len- 7] - '0') * 1000000;
            case  6:    value_ += (str[len- 6] - '0') * 100000;
            case  5:    value_ += (str[len- 5] - '0') * 10000;
            case  4:    value_ += (str[len- 4] - '0') * 1000;
            case  3:    value_ += (str[len- 3] - '0') * 100;
            case  2:    value_ += (str[len- 2] - '0') * 10;
            case  1:    value_ += (str[len- 1] - '0');
                value_ *= sign;
                return value_ > 0;
                return false;
    int value_;


I compared this optimized version with the standard library function by running each 1 million times in a row. I know for a fact that it’s bad, particularly because it reduces the impact of instruction cache misses. However, I expect that this only pessimizes my specialization, which I would expect to exhibit much better locality due to aggressive inlining. Here are the times:

standard atoi()
79142 milliseconds, or approximately 79 nanoseconds per iteration amortized.
class atoi_func
131 milliseconds. On my machine, that works out to about 4 clock cycles, which can’t be right. My best guess is that the numbers are too small for my clock resolution. I’ll take this to work and try on our machines there, they have much higher resolution for clock_gettime(CLOCK_REALTIME).

Sometime in the recent past, I ran across a bit of code that looked something like this:

// C++ code to parse an ASCII message sent across the network
void parseMessage(std::string serializedObject)
    std::string sString = message.substr(0, 1); // first character codifies the data type
    if (sString == "1") { ... }
    else if (sString == "2") { ... }
    else if (sString == "3") { ... }
    else { throw std::runtime_error("invalid message '" + serializedObject + "'"); }

Seriously? Let’s enumerate the ways that this could be improved:

  1. The first character describes the object type. A std::string isn’t necessary, a single character will do just fine.
  2. if/elseif/elseif/else should really be replaced with a switch block. If I were writing this in Python, I’d probably create a Dict of first-class functions and call the appropriate one. But in C++, switch is the closest thing; it’s more efficient than chained ifs, and (more importantly) IMHO its intent is clearer.
  3. The declaration std::string sString tells us three times that it is a string, but it never tells us what it means. At best, it’s a combination of misuse of Hungarian Notation and a lazy variable name. What’s worse—because it’s only ever a single character (and not a full string), the declaration is lying to us three times.

So what would I do to clean this up?

// C++ code to parse an ASCII message sent across the network
void parseMessage(std::string serializedObject)
    const char TYPE_FOO = '1'; // maybe these are defined elsewhere...
    const char TYPE_BAR = '2';
    const char TYPE_BAZ = '3';

    // wordpress doesn't like a literal ASCII null value, ignore the extra space
    char messageType = message.size() ? message.front() : '\ 0';
    switch (messageType)
        case TYPE_FOO:
        case TYPE_BAR:
        case TYPE_BAZ:
            throw std::runtime_error("invalid message '" + serializedObject + "'");

Ok, so that rant hit more than just variable names. I guess you could argue with me about some of the other style issues, but there’s no forgiveness for sString. That’s a lazy mind, and IMHO I’d never want to work next to someone who bothered checking that into source control.

For some more references on naming variables, functions, etc… I’d recommend checking out these pages:

  • Tim Ottinger’s Rules for Variable and Class Naming. I don’t recall where I ran into this, but I agree with most of what he’s saying. The main place where I would extend what he says is when he describes using names from either the Problem Domain or the Solution Domain. It’s quite likely that I just haven’t worked on systems of the same scope as him, but my general opinion is thus:
    • If it’s relevant to the business, it should always be described in terms of the problem domain.
    • If it’s a necessary piece of getting the business parts to work, but it is not at all relevant to the business, it should always be described in terms of the solution domain.
    • Some code may fit in between… choose the solution domain, if possible. But for christsake, don’t call something a FactoryManager!

    That way, properly-layered software can be described at a high level entirely using business-relevant concepts and structure. Anyone looking at the implementation details and lower levels will naturally need to understand basic software development concepts such as data structures and algorithms. It’s really a matter of identifying your audience and writing specifically to them. As to the people who name their design patterns explicitly… fie on you. It’s a pattern because of its behavior, not because of its name. The executable code should eschew pattern-speak if possible, and focus more on the problem domain.

  • Ubiquitous Language by Eric Evans in his book Domain-Driven Design (also see the page, spartan that it may be). Evans’ entire book is based around making the problem domain explicit in any software implementation, and a Ubiquitous Language is the first way to describe it. I worked at an ISP in the past, and differentiating between Customer, Account, Device, Access Point, etc. were all critically important for clarifying how the system was supposed to work—particularly when the time came to add multiple Devices per Customer, and the structural code changes practially wrote themselves. The best code that I wrote there all embedded that knowledge into the code itself, so anyone reading it would learn that part of the domain by learning the code.

I’ve spent the better part of the evening hacking on my guitar and browsing around YouTube. I stumbled across the old “Canon in D” mystery metal guitar movie, which prompted me to make a little list here of some of the better guitarists I’ve seen on YouTube.

Canon in D

  • JerryC, the orginal arranger.
  • funtwo, which was the first version that I remember seeing. Check out his wikipedia entry for the full story
  • Not sure who this is. there are more notes, but that doesn’t make it any better.
  • Rob Paravonian (comedian) rants about Canon in D as a grade-school cellist, and its omnipresence in pop music.
  • Guitar Hero parody… When the funtwo cover came out, about 10 million amateur guitarists decided to post their versions of the Pachelbel classic as well. I remember a really good parody slipped in between piles of utter garbage, of one guy playing a Guitar Hero guitar to the same audio track. I can’t seem to find it anymore—either it’s been lost in the bowels of the YouTube archives, someone pulled it, or it’s really poorly titled (and thus not searchable).

Kurikinton Fox

Being a diehard Final Fantasy fan, I of course ended up at some point searching for guitarists shredding Final Fantasy… Lo and behold, I found Kurikinton Fox. Check these out:

Kurikinton Fox has a few of his own songs posted up on YouTube as well.

Zack Kim

This guy is nuts. He plays two guitars at once like a piano (tapping on the fretboard), and he has a sweet haircut.


Here are few other guitar tidbits I’ve stumbled upon that are worth sharing:

I got into a discussion with a friend today about some C++ hackery using either template meta-programming or clever macro tricks. Essentially, I’m abusing the heck out of a set of header files by defining my own grammar to create constants. It takes 3 passes to go through it entirely.

  1. Interface, constants declared as extern. Everyone can include this.
  2. Implementation, constants defined locally. Only one file should use this. If you use switch statements on these constants, they must be defined in the same compilation unit, so whatever source code wants to use switch statements should have this.
  3. Stringification, where I use global static structures with constructor side-effects to populate a std::map, and a single function that maps unsigned integer values to the variable names (useful mostly for debugging). This part is definitely the biggest hack.

Why didn’t I just use enums? Because I want users to be able to inject their own constants into this set. Logically, it is an enum, but the implementation jumps through these silly hoops in order to make it extensible and add the name lookups.

This is a hack. It is a weakness that the language doesn’t provide an easier way to do extensible enumerations. It is awful to explain to a new person. But it’s beautiful in how it actually ends up working.

The next logical step, of course, is to add a way to convert these integers into their ASCII representation. The most obvious approach goes something like this:

std::string itoa(unsigned int i) {
    std::istringstream is;
    is << i;
    return is.str();

But that’s dog-slow. I want something faster (ok… maybe I don’t need anything faster, but it’s time to throw down the gauntlet and try!) Here’s where the template meta-programming comes in. I need two things:

  • A count of the digits of a given number (e.g. count(9999) = 4)
  • A way to represent fixed-precision numbers (up to the maximum number of digits representable in 32-bits, in this case).

Solution, Part 1

template <unsigned int N>
struct count_digits
    const static unsigned int value = count_digits<N/10>::value + 1;

#define COUNT_DIGITS_BASE_CASE(N)           \
template <>                                 \
struct count_digits<N>                      \
{                                           \
    const static unsigned int value = 1;    \

Ok. 10 base cases because we’re counting in base-10, and a recursive case that divides by 10 and adds one. Sounds like a logarithm, right? Good.

Solution, Part 2

Now I need to generate the actual strings:

 template <unsigned int N>
struct static_itoa
    // individual digits
    const static char digits[11];

    // string length
    const static unsigned int size = count_digits<N>::value;

    // valid NULL-terminated string
    const static char *value;

template <unsigned int N>
const char static_itoa<N>::digits[11] = {
    ((N / 1000000000) % 10) + '0',
    ((N / 100000000) % 10) + '0',
    ((N / 10000000) % 10) + '0',
    ((N / 1000000) % 10) + '0',
    ((N / 100000) % 10) + '0',
    ((N / 10000) % 10) + '0',
    ((N / 1000) % 10) + '0',
    ((N / 100) % 10) + '0',
    ((N / 10) % 10) + '0',
    ((N / 1) % 10) + '0',
template <unsigned int N>
const char *static_itoa<N>::value = digits + sizeof(digits) - size - 1;

There it is. Works with GCC 4.0.3 when being referenced in a single location. I need to play more to see if I can make it work as a general header file without incurring the wrath of the linker (multiply-defined symbols!), but this works quite nicely:

int main() {
    const unsigned int x = 16;
    printf("%u: size = %u, str = %s\n", x,
        static_itoa<x>::size, static_itoa<x>::value);
    return 0;

The other day I was thinking wayyyy too hard about nothing in particular, and I thought about the simple ambiguity of naming ourselves.

From the movie Orgasmo:

Maxxx Orbison:
What’s your name, again?
I am Sancho.
Maxxx Orbison:
Look, I get a lot of people auditioning all the time. What makes you think that you’d be good enough for porno?
I am Sancho.
Maxxx Orbison:
Great… but what do you do?
What do I do? I am Sancho.
Maxxx Orbison:
And there are many Jeffs in the world, and many Toms as well. But I… am Sancho.
Maxxx Orbison:
Are you Sancho? No you are not. Neither is Scott Baio Sancho. Frank Gifford is not Sancho. But I…
Maxxx Orbison:
You… are Sancho!
That’s right.
Maxxx Orbison:
Okay, you’re hired.

In a nutshell, Sancho is stating something more abstract than just his name. I can say “I am Tom Barta”, but typically I just mean “My name is Tom Barta.” This guy isn’t just named Sancho, he is Sancho, and there is no other.

Think about it: what do you really mean if you go up to a stranger and ask, “Who are you?” The answer could be “John”, “a businessman”, “your neighbor”, or “a pround Republican”. Similarly, if you approach someone and say “Hi, I am John,” I can imagine it would take a tiny bit more processing than saying “Hi, my name is John.” If your name is rare (like Moon Unit or Apple), it’s even more important to remove any ambiguity. “I am Apple” sounds like nonsense or pidgin.

There’s a Nerd-Tangent Hidden in Every Real-Life Thought

There is a parallel between this and programming. The advent of object-oriented programming has popularized the notion of “object identity” versus “object equality”. I can have two objects sitting in memory with identical data. Sometimes, that’s just a programming error, and the same object has been copied unnecessarily (this happens frequently with caching or persistent systems). Sometimes, the two objects genuinely are different. How can I tell if one is just an alias, or if they are logically distinct?

It depends, of course. If I am looking at Value Objects, there’s generally not a reason to distinguish them by identity. The color Red is always Red, even if I have two Reds. However, with Entities, identity is of utmost importance. Two John Smiths in a customer database represent different people. Another way to think about it is in the context of the Flyweight pattern. The two Reds could be replaced with a flyweight without affecting the program. However, the John Smiths couldn’t.

Enter Programming Languages

Of course, programming languages that use objects must have some way of distinguishing object identity from object equality.

Language Identity Equality
C, C++ &a == &b a == b
Java Anyone care to fill this one in for me? I’m unaware of the semantics of == and equals().
PHP nothing! a == b (coerce types to match) or a === b (check types)
Python a is b a == b

Whoops! Looks like PHP doesn’t even have object identity! I’d like for someone to be able to refute this, but I haven’t been able to figure it out. PHP documentation claims === means identical and == means equal, but that certainly doesn’t match the notion of object identity I just explained. Sadly, this essentially means that object identity will never truly work in PHP. Instead, we are left with “equal” and “more equal”.

Does it Matter?

In the big picture, I don’t think it hurts PHP programmers to lose object identity. Most PHP applications are business-logic interfaces sitting on top of relational databases. What’s special about the RDBMS in this context? Well, object identity doesn’t exist. I know Postgres has oid and there are probably others, but using them for general applications seems to be unfavorable. In a database table, objects (tuples/records/rows) are identified by a primary key that disallows duplicates and frequently uses auto-incrementing integers. It’s a trivial solution, really, to just assign a number to everyone who walks in the door (until you run out of numbers, of course).

Since the database enforces this uniqueness, I know that two customers both named John Smith will at least have different customer IDs. Social Security, credit cards, university student IDs, and phone numbers all revolve around this notion of unique numeric identity. Consequently, almost any PHP application using a RDBMS can simply piggyback upon the database’s IDs and trivially state that === is now identical to ==.

I’m going to San Francisco for a few days to see my uncle graduate (his 3rd master’s degree). I’m definitely looking forward for another chance to go to Hunan Home’s restuarant for their pot stickers, and of course, Jackie Chan’s (self-proclaimed) brother.

Recently, I took a look at ADOdb, ADOdb-Lite, and PDO to find a replacement for PEAR::DB. I’ve looked at PEAR::MDB2, and not been happy with it for the same reasons I’m not happy with PEAR::DB:

  1. Since the application is tied to Postgres, there’s not much benefit from getting a database-agnostic driver. I’d rather have something that supports my database well, instead of all databases in a mediocre and outdated fashion.
  2. Not directly a complaint of PEAR itself, but my manager prefers wrapping existing OSS solutions to modifying the source (less concern about patches). Wrapping around PEAR::DB introduces all sorts of efficiency problems, since whatever data processing happens gets done twice. So not anything for the fault of PEAR, but it doesn’t fit with the development method of “wrap” vs “inject”.
  3. Following from #1, PEAR supports older versions of PHP/Postgres than I need, with the result that it won’t use all of the modern functionality I want.

For those of you who are curious, I did discover that ADOdb does use the existing pg_*() functions. Plus, the ADOdbs both target database independence through API-level SQL generation, instead of raw SQL. I ended up going with PDO for my project with a few modifications:

  • Exceptions are enabled by default
  • query() can take parametrized statements (why isn’t this done by default?)
  • Fetchmode is associative by default, so i can foreach a record
  • Statement execute() returns $this instead of an error code (I’m using exceptions anyways) so I can chain other functions onto it
  • I’ve added a few fetch*() functions for clarity in common use-cases

All of these modifications are intended to improve PDO’s overall usability (interfaces aren’t just for end users).

class PDO_ extends PDO {

  function __construct($dsn, $username, $password) {
    parent::__construct($dsn, $username, $password);

  function prepare($sql) {
    $stmt = parent::prepare($sql, array(
      PDO::ATTR_STATEMENT_CLASS => array('PDOStatement_')

    return $stmt;

  function query($sql, $params = array()) {
    $stmt = $this->prepare($sql);
    return $stmt;

class PDOStatement_ extends PDOStatement {

  function execute($params = array()) {
    return $this;

  function fetchSingle() {
    return $this->fetchColumn(0);

  function fetchAssoc() {
    $data = array();
    while ($row = $this->fetch()) {
      $data[$row[0]] = $row[1];
    return $data;

The end result of this is that I can use the connection much more naturally. It’s not quite a fluent interface, but you can see that it’s improved:

chained fetches after statement execution:
$stmt = $db->prepare(
    'SELECT first, last FROM users WHERE uid = ?');
$user1 = $stmt->execute(array(1))->fetch();
$user2 = $stmt->execute(array(2))->fetch();
parametrized direct querying:
$uid = $db->query(
    'SELECT uid FROM users WHERE first = ? AND last = ?',
    array($first, $last));
fetching an associative array of $uid => $username:
$users = $db->query('SELECT uid, username FROM users')
selecting a single aggregate:
$count = $db->query('SELECT COUNT(*) FROM users')

I also add RAII-style transactional programming. This object can be created at the top of a function and committed at the bottom of the function. If an exception is thrown in between, it will automatically roll back the transaction that it started upon creation. This is useful in cases where exceptions are not being handled, but must bubble up to higher layers in the code.

class Transaction {

  private $db = NULL;
  private $finished = FALSE;

  function __construct($db) {
    $this->db = $db;

  function __destruct() {
    if (!$this->finished) {

  function commit() {
    $this->finished = TRUE;

  function rollback() {
    $this->finished = TRUE;

This little guy is really convenient. Any model-level code that works with the database typically doesn’t need to address database errors (actually, I usually let them bubble up to the global error handler, which can then log it and print an apologetic 500 HTTP response).

function addUser($db, $username, $friend_users) {
  $txn = new Transaction($db);

  $db->query('INSERT INTO users(username) VALUES (?)', array($username);
  $uid = $db->query('SELECT currval(?)', array('username_id_seq'))

  $stmt = $db->prepare('INSERT INTO friendships(uid1, uid2) VALUES (?, ?)
  foreach ($friend_users as $fuid => $fname) {
    $stmt->execute(array($uid, $fuid));

No exception handling required at all. If anything throws an error (e.g. an invalid friend), the stack unwinds and $txn automatically rolls back the transaction in its destructor.