You are currently browsing the monthly archive for May 2007.

The other day I was thinking wayyyy too hard about nothing in particular, and I thought about the simple ambiguity of naming ourselves.

From the movie Orgasmo:

Maxxx Orbison:
What’s your name, again?
Sancho:
I am Sancho.
Maxxx Orbison:
Look, I get a lot of people auditioning all the time. What makes you think that you’d be good enough for porno?
Sancho:
I am Sancho.
Maxxx Orbison:
Great… but what do you do?
Sancho:
What do I do? I am Sancho.
Maxxx Orbison:
And…?
Sancho:
And there are many Jeffs in the world, and many Toms as well. But I… am Sancho.
Maxxx Orbison:
And…?
Sancho:
Are you Sancho? No you are not. Neither is Scott Baio Sancho. Frank Gifford is not Sancho. But I…
Maxxx Orbison:
You… are Sancho!
Sancho:
That’s right.
Maxxx Orbison:
Okay, you’re hired.

In a nutshell, Sancho is stating something more abstract than just his name. I can say “I am Tom Barta”, but typically I just mean “My name is Tom Barta.” This guy isn’t just named Sancho, he is Sancho, and there is no other.

Think about it: what do you really mean if you go up to a stranger and ask, “Who are you?” The answer could be “John”, “a businessman”, “your neighbor”, or “a pround Republican”. Similarly, if you approach someone and say “Hi, I am John,” I can imagine it would take a tiny bit more processing than saying “Hi, my name is John.” If your name is rare (like Moon Unit or Apple), it’s even more important to remove any ambiguity. “I am Apple” sounds like nonsense or pidgin.

There’s a Nerd-Tangent Hidden in Every Real-Life Thought

There is a parallel between this and programming. The advent of object-oriented programming has popularized the notion of “object identity” versus “object equality”. I can have two objects sitting in memory with identical data. Sometimes, that’s just a programming error, and the same object has been copied unnecessarily (this happens frequently with caching or persistent systems). Sometimes, the two objects genuinely are different. How can I tell if one is just an alias, or if they are logically distinct?

It depends, of course. If I am looking at Value Objects, there’s generally not a reason to distinguish them by identity. The color Red is always Red, even if I have two Reds. However, with Entities, identity is of utmost importance. Two John Smiths in a customer database represent different people. Another way to think about it is in the context of the Flyweight pattern. The two Reds could be replaced with a flyweight without affecting the program. However, the John Smiths couldn’t.

Enter Programming Languages

Of course, programming languages that use objects must have some way of distinguishing object identity from object equality.

Language Identity Equality
C, C++ &a == &b a == b
Java Anyone care to fill this one in for me? I’m unaware of the semantics of == and equals().
PHP nothing! a == b (coerce types to match) or a === b (check types)
Python a is b a == b

Whoops! Looks like PHP doesn’t even have object identity! I’d like for someone to be able to refute this, but I haven’t been able to figure it out. PHP documentation claims === means identical and == means equal, but that certainly doesn’t match the notion of object identity I just explained. Sadly, this essentially means that object identity will never truly work in PHP. Instead, we are left with “equal” and “more equal”.

Does it Matter?

In the big picture, I don’t think it hurts PHP programmers to lose object identity. Most PHP applications are business-logic interfaces sitting on top of relational databases. What’s special about the RDBMS in this context? Well, object identity doesn’t exist. I know Postgres has oid and there are probably others, but using them for general applications seems to be unfavorable. In a database table, objects (tuples/records/rows) are identified by a primary key that disallows duplicates and frequently uses auto-incrementing integers. It’s a trivial solution, really, to just assign a number to everyone who walks in the door (until you run out of numbers, of course).

Since the database enforces this uniqueness, I know that two customers both named John Smith will at least have different customer IDs. Social Security, credit cards, university student IDs, and phone numbers all revolve around this notion of unique numeric identity. Consequently, almost any PHP application using a RDBMS can simply piggyback upon the database’s IDs and trivially state that === is now identical to ==.

Advertisements

I’m going to San Francisco for a few days to see my uncle graduate (his 3rd master’s degree). I’m definitely looking forward for another chance to go to Hunan Home’s restuarant for their pot stickers, and of course, Jackie Chan’s (self-proclaimed) brother.

Recently, I took a look at ADOdb, ADOdb-Lite, and PDO to find a replacement for PEAR::DB. I’ve looked at PEAR::MDB2, and not been happy with it for the same reasons I’m not happy with PEAR::DB:

  1. Since the application is tied to Postgres, there’s not much benefit from getting a database-agnostic driver. I’d rather have something that supports my database well, instead of all databases in a mediocre and outdated fashion.
  2. Not directly a complaint of PEAR itself, but my manager prefers wrapping existing OSS solutions to modifying the source (less concern about patches). Wrapping around PEAR::DB introduces all sorts of efficiency problems, since whatever data processing happens gets done twice. So not anything for the fault of PEAR, but it doesn’t fit with the development method of “wrap” vs “inject”.
  3. Following from #1, PEAR supports older versions of PHP/Postgres than I need, with the result that it won’t use all of the modern functionality I want.

For those of you who are curious, I did discover that ADOdb does use the existing pg_*() functions. Plus, the ADOdbs both target database independence through API-level SQL generation, instead of raw SQL. I ended up going with PDO for my project with a few modifications:

  • Exceptions are enabled by default
  • query() can take parametrized statements (why isn’t this done by default?)
  • Fetchmode is associative by default, so i can foreach a record
  • Statement execute() returns $this instead of an error code (I’m using exceptions anyways) so I can chain other functions onto it
  • I’ve added a few fetch*() functions for clarity in common use-cases

All of these modifications are intended to improve PDO’s overall usability (interfaces aren’t just for end users).

class PDO_ extends PDO {

  function __construct($dsn, $username, $password) {
    parent::__construct($dsn, $username, $password);
    $this->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
  }

  function prepare($sql) {
    $stmt = parent::prepare($sql, array(
      PDO::ATTR_STATEMENT_CLASS => array('PDOStatement_')
    ));

    $stmt->setFetchMode(PDO::FETCH_ASSOC);
    return $stmt;
  }

  function query($sql, $params = array()) {
    $stmt = $this->prepare($sql);
    $stmt->execute($params);
    return $stmt;
  }
}

class PDOStatement_ extends PDOStatement {

  function execute($params = array()) {
    parent::execute($params);
    return $this;
  }

  function fetchSingle() {
    return $this->fetchColumn(0);
  }

  function fetchAssoc() {
    $this->setFetchMode(PDO::FETCH_NUM);
    $data = array();
    while ($row = $this->fetch()) {
      $data[$row[0]] = $row[1];
    }
    return $data;
  }
}

The end result of this is that I can use the connection much more naturally. It’s not quite a fluent interface, but you can see that it’s improved:

chained fetches after statement execution:
$stmt = $db->prepare(
    'SELECT first, last FROM users WHERE uid = ?');
$user1 = $stmt->execute(array(1))->fetch();
$user2 = $stmt->execute(array(2))->fetch();
parametrized direct querying:
$uid = $db->query(
    'SELECT uid FROM users WHERE first = ? AND last = ?',
    array($first, $last));
fetching an associative array of $uid => $username:
$users = $db->query('SELECT uid, username FROM users')
    ->fetchAssoc();
selecting a single aggregate:
$count = $db->query('SELECT COUNT(*) FROM users')
    ->fetchSingle();

I also add RAII-style transactional programming. This object can be created at the top of a function and committed at the bottom of the function. If an exception is thrown in between, it will automatically roll back the transaction that it started upon creation. This is useful in cases where exceptions are not being handled, but must bubble up to higher layers in the code.

class Transaction {

  private $db = NULL;
  private $finished = FALSE;

  function __construct($db) {
    $this->db = $db;
    $this->db->beginTransaction();
  }

  function __destruct() {
    if (!$this->finished) {
      $this->db->rollback();
    }
  }

  function commit() {
    $this->finished = TRUE;
    $this->db->commit();
  }

  function rollback() {
    $this->finished = TRUE;
    $this->db->rollback();
  }
}

This little guy is really convenient. Any model-level code that works with the database typically doesn’t need to address database errors (actually, I usually let them bubble up to the global error handler, which can then log it and print an apologetic 500 HTTP response).

function addUser($db, $username, $friend_users) {
  $txn = new Transaction($db);

  $db->query('INSERT INTO users(username) VALUES (?)', array($username);
  $uid = $db->query('SELECT currval(?)', array('username_id_seq'))
    ->fetchSingle();

  $stmt = $db->prepare('INSERT INTO friendships(uid1, uid2) VALUES (?, ?)
  foreach ($friend_users as $fuid => $fname) {
    $stmt->execute(array($uid, $fuid));
  }
  $txn->commit();
}

No exception handling required at all. If anything throws an error (e.g. an invalid friend), the stack unwinds and $txn automatically rolls back the transaction in its destructor.

Microsoft has recently introduced Popfly, a “fun, easy way to build and share mashups, gadgets, Web pages, and applications”. While it looks like it may be interesting (I don’t have a Passport login, so I couldn’t test-drive any of it), the most humorous part of the website is the laughable About Us section.

Lofty Goals

We believe that if you can send an email, you should be able to build and personalize your own website, mashup, social networking site, or blog.

This certainly isn’t a rare thing to see nowadays. It still bothers me, though. To bring up the oft-used building architecture analogy, this looks a bit more ridiculous:

We believe that if you can climb stairs, you should be able to build and personalize your own house, apartment, office, or auditorium.

The original quotation would be much more accurate with a few revisions:

We believe that if you can send an email, you should be able to use and personalize a blogging tool, social networking site, or dashboard-widget service that software developers built.

There, much better! I think there is this huge fallacy about making programming obseleted by arbitrary design. That’ll never happen. What we are seeing more of is that domain-specific Turing-incomplete tools are becoming easier and easier to use. There’s always programmers behind those tools, and extending those tools, but there’s definitely a clear separation between them and the users—digital content creators / artists.

Managers Everywhere!

The caption for the bottom picture states:

From left to right: John Montgomery (Group Program Manager), Andy Sterland (Program Manager), Alpesh Gaglani (Developer), Tim Rice (Developer), Suzanne Hansen (Program Manager), Steven Wilssens (Program Manager), Vinay Deo (Engineering Manager), Michael Leonard (Test Developer), Jianchun Xu (Developer), Dan Fernandez (Product Manager), Adam Nathan (Developer), Wes Hutchins (Program Manager), Aaron Brethorst (Program Manager), Paramesh Vaidyanathan (Product Unit Manager), and Murali Potluri (Developer).

So what’s the breakdown of their team?

  • 5 developers
  • 1 test developer
  • 5 program managers
  • 1 group program manager (What is that? Something like a super-program-manager?)
  • 1 engineering manager
  • 1 product manager
  • 1 product unit manager

If it were a real startup, I think this would be quite a bit more realistic:

  • 2 business/legal/financial (the pair of guys that make sure VC money is rolling in and provides safe harbor for the programmers)
  • 1 marketer
  • 2 designers
  • 3 front-end developers (Javascript, ASP)
  • 1 back-end developer/sysadmin
  • 1 champion developer/team lead (the guy who is both an application visionary and a technical expert)

Disingenuity

This is the best part. The images and text on this page are reminiscent of a 40-year-old trying to blend in with his teenage daughter’s friends:


If this doesn’t look like corporate coercion, I don’t know what does. You know, where your boss comes around and says “Hey! Let’s take a fun picture, so everyone can see how cool we are, and how much you all enjoy working under my hip management style!” and the only reason people follow is because their jobs depend on it.


Ok, not as ridiculously out-of-character, but everyone still looks forced (“cross your arms! everybody do it!”).


This one looks the most realistic. Notice the token Startup Guy and Girl Programmer hidden among a flock of token Fortune 500 Boring Guys, like a game of Where’s Waldo.

How about all of the self-praise?

Like most startup ventures, the team hustles for resources every day and is innovative, scrappy, and fun. Oh, and we also dream big.

Is this serious? Fun? Scrappy? Dreaming big? Most kids who feel they have to announce that they are cool to others aren’t. I never got friends by telling people how cool I was; I got friends by sharing interests and hobbies. For some contrast, this is from the smallish company Business Logic’s website:

We enjoy flexible work hours, late night brainstorming, refactoring, halo, open source, robot chicken and Leeroy Jenkins. We like to build great software and find a better way to do it every day. We want to grow and share the experience.

Now that sounds real. No one is tooting their own horn, and I can judge for myself whether or not the group is “fun”.

Score

I’d give them a 8/10 for humor (the “Army of Darkness” style of humor). I’d score them higher, except the website looked too realistic, so some people might not understand the joke of a Microsoft Startup.

Today was my last day of work at Pavlov Media. Sad day, the last day of a job. Anyways, I don’t keep this blog to talk about personal ramblings, so I’ll keep it short. One of the guys pointed out the large number of obscure tidbits I know about PostgreSQL and PHP development (most of the PHP is stuff you won’t see in a book, but as tiny repeatable timesavers or readability improvements in a large). So, I’m going to try to post random things I know about Postgres that I consider on the obscure side. Most of them are already well-documented, but stuff a newcomer would probably miss.

Postgres Tip #1: Use COALESCE and friends

The functions COALESCE(...), NULLIF(a,b), GREATEST(...), and LEAST(...) are all well-hidden ways to improve the readability of SQL queries. I’m going to talk about the first today.

COALESCE takes an arbitrary number of parameters, and simply returns the first one that’s not NULL. This has come in handy at work in two specific cases:

  1. If table vendors has field website, and table products (with a foreign key into vendors) also has field website, I can pull the most specific website of a product available with the SQL query:

    SELECT COALESCE(p.website, v.website) FROM products p JOIN vendors v...

    The alternative would have been to use a rather obtuse-looking CASE statement like this:

    SELECT CASE WHEN p.website IS NULL THEN v.website ELSE p.website END FROM products p JOIN vendors v...

  2. If a NULL value in a table is logically equivalent to another non-NULL value, it simplifies conditionals. For example, the Freeside billing system treats NULL and 0 both as “empty” in date fields, so I can check if a date is “empty” with only one comparison:

    ... WHERE COALESCE(date_field, 0) <> 0

    Again, the alternative is bulkier and harder to read:

    ... WHERE (date_field IS NOT NULL OR date_field <> 0)

    This is particularly useful for string fields, where the empty string and the NULL string are sometimes interchangeable. In this case, COALESCE really helps when dealing with someone else’s sloppy database design.

In both of these cases, the advantage is small in a toy example like this, but a SQL query with 3 default-values and 2 “maybe-NULL” conditionals, the difference is huge.