I downloaded the WordPress codebase the other day and I’ve been browsing through it. I’d like to contribute to the project, since I use it myself and am starting to encourage others to do so as well. I don’t really know where to start, but documentation and simple refactorings seem to be a good start.

Here are a few suggestions that I’ve been able to think of while browsing. Some of them are purely style-based (I’m dreading hostile responses from the Big Ball Of Mud natives), but I’d like to clean up the code without actually changing anything.

Documentation

I like to use Doxygen even for PHP. I know PHPDoc exists, but it should be possible to write Doxygen-compatible comments that also work for PHPDoc. There is an element of object-oriented code inside all of the global variables (implicit objects?) that a little bit of @group would factor right out. It’s probably written somewhere in the WordPress Codex, but I find that API documentation is much more useful embedded in the code.

Here’s the basic gist of what I would do with the documentation:

  1. Go through every single file and classify them, e.g. (admin-interface, client-interface, rpc-interface, function-definitions, external-libraries installation-interface)
  2. Go through every single function (at least, anything that doesn’t look like an imported 3rd-party library). Classify functions based on implicit objects (e.g. current-post, current-page, user-settings, system-settings), or general layer (e.g. UI/formatting, data access, domain operations). Come to think of it, most of the domain operations probably operate on implicit objects.
  3. Locate every single damn global variable and document it explicitly, including the functions that represent its interface (most global variables I’ve seen have been either implicit “context” objects or in-memory caches).
  4. Figure out how to incorporate the Doxygen documentation into the Codex wiki, so duplication could be avoided. Hell, most documentation could be backported, allowing the Codex to then provide documentation for every released version of WordPress (though that’s a hell of a lot of work to backport comments).

Obviously, this would require the help of WordPress and Codex gurus. Some stuff I could figure out on my own, but definitely not everything.

Long Lines

Yell at me now for using vi. I enjoy it, and I find I can do things rapidly in it. I cringe when I see 250+ characters in a single line. I understand for HTML output that it is a necessity to go above 80, but there are plenty of ways around that.

In particular, this bit of code was a single line. I’ve broken it so you can read it:

<div id="login"><h1><a href="<?php
    echo apply_filters('login_headerurl',
    'http://wordpress.org/'); ?>" title="<?php
    echo apply_filters('login_headertitle',
    __('Powered by WordPress')); ?>">
    <span class="hide"><?php bloginfo('name'); ?>
    </span></a></h1>

Even with less linebreaks, this is easier to read, I think:

<?php
    $href = apply_filters('login_headerurl',
        'http://wordpress.org/');
    $title = apply_filters('login_headertitle',
        __('Powered by WordPress'));
?>
<div id="login"><h1><a href="<?php echo $href; ?>"
    title="<?php echo $title; ?>"><span class="hide">
    <?php bloginfo('name'); ?></span></a></h1>

Making lines wrap at 80 was my original purpose for this refactoring. However, even if the lines go beyond 80 (one line for each apply_filters and one for the entire HTML block), this is more readable.

What’s the real difference? I’ve separated concerns. While I’m parsing the HTML content, I don’t have to worry about plugin behavior from apply_filters. While I’m looking at the filters, I don’t have to parse out all of the HTML code surrounding them. Getting vi read it better is really the less significant benefit here, even though that was my original intent.

Refactoring

I don’t remember where I saw it, but one of the developers commented that the codebase contained a lot of legacy code that no one had time to clean up. I want to use the wordpress project myself, but I can’t stand going through “unclean code”. So here’s the priorities:

  1. Fix E_NOTICE errors everywhere, and E_STRICT errors where possible without breaking backwards compatibility. Never use an undefined variable.
  2. Split long statements into multiple statements. Non-sucky variable naming for temporaries actually helps document the semantics, too.
  3. Pull out large bodies of HTML/text from function calls where possible. In particular, the $wpdb uses some fatal-error text that could be relocated to the top of the page to increase member function readability.
  4. Reorder operations within functions so that processing occurs before display, instead of mixed like spaghetti. Because of plugins, this will probably involve a lot of ob_* usage to maintain ordering.
  5. Tidy up nonfunctional code styles (mostly brackets and whitespace).
  6. See what I can do about reducing global variable count. This one is tricky, because refactoring implies no change in behavior, and naughty plugins and templates may depend on the existence of certain globals.
Advertisements