Code duplication, cheap but not free

I’m working on a large old codebase at the moment where Repeat Yourself seems to have been standard practice. Here’s a typical example:

    if (exists(Foo::StaticData::NodeRemap->get_data()->{ $args{cid} }->{ $graph_id })
        && Foo::StaticData::NodeRemap->get_data()->{ $args{cid} }->{ $graph_id }->{ as_of_rev } <= $current_revision) {
        $self->{cid} = Foo::StaticData::NodeRemap->get_data()->{ $args{cid} }->{ $graph_id }->{ new_node_id };
    }
    elsif (exists(Foo::StaticData::NodeRemap->get_data()->{ $args{cid} }->{''})
        && Foo::StaticData::NodeRemap->get_data()->{ $args{cid} }->{''}->{ as_of_rev } <= $current_revision) {
        $self->{cid} = Foo::StaticData::NodeRemap->get_data()->{ $args{cid} }->{ '' }->{ new_node_id };
    }

The codebase has many, many, examples of this style written by a variety of developers.

What puzzles me is why this kind of code didn’t raise red flags for the developers at the time. It’s harder to read, harder to maintain, and slower than a simpler approach. Perhaps the elsif part was a copy-n-paste job, but that still doesn’t explain the three instances of Foo::StaticData::NodeRemap->get_data()->{ $args{cid} }->{ $graph_id } in the first part. Perhaps even those were copy-n-pasted.

I’d always have an urge to factor out the common expression into a temporary variable for efficiency and clarity. I wonder if my sensitivity to code duplication is partly due to needing to pay close attention to efficiency for most of my career. (I started Perl programming about 15 years ago, when cpu performance was measured in MHz.)

I also wonder how soon tools like Perl::Critic can help detect duplicate code fragments. Common sub-expressions that may be candidates for elimination.

I’m working on optimizing the codebase at the moment. That chunk showed up as a performance issue so I rewrote it as

    my $NodeRemap = Foo::StaticData::NodeRemap->get_data();
    for my $id ( $graph_id, '' ) {
        my $x = $NodeRemap->{ $args{cid} }->{ $id };
        next unless $x and $x->{as_of_rev} > $current_revision;
        $self->{cid} = $x->{new_node_id};
        last;
    }

Spot my mistake?

Sidebar: This post is also an experiment in posting code to my blog. I’m trying out MarsEdit. It’s good but I’d like to see a “Paste Preformatted” mechanism that would also html escape the contents of the paste buffer. It’s scriptable so I guess I could implement it myself in my copious spare time…

Loaded Perl: A history in 530,000 emails

MarkMail is a free service for searching mailing list archives. They’ve just loaded 530,000 emails from 75 perl-related mailing lists into their index.

They’ve got a home page for searching these lists at http://perl.markmail.org/.

Of course the first thing people often do with new search engines is search for themselves. I’m no exception. Where MarkMail shines is the ability to drill-down into the results in many ways with a single click (bugs, announcements, attachments etc). Worth a look.

The graph of messages per month is not just cute, you can click and drag over a range of bars to narrow the search to a specific period. It clearly shows my activity rising sharply in 2001 and then dropping to a lower level after 2004.

I particularly pleased that they’ve indexed dbi-users, dbi-dev, and dbi-announce lists.

The Limerick Open Source Meetup

I moved to the west coast of Ireland about six years ago. Being a hermit by nature it wasn’t until late last year that I made any real effort to connect with other techie-type people in the area.

It really started with BarCamp Galway then OpenCoffee Limerick and Blogger Coffee Limerick. Along the way I came across meetup.com and created The Limerick Open Source Meetup. That’s not really taken off yet, but I remain hopeful.

In fact our first meetup is tonight, prompted by Anton Manering joining recently.

Perl Myths

Update: several more recent versions of my Perl Myths talk are available. These have significant updates. Slides can be found on slideshare.net and screencasts can be found on my blip.tv channel.

I’ve uploaded my Perl Myths presentation to slideshare.net and google video:

“Perl has it’s share of myths. This presentation debunks a few popular ones with hard facts. Surprise yourself with the realities.”

While I agree with Andy Lester that Good Perl code is the best form of evangelism, I wanted to put together a presentation that others could refer to when they encounter misinformation about Perl. I cover these myths that I’ve heard recently:

  • Perl is dead
  • Perl is hard to read / test / maintain
  • Perl 6 is killing Perl 5

and pull in a wealth of upto date information, some of it quite surprising even to those familiar with Perl and its community. There are two versions, plus a video. I recommend the one with notes (which have useful extra detail and context for the slides) which is best viewed as a PDF. There’s also one without notes which I’ve embedded here:

I videoed an extended version of this presentation at IWTC in Dublin in February. The first 40 minutes or so correspond with the slides above. In the remaining 30 minutes or so I talk about Parrot and Perl 6. I’ve embedded the video below, but wordpress forces me to use a small size so you’ll probably prefer to view it at video.google.com: