Code duplication, cheap but not free

I’m working on a large old codebase at the moment where Repeat Yourself seems to have been standard practice. Here’s a typical example:

    if (exists(Foo::StaticData::NodeRemap->get_data()->{ $args{cid} }->{ $graph_id })
        && Foo::StaticData::NodeRemap->get_data()->{ $args{cid} }->{ $graph_id }->{ as_of_rev } <= $current_revision) {
        $self->{cid} = Foo::StaticData::NodeRemap->get_data()->{ $args{cid} }->{ $graph_id }->{ new_node_id };
    }
    elsif (exists(Foo::StaticData::NodeRemap->get_data()->{ $args{cid} }->{''})
        && Foo::StaticData::NodeRemap->get_data()->{ $args{cid} }->{''}->{ as_of_rev } <= $current_revision) {
        $self->{cid} = Foo::StaticData::NodeRemap->get_data()->{ $args{cid} }->{ '' }->{ new_node_id };
    }

The codebase has many, many, examples of this style written by a variety of developers.

What puzzles me is why this kind of code didn’t raise red flags for the developers at the time. It’s harder to read, harder to maintain, and slower than a simpler approach. Perhaps the elsif part was a copy-n-paste job, but that still doesn’t explain the three instances of Foo::StaticData::NodeRemap->get_data()->{ $args{cid} }->{ $graph_id } in the first part. Perhaps even those were copy-n-pasted.

I’d always have an urge to factor out the common expression into a temporary variable for efficiency and clarity. I wonder if my sensitivity to code duplication is partly due to needing to pay close attention to efficiency for most of my career. (I started Perl programming about 15 years ago, when cpu performance was measured in MHz.)

I also wonder how soon tools like Perl::Critic can help detect duplicate code fragments. Common sub-expressions that may be candidates for elimination.

I’m working on optimizing the codebase at the moment. That chunk showed up as a performance issue so I rewrote it as

    my $NodeRemap = Foo::StaticData::NodeRemap->get_data();
    for my $id ( $graph_id, '' ) {
        my $x = $NodeRemap->{ $args{cid} }->{ $id };
        next unless $x and $x->{as_of_rev} > $current_revision;
        $self->{cid} = $x->{new_node_id};
        last;
    }

Spot my mistake?

Sidebar: This post is also an experiment in posting code to my blog. I’m trying out MarsEdit. It’s good but I’d like to see a “Paste Preformatted” mechanism that would also html escape the contents of the paste buffer. It’s scriptable so I guess I could implement it myself in my copious spare time…

8 thoughts on “Code duplication, cheap but not free

  1. A couple things. First, “$x”? Seriously? It must have some sort of meaning, how about giving it a real name?

    Second, I think it’d make more sense to extract out the “->{ $args{cid} }” part too. Any reason you didn’t?

    I’m afraid I can’t spot the error though. Looking at the original code is making my brane hurt so I’m giving up now ;)

  2. Pingback: Perl Coding School » Blog Archive » perl code [2008-03-25 18:44:05]

  3. Yes, $x was a cop out. In my defence I’m not familiar enough with that part of the system to come up with a name I could be sure of, $x is at least obviously abstract, and it only has a 3 line scope,

    I opted not to factor out the “->{ $args{cid} }” because it didn’t seem worth digging deeper into the data structure for the very small gain (relative to avoiding calls to get_data). If I need to revisit the code again for performance reasons I’d change it.

    My mistake? The “> $current_revision” should have been “<= $current_revision”. I think the combined effect of changing from an ‘if’ to a ‘next’ with an ‘unless’ threw me off. Like a double negative.

  4. One thing that helps me in these cases: I usually write a comment something like:

    # at this point $current_revision is > $x, save the first one.

    then stick the ‘next’ above the comment. That gives me a reference for
    the if/or/unless/and logic — and makes it easier for whomever follows
    me to see if I did it right!

  5. “I think the combined effect of changing from an ‘if’ to a ‘next’ with an ‘unless’ threw me off. Like a double negative.”

    Perl Best Practices, page 96 & 97. Don’t use postfix-style modifiers other than if, and never use unless.

    I point that out only because you recommended that I read the book just 3 weeks ago (I did, and this is exactly what it said could happen). I do think the next belongs on the left though.

    Cheers
    adam

  6. You’re right of course Adam.

    I do use unless occasionally, but only when it reads much more naturally than any alternative, and when the conditional expression is trivial. So in this case I certainly broke the second part of my own guidelines, and I probably broke the first as well.

    When I’m trying to avoid an ‘unless’ I find myself using ‘not’ more than ‘!’ these days. So I’d probably write the expression as

    next if not $x or $x->{as_of_rev} …

Comments are closed.