Please forgive the title!
Perl has three regular expression match variables (
$& $‘ $’ ) which hold the string that the last regular expression matched, the string before the match, and the string after the match, respectively.
As you’re probably aware, the mere presence of any of these variables, anywhere in the code, even if never accessed, will slow down all regular expression matches in the entire program. (See the WARNING at the end of the Capture Buffers section of the perlre documentation for more information.)
Clearly this is not good.
I’ve long planned to add detection and reporting of this to Devel::NYTProf, along with things like method cache invalidation, but it’s never risen to the top of the list. In fact, now I look, I see it never even got entered into the ever-growing collection of ideas recorded in the HACKING file.
After the 4.00 release, plus few minor releases, I’d put NYTProf on hold and was starting to focus on my java2perl6 API translation project (more news on that soon).
Then I saw a recent blog post by Josh McAdams, one of the authors of Effective Perl Programming (along with Joseph N. Hall and brian d foy) about detecting these variables using the Devel::SawAmpersand and Devel::FindAmpersand modules. Firstly it reminded me of the issue, and then it struck me that few people would bother using those tools because they simply wouldn’t know they had the problem in the first place.
Someone with a performance problem is likely to use a profiler like NYTProf to see where time is being spent in their code. That might point out that significant time is being spent in regular expressions, but even then they might not make the leap to consider these special match variables as a possible cause. The profiler should point it out to them!
NYTProf version 4.03 didn’t. Clearly that was not good. So NYTProf version 4.04 now does!
In the list of files on the index page it highlights the file and adds a comment:
On the report page for the file itself it adds an unmissable, and hopefully self-explanatory, note to the top of the page:
I’d be very interested to hear from anyone who now discovers these problem variable lurking in their application code or any CPAN modules.
Go take a look!
4 thoughts on “NYTProf 4.04 – Came, Saw Ampersand, and Conquered”
Thanks for adding this. It will make it much easier to find these issues than the other modules available. After seeing the post about Devel::SawAmpersand and Devel::FindAmpersand, I tried to use them to check the WebGUI codebase for these issues. Unfortunately, the B::FindAmpersand module doesn’t work with a threaded perl, but B::Lint can do the same detection and did work.
I’m wondering if anyone reported a performance improvement after removing any of these variables.
I’m also wondering if anyone has simple demonstration code that shows the performance difference when benchmarked or profiled.
I don’t mean to sound ungrateful, just curious.
The WARNING in perlre has existed since 1997 (5.004), and much has changed since then.
Some work has been done over the years to reduce the cost (see perldelta’s), but the definition of these variables requires a copy to be made. So there’s little that can be done to avoid that. You’re welcome to do some benchmarking.
In additon to changes in perl, I was thinking of changes in hardware. I did try to create a simple benchmark to demonstrate a performance improvement, but was unable to observe a significant difference.
Comments are closed.