NYTProf 2.07 – Even better, and now for Windows too

NYTProf development rolls on, with 2.07 being another significant release.

Here’s a summary of highlights from the many improvements.

Windows, at last

Windows wizard Jan Dubois has contributed patches that enable NYTProf to run on Windows. Many thanks Jan!

Corrected Subroutine Total Time

A significant error in time reported as spent in a subroutine has been fixed. It was showing the sum of the inclusive and exclusive time instead of just the inclusive time.

String Evals

One of NYTProf’s innovative features is tracking of subroutine calls by calling location. In earlier versions you couldn’t see calls made from within string evals. Now you can. Even for evals within evals:

NYTProf string eval.png

That’s a big step forward for profiling code that makes heavy use of evals.

(The use of the µ micro symbol is also new – part of some helpful interface polishing contributed by Gisle Aas. They’re not used on the statement timings yet, but will be at some point.)

Recursion

Previously the ‘inclusive time’ reported for a subroutine wasn’t useful for subroutines called recursively. The overall inclusive time included the inclusive time of nested calls, typically inflating it beyond usefulness.

Now the inclusive time only measures the inclusive time of the outermost calls. NYTProf now also tracks and reports the maximum call depth for each calling location.

Disabling the Statement Profiler

NYTProf started life as a statement profiler. I added a subroutine profiler later and was delighted with how well that worked out. The subroutine profiler is novel in a few ways and also extremely robust and efficient. Part of the efficiency comes from not streaming data to disk while the program is running. It accumulates the counts and timings in memory.

When profiling large and/or performance sensitive applications the costs of NYTProf’s statement profiler, in performance impact and disk space, can be prohibitive. So now I’ve added an option to disable the statement profiler. Setting the stmts=0 option turns NYTProf into a very lightweight subroutine profiler.

Caller Information for XSubs

NYTProf reports time spent in subroutines broken down by calling location. It’s one of its key features. The caller details are embedded as comment annotation into the report for each perl source code file.

That has worked very well for perl subs, but there’s no natural home for XS subroutines because there was no source code file available to annotate.

There’s a partial solution included in NYTProf 2.07: XS subs that are in the same package as profiled perl subs, get annotated subroutine ‘stubs’ appended to the report for the corresponding perl source file.

It works quite well, in particular the common UNIVERSAL methods get annotated:

NYTProf xsub callers.png

Also, now these subs have a place in the reports, any references to them are hyper links, naturally.

Sadly it doesn’t work so well for packages that don’t contain any perl subs, like DBI::st. I plan to add per-package report pages for them in a future release.

Clock Docs

I’ve greatly expanded the description of the various clocks available for profiling: POSIX real-time clocks, gettimeofday(), and Time::HiRes.

I’ve also added a caveats section that discusses issues with SMP systems, processor affinity, and virtual machines.

Feedback Wanted

How has NYTProf helped you optimize your applications? Let me know.

Perhaps as blog posts of your own, linking back here, or comments here, or CPAN ratings, or email to the mailing list or just to me. I’d love to hear from you.

Thanks!