Hot on the heels of NYTProf 2.04, I’ve just released NYTProf 2.05. It has a few small but significant bug fixes. From the Changes file:
Fixed alteration of $! during profiling, thanks to Slaven Rezic. Fixed clock_gettime() clock selection to fallback to CLOCK_REALTIME if CLOCK_MONOTONIC is not available, e.g, on linux 2.4. Fixed error when application ends after DB::disable_profile(). Added some docs to Devel::NYTProf::Apache Added clock=N option to enable user to select the clock.
All the bugs were long-standing ones that happened to come to light and get fixes after 2.04 was released.
The first was the cause of a couple of odd bugs were applications would behave differently when profiled. The problem was that in certain situations the profiler would alter the value of $! between statements, so code that tested for $! would think something had gone wrong.
The second caused weird timing values on some platforms. The statement times would be zero and the subroutine times would be garbage. The problem was that some systems, such as Linux 2.4, define the CLOCK_MONOTONIC macro but don’t implement that clock in the kernel. NYTProf wasn’t testing that the clock worked. Now it does. If the clock_gettime() call fails with the selected clock it falls back to CLOCK_REALTIME (and if that fails NYTProf aborts).
As a bonus I’ve added the ability to select what clock you want to use by setting
clock=N in the NYTPROF environment variable, where N is the integer id of the clock you want to use (usually you can find the CLOCK_* macros in the /usr/include/time.h file). This only works if your system supports clock_gettime(), which most modern unix-like systems do, with the notable exception of OS X.
Why would you want to use a different clock?
Well the default, CLOCK_MONOTONIC, is relatively expensive to call, and NYTProf calls it twice per statement. So it’s the dominant cost of the profiler.
Some systems, like Linux 2.6, offer other clocks, like CLOCK_MONOTONIC_FAST that returns the most recently cached value of the clock. That’s typically just 1/10th the cost of CLOCK_MONOTONIC, so your code runs much faster under the profiler.
Naturally there’s a trade-off: the accuracy of the timings drops significantly. Still far better than Devel::DProf, but far worse than normal. You’re dependent on how often the cached time value gets updated by your kernel. I saw a figure of “600 times per second” quoted for one system, but I can’t find the link now. My advice would be to steer clear of this feature unless you feel comfortable with the issues like these.