Thursday, February 18, 2010

(WCPI): An In-Depth Look at the Methodology

The Windows Composite Performance Index (WCPI) is a new industry metric built upon the foundations of one the most robust and sophisticated performance analysis frameworks ever devised. The DMS Clarity Suite framework was created by a former Intel Corporation performance engineer who specialized in the design and implementation of next generation benchmarking methodologies. It has been deployed commercially at some of the largest financial services companies in the world and is in use today on trading floors throughout Wall Street.

The methodology behind the framework is straightforward: Monitor critical Windows performance counters; evaluate them against a series of weighted thresholds; and flag those events wherein one or more of the counters exceeds its configured target limit. To further qualify these events – which are defined as contiguous sample periods in which a threshold has been exceeded – each one is evaluated both for severity (how far above the threshold) and duration (how long it remained that way). This has the effect of skewing the analysis away from isolated “spikes” – which are often meaningless - and towards prolonged sequences of threshold exceptions of the type that, cumulatively, can cause significant system-wide delays.

For example, when evaluating memory utilization, one of the metrics monitored is the Memory\Pages Input/Second counter. As the DMS Clarity framework’s analysis engine reviews sample records for a given system (over 10,000 such samples are recorded per system, per week), it documents each incident in which this counter has exceeded the user-defined threshold (in the case of WCPI, the threshold is set at 300). However, the engine also records how many contiguous (i.e. sequential in time) records show this counter – and the other counters it evaluates – running above the threshold, and will generally discount a short duration event with a high counter value (i.e. a sudden spike from an application startup or large memory mapped file I/O) in favor of a longer duration event (i.e. a sustained value associated with a prolonged disk I/O operation, like extensive VMM paging) in which the individual counter values may be lower but their corresponding negative side effects more detrimental to system performance.

Naturally, the number of threshold violation “hits” for a given counter or group of counters correlates directly to the frequency of the sample periods and also the restrictiveness of the threshold values. In commercial deployments, the framework is configured to sample a system’s runtime environment every second, and then average the collected values every 15 seconds before storing the results as a single record. It’s also configured to use very aggressive thresholds in order to identify degraded runtime conditions that might impact a real-time computing environment(e.g. a trading workstation in a Wall Street brokerage).

For WCPI calculations, the sample frequency is retained but the storage schedule widened to one record per averaged 60 second interval (vs. 15 seconds in commercial deployments). Likewise, the individual counter thresholds are loosened to reflect a more general purpose runtime scenario. The net result is an analysis methodology that ignores events that would otherwise have triggered a violation in a commercial setting, essentially making the WCPI calculations much more lenient and forgiving of runtime conditions – an important consideration when casting such a wide net across a diverse population like the exo.performance.network.

Bottom Line: With a robust, sophisticated analysis framework behind it, and the world’s largest repository of raw Windows metrics data on the planet the work on, WCPI is by far the most accurate metric of its kind. No other industry metric provides equivalent insight into the runtime behavior of Windows systems as they exist in the real-world.

The following is a technical break-down of the three key sub-indices that are used to generate the WCPI value. These indices are recalculated daily across all participating exo.performance.network sites, and the resulting data is then combined and averaged to achieve the final WCPI number:

  • The Peak CPU Saturation Index is calculated by comparing a series of 4 independent Windows metrics – The System\Processor Queue Length counter, the Process Instant Delay factor (a custom metric unique to DMS Clarity), the Process Cumulative Delay factor (another DMS Clarity custom metric) and the event duration – against a series of user-defined threshold values. The resulting individual ratios are then weighted and combined to create the single number Peak CPU Saturation Index.

    Note: Peak CPU Saturation is the not the same thing as 100% CPU utilization. A system can use 100% of its CPU bandwidth and yet still remain responsive. Rather, it is a measure of the amount of delay, in real-time, that the system is experiencing as a result of contention for CPU resources as well as the impact that delay will likely have for any a given process running on the system under those conditions.

  • The Peak Memory Pressure Index is calculated by comparing a series of 4 independent Windows metrics – the Memory\Committed Bytes counter, the Memory\Pages Input/sec counter, the PageFile\% Usage counter and the aforementioned event duration value – against a set of user-defined threshold values. The resulting individual ratios are then weighted and combined to create the single number Peak Memory Pressure Index value.

    Note: Peak Memory Pressure is a gradual, as opposed to sudden or instantaneous, metric. The index normally builds over time as the number of active processes increases, putting pressure on the Windows VMM to re-shuffle the physical memory deck and page portions of certain processes to disk where necessary.

  • The Peak I/O Contention Index is calculated by comparing a series of 4 independent Windows metrics – the Physical Disk\Current Disk Queue Length counter, the Physical Disk\Disk Bytes/Sec counter, the Processor\Interrupts/Sec counter and the aforementioned even duration – against a set of user-defined threshold values. The resulting individual ratios are then weighted and combined to create the single number Peak I/O Contention Index.

    Note: The Peak I/O Contention Index often mirrors the Peak Memory Pressure Index during low-memory situations where VMM activity is dominating the disk channels.

Note: You can keep tabs on all of our research findings by visiting our web site: www.xpnet.com. There you’ll find a wide selection of interactive chart objects and free monitoring tools that you can use to compare your own systems to the WCPI and similar independent research metrics published by the exo.performance.network.

10 comments:

Unknown said...

None of that explains how completely wrong your conclusions are in the last blog. We can all count, the difference seems, we know WHAT to count..

http://exo-blog.blogspot.com/2010/02/wcpi-85-of-windows-7-pcs-are.html

Unknown said...

Agreed, you need to issue a retraction of your previous blog entry

Unknown said...

Probably you should read for example this article: http://arstechnica.com/microsoft/news/2010/02/behind-the-windows-7-memory-usage-scaremongering.ars
It's just amazing that you still try to defend yourself on this...

ChrisCicc said...

Agreed. If you want to save any self respect as a person and/or company you should retract your previous blog entry and issue an apology.

Unknown said...

1) Vista and Win7 primarily run on multicore machines. 44% CPU utilization means what on a i7 chip? 10%, 5%, 2%, 3%, 10%, 14% on each core?

2) SuperFetch. read about it. This process alone messes with your metrics on Disk I/O and Memory Usage and totally invalidates your metrics

Thushan Fernando said...

I always chuckle when I hear n00bs talk about resource over utilisation in post-XP operating systems. As mentioned by quite a lot of folks, and ArsTechnica - who lets all admit, have some of the brightest and technically minded journalists around, the benchmarks of yester-year of finding and defining 'free' RAM is irrelevant. These days Windows caches and is far more intelligent at how it looks and manages your memory. Do you ever wonder why your most frequently used applications load helluvah lot faster than XP? I'll leave the authors of the blog to find out! Might teach you a thing or two.

Anonymous said...

For all of those who don't believe that Windows 7 sucks with memory management, I'd like to show you something.
This is on Windows 7 x64, and after having the PC on for a while (I usually don't turn it off often), it goes like this:

Screenshot

What I expect as a user, is to be able to load heavy apps at any time.
Currently, nothing big in memory is opened, but still there is no free RAM (less than 350 MB available), and the pagefile size is 5147 MB.

Windows fails to use and clean up the RAM, and swaps heavily (and look at the hard faults /s !).

Tell me now, how cool is that ? W7 does manage to be fast when I first start the PC (and only 1.5GB RAM is used), but in this case if I want to launch Chrome, instead of loading the program in 1 second, it takes a lot more !

orev said...

The bulk of your argument seems to be that you are using an "industry standard" test, and point to its longevity as proof of accuracy. The companies you have mentioned are not exactly well known for being nimble in the face of change (just like any large corporation). It's certainly within the realm of possibility that the test you are using has not been updated to work with Vista and Win7 (and this is likely given the test's age).

Given that most large companies have avoided deploying Vista, and are only now starting with Win7, it adds further fuel to the argument that this test is probably out of date, as they would have no reason to update a test for an OS that is not widely deployed in their environments.

You have now made 3 posts which are basically arguing against the entire Internet that you are right and everyone else is wrong. Do you really think that out of all of the smart people out there (none that I've seen yet defending your position), you are the smartest?

Given the overwhelming outcry against these results, any astute person would at least take a step back and consider the arguments. You should try some tests on your own, using multiple tools and testing methods, and see how many of them corroborate each other. Take a look at the SysInternals Process Explorer, which was written by a guy so smart about this stuff that Microsoft hired him to fix all of the internals of Windows. If, after all of that, you still have a valid point, then more power to you. However, your current stance of digging in and claiming to be right with no further data or testing just makes you look foolish.

BlakeyRat said...

This blog entry might explain your methodology in-depth (although it really doesn't, it's mostly marketing-ese)... it still doesn't even come close to addressing the main point everybody's complaining about:

YOUR METHODS DO NOT WORK IN VISTA OR WINDOWS 7.

They don't take into-account Superfetch, most importantly. And as a results, your "alarming" blog post about Windows 7 memory usage is, quite frankly, nothing more than wrong piled on top of incorrect riding in a failboat.

You should immediately add a correction to your original blog post. Then you should seek out every tech-ignorant journalist that re-published your wrong data without questioning it, and ask that they issue the same correction. Then you should publicly apologize to Microsoft.

That's the only way you can dig yourself out of this well.

(BTW, are we supposed to be impressed that ex-Intel engineers worked on it? What do Intel engineers know about Windows memory caching technologies? Might as well ask them about transmission replacements in a '96 Buick while you're at it.)

Unknown said...

This whole blog and methodology should be dissed as an opinion. While self-proclaming "the best", "no one else ...", etc. can be arrogant, but the blogger is entitled to his opinion. We don't have to listen. This is almost like, well, "my chili tastes better than everyone else's."

For memory index, the use of memory\\page input\sec may be insufficient. Why not look at the whole page scan rate?

@BlakeyRat ... don't worry about that "Intel engineer's' ..." now. As we learned, it's probably just one ex-Intel contractor. We don't know for sure, but it's possible :)