Thursday, December 13, 2007

Office 2007 SP1 Yields Modest Performance Gains

During our most recent round of testing we covered Windows Vista with the Service Pack 1 Release Candidate bits running both Office 2007 and Office 2003. Now that Service Pack 1 for Office 2007 has been released we decided to revisit the tests with the updated bits for Word, Excel and PowerPoint in place.

Once again, we ran the tests on the same 2GHz Core 2 Duo Laptop (Dell XPS M1710) with 2GB of memory. Test methodology consisted of booting Vista, running an initial pass of OfficeBench, then running OfficeBench in a 10-iteration loop and recording the average loop time across the iterations.

The net result: A 3% improvement in OfficeBench completion times.

It would seem that, unlike the Vista team - which has the habit of over-promising and under-delivering - when the Office team talks about a performance boost they actually provide one. And while the results are far from earth-shattering - Vista SP1 + Office 2007 SP1 is still nearly 2x slower than the combination of Windows XP SP3 + Office 2003 - it's at least a step in the right direction.

Note: You can grab your free copy of OfficeBench, plus the rest of the DMS Clarity Suite tools, from the exo.performance.network web site. Just register for your free portal account, hook-up a couple of systems and help us build the world's first global repository of computer performance-related knowledge and data.

Read more...

Sunday, December 2, 2007

When Microsoft Attacks!

We at the exo.performance.network pride ourselves on having relatively "thick skins." After all, when you have the audacity to actually take a position on the issues - and then to back them up with hard data - you're bound to bring out the zealots on the other side. However, it's a rare treat when the shots sent our way hail from no less an industry authority than Microsoft.

Yes, it seems our little foray into Windows benchmarking has drawn the ire of Mr. Nick White, official "spokesmouth" for the Windows Vista team. In his blog/rant, Mr. White blabbers on a  bit about unrealistic benchmarking, then proceeds to badmouth our OfficeBench test script by calling it nothing more than a "window-open, window-close" routine. He even includes a ridiculously accelerated video of our test script in action, using the "speed" at which it executes as an excuse for discrediting its validity.

Of course he's wrong, both in his description of the test script and his justification for discrediting it. However, since his agenda likely had little to do with actually clearing the air - and was more in line with a classic Microsoft "hit" piece ordered from on high - we won't bother responding to Mr. White directly. However, we will take a moment to explain how OfficeBench works and why it's garnered so much respect over the years:

  1. OfficeBench's origins can be traced to our sister company, Competitive Systems Analysis, Inc. It was designed by Mr. Randall C. Kennedy while his company was under contract to Intel's Desktop Architecture Labs (DAL). This was way back in the 1999/2000 timeframe, when CSA was responsible for a great deal of internal benchmarking and white paper development surrounding the Pentium III and Pentium 4 CPU launches, among other projects.

    Why is this important? Because it helps to establish the origins of the technology.

  2. The OfficeBench test script was designed from the beginning to be a "run anywhere" benchmark. By "run anywhere" we mean that the script will execute reliably under almost any Windows runtime environment. At the time it was being developed, this meant Windows 2000 and Terminal Server. As Windows evolved, so did OfficeBench, until now it supports every version of Windows since 2000, including XP, Vista, Server 2003, Server 2008, all flavors of Terminal Server and all known application and desktop virtualization environments. When we say "anywhere," we really mean it.

    Why is this important? Because it allows us to test across multiple generations of Windows.

  3. OfficeBench is also version independent. That is, it's designed to work with any version of Microsoft Office. When it was originally conceived, the state of the art was Office 2000. Since then, Microsoft has rolled out 3 additional versions: XP, 2003 and, most recently, 2007. Once again, OfficeBench runs unmodified across all four versions of Office. Combine this with the support for the various Windows platform releases and you begin to see why OfficeBench is so powerful: It is the only test script of its kind that allows you to compare performance across multiple generations of Windows and Office. Mix them, match them - it just works.

    Why is this important? Because it allows us to test across multiple generations of Office.

  4. Mr. White's hatchet job aside, OfficeBench is in fact a fairly complex test script. For starters, it uses OLE automation to drive the applications. This is different from most test scripts, which use window messages or key stroke/mouse click simulation. Using OLE automation has numerous benefits, including allowing us to run unmodified across Office versions. It also factors out any input-related anomalies while eliminating the chance that a UI change or 3rd party modification will somehow break the script.

    As for Mr. White's assertion that it's a simple "window-open, window-close" script, I offer the following summary of key OfficeBench tasks:

    a. Reformat all section headers and subheads in Word.
    b. Generate multiple chart objects in Excel.
    c. Generate complete multi-slide presentation in PowerPoint.
    d. Multi-page scroll w/copy paste of chart objects into Word.
    e. Slide sort/apply multiple templates in PowerPoint.
    f. Multi-page scroll/print preview/print-to-file in Word.
    g. Multi-chart print preview/print-to-file in Excel
    h. Global search/replace in word (multiple).
    i. Multi-slide preview/print-to-file in PowerPoint
    j. Navigate simulated research web site in IE (multiple).

    Again, the above are just some highlights. There's a lot more going on than meets the eye, and the key is that it's the exact same set of tasks executing across all versions of Office.

    Why is this important? Because it shows that OfficeBench is a sophisticated test script that does more than merely "open and close windows."

  5. OfficeBench doesn't exist in a vacuum. It's delivered as part of a sophisticated, extensible, multi-process testing framework we call DMS Clarity Studio. With DMS Clarity Studio, we provide a variety of scalable workload objects for testing everything from client/server database connections to MAPI-based message store access to streaming multimedia. OfficeBench has been engineered to run in parallel with these workloads, providing for a rich variety of targeted test scenarios spanning the range of Windows client and server platforms. It's all coordinated through the DMS Clarity Studio framework and also seamlessly integrated with the exo.performance.network's Clarity Analysis Portal.

    Note: DMS Clarity Studio is offered for free as part of the exo.performance.network. It's also part of the larger DMS Clarity Suite framework in use across thousands of trading workstations and other mission critical systems in the financial services sector. Some of the largest trading firms in the world trust us - Devil Mountain Software, Inc. - to tell them when their systems are under-performing. Enough said.

    Why is this important? Because it shows that OfficeBench is part of a proven testing ecosystem that spans the range of Windows platforms and runtime scenarios.

In summary, OfficeBench is much more than a simple "window-open, window-close" script. It is a sophisticated, version-independent benchmarking tool that executes reliably under almost any Windows runtime environment. As such, it is the only tool of its kind that allows IT organizations to accurately assess multi-generational performance across all versions of Windows and Office.

And that's why OfficeBench scares the hell of out Microsoft. For the first time ever the industry has the tools necessary to call the company to the mat for its bloated, CPU cycle-sucking ways.

Microsoft's response? Slam the benchmark! Try to discredit the author/source! And crank-up the FUD machine!

Sorry, guys! You can run, but you can't hide, from OfficeBench.

Read more...

Thursday, November 29, 2007

Community Snapshot 01: Who's Using What?

As an ongoing service to the greater Windows IT community, we're publishing the first of our weekly "snapshots" of OS and application usage rates as measured within our own little corner of the industry. These numbers are current as of 06:00 GMT and based on a sample of set of 1,106 contributor systems:

Figure 1 - Contributor Usage by OS Type/SKU

Interpretation: While Windows XP (55%) still makes up the lion's share of our user base, Windows Vista - in its various incarnations - is running a closing second (37%). We'll be checking back each week to chart the progress of XP, Vista and their variations over time, and also to monitor adoption rates for Vista SP1 and XP SP3 when they're released early next year.

Read more...

Wednesday, November 28, 2007

How to Make Vista Run Like XP (Sort of)

In an effort to further clarify our previous test results, we decided to experiment a bit with Vista to see how it would perform with a majority of its newer UI elements and background services turned off. In the process, we believe we've come up with a roadmap of sorts for how to "make Vista run like XP" (sort of):

  1. Shutdown "Unnecessary" Services - This means killing all those new Vista goodies, like SuperFetch and the WSearch indexer. CPU cycles are a precious commodity - use them wisely.
  2. Ditch the UI - Use the Advanced System Settings dialog to change the Appearance settings to "adjust for best performance." It's like taking a trip back in time...to 1995!
  3. Drop the Resolution/Color Depth - This helps to mitigate any sluggishness in the newer Vista-model drivers. 1024x768 with 16-bit color should be good for most video adapters - and it looks "real sharp" stretched across that new 21 inch LCD!
  4. Handicap XP - Most Windows XP users are running Office 2000, XP or 2003. Upgrade the XP config with Office 2007 to ensure that you get a nice, entirely unrepresentative (of the real world) hybrid scenario.

Do the above and you'll be rewarded by a much closer net experience. Instead of being ~2x slower than XP, Vista in this new, "bare metal" configuration is *only* (drum roll, please)...40% slower!

Bottom Line: Even with the OS stripped to the core, and with all of the new eye-candy and CPU-sucking background services turned off, Vista is *still* 40% slower than XP (SP3) at a variety of business productivity tasks. Time to ask Santa for that new PC for Christmas!

Read more...

Tuesday, November 27, 2007

Update: Re-Testing Vista w/2GB RAM, Office 2003

Many of our members have requested that we re-test Vista SP1 with 2GB of RAM instead of the 1GB we used in our original tests. So, without further delay, we present our revised results table:

image

Figure 1 - Revised OfficeBench Completion Times (Seconds)

Analysis: By providing Vista (SP1) with an additional 1GB of RAM (that's a total of 2GB for those of you keeping score) we managed to achieve a "whopping" 4% improvement in OfficeBench throughput.

Note: We added the Windows XP (SP3) results to the chart to add further context to the Vista results. As before, all tests were conducted on the same Dell XPS M1710 system w/2GHz Core 2 Duo CPU and DDR-2 667MHz RAM.

A few members voiced their concerns over the use of Office 2007 under Vista. They suggested we re-test using Office 2003 on both Vista and Windows XP. Here are the results:

image

Figure 2 - Office 2003 vs. Office 2007 Completion Times

Analysis: Moving from Office 2007 to Office 2003 definitely improved Vista's showing. Instead of being over 2x slower than XP on the same OfficeBench workload, Vista is now "only" 1.8x slower.

To quote Darth Vader: "Impressive...most impressive."

Read more...

Friday, November 23, 2007

Windows XP SP3 Yields Performance Gains

After a disappointing showing by Windows Vista SP1 (see previous post), we were pleasantly surprised to discover that Windows XP Service Pack 3 (v.3244) delivers a measurable performance boost to this aging desktop OS. Testing with OfficeBench showed an ~10% performance boost vs. the same configuration running under Windows XP w/Service Pack 2.

 image

Figure 1 - OfficeBench Completion Times
(In Seconds - Lower is Better)

Note: As with our Vista SP1 testing, we used the identical Dell XPS M1710 test bed with 2GHz Core 2 Duo CPU, 1GB of RAM and discrete nVidia GeForce Go 7900GS video.

Since SP3 was supposed to be mostly a bug-fix/patch consolidation release - unlike w/Vista SP1, Microsoft made no promises of improved performance for XP - the unexpected speed boost comes as a nice bonus. In fact, XP SP3 is shaping-up to be a "must have" update for the majority of users who are still running Redmond's not-so-latest and greatest desktop OS.

Of course, none of this bodes well for Vista, which is now more than 2x slower than the most current builds of its older sibling. Suffice to say that performance-minded users will likely choose to stick with the now even speedier Windows XP - at least until more "Windows 7" information becomes publicly available.

Windows Vista = Windows ME "Reloaded?" You be the judge!

Read more...

Sunday, November 18, 2007

Vista SP1 a Performance Dud

Note: We've updated our data set to include results from Vista with 2GB of RAM and also with Office 2003 instead of 2007. Check out our revised numbers here.


With the initial performance characteristics of Windows Vista leaving much to be desired (see our previous post on the subject), many IT organizations have put off deploying the new OS until the first service pack (SP1) is released by Microsoft early next year. The thinking goes that SP1 will address all of these early performance issues and somehow bring Windows Vista on par with - or at least closer to - Windows XP in terms of runtime performance.

Unfortunately, this is simply not the case. Extensive testing by the exo.performance.network (www.xpnet.com) research staff shows that SP1 provides no measurable relief to users saddled with sub-par performance under Vista.

How We Tested

The above conclusion is based on an analysis of the RC0 (v.658) build of Service Pack 1 for Windows Vista. Testing was conducted on a dual-core Dell notebook with 1GB of RAM. The staff ran a variety of test scenarios against both "before" (RTM w/no updates) and "after" (RTM w/SP1 installed) configurations, using the DMS Clarity Studio testing framework to capture scenario scoring and metrics data for upload to the exo.repository.

  • During office productivity testing, the staff used the DMS Clarity Studio OfficeBench test script to drive Microsoft Office 2007 through a scripted set of productivity tasks, including creating a compound document and supporting workbooks and presentations materials.
  • To test multitasking performance, the staff used the ADO, MAPI and WMP Stress modules - all part of DMS Clarity Studio - to generate a multi-process workload scenario involving client/server database, workflow and streaming media tasks.

Note: DMS Clarity Studio is available as a free download from the exo.performance.network (www.xpnet.com) site. Simply register for your free DMS Clarity Analysis Portal account to access these and other free tools from xpnet.

Test Results

During OfficeBench testing we noted a statistically insignificant delta (~2%) in favor of the SP1-patched configuration. CPU Saturation, Memory Pressure and I/O Contention factors were all comparable, as were process specific metrics - including the Thread Utilization and Thread Growth Potential Indices.

image

Figure 1 - OfficeBench Completion Times (Seconds)

The multitasking scenario was also comparable, with the ADO and MAPI Stress workloads showing a delta of less than 1% in favor of the SP1-patched configuration. As with the OfficeBench test scenario, system and process metrics for CPU, Memory and I/O were all nearly identical between the two configurations.

image

Figure 2 - ADO and MAPI Avg. Transaction Times (Seconds)

Note: For more information on the various system and process metrics employed in this article, please login to your private DMS Clarity Analysis Portal site and refer to the Glossary section of the Online Help

Not yet a member of the exo.performance.network? Sign up today! It's free, and you'll be helping us to build the world's first global repository of computer performance-related knowledge and data.

Conclusions

After extensive testing of both RTM and SP1-patched versions of Windows Vista, it seems clear that the hoped-for performance fixes that Microsoft has been hinting at never materialized. Vista + SP1 is no faster than Vista from the RTM image.

Bottom Line: If you've been disappointed with the performance of Windows Vista to date, get used to it. SP1 is simply not the panacea that many predicted. In the end, it's Vista's architecture - not a lack of tuning or bug fixes - that makes it perform so poorly on systems that were "barn-burners" under Windows XP.

Read more...

Thursday, November 15, 2007

What Intel Giveth, Microsoft Taketh Away

“What Intel giveth, Microsoft taketh away.” Such has been the conventional wisdom surrounding the Windows/Intel (“Wintel”) duopoly since the early days of Windows 95. In practical terms, it means that performance advancements on the hardware side are quickly consumed by the ever-increasing complexity of the Windows/Office code base. Case in point: Microsoft Office 2007 which, when deployed on Windows Vista, consumes over 12x as much memory and nearly 3x as much processing power as the version that graced PCs just 7 short years ago (Office 2000).

But despite years of real-world experience with both sides of the duopoly, few organizations have taken the time to directly quantify what my colleagues and I at Intel used to call “The Great Moore’s Law Compensator (TGMLC).” In fact, the hard numbers above represent what is perhaps the first ever attempt to accurately measure the evolution of the Windows/Office platform in terms of real-world hardware system requirements and resource consumption.

Over the next several sections I hope to further quantify the impact of TGMLC and to track its effects across four distinct generations of Microsoft’s desktop computing software stack. To accomplish my goal I’ll be employing a cross-version test script – OfficeBench – and executing it against different combinations of Windows and Office: Windows 2000 + Office 2000; Windows XP (SP1) + Office XP; Windows XP (SP2) + Office 2003; and Windows Vista + Office 2007. Tests will first be conducted in a controlled virtual machine environment under VMware and then repeated on different generations of Intel desktop and mobile hardware to assess each stack’s impact on hardware from the corresponding era.

Click Image to View Our Interactive Results Table

About OfficeBench: The OfficeBench test script is a version-independent benchmark tool that uses OLE automation to drive Microsoft Word, Excel, PowerPoint and Internet Explorer through a series of common business productivity tasks. These include assembling and formatting a compound document and supporting workbooks and presentation materials, as well as data-gathering through simulated browsing of a web-based research database. OfficeBench is available for free download from the exo.performance.network (http://www.xpnet.com/) web site as part of the DMS Clarity Studio testing framework.

The Stone Age

Back in 1999, when I was working as an advisor to Intel’s Desktop Architecture Labs (DAL), I remember how thrilled we all were to get our hands of Windows 2000 and Office 2000. Finally, a version of the Windows/Office stack that could leverage all of the desktop horsepower we were building in to the next generation Pentium 4 platform. I remember it was also the first time I had a fully-scriptable version of the Office Suite to work with (previous versions had supported OLE automation only in Word and Excel). Shortly thereafter, the first version of OfficeBench was born and I began my odyssey of chronicling TGMLC through the years.

First-off, let me characterize the state-of-the-art at the time. The Pentium 4 CPU was about to be unveiled and the standard configuration in our test labs was a single-CPU system with 128MB of RDRAM and an IDE hard disk. While a joke by today’s standards, this was considered a true power-user configuration suitable for heavy number-crunching or even lightweight engineering workstation applications. It was also only marginally faster than the previous generation Pentium III, a fact that Intel tried hard to hide by cranking-up the CPU clock to 1.5GHz and turning its competition with rival AMD into a drag race. It’s a decision that would come back to haunt them well into the next century.

Sadly, I didn’t have access to an original Pentium 4 system for this article. My engineering test bed was long ago scrapped for parts, and I doubt that many of these old i840 chipset-based boxes are still in use outside of the third world. However, we can at least evaluate the software stack itself. Through the magic of virtualization we can conclude that, even with only 128MB of RAM, a Windows 2000-based configuration had plenty of room to perform. During OfficeBench testing, the entire suite consumed only 9MB of RAM, while the overall OS footprint never exceeded 50% of the available memory. Clearly this was a lean, mean version of Windows/Office and it chewed through the test script a full 17% faster than its nearest competitor, Windows XP (SP1) + Office XP.

The Bronze Age

The introduction of Windows XP in 2001 marked the first mainstream (i.e. not just for business users) version of Windows to incorporate the Windows “NT” kernel. In addition to improved Plug & Play support and other improvements, XP sported a revamped user interface with true-color icons and lots of shiny, beveled effects. Not wanting to look out of style, and also smelling another sell-up opportunity, the Office group rushed out Microsoft Office XP (a.k.a. “Office 10”), which was nothing more than a slightly tweaked version of Office 2000 with some UI updates.

Hardware had evolved a bit in the two years since the Windows 2000 launch. For starters, Intel had all but abandoned its ill-fated partnership with RAMBUS. New Intel designs featured the more widely supported DDR-SDRAM, while CPU frequencies were edging above 2GHz. Intel also upped the L2 cache size of the Pentium 4 core from 256KB to 512KB (i.e. the “Northwood” redesign) in an attempt to fill the chip’s stall-prone 20 stage integer pipeline. Default RAM configurations were now routinely in the 256MB range while disk drives sported ATA-100 interfaces.

Windows XP, especially in the pre-Service Pack 2 timeframe, wasn’t all that more resource intensive than Windows 2000. It wasn’t until later, as Microsoft piled-on the security fixes and users started running anti-virus and anti-spyware tools by default, that XP began to put on significant “weight.” Also, the relatively modest nature of the changes from Office 2000 to Office XP translated into only a minimal increase in system requirements. For example, overall working set size for the entire suite during OfficeBench testing under VMware was only 1MB higher than Office 2000, while CPU utilization actually went down 1% across the three applications (Word, Excel and PowerPoint). This did not, however, translate into equivalent performance. As I noted before, Office XP on Windows XP took 17% longer than Office 2000 on Windows 2000 to complete the same OfficeBench test script.

I was fortunate enough to be able to dig-up a representative system of that era: A 2GHz Pentium 4 system with 256MB of RAM and integrated Intel Extreme graphics (another blunder by the chip maker). Running the combination of Windows XP (SP1) and Office XP on bare iron allowed me to evaluate additional metrics, including the overall stress level being placed on the CPU. By sampling the Processor Queue Length (by running the DMS Clarity Tracker Agent in parallel with Clarity Studio and OfficeBench), I was able to determine that this legacy box was only moderately stressed by the workload. With an average Queue Length of 3 ready threads, the CPU was busy but still not buried under the computing load. In other words, given the workload at hand, the hardware seemed capable of executing it while remaining responsive to the end-user (a trend I saw more of as testing progressed).

The Industrial Revolution

Office 2003 arrived during a time of real upheaval at Microsoft. The company’s next major Windows release, code named “Longhorn,” was behind schedule and the development team was being sidetracked by a string of security breaches in the Windows XP code base. The resulting fix, Windows XP Service Pack 2, was more of a re-launch than a mere update. Whole sections of the OS core were either replaced or rewritten, and new technologies – like Windows Defender and a revamped firewall – added layers of code to a rapidly bloating platform.

Into this mess walked Office 2003 which, among other things, tried to bridge the gap between Windows and the web through support for XML and the ability to store documents as HTML files. Unlike Office XP, Office 2003 was not a minor upgrade but a major overhaul of the suite. And the result was, not surprisingly, more bloating of the Windows/Office footprint. Overall memory consumption went up modestly to 13MB during OfficeBench testing while CPU utilization remained constant vs. previous builds, this despite the fact that the suite was spinning an extra 4 execution threads (overall thread count was up by 15).

Where the bloat took its toll, however, was in raw application throughput. Completion times under VMware increased another 8% vs. Office XP, putting the Windows XP (SP2) + Office 2003 combination a full 25% off the pace of the original Windows 2000/Office 2000 numbers from 3 years earlier. In other words, with all else being equal – hardware, environment, configuration – Microsoft’s desktop computing stack was losing in excess of 8% throughput per year due to increased code path complexity and other delays.

Of course, all else was not equal. Windows XP (SP2) and Office 2003 were born into a world of 3GHz CPUs, 1GB RAM, SATA disks and Symmetrical Multithreading (a.k.a. Hyper-threading). This added hardware muscle served to offset the growing complexity of Windows/Office, allowing a newer system to achieve OfficeBench times slightly better (~5%) than a legacy Pentium 4 system, despite the latter having a less demanding code path (TGMLC in action once again).

Welcome to the 21st Century

Given the extended delay of Windows Vista and its accompanying Office release, Microsoft Office System 2007, I was understandably concerned about the level of bloat that might have slipped into the code base. After all, Microsoft was promising the world with Vista, and early betas of Office showed a radically updated interface (the Office “Ribbon”) as well as a new, open file format and other nods to the anti-establishment types. Little did I know that Microsoft would eventually trump even my worst predictions: Not only is Vista + Office the most bloated desktop software stack ever to emerge from Redmond, its system requirements are so out of proportion with recent hardware trends that only the latest and greatest from Intel or AMD can support its epically porcine girth.

Let’s start with the memory footprint. The average combined working set for Word, Excel and PowerPoint when running the OfficeBench test script is 109MB. By contrast, Office 2000 consumed a paltry 9MB, which translates into a 12x increase in memory consumption (i.e. 170% per year since 2000). To be fair, previous builds of Office benefited from a peculiar behavior common to all pre-Office 12 versions: When minimized to the Task Bar, each Office application would release much of its non-critical working set memory. This resulted in a much smaller memory footprint, as measured by the Windows performance counters (which are employed by the aforementioned DMS Clarity Tracker Agent).

Microsoft has discontinued this practice with Office 2007, resulting in much higher average working set results. However, even factoring this behavioral change, the working set for Office 2007 is truly massive. Combined with an average boot-time image of over 500MB for just the base Windows Vista code base, it seems clear that any system configuration that specifies less than 1GB of RAM is a non-starter with this version. And none of the above explains the significantly higher CPU demands of Office 2007, which are nearly double (73% vs. 39%) that of Office 2003. Likewise, the number of execution threads spawned by Office 2007 (32) is up, as is the total thread count for the entire software stack (615 vs. 370 – again, almost double the previous version).

Clearly, this latest generation of the Windows/Office desktop stack was designed with the next generation of hardware in mind. And in keeping with the TGMLC pattern, today’s latest and greatest hardware is indeed up to the challenge. Dual (or even Quad) cores, combined with 4MB or more of L2 cache, have helped to sop-up the nearly 2x greater thread count, while 2GB standard RAM configurations are mitigating the nearly 1GB memory footprint of Vista + Office 2007.

The net result is that, surprise, Vista + Office 2007 + state of the art hardware delivers throughput that’s nearly on par (~22% slower) with the previous generation of Windows XP + Office 2003 + the previous state of the art hardware. In other words, the hardware gets faster, the code base gets fatter and the user experience, as measured in terms of application response times and overall execution throughput, remains relatively constant. The Great Moore’s Law Compensator is vindicated.

Conclusion

As I stated in the beginning, the conventional wisdom regarding PC evolution could be summed up in this way: “What Intel giveth, Microsoft taketh away.” The testing I conducted here shows that the wisdom continues to hold true right up through the current generation of Windows Vista + Office 2007. What’s shocking, however, is the way that the IT community as a whole has grown to accept the status quo. There is a sense of inevitability attached to the concept of the “Wintel duopoly,” a feeling that the upgrade treadmill has become a part of the industry’s DNA. Forces that challenge the status quo – Linux, Google, OS X – are seen as working against the very fabric of the computing landscape.

But as recent events have shown us, the house that “Wintel” built exists largely because of a fragile balance between hardware evolution and software complexity. When that balance gets out of whack – as was the case when Vista became delayed, leaving Intel with a hard sell for many of its newer offerings – the house can quickly destabilize. And when that happens, it will be up to one of the aforementioned outside forces to seize the initiative, topple the “Wintel” structure and perhaps change the very nature of desktop computing as we know it. Read more...

Saturday, September 22, 2007

The Great Virtual PC 2007 CPU Gobble!

One of the unexpected side effects of moving from Virtual PC 2007 to Virtual Server as development and testing environment is lower CPU utilization. For some reason, when I load up a test scenario (two client VMs collecting data and uploading to a single server VM) in Virtual PC 2007, the virtualpc32.exe process hosting the scenario chews-up 50% or more of the available CPU cycles on my dual-core workstation. By contrast, when I load up the same scenario under Virtual Server, the VMs chew-up almost no CPU, which is what I would have expected given that the workloads running within them are very light (i.e. Task Manager inside the Guest OS sessions shows nearly zero CPU utilization).

I'm going to forward my findings to Ben Armstrong (i.e. the "Virtual PC Guy" at Microsoft) for analysis, however, I already have a theory on the source of the excessive CPU utilization. For starters, it seems isolated the client VMs, both of which are running a high-resolution monitoring agent (DMS Clarity Metrics Tracker). This agent makes frequent calls to PDH, WMI and the Registry, and I'm guessing that Virtual PC 2007 is generating a lot more overhead when processing these calls than Virtual Server. The agent also uses a set of high-resolution timer objects that likewise seems to give Virtual PC fits.

As a simple test, I tried disabling the agents on each VM. CPU utilization for virtualpc32.exe immediately to dropped to below 10%. Case closed.

Bottom Line: For testing applications that use high-resolution timers, or that make frequent calls to certain system libraries, Virtual Server 2005 R2 SP1 does a much better job of handling what should normally be a very fast, lightweight operation. And with VMRC Plus, you don't have to sacrifice usability in order to reap the rewards of better real-time application support under Virtual Server. Read more...

Friday, June 22, 2007

Vista + Virtualization = Poor Performance

Everyone knows by now that Vista is slower than Windows XP. In fact, my own testing shows it to be roughly twice as slow on the same hardware (see our upcoming Test Center study for more details). That's because Vista is a far more complex operating system, with many additional features and background services that simply don't exist under XP.

What many users don't know, however, is that this gap widens even further under virtualization. Fire-up Vista under VMware or Virtual PC and you'll find that the delta is more like 3x - i.e. tasks take over three times as long to complete under virtualized Vista as they do under virtualized XP.

Let that preceding statement sink in for a minute. In real world terms it means that applications are taking as much as a 50% greater performance hit from being virtualized than would be expected given the aforementioned 2x delta in a native, non-virtualized comparison. As a veteran IT professional, I would expect to take a hit moving to Vista - just not one that's so out of whack with the established norms.

Clearly, there's something going on here that makes Vista particularly difficult to virtualize. I have some theories, however, at this point I can only go on what I've observed. And that is:

1. When testing Vista on a Dell OptiPlex 745 with 4GB of RAM, the performance delta - as measured by the OfficeBench test script - is roughly 2x. The script took twice as long to complete under the Vista/Office 2007 combination as it did under Windows XP/Office 2003.

2. When repeating these tests under VMware Workstation 6.0 and Virtual PC 2007, running on two different hardware test beds (the aforementioned OptiPlex and an XPS M1710 laptop), the script took 3x longer under Vista.

3. All of the tests were conducted using OfficeBench, which is part of the Clarity Studio test framework that is freely available via the exo.performance.network web site (www.xpnet.com).
For the record, I was skeptical of the original results, so much so that I re-ran the VMware scenarios multiple times and then confirmed them against a different installation running under Virtual PC 2007. I also presented my findings to both VMware and Microsoft, but neither could explain the phenomena I observed.

Bottom Line: Vista is significantly slower under virtualization than it should be, and I'll be damned if I know why.

Anyone have an idea what might be going on here? Anyone? Bueller? B-u-e-l-l-e-r? Read more...

Thursday, May 17, 2007

The Vista Aero vs. Battery Life Myth

Lately, there has been a rash of unscientific reporting around the issue of Windows Vista and notebook battery consumption. Some customers have apparently reported decreased battery life under Vista, however, most of these reports have been entirely anecdotal in nature: Someone quoting someone else who claims that some notebook has somehow lost some of its battery life under some version of Vista (wow, that's a lot of...err..."something").

Many pundits have pointed to Vista's Aero Glass UI as the source of the power-drain. They assume (wrongly, as it turns out) that rendering the UI via a dedicated graphics processing chip - instead of using the primary CPU to draw a series of bitmaps images - somehow consumes more battery power. Since my own experiences with Vista (six months as my full-time OS on three different notebooks, with Aero enabled on all of them) seem to contradict these reports, I decided to do some objective benchmarking to set the record straight.

To ensure a representative test bed I selected two systems from different vendors operating at opposite ends of the notebook power/performance spectrum:

1. A Dell XPS M1710 with 2GHz Core 2 Duo (T7200) CPU, nVidia GeForce 7900GS graphics, 2GB of DDR-2 (667MHz) RAM and an 80GB, 7200RPM hard disk. This is hardly a "power-miser" rig. In fact, the various components - in particular, the 7900GS card - are notoriously power-hungry, at least during 3D graphics/gaming tasks.

2. A Lenovo ThinkPad R60e with 1.66GHz Core 2 Duo (T5500) CPU, integrated Intel 945 series graphics, 2GB of DDR-2 (533MHz) RAM and a 60GB, 5400 RPM hard disk. This is more of a mainstream system for business class users. It lacks the power-hungry discrete graphics, oversized LCD screen, etc.

The test consisted of multiple iterations (10x) of the OfficeBench test script (part of the DMS Clarity Suite toolset - see http://www.xpnet.com/). I configured both systems to use Vista's Power saver battery scheme and also configured OfficeBench to pause frequently (1-3 seconds per test section) to allow an opportunity for the CPU to throttle-down and/or power-management features to kick-in.

Starting from a fully-charged (100% as reported by Vista's battery meter) state, I pulled the plug on each system and allowed them to complete the test script. I then repeated this sequence, only this time I manually disabled desktop composition (i.e. turned-off the Aero UI) via the Compatibility tab in the Clarity Studio application shortcut. This caused Vista to stop the "dwm" service and render the entire script workload - the application windows, dialogs, animations and transitions - using the older, non-GPU-accelerated model.

As I suspected, the battery consumption for the non-Aero scenario was within 1-2% of the consumption with Aero enabled. In other words, disabling Aero had little or no measurable impact on battery consumption under Windows Vista Ultimate when running a mix of common business productivity (Internet Explorer, Word, Excel and PowerPoint) applications.

So much for that myth... Read more...

Tuesday, April 17, 2007

Quad Core Nostalgia

As I watch Intel launch its latest quad-core CPU I can't help but wax nostalgic about my time as contract test engineer for the company's Desktop Architecture Labs (DAL). It was early 2000 and the first Pentium 4 was still in preproduction testing. I had just received my prototype system for evaluation - an 800Mhz box with dual-channel RAMBUS (remember those guys?) RDRAM. I knew they were in trouble when my first round of tests - mostly linear office productivity tasks - showed the chip to be 30-40% *slower* than its predecessor, the Pentium III. I communicated my findings back to Intel and they blamed it on a buggy BIOS and poorly tuned chipset.

Weeks later, as I evaluated the now 1.5GHz production-level chip, I became convinced that a traditional linear benchmark approach wasn't going to cut it. The Pentium 4's longer pipeline simply clobbered throughput, with most tests showing performance barely on par with the older P6 core. Fortunately, I was already hard at work on my first parallel-processing test suite, Benchmark Studio, and tests with multiple, concurrent tasks had the Pentium 4 pulling away from the Pentium III by a sizable margin.

Once again, I communicated my findings to Intel, even suggesting a possible marketing spin for the data: More performance for demanding workloads. It would have dovetailed nicely with the related work I had been doing around the company's "Constant Computing" initiative, however, ultimately my findings were canned. Apparently, my message of "more torque for heavy multitasking loads" (i.e. the "SUV" argument) wasn't sexy enough. They wanted a "sports car" message. Better linear performance. Ever higher clock speeds (4GHz was the long term goal). The rest, as they say, is history.

Of course, the Pentium 4 architecture (a.k.a. "NetBurst") ultimately flopped, allowing AMD to each Intel's lunch for many years. When Intel finally dumped NetBurst in favor of a revamped Pentium III design (a.k.a. Core 2), the industry had finally caught-up with where I was over 7 years ago. Now, virtually all business productivity benchmarks emphasize parallel execution performance, a necessity now that most CPUs have 2 or more cores on board. Symmetrical Multiprocessing (SMP), once the purview of engineering workstations, is now de rigeur, and the current mainstream OS - Windows XP - is completely at home on multiple CPUs.
I guess I can take some small measure of satisfaction in knowing that I was right about where benchmarking was headed, and that if Intel had followed my lead they might have fared better (at least in terms of marketing success).

Note: You can download the latest incarnation of my test suite, Clarity Studio, for free from the exo.performance.network (www.xpnet.com) web site. Read more...

Tuesday, March 20, 2007

Vista Aero: What a CPU Hog!

Microsoft's new Aero Glass GUI - one of the cornerstones of the company's Windows Vista marketing message - is a joy to behold. Aero's sleek, semi-transparent facades serve to enhance the user experience by providng a stronger sense of "depth" and cohesion. Combined with Vista's enhanced UI metaphors (love those "breadcrumbs" in explorer), Aero is a major improvement over the XP GUI.

It's also a CPU hog. Despite Microsoft's claims about leveraging 3D accelerator technology to offload the GUI workload, Aero still chews-up more CPU cycles (an average of 22%) with desktop composition enabled (i.e. 3D accelerated mode) than with it disabled (i.e. non-accelerated "legacy" mode). In other words, turn on the "bling" and you toss nearly a quarter of your CPU bandwidth out the window.

Note: I measured the above using the recently updated DMS Clarity Tracker Agent, which is now part of the new public exo.performance.network (xpnet.com) project. You can reproduce the test scenario by downloading the companion DMS Clarity Studio tool (also at xpnet.com) and running the OfficeBench test script, first with composition enabled and then with it disabled.

Other interesting tidbits:

  • The ratio of the increased CPU overhead is roughly 4:1 in favor of User vs.Privileged (i.e. kernel mode) time - a good thing.

  • That User mode bump can be traced directly to the participating applications - for example, Word 2007 using 48% more CPU time with composition enabled.

  • Enabling desktop composition also causes windows to chew another 16% of the available physical RAM as measured by the Committed Bytes counter.

  • Though CPU overhead was higher with composition enabled, overall script completion times - as measured by OfficeBench - were only 3% slower.

  • One alarming statistic: Processor Queue Length, a measure of how many threads are queued and waiting for CPU time, increases by 28% with composition enabled - not good, especially for multitasking.

Bottom Line: For single-tasking users running generic productivity application scenarios, the added overhead from enabling desktop composition shouldn't be much a factor. However, more demanding users - especially those needing maximum performance - may find that turning composition off let's them regain some of those long lost CPU cycles.

I guess the old saw still applies: What Intel (and AMD) giveth, Microsoft taketh away...

Read more...

Saturday, March 10, 2007

The Secret Life of SuperFetch

One of the more mysterious new features of Windows Vista is its SuperFetch memory management subsystem. Billed as a "smart" pre-caching mechanism, SuperFetch is supposed to improve system responsiveness by monitoring application usage patterns and then pre-loading application code in anticipation of the next task. SuperFetch uses the time of day and other behavioral markers to determine what to load when, using the Windows memory manager's Standby List as its entry point (the ever illuminating Mark Russinovich goes into more detail in his recent TechNet Article).

Of course, it all sounds good on paper. But how do you quantify the impact of something that's designed to work in the background and to be essentially undetectable (beyond some vague sense that the OS is more responsive)? For starters, it helps to know where to find it. SuperFetch runs as a Windows Service under the SVCHOST alias. The actual filename is sysmain.dll, so a quick scan of Task Manager to correlate the Process ID with the instance of SVCHOST in question gets you in touch with SuperFetch.

Next, you need a way to monitor both SuperFetch's behavior and its impact on system responsiveness. The first part is easy - any Vista-compatible metrics agent will do the trick, though we prefer the one we provide through the xpnet: DMS Clarity Tracker. Measuring the second part - how SuperFetch impacts the system - is a bit trickier.

Here we used another xpnet tool, DMS Clarity Studio, to generate a scripted productivity workload spanning Microsoft Word, Excel, PowerPoint and Internet Explorer. By comparing before/after results with SuperFetch enabled/disabled (and rebooting after each test run) we were able to determine that Microsoft's new VMM magic is indeed having a positive impact on application responsiveness. With the service enabled, Startup times - as measured by the OfficeBench test script - were cut in half, this after multiple "training" runs to allow SuperFetch to map the usage pattern (i.e. "We run Office after booting").

We'll be conducting additional research and testing around SuperFetch and other new Vista technologies (Integrated Search, ReadyBoost) in future blog entries for the exo.performance.network. Stay tuned! Read more...

Friday, March 9, 2007

Monster Excel Workbooks Exposed

Everyone knows that Microsoft Office is a bit of a memory hog. In fact, few products can claim as much credit for driving the memory upgrade cycle as the ubiquitous combination of Word, Excel, PowerPoint and Outlook. However, while many power users may think they’ve pushed the envelope on one or more of these applications – massive documents, huge spreadsheets, media-rich presentations – none can compare to those captains of industry that make their home at the corner of Wall and Broadway.

I’m referring, of course, to financial services traders. The “Type A” personality crowd – risk takers, deal makers, the rock stars of wealth creation. They live life on the edge, balancing risk vs. reward in constant battle with each other and the market itself. And the fuel that drives their engines is…data. Lots and lots of data – analyzed, quantified and extrapolated in nearly every conceivable way.

Massaging that data falls on the shoulders of Microsoft Excel. Through myriad templates and macros and real-time data connections, these traders push Excel to the extreme as they tweak and tune their customized (and highly proprietary) financial models. All of which consumes a tremendous amount of hardware resources. In at least one shop we found that traders ran, on average, six concurrent instances of the Excel application process, each one occupying a peak memory footprint of from 300-500MB during a normal trading session (for a total of 1.8GB).
CPU utilization was also high, with each instance consuming ~60% of the available CPU cycles on a 4-CPU workstation, or 300% out of a total CPU capacity of 400% (100% x 4 CPU). Then there was the thread count. At any given time these systems were asked to juggle up to 230 concurrent execution threads just from the various Excel instances.

That’s approximately 1/2 the total thread workload for a typical business productivity user, yet this is just one application among many. These systems are also running proprietary trading software (plus various real-time feeds, like Bloomberg), which is why even a high-end PC isn’t adequate. Hence their reliance on top-of-the-line workstation hardware. And even then, with dual-cores and gigabytes of memory, many traders still need more than one system in order to handle their computational load. In fact, it’s not uncommon to find 3 or more high-end boxes under a trader’s desk – thanks in large part to the overhead of their massive Excel workbooks.

1.8GB of RAM. 300% CPU utilization. 230 concurrent execution threads. And you thought your spreadsheets were complicated! Read more...

Wednesday, February 21, 2007

DreamScene? More like a “Nightmare on Vista Street!”

Today I installed “DreamScene” – and kissed my CPU cycles goodbye. The official marketing moniker for Microsoft’s seductively frivolous new “motion desktop” technology (one of those Ultimate Extras we’ve been promised since Vista was RTM), DreamScene transforms Windows’ staid desktop wallpaper mechanism into a seamless, animated backdrop to your workaday tasks. And like much of the Vista Aero user experience, DreamScene is pure eye candy with little or no practical benefit.

Yes, having an MPEG or WMV video play seamlessly in the background – with full support for window transparency and other Aero goodness – is cool. Having Windows Explorer chew-up 10-15% of your available CPU bandwidth in order to accommodate DreamScene’s frivolity, on the other hand, is decidedly un-cool. And the mechanism itself is buggy as hell. Case in point: Every time I suspend/resume my Dell XPS M1710 “notebrick” the driver for its integrated nVidia GeForce 7900 GS graphics adapter crashes. Or at least that’s what Vista is telling me once it recovers. What I see when it happens is a blank screen that makes me want to reach for the power switch (I assumed it had hung like XP was wont to do in similar situations – old habits die hard).

Of course, I could use the above scenario point out one of the nicer features of the revised Vista driver model – namely, the ability to recover from many driver-related errors thanks to a modular architecture that moves much of the non-critical code outside of the kernel. But this is about DreamScene, and from where I’m standing it looks like a real nightmare for corporate IT.

My advice: Outlaw DreamScene, at least until Microsoft agrees to reimburse you for all those lost CPU cycles and panicked calls to the help desk. And to think, we gave up WinFS…for this! Read more...