Memory subsystem performance
We'll start, as ever, with some quick synthetic tests of the memory subsystem, which will help give us the lay of the land before we dive into our real-world benchmarks.

The QX9650 easily surpasses the QX6850 here, probably because it can prefetch more data into its larger L2 cache and thus effectively transfer more data. There are clear striations here among the Intel processors based on bus speed, with the CPUs on the 1066MHz at the back of the pack. The top spots all go to Athlon 64 processors, whose integrated memory controllers are very tough to beat with a front-side bus-based system architecture.

This useful little test gives us a look at L2 cache bandwidth. You'll notice that it's multithreaded, so systems with more cores show up as having higher L2 cache bandwidth. Not just one processor or cache is being measured. As a result, the dual-socket quad-core Xeon X5365 (65nm) soars above everything else. We've included this system because it was marketed to enthusiasts as part of Intel's "V8" media creation platform, an answer of sorts to AMD's dual-socket Quad FX platform, represented here by the Athlon 64 FX-74. I'm happy to be able to include these systems as a curiosity, especially since the FX-74 is AMD's only quad-core solution for the desktop, but they both have their quirky performance drawbacks as well as benefits that I won't discuss in too much detail, lest they become a distraction. Besides, as I've mentioned before, the Xeons are total show-offs.

Back to the QX9650, its L2 cache bandwidth mirrors that of its 65nm predecessor until we reach the 16MB test block size, where its larger L2 cache grants it a slight advantage.

The QX9650's memory access latencies also mirror those of the QX6850, despite the QX9650's larger L2 cache. That's impressive, though perhaps not quite as impressive as the roughly 15ns advantage the Athlon 64 X2's integrated memory controller gives it.

We can look at this issue in a little more detail. In the graphs below, yellow represents L1 cache, light orange is L2 cache, and dark orange is main memory.

We measured the QX9650's 6MB L2 cache latency at 15 cycles, just one cycle more than the smaller 4MB L2 cache in the QX6850. Larger caches tend to bring latency penalties with them, but the smarter L2 in Penryn has barely any penalty at all. That helps explain why the QX9650's memory access latencies are effectively equivalent to the older chips.

But enough of this CPU geekery! Let's play some games.

Copyright ©1999-2009 The Tech Report. All rights reserved.
About us | Privacy policy | Subscribe to our mailing list