Home Intel’s Core 2 Extreme QX6700 processor
Reviews

Intel’s Core 2 Extreme QX6700 processor

Scott Wasson Former Editor-in-Chief Author expertise
Disclosure
Disclosure
In our content, we occasionally include affiliate links. Should you click on these links, we may earn a commission, though this incurs no additional cost to you. Your use of this website signifies your acceptance of our terms and conditions as well as our privacy policy.

YOU’VE GOTTA LIKE Intel’s Core 2 Duo processors. After struggling mightily with performance and power consumption problems in the latter-day Pentiums, Intel came roaring back with the Core 2 Duo, producing a chip that goes like stink without spinning the electric meter into a frenzy. Since it offers a better combination of processing power, energy efficiency, and overclocking headroom than the Athlon 64, the Core 2 Duo has quickly become an enthusiast favorite, capturing prominent spots in our system guide recommendations and prompting a new round of upgrades for many folks.

Now comes the CPU de grâce, a processor that takes advantage of the Core 2 Duo’s modest heat output by cramming two of those chips together into a single socket, a product Intel can plausibly claim is the world’s first quad-core CPU. The Core 2 Extreme QX6700 isn’t exactly cheap and doesn’t run especially cool, but it will turn your spare bedroom into the computing equivalent of a government astrophysics lab and make the neighbors terribly jealous—provided your neighbors are total geeks.

What hath Intel wrought with this quad-core beast? Do four CPU cores make sense in a desktop PC, and what sort of applications can really take advantage of such power? Let’s have a look.

Core 2 Duo times two equals Kentsfield
We won’t dwell too long on the specifics of the Core 2 Extreme QX6700. This product, which lived its early life going by the code-name Kentsfield, really is two Core 2 Duo chips mounted together on the same package. If you want to know more about the Core 2 Duo’s basic technology, I suggest you read our review of that processor. Intel has used this multi-chip packaging technique in the past to create “dual-core” processors, such as the “Presler” Pentium D. Lashing together two separate chips rather than making one large chip makes good sense from an economic standpoint, because smaller die areas tend to make for higher yields of good chips from each wafer.

The result of this multi-chip fusion is a processor that plugs into a regular LGA775-style socket and packs four processing cores alongside a total of 8MB of L2 cache. Cosmetically, it looks for all the world like any other recent Intel desktop CPU.


The Core 2 Extreme QX6700

But here’s a fancy illustration Intel came up with to show what’s under the hood:

The Core 2 Extreme QX6700 runs at 2.66GHz on a 1066MHz front-side bus, so its clock speed matches that of the second fastest Core 2 Duo, the E6700. (The Core 2 Extreme X6800 is the fastest at 2.93GHz.) Intel probably chose not to push any harder on clock speed in order to keep the QX6700 inside of a reasonable power envelope. The E6700’s thermal rating, or TDP, is 65W, while the X6800’s is 75W. Fittingly, the QX6700’s TDP is exactly twice that of the E6700 at 130W. That’s quite enough heat production for a desktop processor, and Intel has already established a 130W thermal envelope for Pentium Extreme Edition CPUs that use this same LGA775 infrastructure.

In fact, the QX6700 should be compatible with many existing Core 2-compatible motherboards via nothing more than a BIOS update. Some mobo makers have already published compatibility lists for Kentsfield. Then again, Intel says previous revisions of its own D975XBX “BadAxe” mobo aren’t designed for use with the QX6700, so nothing is certain. You’ll want to check with the motherboard maker to ensure compatibility before taking the plunge.

Because the QX6700 is an Extreme Edition processor, it comes with a customarily robust price tag of $999 and an unlocked upper multiplier to facilitate easy overclocking. Intel also plans to introduce a less expensive Core 2 Quad Q6600 CPU at some point in the first quarter of next year. That product will run at 2.4GHz and have a TDP rating of 105W.

Quad-core’s performance challenges
The Core 2 Extreme QX6700 may be the apex of awesomeness in processors today, but it does face some formidable performance challenges, both due to its own nature and because of external factors. As a multi-chip package, the QX6700 contains two copies of a relatively well-integrated dual-core design. The two cores on each chip share a 4MB L2 cache between them, complete with dynamic partitioning and the ability to hand off ownership of data from one core to the next. Unfortunately, the integration between the QX6700’s two chips is less than ideal.

Although they occupy the same package, their only means of communication is the system’s front-side bus. The two chips must coordinate to ensure the sanity of the contents of their respective L2 caches via this bus. That will sometimes mean writing modified data out of one chip’s cache into main memory and then reading it back into the other chip’s cache—a positively eternal operation in CPU time. Both chips use this same bus to talk with the rest of the system, including main memory and I/O devices. Also, the presence of three electrical loads on the bus—two CPU chips and the core-logic chipset’s north bridge—complicates matters. Someone looking to overclock his system’s FSB may find less success with a Core 2 Quad or QX6700 than with a standard-issue Core 2 Duo.

If all of that sounds complex, just wait until you dig into the software issues. In order to take advantage of multi-core processors, software applications must execute by means of multiple threads. Today, very few games and not many other applications are multithreaded. We do try to take advantage of multithreaded applications when possible in our CPU test suite, but that’s more difficult to do for four cores than for two. Many of the early optimizations for multi-core processors only use two threads, so their performance benefits are fully realized on a dual-core CPU.

There are reasons for this situation. For instance, one of our test apps, the MP3 encoding program LAME MT, employs a technique called linear pipelining that processes a portion of its work one frame ahead of the main thread and then buffers the result for later use. This method uses only two threads and can’t take advantage of more than two CPU cores, but it is relatively easy to program. LAME MT’s author says of linear pipelining: “In general, this approach is highly recommended, for it is exponentially harder to debug a parallel application than a linear one.” On a similar note, we have seen measurable performance gains in dual-core systems using graphics drivers that offload some vertex processing to a second thread, but Nvidia’s drivers, at least, don’t appear to benefit from the presence of more than two cores.

The thread scheduling mechanism in Windows presents another challenge for quad-core processors, because it doesn’t always make the best decisions. During our testing, for example, we found that the Core 2 Extreme QX6700 was turning in substantially lower performance running the same single-threaded task—a POV-Ray scene render—than the like-clocked Core 2 Duo E6700. This behavior was consistent across multiple benchmark runs and a little bit puzzling, until we looked at the Windows Task Manager as this process ran. Turns out the rendering work was bouncing around across all four of the QX6700’s cores, playing havoc with cache locality and the like.

For the most part, you can expect the Core 2 Extreme QX6700 to perform like a Core 2 Duo E6700 in applications that use only one or two threads, but the QX6700 may prove slower in some cases due to additional bus overhead or bad thread management in Windows. Of course, when applications use more than two threads or more than two apps are running at once, the QX6700 will pull the tab back and pop open a can of whupass. We have some applications like that in our test suite, so you can see quad-core’s true potential.

That potential, by the way, will almost certainly be more fully realized by future applications, especially games. Software developers know that multi-core processors are the future, and high-profile game development houses have been working on game engines that use multiple threads to handle various tasks. Heck, they practically have to given that the Xbox 360 and the PlayStation 3 have multi-core CPUs. Doing this kind of thing well is by no means a trivial undertaking, but the general trajectory seems to involve spinning off threads for specific game elements like A.I., physics, rendering, and audio. Industry giants like Microsoft and Intel have been pouring resources into helping the conversion to multithreading happen, and I’m convinced that it will.

If you’re not convinced, perhaps a couple of statements that Intel forwarded to us from key game developers will help. Here’s Tim Sweeney, Founder and President of Epic Games:

Multi-core computing is the new standard for PC games, and we at Epic are thrilled to see Intel leading the industry forward with Core 2 Extreme. Its four high-performance CPU cores enable a new level of realism in games, with realistic physics simulation, character animation, and other computationally-intensive systems.

And here’s Gabe Newell, President and co-founder of Valve:

Quad-core will change every aspect of PC gaming. It will change how we create our games, how we provision our service, and how we design our games. The scalability we’ve seen in graphics over the last few years will now extend to physics, AI, animation, and all the systems which are critical to moving beyond the era of pretty but dumb games.

I don’t think these guys are just issuing blanket statements of support in order to play nice. That’s not been their style, historically. In fact, we will have more coverage of the specifics of Valve’s multithreading efforts very soon, so stay tuned for that.

Between now and when those next-generation game engines arrive, owners of quad-core processors will have to find other ways to take full advantage of their CPUs. The test results on the following pages offer numerous examples of applications that use four threads, and beyond that, there’s always the prospect of really, really good multitasking. My initial reaction is that you don’t need four cores for good multitasking. Despite frequent abuse, my current Athlon 64 X2-based desktop system rarely slows down, and when it does, available CPU time isn’t the likely culprit. Then again, I sure wouldn’t complain about having four cores at my beck and call.

 

Test notes
First and foremost, you will see test results for the Core 2 Duo E6400 processor in this article. That processor came to us courtesy of the fine folks at NCIX. Those of you who are in Canada will definitely want to check them out as potential source of PC hardware and related goodies. Of course, no examination of the E6400 would be complete without some overclocking action, and we intend to produce a separate article about overclocking the E6400 in the near future.

You’ll notice the presence of a CPU marked “Core 2 4MB 1.86GHz” in our results. That’s actually a Core 2 Extreme X6800 chip that I’ve clocked down to the same speed as the Core 2 Duo E6300. I wanted to see how the move from 4MB of L2 cache to 2MB impacts performance, so I set up this clock-for-clock comparison against the E6300.

Please note that the two Pentium D 900-series processors in our test are actually a Pentium Extreme Edition 965 chip that’s been set to the appropriate core and bus speeds and had Hyper-Threading disabled in order to simulate the actual products. Similarly, our Socket AM2 versions of the Athlon 64 X2 4800+, 4600+, and 4200+ are actually the Athlon 64 FX-62 and X2 5000+ clocked down to the appropriate speeds, and the Core 2 Duo E6600 is actually an underclocked Core 2 Extreme X6800. The performance of our “simulated” processor models should be identical to the actual products.

Also, I’ve placed asterisks next to the memory clock speeds of the Socket AM2 test systems in the table below. Due to limitations in AMD’s memory clocking scheme, a couple of these systems couldn’t set their memory clocks to exactly 800MHz.

Our testing methods
As ever, we did our best to deliver clean benchmark numbers. Tests were run at least three times, and the results were averaged.

Our test systems were configured like so:

Processor Pentium Extreme Edition 965 3.73GHz Core 2 Extreme QX6700 2.66GHz Pentium D 950 3.4GHz
Pentium D 960 3.6GHz
Athlon 64 X2 3800+ Energy Efficient SFF 2.0GHz
Athlon 64 X2 4200+ 2.2GHz
Athlon 64 X2 4800+
2.4GHz
Athlon 64 X2 4600+ 2.4GHz
Athlon 64 X2 5000+ 2.6GHz
Athlon 64 FX-62
2.8GHz
Athlon 64 X2 4600+ Energy Efficient 2.4GHz
Core 2 Duo E6300 1.86GHz
Core 2 Extreme X6800 at 1.86GHz
Core 2 Duo E6600 2.4GHz
Core 2 Duo E6700 2.66GHz
Core 2 Extreme X6800 2.93GHz
Core 2 Duo E6400 2.66GHz
System bus 1066MHz (266MHz quad-pumped) 1066MHz (266MHz quad-pumped) 800MHz (200MHz quad-pumped) 1GHz HyperTransport
Motherboard Intel D975XBX Intel D975XBX2 Intel D975XBX Asus M2N32-SLI Deluxe
BIOS revision BX97510J.86A.1073.
2006.0427.1210
BX97520J.86A.1024.
2006.0814.1142
BX97510J.86A.1073.
2006.0427.1210
0402
BX97510J.86A.1209.
2006.0601.1340
BX97510J.86A.1334.
2006.0714.1343
North bridge 975X MCH 975X MCH 975X MCH nForce 590 SLI SPP
South bridge ICH7R ICH7R ICH7R nForce 590 SLI MCP
Chipset drivers INF Update 7.2.2.1007
Intel Matrix Storage Manager 5.5.0.1035
INF Update 7.2.2.1007
Intel Matrix Storage Manager 5.5.0.1035
INF Update 7.2.2.1007
Intel Matrix Storage Manager 5.5.0.1035
SMBus driver 4.52
IDE/SATA driver 6.67
Memory size 2GB (2 DIMMs) 2GB (2 DIMMs) 2GB (2 DIMMs) 2GB (2 DIMMs)
Memory type Crucial Ballistix PC2-8000
DDR2 SDRAM
at 800MHz
Crucial Ballistix PC2-8000
DDR2 SDRAM
at 800MHz
Crucial Ballistix PC2-8000
DDR2 SDRAM
at 800MHz
Corsair TWIN2X2048-8500C5 DDR2 SDRAM at 800MHz*
Corsair TWIN2X2048-8500C5 DDR2 SDRAM at 800MHz
Corsair TWIN2X2048-8500C5 DDR2 SDRAM at 800MHz
CAS latency (CL) 4 4 4 4
RAS to CAS delay (tRCD) 4 4 4 4
RAS precharge (tRP) 4 4 4 4
Cycle time (tRAS) 15 15 15 12
Audio Integrated ICH7R/STAC9221D5
with SigmaTel 5.10.4991.0 drivers
Integrated ICH7R/STAC9221D5
with SigmaTel 5.10.4991.0 drivers
Integrated ICH7R/STAC9221D5
with SigmaTel 5.10.4991.0 drivers
Integrated nForce 590 MCP/AD1988B with SoundMAX 5.10.2.4490 drivers
Hard drive Maxtor DiamondMax 10 250GB SATA 150
Graphics GeForce 7900 GTX 512MB PCI-E with ForceWare 84.25 drivers
GeForce 7900 GTX 512MB PCI-E with ForceWare 84.21 drivers (WorldBench only)
OS Windows XP Professional x64 Edition
Windows XP Professional with Service Pack 2 (WorldBench only)

Thanks to Corsair and Crucial for providing us with memory for our testing. Both of them provide products and support that are far and away superior to generic, no-name memory.

Also, all of our test systems were powered by OCZ GameXStream 700W power supply units. Thanks to OCZ for providing these units for our use in testing.

The test systems’ Windows desktops were set at 1280×1024 in 32-bit color at an 85Hz screen refresh rate. Vertical refresh sync (vsync) was disabled.

We used the following versions of our test applications:

The tests and methods we employ are generally publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them.

 

Memory performance
We’ll begin with our usual set of synthetic memory tests, just to set the stage. The QX6700 doesn’t bring any additional memory bandwidth to go along with its additional CPU cores, but we can see what effect its additional bus overhead has on memory access.

All of these synthetic, single-threaded memory tests show the Core 2 Extreme QX6700 performing just like its dual-core sibling, the E6700. Our version of Linpack isn’t highly optimized, but it does give us a sense of cache size and performance.

Now for some ridiculously verbose 3D representations of cache and memory latency, just because we can. I’ve color coded the portions of these graphs to correspond with L1 cache (yellow), L2 cache (amber), and main memory (orange).

Once again, the QX6700 looks very similar to a dual-core Core 2 Extreme. Access latencies are lower on the Athlon 64 thanks to its integrated memory controller, but the Core 2 CPUs look awfully good compared to the Pentium Extreme Edition, likely due to their improved L2 cache prefetchers and their ability to move loads ahead of stores in certain cases, also known as memory disambiguation.

 

Gaming performance

Quake 4
We tested Quake 4 by running our own custom timedemo with and without its multiprocessor optimizations enabled. These can be switched on in the game console by setting the “r_usesmp” variable to “1”.

We’re testing at lower resolutions and lower graphical detail settings in order to ease any GPU bottlenecks that might mask the performance differences between the CPUs. At higher graphical detail settings or with a less powerful graphics card, of course, the graphics subsystem might become the primary performance limiter.

Above the following benchmark graph, and throughout most of the tests in this review, we’ve included Task Manager plots showing CPU utilization. These plots were captured on the Pentium Extreme Edition 965, and they should offer some indication of how much impact multithreading has on the operation of each application. Single-threaded apps may sometimes show up as spread across multiple processors in Task Manager, but the total amount of space below all four lines shouldn’t equal more than the total area of one square if the test is truly single-threaded. Anything significantly more than that is probably an indication of some multithreaded component in the execution of the test. Because WorldBench’s tests are entirely scripted, however, we weren’t able to capture Task Manager plots for them, as you’ll notice later.

Nvidia’s video drivers are now multithreaded, so we should see some amount of multithreading action happening in any application that uses the GPU for 3D graphics, even if the game is only single-threaded.


With “r_usesmp 0”


With “r_usesmp 1”

Quake 4’s multiprocessor optimizations make some use of a second core, but that’s pretty much it. Our quad-core processor isn’t any faster here than its dual-core counterpart.

The Elder Scrolls IV: Oblivion
We tested Oblivion by manually playing through a specific point in the game five times for each CPU while recording frame rates using the FRAPS utility. Each gameplay sequence lasted 60 seconds. This method has the advantage of simulating real gameplay quite closely, but it comes at the expense of precise repeatability. We believe five sample sessions are sufficient to get reasonably consistent and trustworthy results. In addition to average frame rates, we’ve included the low frames rates, because those tend to reflect the user experience in performance-critical situations. In order to diminish the effect of outliers, we’ve reported the median of the five low frame rates we encountered.

We set Oblivion’s graphical quality settings to “Medium,” 800×600 resolution, with HDR lighting enabled. Our Oblivion test is a quick run around the Imperial City Arboretum.

There’s some variability built into our manual FRAPS-based testing here, but the QX6700 is at least no faster than the E6700 once again. Note that in both of these games, though, the QX6700 is faster than anything AMD has to offer. That’s largely because of the Core 2’s excellent single-threaded performance, a dynamic that will continue to matter as AMD introduces its own quad-core “4×4” solution in the coming weeks.

 

F.E.A.R.
We used F.E.A.R.’s built-in “test settings” benchmark to get these results. The game’s “Computer” and “Graphics” performance options were both set to “High.”

Battlefield 2
We used FRAPS to capture BF2 frame rates just as we did with Oblivion. Graphics quality options were set to BF2’s canned “High” quality profile. This game has a built-in cap at 100 frames per second, and we intentionally left that cap enabled so we could offer a faithful look at real-world performance.

Unreal Tournament 2004
We used a more traditional recorded timedemo for testing UT2004, but we tried out two versions of the game, the original 32-bit flavor and the 64-bit version.

The QX6700 suffers a little bit in F.E.A.R. compared to the Core 2 Duo, likely due to some quirk of thread allocation or bus overhead. However, it regains its footing in BF2 and UT2004, shadowing the E6700 in both games.

 

3DMark06
3DMark06 combines the results from its graphics and CPU tests in order to reach an overall score. Here’s how the processors did overall and in each of those tests.

The QX6700 lands the top spot in 3DMark06, for reasons that will become clear as we look at the rest of the results.

Our test systems are largely limited by the graphics card throughout 3DMark’s four main graphics tests. Obviously, this isn’t where the QX6700 distinguishes itself.

3DMark’s CPU tests, however, are widely multithreaded, and our quad-core processor excels in them. Here’s how FutureMark describes what’s going on in these tests:

Both CPU tests use our new game engine, and rely on AI, physics and game logic to generate a multi-threaded workload that can be distributed on multiple processors, cores or even on a single processor. Ageia PhysX library and D* Lite path finding AI algorithm are produce [sic] demanding CPU loads.

Because the results of the CPU tests help determine the overall 3DMark composite score, the QX6700 comes out on top there, as well.

 

WorldBench overall performance
WorldBench’s overall score is a pretty decent indication of general-use performance for desktop computers. This benchmark uses scripting to step through a series of tasks in common Windows applications and then produces an overall score for comparison. WorldBench also records individual results for its component application tests, allowing us to compare performance in each. We’ll look at the overall score, and then we’ll show individual application results alongside the results from some of our own application tests.

We’ve seen dual-core CPUs set new highs in WorldBench, but disappointingly, the QX6700 isn’t able to replicate that success with four cores. We’ll have to see where it fell short in the various WorldBench component tests.

Audio editing and encoding

LAME MP3 encoding
LAME MT is the multithreaded version of the LAME MP3 encoder that we discussed earlier. LAME MT was created as a demonstration of the benefits of multithreading specifically on a Hyper-Threaded CPU like the Pentium 4. (Of course, multithreading works even better on dual-core processors.) You can download a paper (in Word format) describing the programming effort.

Rather than run multiple parallel threads, LAME MT runs the MP3 encoder’s psycho-acoustic analysis function on a separate thread from the rest of the encoder using simple linear pipelining. That is, the psycho-acoustic analysis happens one frame ahead of everything else, and its results are buffered for later use by the second thread.

We have results for two different 64-bit versions of LAME MT from different compilers, one from Microsoft and one from Intel, doing two different types of encoding, variable bit rate and constant bit rate. We are encoding a massive 10-minute, 6-second 101MB WAV file here, as we have done in many of our previous CPU reviews.

MusicMatch Jukebox

As I said earlier, LAME MT’s linear pipelined approach doesn’t gain any benefit from more than two cores, and I don’t believe MusicMatch Jukebox uses anything more than one thread. The QX6700 still looks pretty good, again outperforming the Athlon 64 FX-62 comfortably.

 

Video editing and encoding

Windows Media Encoder x64 Edition Advanced Profile
We asked Windows Media Encoder to convert a gorgeous 1080-line WMV HD video clip into a 320×240 streaming format using the Windows Media Video 8 Advanced Profile codec.

Windows Media Encoder

Adobe Premiere

VideoWave Movie Creator

Video encoding is one case where wider multithreading is quite possible and common. The QX6700 takes the top spot in our Windows Media Encoder Advanced Profile test, outdoing the higher-clocked Core 2 Extreme X6800 by a small margin. Unfortunately, the three video editing applications that are part of WorldBench simply don’t appear to use more than two threads.

To be fair, none of these tests is a prime example of the video encoding potential of a quad-core processor. I had hoped to include an additional video encoding test here, but time limits didn’t permit it. We’ll have to look into adding another codec, perhaps with H.264-class compression and HD source video, in a future review.

 

Image processing

Adobe Photoshop

ACDSee PowerPack

picCOLOR
picCOLOR was created by Dr. Reinert H. G. Müller of the FIBUS Institute. This isn’t Photoshop; picCOLOR’s image analysis capabilities can be used for scientific applications like particle flow analysis. Dr. Müller has supplied us with new revisions of his program for some time now, all the while optimizing picCOLOR for new advances in CPU technology, including MMX, SSE2, and Hyper-Threading. Naturally, he’s ported picCOLOR to 64 bits, so we can test performance with the x86-64 ISA. Eight of the 12 functions in the test are multithreaded.

Scores in picCOLOR, by the way, are indexed against a single-processor Pentium III 1 GHz system, so that a score of 4.14 works out to 4.14 times the performance of the reference machine.

WorldBench’s two image editing apps don’t do much with more than two cores, but the QX6700 turns in the fastest performance we’ve seen in picCOLOR on the strength of its performance in that app’s multithreaded rotate function. As ever, the Core 2 processors occupy the top ranks in all of these tests.

 

Multitasking and office applications

MS Office

Mozilla

Mozilla and Windows Media Encoder

Two of these three tests from WorldBench, the MS Office test and the Mozilla plus Windows Media Encoder one, attempt to simulate a real user multitasking session by doing different things at once with multiple applications. Regardless, the QX6700 doesn’t pull out ahead of the E6700 much, and it actually runs the Mozilla test quite a bit slower.

 

Other applications

Sphinx speech recognition
Ricky Houghton first brought us the Sphinx benchmark through his association with speech recognition efforts at Carnegie Mellon University. Sphinx is a high-quality speech recognition routine. We use two different versions, built with two different compilers, in an attempt to ensure we’re getting the best possible performance.

WinZip

Nero

All of these tests are single-threaded, and in them all, the QX6700 ties or slightly trails the E6700.

 

3D modeling and rendering

Cinebench 2003
Cinebench measures performance in Maxon’s Cinema 4D modeling and rendering app. This is the 64-bit version of Cinebench, primed and ready for these 64-bit processors.

Ahh, now here’s an application that scales well to more than two threads. The QX6700 polishes off the Cinebench rendering test in record time.

These remaining shading tests are all single-threaded, so the QX6700 ends up using only one of its cores to produce performance virtually identical to the Core 2 Duo E6700’s.

 

POV-Ray rendering
POV-Ray just recently made the move to 64-bit binaries, and thanks to the nifty SMPOV distributed rendering utility, we’ve been able to make it multithreaded, as well. SMPOV spins off any number of instances of the POV-Ray renderer, and it will divvy up the scene in several different ways. For this scene, the best choice was to divide the screen horizontally between the different threads, which provides a fairly even workload.

We tried using another new beta of POV-Ray with native support for SMP, but once again, it proved to be horribly slow, so we’re sticking with SMPOV.

If it’s rendering power you want, the QX6700 has it by the bucketload. With four threads, the QX6700 isn’t quite twice as fast as the E6700, but it’s pretty close. Wow.

Incidentally, the single-threaded instance of POV-Ray is slower here on the QX6700 than on the E6700 for the reasons I mentioned earlier. The rendering thread ping-ponged around among all four cores, and render times were higher as a result.

3dsmax 8 rendering
For our 3ds max test, we used the “architecture” scene from SPECapc for 3ds max 7. This scene is very complex and should be nice exercise for these CPUs. Using 3ds max’s default scanline renderer, we first rendered frames 0 to 10 of the scene at 500×300 resolution. The renderer’s “Use SSE” option was enabled.

The QX6700 again proves its prowess at rendering in convincing fashion here.

Next, we rendered just the first frame of the scene in 3ds max’s mental ray renderer. Notice that we’ve changed our time scale from seconds to minutes for this one.

These were definitely not the results we expected, but I’ve decided to go ahead and include them, so you can make of them what you will. We’ve seen problems in the past with mental ray refusing to run on four cores due to licensing restrictions, but in this case, all four cores appeared to be occupied as the test ran. I’m not sure what the snag was here.

 

SiSoft Sandra Mandelbrot
Next up is SiSoft’s Sandra system diagnosis program, which includes a number of different benchmarks. The one of interest to us is the “multimedia” benchmark, intended to show off the benefits of “multimedia” extensions like MMX and SSE/2. According to SiSoft’s FAQ, the benchmark actually does a fractal computation:

This benchmark generates a picture (640×480) of the well-known Mandelbrot fractal, using 255 iterations for each data pixel, in 32 colours. It is a real-life benchmark rather than a synthetic benchmark, designed to show the improvements MMX/Enhanced, 3DNow!/Enhanced, SSE(2) bring to such an algorithm.

The benchmark is multi-threaded for up to 64 CPUs maximum on SMP systems. This works by interlacing, i.e. each thread computes the next column not being worked on by other threads. Sandra creates as many threads as there are CPUs in the system and assigns [sic] each thread to a different CPU.

We’re using the 64-bit port of Sandra. The “Integer x16” version of this test uses integer numbers to simulate floating-point math. The floating-point version of the benchmark takes advantage of SSE2 to process up to eight Mandelbrot iterations at once.

Yikes. I hope we don’t get cited for violating Amdahl’s Law. In this highly parallel computing task, the QX6700 really gets to stretch its legs.

 

Folding@Home scientific computing
Next, we have a new addition to our benchmark suite: a slick little Folding@Home benchmark CD created by notfred, one of the members of Team TR, our excellent Folding team. For the unfamiliar, Folding@Home is a distributed computing project created by folks at Stanford University that investigates how proteins work in the human body, in an attempt to better understand diseases like Parkinson’s, Alzheimer’s, and cystic fibrosis. It’s a great way to use your PC’s spare CPU cycles to help advance medical research. I’d encourage you to visit our distributed computing forum and consider joining our team if you haven’t already joined one.

The Folding@Home project uses a number of highly optimized routines to process different types of work units from Stanford’s research projects. The Gromacs core, for instance, uses SSE on Intel processors, 3DNow! on AMD processors, and Altivec on PowerPCs. Overall, Folding@Home should be a great example of real-world scientific computing.

notfred’s Folding Benchmark CD tests the most common work unit types and estimates performance in terms of the points per day that a CPU could earn for a Folding team member. The CD itself is a bootable ISO. The CD boots into Linux, detects the system’s processors and Ethernet adapters, picks up an IP address, and downloads the latest versions of the Folding execution cores from Stanford. It then processes a sample work unit of each type.

On a system with two CPU cores, for instance, the CD spins off a Tinker WU on core 1 and an Amber WU on core 2. When either of those WUs are finished, the benchmark moves on to additional WU types, always keeping both cores occupied with some sort of calculation. Should the benchmark run out of new WUs to test, it simply processes another WU in order to prevent any of the cores from going idle as the others finish. Once all four of the WU types have been tested, the benchmark averages the points per day among them. That points-per-day average is then multiplied by the number of cores on the CPU in order to estimate the total number of points per day that CPU might achieve.

This may be a somewhat quirky method of estimating overall performance, but my sense is that it generally ought to work. We’ve discussed some potential reservations about how it works here, for those who are interested. I have included results for each of the individual WU types below, so you can see how the different CPUs perform on each.

Because this is a new addition to our test suite and it takes a while for the benchmark to run, I was only able to run the benchmark once on each CPU, not three times each per our usual practice. Also, all processors tested on the D975XBX motherboard here used the 1334 BIOS and Corsair RAM.

The Athlon 64 processors are relatively strong in the Tinker and Amber WU types, an atypical result compared to almost all of our other benchmarks so far. With the two Gromacs WU types, though, the Core 2 processors are back on top. Once we average all four WU types together, it’s nearly a toss-up. The Core 2 Extreme X6800 has the highest average, while the Athlon 64 FX-62 is practically tied with the two 2.66GHz Core 2-based processors. Of course, the bottom line is that the QX6700 is far and away the most capable single-socket solution for Folding, as one would expect from a quad-core CPU.

The case of the Pentium Extreme Edition 965 is intriguing. It’s the only CPU here that has Hyper-Threading, so it can run four WUs simultaneously on its two cores. The benchmark loads up all four of this processor’s front ends at once, and the Extreme Edition 965 responds by turning in the lowest scores of the lot for each WU type, pretty much like one would expect. However, once we multiply the 965’s average points per day by its four front ends, this CPU comes out ahead of the Core 2 Extreme X6800. Is this a fair and accurate reflection of how the Extreme Edition 965 would perform in the real world while running four instances of the Folding@Home client? I’m not entirely sure. There are issues of cache sharing and locality created by running different WU types at once on a multi-core CPU, and those issues are multiplied by doing the same with Hyper-Threading.

At any rate, this is a nice first look at comparative Folding@Home performance, and it confirms that the Core 2 Extreme QX6700 is a total beast for Folding.

 

Power consumption
We took our power readings at the wall outlet using an Extech 380803 power meter. Only the PC was plugged into the watt meter; the system’s monitor and speakers, for instance, were not. The “idle” readings were taken at the Windows desktop, while the “load” readings were taken using SMPOV and the 64-bit version of the POV-Ray renderer to load up the CPUs. In all cases, we asked SMPOV to use the same number of threads as there were CPU front ends in Task Manager—so four for the Extreme Edition 965, two for the Core 2 and Athlon 64 X2 processors. The test rigs were all equipped with OCZ GameXStream 700W power supply units.

The graph below for idle power use has results with and without “power management.” By “power management,” we mean the dynamic clock speed and voltage throttling technologies from Intel and AMD, known as SpeedStep and Cool’n’Quiet, respectively. The Intel processors also have an enhanced halt state known as C1E. A processor’s halt state is invoked by the OS whenever the system is able to sit idle for a moment. The C1E halt state in the Intel processors ramps down the CPU clock speed and voltage in order to save power, so even without SpeedStep, the CPU’s idle power use is reduced. Keep that in mind when considering the “No power management” results for the Intel processors at idle.

Interestingly, we found that the Core 2’s C1E state doesn’t lower CPU voltage. The CPU multiplier drops to 6.0, bringing the clock speed down to 1.6GHz, but voltage appears to remain unchanged. Turning on SpeedStep, however, drops the CPU’s core voltage, allowing for even lower idle power use.

Another tricky part about power consumption testing is getting good numbers for our “simulated” CPU speed grades. In order to make it work, you have to set the proper CPU core voltage, not just the right clock speeds. I made an attempt at simulating the Athlon 64 X2 models 4800+, 4600+, and 4200+ and the Pentium D 950/960 by setting the CPU voltages manually, but I’ve put an asterisk next to those CPUs in our results as a reminder that they’re simulated. I didn’t even bother including some simulated CPU models because of the difficulty involved and a few questionable results.

For the Athlon 64 X2 4800+, I set the voltage at 1.35V. The X2 4600+ and 4200+ were set to 1.3V. The “power management” idle scores were simply taken from chips with the same cache size (the FX-62 and 5000+, respectively), because all of these processors share the same 1 GHz/1.1V idle with Cool’n’Quiet.

The Pentium D 950 and 960 were trickier, since each Pentium D’s voltage needs are programmed at the factory. In this case, I stuck with the default of 1.312V for both speed grades. On an 800MHz bus, the Pentium D 950 and 950 both clocked down to 2.4 GHz at idle via the C1E halt mechanism. The Extreme Edition 965 clocked down to 3.2 GHz at idle.

Our Core 2 Extreme QX6700 system uses about as much power, and thus generates about as much heat, as our Pentium Extreme Edition 965 and Athlon 64 FX-62 rigs. So it’s not the Toyota Prius of the processor world, but it is remarkable that Intel was able to shoehorn two copies of the Core 2 Duo into the same basic thermal envelope as an Extreme Edition 965. If you want to talk about performance per watt—that fake shorthand metric for energy efficiency—then the QX6700 looks pretty good, provided you are using all four of its cores effectively. That’s because of the performance side of that ratio; peak performance is much higher than the Extreme Edition 965 or the FX-62.

 

Overclocking
The QX6700’s unlocked multiplier makes overclocking dead easy. Just raise the multiplier, perhaps twiddle with the voltage, and reboot. If it works, great. If not, you know the problem is the CPU and not some other component of your system throwing a fit about running out of spec.

Overclocking the QX6700 was still an adventure, though, because of another factor. As you saw on the last page, this CPU draws about as much power as any current PC processor when running at its stock speed. Getting it to run at higher speeds can present cooling challenges. This is true even though Intel sent along a cooler with it that has a B-52 propeller attached to the top. Seriously, check it out:

Running at full tilt, this thing sounds like a Metallica concert. I don’t believe I’ve ever heard a CPU cooler this loud. It’s effective, too, easily besting the results I got with my Zalman CNPS9500 LED. Still, even with this cooler, overclocking the QX6700 presented what was, for me, a new conundrum: if the CPU is totally stable at a given speed but generates heat beyond the bounds of your best cooler’s capacity, is it a successful overclock? The QX6700 was stable running four instances of Prime95 but edging up toward 73°C with the Metallica concert raging away, deep into Enter Sandman. That’s near thermal throttling territory, killing the potential benefits of overclocking. You’re going to want to use water—or something more potent—to cool the QX6700 well enough to really overclock it.

I think I can declare our overclock to 3.2GHz a success. The QX6700 was stable at that speed at its stock 1.35V. I got it to POST at 3.46GHz at 1.375V, but the system crashed while loading Windows. Here’s how the QX6700 performed at 3.2GHz.

Yep, this quad-core processor at 3.2GHz is fast. Uh huh. Yep.

 
Conclusions
Like any solution with four CPU cores, the Core 2 Extreme QX6700’s effectiveness depends on what you feed it. Give it a nicely parallelizable task with four or more threads, and it will utterly embarrass former top dogs like the Core 2 Extreme X6800 and the Athlon 64 FX-62. For applications like video encoding, 3D rendering, image processing, and scientific computing, the QX6700 trumps all other desktop processors—and, I suspect, a great many dual-socket Opteron workstations. 3DMark06’s multithreaded CPU test gives us a glimpse of how multithreaded gaming might look, and the QX6700 performs very well there, too.

Feed it a simple app with only one or two threads, though, and this quad-core monster begins to look an awful lot like a Core 2 Duo E6700 with higher power consumption and a much steeper price tag. Of course, even that isn’t a horrible place to be. In single- and dual-threaded applications, the QX6700 still wallops the Athlon 64 FX-62 nearly across the board, with similar power requirements and heat output. That fact simply underscores how good the Core 2 lineup truly is.

Still, this is very much an Extreme processor in every sense. As I’ve said in various ways over the years, I happen to think forking over a grand for a CPU is sheer insanity. If you do write that check, though, be prepared to write another one for a good water cooling system. Most air coolers that could keep this thing cool would simply be too loud for my taste, and you won’t want to attempt much overclocking with air cooling.

This quad-core CPU puts Intel in the same tricky position that the GPU guys have had to endure from time to time: the hardware is now well ahead of software development, particularly in mainstream consumer applications and games. Many owners of this beast may be stuck waiting for new applications to arrive that use it to its fullest ability. Like I said, though, I’m confident the applications will come, and when they do, the Core 2 Extreme QX6700 may well be the best option for running them.

Soon, the QX6700 should get some competition in the form of AMD’s so-called 4×4 platform. Can AMD unseat the QX6700 using dual-socket motherboards? Interesting question. I have my doubts, but I suppose we’ll soon find out. 

Scott Wasson Former Editor-in-Chief

Scott Wasson Former Editor-in-Chief

Scott Wasson is a veteran in the tech industry and the former Editor-in-Chief at Tech Report. With a laser focus on tech product reviews, Wasson's expertise shines in evaluating CPUs and graphics cards, and much more.

Latest News

Dogecoin (DOGE) Price Prediction - Will Dogecoin Hit the $2 Benchmark?
Crypto News

Dogecoin (DOGE) Price Prediction – Will Dogecoin Hit the $2 Benchmark?

Bitcoin Critic Peter Schiff Berates Bitcoin's Functionality as a Digital Currency, Cites Post-Halving Flaws
Crypto News

Bitcoin Critic Peter Schiff Berates Bitcoin’s Functionality as a Digital Currency, Cites Post-Halving Flaws

Bitcoin critic Peter Schiff questioned the token’s functionality as a digital asset following the recent halving event. He hinted that Bitcoin’s post-halving brought no significant change, contrary to many expectations....

Disney
Streaming News & Events

Disney to Feature Old-style TV Channels to Expand User Interactions

One of the prominent streaming services in the US, Disney, announces featuring an ad-supported TV-style channel on its platform. Unlike the streaming service’s initial framework that allows users to watch...

Queen Rock Montreal Coming to Disney+ With IMAX Enhanced, Here’s When
Streaming News & Events

Queen Rock Montreal Coming to Disney+ With IMAX Enhanced, Here’s When

Top Five Price Predictions for Shiba Inu After Post Bitcoin Halving
Crypto News

Top Five Price Predictions for Shiba Inu After Post Bitcoin Halving

Joe Biden Green Lights Controversial Bill Strengthening Surveillance Power of U.S. Agencies
News

Joe Biden Green Lights Controversial Bill Strengthening Surveillance Power of U.S. Agencies

Mega Dice_Leading Solana GameFi Platform
Crypto News

Crypto Token That Lets You Invest in the Leading Solana GameFi Platform