Home Intel’s Pentium 4 600 series processors
Reviews

Intel’s Pentium 4 600 series processors

Scott Wasson
Disclosure
Disclosure
In our content, we occasionally include affiliate links. Should you click on these links, we may earn a commission, though this incurs no additional cost to you. Your use of this website signifies your acceptance of our terms and conditions as well as our privacy policy.

A LITTLE OVER A YEAR ago, while media attention was affixed firmly on the Superbowl, Intel discreetly let slip a brand-new, vastly rearchitected CPU core that, by all rights, should have been called the Pentium 5. The “Prescott” CPU core, as we now know, became somewhat infamous for its particular combination of tepid performance and gluttonous appetite for power (and corresponding prodigious heat production). This was the processor that was supposed to make it to 4GHz and never did, the CPU that convinced Intel that the future was in dual-core designs and “platformization.” It may not have been a resounding success or a complete failure, but it was certainly consequential, despite its quiet introduction.

Today, in the dead of early Sunday morning, Intel is meekly unveiling another new Pentium 4 processor core, and it may be just as consequential. The Pentium 4 600 series is a new tier of performance-oriented Pentium 4 processors that will be sold alongside the existing P4 500 series. Based on the Prescott design, the 600-series core adds key features intended to pep up Prescott’s performance and curb its power consumption. Not only that, but these are 64-bit CPUs. With the introduction of a 64-bit version of Windows approaching, Intel has finally turned on Prescott’s dormant support for the 64-bit extensions to the x86 instruction set pioneered by AMD.

Recent lottery winners will also be pleased to learn of the emergence of a new Pentium 4 Extreme Edition processor. Based on the same new CPU core as the 600 series, this puppy runs at 3.73GHz on a 1066MHz front-side bus, and it has 64-bit support, as well.

Can this new variation of the Prescott core help Intel recapture its supremacy in desktop processor performance? We’ve had Intel’s new CPUs on the test bench for over a week now, and we have some answers.

The Pentium 4 660

What’s new
Intel’s new CPU core packs fistful of enhancements over the original Prescott core. I’m gonna bust out the bullet points in order to give you the highlights.

  • 2MB of L2 cache — In terms of performance, this is the number-one change. The 600 series and the new Extreme Edition both pack a robust 2MB of L2 cache now, twice as much as older P4s. The extra on-board cache memory will boost performance in situations where the CPU can avoid accessing slower main memory in order to complete a task. The benefits of extra cache RAM aren’t universal, though. Some programs cycle through quite a bit more data than 2MB, and won’t benefit from additional cache. Others already fit nicely into a smaller cache, and therefore aren’t helped by more of the same. We’ll explore this dynamic in our performance tests, of course.

    The addition of another meg of L2 cache raises the new core’s transistor count to roughly 169 million, well above the 125 million transistors in the original Prescott core. Thanks to Intel’s 90-nanometer manufacturing process, the chip isn’t incredibly large by today’s standards. Die size is up from 122mm2 to 135mm2. Larger chips generally tend to consume more power and generate more heat, all other things being equal. In this case, though, other things are not entirely equal.

  • Enhanced power management — The 600 series finally brings Intel’s Enhanced SpeedStep technology to the desktop. Previously used in Intel’s mobile processors, SpeedStep dynamically scales CPU clock speed and voltage in response to load. The new core also includes the enhanced halt state from the Pentium 4 500J-series processors we reviewed not long ago. I’ll explain more about how these new power management features interact shortly.
  • 64-bit extensions — Intel has dubbed its 64-bit extensions EM64T, for Extended Memory 64 Technology, but they are really just a functional clone of AMD’s AMD64 extensions, first implemented in the Opteron processor a couple of years ago. With these extensions and the right software, including a 64-bit operating system and applications compiled to use 64-bit extensions, the Pentium 4 gains the ability to address more than 4GB of RAM (without any workarounds). AMD64 and EM64T also include some additional registers, or local slots on the chip for storing data, that should provide a bit of a performance boost in 64-bit applications. The move to 64-bit computing won’t bring revolutionary new heights of CPU performance overnight, but it will prevent us all from bumping our heads on the 4GB memory address space limitation in the next few years.
  • Execute Disable Bit support — Like the 500J series processors, the new Intel core includes support for the Execute Disable Bit, also called the No Execute (NX) bit by AMD. Operating systems can use this “no execute” capability to help minimize the risks of certain types of security threats, such as buffer overflow exploits.

Notably missing from the features list of the 600 series is support for faster 1066MHz front-side bus speeds. Instead, the P4 600s will roll on an 800MHz bus, as did their predecessors. The 1066MHz bus is reserved for the Pentium 4 Extreme Edition processors.

Speaking of which, the Pentium 4 3.73GHz Extreme Edition is quite a change from the 3.46GHz model. This new Extreme Edition is based on the same Prescott-derived CPU core as the 600 series, while previous Extremes were based on the pre-Prescott “Gallatin” core. That means the new Extreme Edition now has a longer, 31-stage main pipeline and lower clock-for-clock performance. The old EE’s L3 cache is gone by the wayside, replaced by the beefy 2MB of L2 cache in this new core. The new EE can also do the 64-bit dance, but it doesn’t have the fancy power management or enhanced halt state that the 600 series does. The EE 3.73GHz ought to outperform the 600 series thanks to its 1066MHz bus and higher clock speed, but whether it can outperform the EE 3.46GHz is another question.


The original Prescott die (left) versus the new die with 2MB of L2 cache (right)
 

Model number mania!
Since making the move to CPU model numbers, AMD has been abusing the convention by playing mix ‘n’ match with clock speeds, cache sizes, and the number of memory channels on a processor. It’s possible to buy no less than four different varieties of Athlon 64 3200+, for example. This kind of marketing shell game may be confusing to the public, but it’s an important esteem-building exercise for the marketing majors inhabiting an engineering-driven company.

Don’t think Intel’s marketing types aren’t salivating to crank up the model number madness themselves. In fact, rather than replace it, the Pentium 4 600 series will coexist with the 500 series for the time being.

But I’m oversimplifying. You see, the 500 series is already being replaced by the 500J series, which includes Execute Disable Bit and C1E halt state support, but not EM64T or SpeedStep.

So the 600 series will coexist with the 500J series, you see.

What about pricing, you ask? Well, in order to make it plain as day, I’ve laid out the list prices for the various AMD and Intel desktop processors in a table. Here’s how it looks.

Pentium 4 Model Clock speed (GHz) List Price Pentium 4 Model Clock speed (GHz) List price Athlon 64 Model List price
520J 2.8 $163       3000+ $149
530J 3.0 $178       3200+ $194
540J 3.2 $217 630 3.0 $224 3400+ $223
550J 3.4 $278 640 3.2 $273 3500+ $272
560J 3.6 $417 650 3.4 $401 3800+ $424
570J 3.8 $637 660 3.6 $605 4000+ $643

There you have it. Now, when a confused, non-techie friend asks you which Pentium 4 processor he should buy, you have the information to answer confidently, “I have no idea.” You may optionally lapse into extended explanations of clock speed versus cache size, EM64T versus 32 bits, C1E halt state versus SpeedStep, and all the rest, but Intel has already done the hard work for you by pricing the two product lines nearly on top of each other.

To be fair, I think the obvious choice for those in the know will be the 600 series, given its EM64T support. Intel has said that it plans to extend EM64T capability across its desktop line, from the top all the way down to the Celeron D, eventually. That’s apparently not going to happen until later this year, though. Other considerations in the 500-vs.-600-series debate, including performance and power management, are quite a bit more complicated than one might expect.

Sorting out the various types of power management
One of the potentially most compelling features of the new P4 600 series is its support for Enhanced SpeedStep power management. By ramping down the processor’s clock speed and voltage when there’s little work to be done, SpeedStep should help alleviate the Pentium 4’s power and heat problems. It should also allow for much quieter Pentium 4 systems, at least at idle. However, SpeedStep alone is only part of the picture. The newer Pentium 4 processors use the same transistors on the chip in order to implement three different power and heat management functions that are distinct, but fundamentally similar. All three functions dynamically adjust the processor’s clock speed and voltage. They are:

  • C1E enhanced halt state — Introduced in the Pentium 4 500J-series processors, the C1E halt state replaces the old C1 halt state used on the Pentium 4 and most other x86 CPUs. The C1 halt state is invoked when the operating system’s idle process issues a HLT command. (Windows does this constantly when not under a full load.) Entering halt state, which is a lower-power state, will cut a CPU’s power consumption and heat production. Intel’s new C1E halt state is also invoked by the HLT command, but it turns down the entire CPU’s clock frequency (via multiplier control) and voltage in order to work its mojo. This more robust halt state requires significantly less power than the old C1 implementation.

    C1E halt cranks the CPU bus multiplier down to its lowest possible level on the 600-series processors, which is 14X, so a P4 660 processor with the C1E halt state active actually runs at 2.8GHz. I believe that C1E halt is also a binary condition invoked by the HLT command; it’s either on or it’s off.

  • Enhanced SpeedStep — SpeedStep also modulates the CPU clock speed and voltage according to load, but it is invoked via another mechanism. The operating system must be aware of SpeedStep, as must the system BIOS, and then the OS can request frequency changes via ACPI. SpeedStep is more granular than C1E halt, because it offers multiple rungs up and down the ladder between the maximum and minimum CPU multiplier and voltage levels.

    Intel cites its mobile products when talking about SpeedStep, which is apt but not entirely helpful because it conjures up images of the Pentium M processor, a very different beast. The Pentium 4 doesn’t contain most of the heroic power-saving measures of the Pentium M.

  • TM2 thermal throttling — Since the beginning, all Pentium 4 processors have included a facility for throttling themselves back in the event that they should begin to overheat. This facility, called Thermal Monitoring 1 or TM1, essentially tells the Pentium 4 to take half its clock cycles off, cooling the CPU and reducing power by about 50%. Externally, the chip still runs at its rated frequency, but internally, it runs at half that. This throttling mechanism is effective, but it has several disadvantages. As Michael Schuette has noted, the rest of the system doesn’t know what to do with an internally throttling Pentium 4, and the memory subsystem can be thrown into a retry loop that causes it to heat up. For the same reason, TM1 throttling can really harm performance.

    TM2 throttling instead steps down CPU voltage and clock speed via the same mechanism as C1E and SpeedStep, effectively cooling the CPU by roughly 40% without affecting performance as severely TM1 throttling does. TM2 shouldn’t throw the memory subsystem into a retry loop, either. Like TM1 throttling, TM2 becomes active when the CPU decides it’s getting too hot.

Obviously, the practical difference between TM2 throttling and the other two functions is easily discernible. The practical difference between C1E halt and SpeedStep, however, is more difficult to pinpoint and is literally quite marginal. SpeedStep will adjust clock frequency and voltage more gradually during transitions between idle and busy times, but C1E would seem to accomplish more or less the same thing a little less gracefully. SpeedStep may opportunistically grab a little more power savings here and here on a partially loaded system, but I wouldn’t expect dramatic differences. C1E’s great advantage is that it’s transparent to the operating system and requires no special support other than the already widely used HLT command.

Oddly, although these two functions use the same transistors, Intel has elected to endow the Pentium 500J series with C1E halt but not SpeedStep, while the 600 series gets both. This is apparently a product segmentation decision.

As I noted, this CPU core’s lowest possible CPU bus multiplier is 14X, which explains why the new Extreme Edition 3.73GHz lacks C1E halt, SpeedStep, and TM2 support. Thanks to its 1066MHz front-side bus speed, the new EE runs at a 14X multiplier by default, and thus has no ability to turn down its clock speed. I have to think that, were Intel fully committed to making the P4 Extreme Edition a compelling product, they would have added some lower multipliers to the chip. Doing so probably wasn’t worth the extra effort for such a low-volume product, though.

 

Ok, here’s the plan
Now, let’s move on to our test results. We’ll start with 32-bit benchmarks to test the performance impact of the larger L2 cache, and then we’ll move to power consumption, where we can see the influence of the power management features in these new processors. Finally, in a follow-up article, we’ll take a look at 64-bit performance with Windows XP Professional x64.

In order to compare the 32-bit performance of the new Pentium 4 processors to the widest possible range of competitors, including the Pentium M, I’ve reused some test results from my previous articles. I hope you’ll forgive the use of an AGP motherboard on the Athlon 64 system, but there wasn’t time to retest everything. Generally, the move to PCI Express doesn’t change performance much in current applications. If you’d like to see how the Athlon 64 4000+ performs on a couple of new PCI Express chipsets, please have a look at this article. The results are largely comparable to the ones you’ll see here.

Our testing methods
As ever, we did our best to deliver clean benchmark numbers. Tests were run at least twice, and the results were averaged.

Our test systems were configured like so:

Processor Athlon 64 3200+ 2.0GHz (S939)
Athlon 64 3500+ 2.2GHz (90nm)
Athlon 64 3800+ 2.4GHz
Athlon 64 4000+ 2.4GHz
Athlon 64 FX-55 2.6GHz
Pentium M 755 2.0GHz Pentium 4 540 3.2GHz
Pentium 4 550 3.4GHz
Pentium 4 560 3.6GHz
Pentium 4 Extreme Edition 3.4GHz
Pentium 4 Extreme Edition 3.46GHz
Pentium 4  640 3.2GHz
Pentium 4 650 3.4GHz
Pentium 4 660 3.6GHz
Pentium M 755 at 2.4GHz Pentium 4 570J 3.8GHz
Pentium 4 Extreme Edition 3.73GHz
System bus 1GHz HyperTransport 400MHz (100MHz quad-pumped) 800MHz (200MHz quad-pumped) 1066MHz (266MHz quad-pumped)
800MHz (200MHz quad-pumped)
533MHz (133MHz quad-pumped)
1066MHz (266MHz quad-pumped)
Motherboard Asus A8V Deluxe DFI 855GME-MGF Abit AA8 DuraMax Intel D925XECV2
BIOS revision 1008 beta 1 55GMDC06 1.4 CV92510A.86A.0338
CV92510A.86A.0394.EB
1.7
CV92510A.86A.0394.EB
North bridge K8T800 Pro 855GME 925X MCH 925XE MCH
South bridge VT8237 6300ESB ICH ICH6R ICH6R
Chipset drivers 4-in-1 v.1.11 beta (9/7/04) INF Update 6.0.1.1002
IAA for RAID 4.5.0.6515
INF Update 6.0.1.1002
IAA for RAID 4.5.0.6515
INF Update 6.0.1.1002
IAA for RAID 4.5.0.6515
Memory size 1GB (2 DIMMs) 1GB (2 DIMMs) 1GB (2 DIMMs) 1GB (2 DIMMs)
Memory type OCZ PC3200 EL DDR SDRAM at 400MHz OCZ PC3200 EL DDR SDRAM at 333MHz OCZ PC2 5300 DDR2 SDRAM at 533MHz OCZ PC2 5300 DDR2 SDRAM at 533MHz
CAS latency (CL) 2 2 3 3
RAS to CAS delay (tRCD) 2 2 3 3
RAS precharge (tRP) 2 2 3 3
Cycle time (tRAS) 5 5 10 10
Hard drive Maxtor MaXLine III 250GB SATA 150
Audio Integrated VT8237/ALC850 with 3.64 drivers Integrated 6300ESB/ALC655 with 5.10.0.5750 drivers Integrated ICH6R/ALC880 with 5.10.0.5022 drivers Integrated ICH6R/ALC880 with 5.10.0.5032 drivers
InGraphics GeForce 6800 GT 256MB AGP with ForceWare 66.81 drivers GeForce 6800 GT 256MB AGP with ForceWare 66.81 drivers GeForce 6800 GT 256MB PCI-E with ForceWare 66.81 drivers GeForce 6800 GT 256MB PCI-E with ForceWare 66.81 drivers
OS Microsoft Windows XP Professional
OS updates Service Pack 2, DirectX 9.0c

All tests on the Pentium 4 systems were run with Hyper-Threading enabled.

Thanks to OCZ for providing us with memory for our testing. If you’re looking to tweak out your system to the max and maybe overclock it a little, OCZ’s RAM is definitely worth considering.

Also, all of our test systems were powered by OCZ PowerStream power supply units. The PowerStream was one of our Editor’s Choice winners in our latest PSU round-up.

The test systems’ Windows desktops were set at 1152×864 in 32-bit color at an 85Hz screen refresh rate. Vertical refresh sync (vsync) was disabled for all tests.

We used the following versions of our test applications:

The tests and methods we employ are generally publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them.

 

Memory performance
The first few tests are synthetic memory tests. They aren’t a good indication of real-world application performance, but they can illustrate the impact of larger L2 cache on the memory subsystem.

The 600 series lands about where we’d expect overall, though it is a little bit slow in Sandra, possibly because of the extra overhead imposed by the larger L2 cache. In cachemem, which is not so aggressive about doing buffering and the like to get the absolute maximum throughput, the new 2MB L2 chips shine. The P4 does its own speculative prefetching of data into its L2 cache, and with a larger cache, it’s able to achieve quite a bit more throughput.

Linpack’s calculations are performed almost entirely in the L2 cache of the new chips. Notice how the Pentium 4 560’s performance matches that of the 660 until we get to just under 1024K, and then it begins to drop off. The 660 just keeps on truckin’.

The larger cache doesn’t do much for memory access latency, unfortunately.

 
Memory performance (continued)
Here’s a slightly indulgent look at memory access latencies in more detail. If the following intimidates you, just skip to the next page with the gaming results. Remember, though, to flip back here if the boss is looking over your shoulder.

I’ve colored the data series below according to how they correspond to different parts of the memory subsystem. Yellow is L1 cache, light orange is L2 cache, and orange is main memory. The red series, if present, represents L3 cache. Of course, caches sometimes overlap, so the colors are just an interesting visual guide.

I’ve whittled down the entrants here to one or two representatives from each CPU type in order to keep things manageable. Compare the first two graphs to see the difference between 1MB and 2MB of L2 cache.

The new P4 chips certainly have lots of cache, but the Athlon 64 still gets out to main memory very quickly courtesy of its onboard memory controller.

 

Gaming performance

Doom 3
We’ll begin the real-world tests with Doom 3. We tested performance by playing back a custom-recorded demo that should be fairly representative of most of the single-player gameplay in Doom 3.

The extra cache raises performance by a few frames per second in Doom 3, but that’s it. The Pentium 4 needed more of a boost in order to become competitive.

Far Cry
Our Far Cry demo takes place on the Pier level, in one of those massive, open outdoor areas so common in this game. Vegetation is dense, and view distances can be very long.

Far Cry barely notices the extra cache at all.

 

Unreal Tournament 2004
Our UT2004 demo shows yours truly putting the smack down on some bots in an Onslaught game.

UT2004 doesn’t seem to care much for the additional cache, either, and the new Prescott-based Extreme Edition has trouble catching up to the old one here.

However, could the picture change during actual gameplay? Some folks from Intel suggested to us that we should consider testing gameplay performance with the FRAPS frame rate capture program instead of relying on an in-game benchmarking function. The suggestion makes some sense, because timedemo playback tools don’t always use every aspect of the game engine, such as physics, A.I., and user input routines.

I tried using FRAPS with a couple of games, including Doom 3 and Rome: Total War, but frame rate caps in those games prevented us from being able to show meaningful performance differences between different processors. UT2004, which is very much a CPU-bound game, was a different story. The results below are averaged from five different 150-second gaming sessions played on the same Onslaught map as in our timedemo above, ONS-Torlan. I was playing against computer-controlled bots, so UT2004’s A.I. was working overtime.

Playing the game yields results similar to demo playback. The 600 series is no faster than the 500 series, and the new EE is slower than the old one. More notably, the Athlon 64 3500+ is quite a bit faster than any of the P4s.

Before we move on, we tried one more thing with UT2004. We tested CPU performance using its software renderer, just to see what would happen.

Not even the software renderer is able to use the extra cache to any advantage. That’s just cold.

 

3DMark05

Finally, we have a benchmark that seems to appreciate the additional cache! 3DMark puts it to good use, bolstering the P4 660 and EE 3.73GHz to the top ranks of its overall CPU index.

 

WorldBench overall performance
WorldBench uses scripting to step through a series of tasks in common Windows applications. Also like those benchmarks, WorldBench produces an overall score for comparison. More impressively, WorldBench spits out individual results for its component application tests, allowing us to compare performance in each. We’ll look at the overall score, and then we’ll show individual application results alongside the results from some of our own application tests.

There’s not much daylight between the different processors in the overall WorldBench score, but the 600 series’ extra cache seems to be worth a point or two.

Audio editing and encoding

LAME MP3 encoding
We used LAME to encode a 101MB 16-bit, 44KHz audio file into a very high-quality MP3. The exact command-line options we used were:

lame –alt-preset extreme file.wav file.mp3

MusicMatch Jukebox

The music encoding apps don’t appear to appreciate more cache, but the P4 is still competitive in LAME encoding, regardless.

 

Video encoding and editing

XMPEG DivX video encoding
We used the default settings for the DivX codec to encode a 3000-frame sequence from a DVD-formatted MPEG2 source file.

Windows Media Encoder

Adobe Premiere

VideoWave Movie Creator

All of the video encoding applications benefit from 2MB of L2 cache, surprisingly enough, and the Pentium 4 600 series looks even stronger overall than the 500 series does.

 

Image processing

Adobe Photoshop

ACDSee PowerPack

picCOLOR
We thank Dr. Reinert Muller with the FIBUS Institute for pointing us toward his picCOLOR benchmark. This image analysis and processing tool is partially multithreaded, and it shows us the results of a number of simple image manipulation calculations. We’re using a new build of picCOLOR this time out; it removes the video tests, which are highly dependent on the chipset and video card, from the calculation of the overall score.

ACDSee could care less about the larger L2 cache, but both Photoshop and picCOLOR see marginal performance gains.

 

Multitasking and office applications

MS Office

WorldBench’s Office test involves switching between the various components of the Office suite, which are all running at once. This test is a nice showcase for the Pentium 4’s Hyper-Threading capabilities (although some of the tests we’ve already seen are also multithreaded). The 600 series is a little quicker than the 500 series, as well, in this test.

Mozilla

More cache is better, but Mozilla still prefers the Pentium M and Athlon 64.

Mozilla and Windows Media Encoder

This multitasking test allows the Pentium 4 600s to get a little revenge. When you combine web browsing with video encoding, the P4 looks relatively stronger.

 

Other applications

Sphinx speech recognition
Ricky Houghton first brought us the Sphinx benchmark through his association with speech recognition efforts at Carnegie Mellon University. Sphinx is a high-quality speech recognition routine. We use two different versions, built with two different compilers, in an attempt to ensure we’re getting the best possible performance.

We have a new champ in Sphinx, perhaps thanks the to new core’s large cache and aggressive prefetching.

WinZip

Oddly, the P4 600s end up slower, somehow, in WinZip.

Nero

The P4 systems roll in Nero, and the new core handles itself especially nicely here.

 

3D modeling and rendering

Cinebench 2003
Cinebench is based on Maxon’s Cinema 4D modeling, rendering, and animation app. This revision of Cinebench measures performance in a number of ways, including 3D rendering, software shading, and OpenGL shading with and without hardware acceleration. Cinema 4D’s renderer is multithreaded, so it takes advantage of Hyper-Threading, as you can see in the results.

The 2MB L2 cache helps slightly in the shading tests, but not at all in the Cinebench renderer, where the Pentium 4 was already quite fast.

 

3ds max
We have used 3ds max in the past for CPU testing, but most of those tests have consisted of rendering only. WorldBench’s 3ds max tests replicate an entire modeling and animation work session, stressing the graphics card as well as the CPU and the rest of the system.

What a difference an API can make. The larger L2 cache of the 600 series makes the P4 more competitive in OpenGL, but not in DirectX, where the cache shows little benefit.

POV-Ray
POV-Ray is the granddaddy of PC ray-tracing renderers, and it’s not multithreaded in the least, because it’s designed to be a cross-platform application. POV-Ray also relies heavily on x87 FPU instructions to do its work, and it contains only minor SSE optimizations.

POV Ray is all about the MHz and none about the cache size. That’s why the Athlon 64 3800+ matches the performance of the 4000+ (both run at the same speed), and it’s also why the P4 600 chips aren’t any faster than the 500s.

 

Power consumption
In order to test the power management features of the P4 600 series, we decided to throw out all of our old results and start over with fresh test systems. There were a number of reasons for this decision, including the fact that our prior Athlon 64 motherboard didn’t properly support Cool’n’Quiet, AMD’s power management technology. Below you’ll find the test configurations we used for power consumption testing.

Processor Athlon 64 3200+ 2.0GHz (S939)*
Athlon 64 3500+ 2.2GHz (90nm)
Athlon 64 3800+ 2.4GHz
Athlon 64 4000+ 2.4GHz
Athlon 64 FX-55 2.6GHz
Pentium M 755 2.0GHz Pentium 4 540 3.2GHz*
Pentium 4 550 3.4GHz*
Pentium 4 560 3.6GHz
Pentium 4 540J 3.2GHz*
Pentium 4 550J 3.4GHz*
Pentium 4 560J 3.6GHz*
Pentium 4 570J 3.8GHz
Pentium 4  640 3.2GHz*
Pentium 4 650 3.4GHz*
Pentium 4 660 3.6GHz
Pentium M 755 at 2.4GHz* Pentium 4 Extreme Edition 3.46GHz
Pentium 4 Extreme Edition 3.73GHz
System bus 1GHz HyperTransport 400MHz (100MHz quad-pumped) 800MHz (200MHz quad-pumped)
533MHz (133MHz quad-pumped) 1066MHz (266MHz quad-pumped)
Motherboard DFI LANParty nF4 SLI-DR DFI 855GME-MGF Intel D925XECV2
BIOS revision 2/9/2005 beta 55GMDC06 CV92510A.86A.0394.EB
North bridge nForce4 SLI 855GME 925XE MCH
South bridge 6300ESB ICH ICH6R
Chipset drivers SMBus driver 4.45 INF Update 6.3.0.1007 INF Update 6.3.0.1007
Memory size 1GB (2 DIMMs) 1GB (2 DIMMs) 1GB (2 DIMMs)
Memory type OCZ PC3200 EL DDR SDRAM at 400MHz OCZ PC3200 EL DDR SDRAM at 333MHz OCZ PC2 5300 DDR2 SDRAM at 533MHz
CAS latency (CL) 2 2 3
RAS to CAS delay (tRCD) 2 2 3
RAS precharge (tRP) 2 2 3
Cycle time (tRAS) 5 5 10
Hard drive Maxtor DiamondMax 10 250GB SATA 150
Audio Integrated nForce/ALC850
with Realtek 5.10.0.5780 drivers
Integrated 6300ESB/ALC655
with Realtek 5.10.0.5780 drivers
Integrated ICH6R/ALC880
with Realtek 5.10.0.5034 drivers
InGraphics GeForce 6800 Ultra 256MB PCI-E
with ForceWare 66.93 drivers
GeForce 6800 Ultra 256MB AGP
with ForceWare 66.93 drivers
GeForce 6800 Ultra 256MB PCI-E
with ForceWare 66.93 drivers
OS Microsoft Windows XP Professional
OS updates Service Pack 2, DirectX 9.0c

Please note that some of the CPUs in the spec table above and on the graphs below are marked with an asterisk. That’s because they are not true-blue versions of the chips in question. They are, instead, CPUs rated for higher speed grades that have been underclocked to match the model in question. For instance, Intel only supplied us with a Pentium 4 660 review sample, so we reduced its CPU multiplier to simulate a 650 and 640. I mention this fact because every chip is different, and generally, the ones that are chosen to run at higher speed grades exhibit better power and heat characteristics, as well.

That said, clock speed and voltage are still major determinants of overall power consumption, so we decided to include these “simulated” CPU models in our results. In all cases, we allowed the motherboard to determine the appropriate CPU voltage levels. That’s handled dynamically with SpeedStep and Cool’n’Quiet, anyway.

Speaking of simulation, because the DFI desktop motherboard for the Pentium M doesn’t support SpeedStep, we used the RMClock utility to simulate SpeedStep in our testing. The comparison won’t be exact, because RMClock uses its own algorithm to scale clock frequencies, but we did set the RMClock profile to match the Pentium M 755’s minimum clock speed of 600MHz and minimum voltage of 0.988V.

We measured the power consumption of our entire test systems, except for the monitor, at the wall outlet using a Watts Up PRO watt meter. The test rigs were all equipped with OCZ PowerStream 520W power supply units. The idle results were measured at the Windows desktop, and we used Cinebench 2003’s rendering test to load up the CPUs. For P4s, we used the multithreaded version of the test to take advantage of Hyper-Threading.

Finally, the graphs below have results for “power management” and “no power management.” That deserves some explanation. By “power management,” we mean SpeedStep or Cool’n’Quiet. In the case of the Pentium 4 600-series and 500J-series processors, the C1E halt state is always available, even in the “no power management” tests.

The first thing one notices about power consumption at idle is that there’s practically no difference between the Pentium 4 600-series chips with SpeedStep enabled or disabled. That’s because the C1E halt state accomplishes essentially the same thing. Either way, though, the 600-series and 500J-series CPUs both consume quite a bit less power at idle than the Prescott chips that don’t support C1E or SpeedStep, like the P4 560 or the new Extreme Edition 3.73GHz.

Because current Athlon 64 processors don’t have anything comparable to the C1E halt state, they pull more juice at idle than the newer Pentium 4 chips. With Cool’n’Quiet enabled, though, all of the Athlon 64 processors consume even less power than the C1E and SpeedStep-enabled P4s. There’s no doubt, though, that Intel has made great strides.

Under load, SpeedStep and Cool’n’Quiet don’t have any significant impact, but we do see that the P4 600 series manages to consume less power than the older Prescott-based processors we’re testing. I wouldn’t attribute that difference to the presence of the C1E halt state, in part because Cinebench is a very full CPU load, and in part because the P4 500 and 500J models consume roughly comparable amounts of power at the same clock speed. The 600-series processors, and even the new P4 3.73GHz Extreme Edition, are relatively more efficient under load—despite the fact that they’re packing another 1MB of L2 cache.

I asked Intel what to make of these results, but unfortunately, they weren’t able to give me an answer before this article went online. I hate to speculate about why the newer P4s with 2MB of L2 cache aren’t drawing as much power as the older models with 1MB of L2. Is it just better properties of newer chips, or has the new CPU core been otherwise tweaked? Perhaps we’ll get some answers from Intel before too long.

Whatever the reason, the new P4 core does require relatively less power under load than the older chips that we tested. These things do vary from chip to chip, so I don’t want to make too much of these results from just a few processors. Indications are certainly good, though. That said, the 90nm version of the Athlon 64 3500+ still pulls about 60W less under load than the P4 650 does, and the Pentium M is even more efficient.

 
Conclusions
All told, the Pentium 4 600 series represents good progress for Intel. Power consumption is down, performance is up a bit, and some of the new features are important developments, like 64-bit support and SpeedStep. Unfortunately, the jump from 1MB of L2 cache to 2MB doesn’t seem to offer big returns in most desktop applications, but it doesn’t hurt, either. I haven’t yet tested 64-bit performance, so I won’t comment on that, although it’s nice to know the capability is in the chip. (We’ll look at 64-bit performance shortly in a follow-up article.)

Having said that, I am a little bit conflicted about what to think. The truth is that all of the new capabilities in this processor—dynamic power management, NX bit protection, 64-bit extensions, and better performance than the Pentium 4 500 series—were available in the first Athlon 64 processors that debuted in September 2003. AMD vaulted so far out ahead of Intel in terms of technology and performance that it’s taken quite a while for Intel to catch up.

Fortunately for Intel, AMD hasn’t done much since the Athlon 64’s debut but monkey with cache sizes, add another memory channel, and ratchet up clock speeds a couple of notches. The 600 series is getting closer to the Athlon 64 in terms of overall attractiveness, and AMD needs to answer in order to retain the lead. In fact, if you don’t care about playing games on your PC, the Pentium 4 600 series is as good a choice as any. The overall WorldBench scores illustrate the general parity between Intel and AMD offerings at each price point. Given Intel’s dominance at the big PC makers, the 600 series is probably as good as it needs to be in order to become a sales success.

One thing that threatens that success is the funky model numbering and pricing mix that Intel has chosen to present to the hapless consumer.

Actually, I take that back. Hap or no hap, it’s confusing.

If you’ll recall from the beginning of this article, the Pentium 4 550J is priced just five dollars below the P4 640. Overall, the performance race between the two would have to go to the 550J, because 200MHz of clock speed is worth more than the move from 1MB of L2 cache to 2MB. The 550J lacks SpeedStep, but with the C1E halt state, that’s quite arguably a moot point. So the decision between the two comes down to this, I suppose: do you want slightly higher performance in 32-bit apps or, for five bucks more, 64-bit capabilities? Ask that of Joe Schmoe and he’ll deck you.

I suppose Intel will rely on PC makers to package up the 600 series and make it all work, but the strategy, on the face of it, is confusing. Surely the larger model number with the bigger cache and 64-bit extensions will sell best. Unless consumers buy primarily based on MHz. Is Intel experimenting a little here? Sure seems like it.

I should probably say a word or two about the P4 Extreme Edition 3.73GHz. At its customary price of $999, the Extreme Edition was never a bargain hunter’s dream chip. This new 3.73GHz version performs comparably to the previous 3.46GHz one, but no better. The move to a Prescott-based Extreme Edition processor was no doubt inevitable, and the move does bring 64-bit support, but it’s an even trade. Go buy an Athlon 64 3500+ if you want a gamer’s CPU. It’s faster than any Extreme Edition, and you can pocket the $727 you save (or better yet, buy an obscenely expensive graphics card.) Personally, I’d rather have a Pentium 4 660 than an Extreme Edition 3.73GHz. Without SpeedStep or the C1E halt state, the Extreme Edition is less attractive than its 600-series siblings.

Now, there’s 64-bit performance testing to be done, so I’ll bring this one to a close. Stay tuned for the next chapter. 

Latest News

NFL
Streaming News & Events

NFL Discloses Moving Two NFL Games Into Streaming in 2024

Crypto News

Unveiling the Most Popular Crypto Presales in March Among Americans

As BTC surges 162% and SOL an enormous 820% year to date, with many other tokens following, the crypto industry experiences a revival. We start seeing more new exciting projects,...

Apple Users Are Being Spammed with Unwanted Password Reset Requests as Part of ‘MFA Bombing'
News

Apple Users Are Being Spammed with Unwanted Password Reset Requests as Part of ‘MFA Bombing’

Beware Apple users – a phishing scam is doing the rounds, targeting Apple devices. It’s being called as ‘MFA Bombing’ where unknown threat actors send you unsolicited system-level password reset...

Lenovo Plans To Introduce A New Generation Of AI PCs
News

Lenovo Plans to Introduce a New Generation of AI PCs with Native AI Features

5 Major Tech Companies Announce Fresh Layoffs in March
News

5 Major Tech Companies Announce Fresh Rounds of Layoffs in March

Bitcoin ETF Records Inflows After Bottoming Out Favoring Self-Custody Investors
Crypto News

Bitcoin ETF Records Inflows After Bottoming Out Favoring Self-Custody Investors

Coinbase Gears Up To Onboard Business Onchain With Plans To Store Users’ USDC On Base
Crypto News

Coinbase Gears Up To Onboard Business Onchain With Plans To Store Users’ USDC On Base