AMD’s new Athlon 64 3000+ arguably resides in the sweet spot now, priced at less than $220 for an honest-to-goodness 2GHz “Hammer” microprocessor with a built-in memory controller and true 64-bit computing capabilities. To see how the 3000+ measures up, we’ve benchmarked it against 11 of its closest competitors. Keep reading to see what we found.
Clawhammer declawed?
The Athlon 64 3000+ drops into a 754-pin socket, just like the Athlon 64 3200+ and 3400+ chips. That means it can support one channel of DDR400 memory, not two like the super-expensive Athlon 64 FX series. Have a look:
Beyond that, the Athlon 64 3000+ runs at 2GHz and is otherwise identical to the Athlon 64 3200+, save one thing: it has half the L2 cache (512K) of the other A64 chips. Hence the lower performance rating than the A64 3200+.
If the array of AMD Hammer variants has you confused, you’re not alone. AMD is using its model number pricing mojo to segment its product line according to some unconventional measures, like cache size and number of memory channels, instead of just clock frequency. It’s bewildering, especially because memory bandwidth and cache size don’t always affect performance in a given task. The table below will bring you up to speed on AMD’s current lineup of Athlon 64 chips.
Processor | Clock speed | Memory channels | L2 cache | Price |
Athlon 64 3000+ | 2.0GHz | 1 | 512KB | $218 |
Athlon 64 3200+ | 2.0GHz | 1 | 1MB | $278 |
Athlon 64 3400+ | 2.2GHz | 1 | 1MB | $417 |
Opteron 146 | 2.0GHz | 2 | 1MB | $438 |
Athlon 64 FX-51 | 2.2GHz | 2 | 1MB | $733 |
Opteron 148 | 2.2GHz | 2 | 1MB | $733 |
One wonders how long AMD will be able to sustain this fine-grained model distribution strategy. The model number rating system has given the company additional flexibility, but AMD risks straining the credibility of its rating system by selling three different 2GHz Hammer models at different prices. For many tasks, they will perform almost identically.
The most important thing you need to know about all of this, of course, is that the A64 3000+ costs half as much as the most expensive 2GHz Hammer, the Opteron 146, and sixty bucks less than the A64 3200+. Hence the A64 3000+’s residency in the proverbial sweet spot.
A few test notes
Last time out, when we reviewed the Athlon 64 3400+, a compatibility problem with the MSI K8T Neo motherboard and the Corsair DIMMs we used for testing prevented us from running the 754-pin Athlon 64 chips with 1GB of memory. They were stuck with 768MB, while our comparative systems all had 1GB of RAM. Since then, Corsair and MSI have resolved the problem, so we have new results with 1GB memory for the Athlon 64 3200+ and 3400+, as well as the 3000+. (For the record, the fix was a new BIOS for the K8T Neo. The BIOS was possibly reading the aggressive SPD on the Corsair RAM and trying to boot at that speed, despite any manual BIOS settings.)
Not only that, but we’ve retested the Opteron 146 with CAS 2 memory, bringing it up to speed with the rest of the pack. Because we are comparing five different variants of the same chip at two different clock speeds, we figured the extra time retesting everything would be especially well spent. Given how close some of the results are, I think you’ll agree.
Our testing methods
As ever, we did our best to deliver clean benchmark numbers. Tests were run at least twice, and the results were averaged.
Our test systems were configured like so:
Processor | Athlon XP ‘Barton’ 3200+ 2.2GHz | Athlon XP ‘Barton’ 2500+ 1.83GHz Athlon XP ‘Barton’ 2800+ 2.183GHz |
AMD Athlon 64 3000+ 2.0GHz AMD Athlon 64 3200+ 2.0GHz AMD Athlon 64 3400+ 2.2GHz |
AMD Opteron 146 2.0GHz AMD Athlon 64 FX-51 2.2GHz |
Pentium 4 ‘C’ 2.4GHz Pentium 4 ‘C’ 2.8GHz Pentium 4 3.2GHz Pentium 4 3.2GHz Extreme Edition |
Front-side bus | 400MHz (200MHz DDR) | 333MHz (166MHz DDR) | HT 16-bit/800MHz downstream HT 16-bit/800MHz upstream |
HT 16-bit/800MHz downstream HT 16-bit/800MHz upstream |
800MHz (200MHz quad-pumped) |
Motherboard | Asus A7N8X Deluxe v2.0 | Asus A7N8X Deluxe v2.0 | MSI K8T Neo | MSI 9130 | Abit IC7-G |
North bridge | nForce2 SPP | nForce2 SPP | K8T800 | K8T800 | 82875P MCH |
South bridge | nForce2 MCP-T | nForce2 MCP-T | VT8237 | VT8237 | 82801ER ICH5R |
Chipset drivers | nForce Unified 2.45 | nForce Unified 2.45 | 4-in-1 v.4.49 ATA 5.1.2600.10 Audio 5.10.0.5920 |
4-in-1 v.4.49 AGP 4.42 Audio 6.14.1.3870 |
INF Update 5.0.1015 ATA 5.0.1007.0 Audio 5.10.0.5250 |
Memory size | 1GB (2 DIMMs) | 1GB (2 DIMMs) | 1GB (2 DIMMs) | 1GB (2 DIMMs) | 1GB (2 DIMMs) |
Memory type | Corsair TwinX XMS4000 DDR SDRAM at 400MHz | Corsair TwinX XMS4000 DDR SDRAM at 333MHz | Corsair TwinX XMS4000 DDR SDRAM at 400MHz | Corsair CMX512RE-3200LL PC3200 registered DDR SDRAM at 400MHz | Corsair TwinX XMS4000 DDR SDRAM at 400MHz |
Hard drive | Seagate Barracuda V 120GB ATA/100 | Seagate Barracuda V 120GB ATA/100 | Seagate Barracuda V 120GB SATA 150 | Seagate Barracuda V 120GB SATA 150 | Seagate Barracuda V 120GB SATA 150 |
Audio | nForce2 MCP/ALC650 | nForce2 MCP/ALC650 | VT8237/ALC650 | VT8237/ALC201A | ICH5/ALC650 |
Graphics | NVIDIA GeForce FX 5900 Ultra | ||||
OS | Microsoft Windows XP Professional | ||||
OS updates | Service Pack 1, DirectX 9.0b |
All tests on the Pentium 4 systems were run with Hyper-Threading enabled.
Thanks to Corsair for providing us with memory for our testing. If you’re looking to tweak out your system to the max and maybe overclock it a little, Corsair’s RAM is definitely worth considering.
The test systems’ Windows desktops were set at 1152×864 in 32-bit color at an 85Hz screen refresh rate. Vertical refresh sync (vsync) was disabled for all tests.
We used the following versions of our test applications:
- Cachemem 2.65MMX
- SiSoft Sandra MAX3! (2003.7.9.73)
- Compiled binary of C Linpack port from Ace’s Hardware
- Discreet 3ds max 5.1 SP1
- NewTek Lightwave 7.5
- Cinebench 2003
- POV-Ray for Windows v3.5
- PICCOLOR v4.0 build 451
- SPECviewperf 7.1
- ScienceMark 2.0 beta (06SEP03-A build)
- Sphinx 3.3
- LAME 3.93.1 (build from mitiok.cjb.net)
- Xmpeg 5.0.1 with DivX Video 5.05
- FutureMark 3DMark03 build 330
- Comanche 4 demo
- Quake III Arena v1.31
- Serious Sam SE v1.07
- Unreal Tournament 2003 demo v.2206
- Wolfenstein: Enemy Territory v2.55
All the tests and methods we employed are publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them.
As always, we’ll start off with memory performance, where we can see the effects of the 3000+’s built-in, single-channel memory controller and 512K L2 cache.
Bandwidth-wise, the A64 3000+ is nearly identical to the other 754-pin Athlon 64 chips. Linpack, however, will show us the cache size difference visually.
The A64 3000+’s performance line swoops down starting at matrix sizes of about 500K, as does the Athlon XP 3200+. Both chips have 512K of L2 cache. Notice, however, that the A64 3000+’s performance is quite a bit higher at larger matrix sizes than the Athlon XP, probably thanks to its integrated memory controller.
The cachemem latency numbers really show off the Athlon 64’s quickness to memory. The Opteron 146 and Athlon 64 FX require registered DIMMs, slowing them down a bit.
Our funkified 3D graphs will show off cachemem latency numbers in more detail. As a guide, I’ve color coded the various data rows. The yellow rows are primarily accesses to the processor’s L1 cache, while the amber rows are L2 cache. The darker orange rows represent accesses primarily to main memory. Oh, and the bright red bars on the Pentium 4 Extreme Edition graph represent L3 cache accesses. Also, the graphs below are sorted in rough order of overall latency, but don’t hold me to that.
You can see that the A64 3000+ gets out to main memory very quickly, although it has one less amber row than the other Athlon 64 chips, because of its smaller L2 cache. The 3000+’s relatively low memory access latencies should help ease the loss of the additional on-chip cache. In fact, let’s put that theory to the test…
Unreal Tournament 2003
The A64 3000+ is the slowest of the AMD Hammer processors, but it still manages to outrun the Pentium 4 3.2GHz in Unreal Tournament.
Quake III Arena
Quake III definitely likes lots of on-chip cache—witness the stunning performance of the Pentium 4 Extreme Edition with 2MB L3 cache. Still, the A64 3000+ lives up to its model number, coming out just a step behind the Pentium 4 3.2GHz in Q3A.
Wolfenstein: Enemy Territory
You’ve got to ask youself one question: Is an additional 3.6 frames per second worth another 60 bucks for the A64 3200+? Well is it, punk?
Comanche 4
Serious Sam SE
3DMark03
The gaming picture for the A64 3000+ is pretty clear. Against the Pentium 4, this CPU more than lives up to its model number, putting the hurt on the Pentium 4 3.2GHz in several games, and never blinking when it doesn’t. Versus the A64 3200+, the 3000+ is consistently just a few frames per second slower, but rarely more than a few percent.
Sphinx speech recognition
Ricky Houghton first brought us the Sphinx benchmark through his association with speech recognition efforts at Carnegie Mellon University. Sphinx is a high-quality speech recognition routine that needs the latest computer hardware to run at speeds close to real-time processing. We use two different versions, built with two different compilers, in an attempt to ensure we’re getting the best possible performance.
There are two goals with Sphinx. The first is to run it faster than real time, so real-time speech recognition is possible. The second, more ambitious goal is to run it at about 0.8 times real time, where additional CPU overhead is available for other sorts of processing, enabling Sphinx-driven real-time applications.
The difference between 512K and 1MB of L2 cache in Sphinx is essentially nil. The A64 3000+ and 3200+ are dead even.
LAME MP3 encoding
We used LAME 3.92 to encode a 101MB 16-bit, 44KHz audio file into a very high-quality MP3. The exact command-line options we used were:
lame –alt-preset extreme file.wav file.mp3
MP3 encoding also isn’t too sensitive to cache sizes. The 2GHz Hammer chips are all bunched together.
DivX video encoding
Xmpeg is partially self-tuning, and it chose to use the SSE2 Optimized iDCT on the Hammer processors.
As expected, DivX encoding isn’t hindered much by the 3000+’s smaller L2 cache, either. With this kind of media encoding performance, the A64 3000+ could make a very promising CPU for home-theater PCs.
3ds max rendering
We begin our 3D rendering tests with Discreet’s 3ds max, one of the best known 3D animation tools around. 3ds max is both multithreaded and optimized for SSE2. We rendered a couple of different scenes at 1024×465 resolution, including the Island scene shown below. Our testing techniques were very similar to those described in this article by Greg Hess. In all cases, the “Enable SSE” box was checked in the application’s render dialog.
Again, the L2 cache delta has little effect on performance, as does memory bandwidth. The Hammer 2GHz chips all perform about the same.
Lightwave rendering
NewTek’s Lightwave is another popular 3D animation package that includes support for multiple processors and is highly optimized for SSE2. Lightwave can render very complex scenes with realism, as you can see from the sample scene, “A5 Concept,” below.
Lightwave uses SSE2 well enough that more threads don’t really help, or so it seems. All the results below are single-threaded.
I seem to recall saying something about L2 cache size and memory bandwidth not always affecting performance in certain tasks. But that was just crazy talk, right?
POV-Ray rendering
POV-Ray is the granddaddy of PC ray-tracing renderers, and it’s not multithreaded in the least. Don’t ask me why—seems crazy to me. POV-Ray also relies more heavily on x87 FPU instructions to do its work, because it contains only minor SIMD optimizations.
Like the other rendering apps, POV-Ray doesn’t benefit much from additional on-chip cache.
Cinebench 2003 rendering and shading
Cinebench is based on Maxon’s Cinema 4D modeling, rendering, and animation app. This revision of Cinebench measures performance in a number of ways, including 3D rendering, software shading, and OpenGL shading with and without hardware acceleration.
Cinema 4D’s renderer is multithreaded, so it takes advantage of Hyper-Threading. For the AMD-based systems, I’ve reported the single-processor results. For the P4 systems, I’ve reported the multi-threaded results, which in all cases were notably faster.
The 3000+ escapes Cinebench largely unscathed. None of the AMD processors can keep up with the competing P4s in the rendering test, in part because Cinebench uses Intel’s SIMD extensions and Hyper-Threading very effectively.
SPECviewperf workstation graphics
SPECviewperf simulates the graphics loads generated by various professional design, modeling, and engineering applications.
The A64 3000+ is, again, essentially identical to the 3200+ throughout the viewperf suite.
ScienceMark
I’d like to thank Alex Goodrich for his help working through a few bugs the 2.0 beta version of ScienceMark. Thanks to his diligent work, I was able to complete testing with this impressive new benchmark, which is optimized for SSE, SSE2, 3DNow! and is multithreaded, as well.
In the interest of full disclosure, I should mention that Tim Wilkens, one of the originators of ScienceMark, now works at AMD. However, Tim has sought to keep ScienceMark independent by diversifying the development team and by publishing much of the source code for the benchmarks at the ScienceMark website. We are sufficiently satisfied with his efforts, and impressed with the enhancements to the 2.0 beta revision of the application, to continue using ScienceMark in our testing.
The molecular dynamics simulation models “the thermodynamic behaviour of materials using their forces, velocities, and positions”, according to the ScienceMark documentation.
Many of the ScienceMark tests seem to be limited by the CPU’s computational prowess rather than by the cache and memory subsystem. Once more, we see very similar (and very respectable) performance out of all the 2GHz Hammer CPUs.
picCOLOR image analysis
We thank Dr. Reinert Muller with the FIBUS Institute for pointing us toward his picCOLOR benchmark. This image analysis and processing tool is partially multithreaded, and it shows us the results of a number of simple image manipulation calculations. The overall score is indexed to a Pentium III 1GHz system based on a VIA Apollo Pro 133. In other words, the reference system would score a 1.0 overall.
The A64 3000+ just edges out the P4 3.2GHz overall.
The Athlon 64 3000+ won’t be getting its own wing in the Halls of Overclocking Greatness, ensconsed between the Hall of the Celeron 300A and the Pentium 4 2.4C exhibit. Still, I was able to pump the 3000+ up to the speed of the top AMD Hammer chips, 2.2GHz, without too much drama.
Doing so required turning up the base system clock from 200MHz to 220MHz, which in turn jacked up RAM speeds from 400MHz to 440MHz. At that speed, even with extremely lax memory timings and downright abusive memory voltage settings, the system just wasn’t stable. The remedy: turn down the basic memory clock to DDR333 speeds. After overclocking the main system clock to 220MHz, the RAM was then running at 366MHz, or 183.3MHz before DDR did its clock-doubling dance:
I set some fairly aggressive memory timings, and the system outperformed the stock setup by a fair margin.
As always with overclocking, your mileage will likely vary.
It sounds trite to say, but the Athlon 64 3000+ performs about like one might expect a 2GHz AMD Hammer processor with 512K of L2 cache to perform. That is, it’s an exceptional performer for 3D gaming, and it has few weaknesses overall. Versus the Pentium 4 3GHz, the A64 3000+ is a very good value. The difference between the Athlon 64 3000+ and 3200+ hinges entirely on the usefulness of the larger 1MB L2 cache on the 3200+ model. Right now, that larger L2 cache will cost you about $60 American money, and since you’ve seen the benchmark scores, you can decide for yourself whether it’s worth paying the extra cash.
At about $220, the Athlon 64 3000+ isn’t a cheap processor, but its performance makes older Athlon XP chips look a little antiquated, especially in applications where fast memory performance and SSE2 instructions are important. This CPU has all the next-generation Hammer advantages, including the built-in memory controller and 64-bit extensions, to give it some longevity, too. That’s why I think it’s sitting right in the sweet spot of AMD’s processor lineup. If you’re looking to build a new gaming rig or do-it-all workstation PC, the A64 3000+ should be prominent on your radar screen.