Some speculation about that Larrabee die shot
We don't yet know as much as we'd like to about Intel's upcoming Larrabee GPU-CPU hybrid, but enough useful information has leaked out over the past little while to give us the ability to speculate a bit. Intel has disclosed many of the architecture fundamentals, but one of the big missing pieces of the puzzle has been the specific number of cores and other types of hardware that the first implementations will have. The release of a fuzzy die shot yesterday, therefore, caused a bit of a stir around here, with the TR editors sitting around peering at their monitors and exchanging puzzled IMs about what's what.
I started forming some theories eventually, and after poking around online, I was pleased to see that some folks in the B3D discussion thread had some similar ideas. We don't really know much about the particular chip shown in the die shot, but given what we know about the architecture from Larry Seiler's Siggraph paper and Michael Abrash's overview of the instruction set, some possibilities become apparent.
If you look closely at this high-res version of the die shot, you'll see that the chip is laid out in three rows. The design of the chip looks to be fairly modular, with repeating areas of uniform structures of several types. The most common unit of the chip is most likely the x86-compatible Larrabee shader core, and the dark areas at the ends of its long, rectangular shape are probably cache of some sort, either L1, L2, or both. We know that each core has L1 data and instruction caches, plus 256KB of L2 cache. By my count, there are a total of 32 cores on the chip—10 on the top row, 12 in the middle, and 10 in the bottom row.
Along with the cores are two other types of regular blocks on the chip. The larger of these two is a little narrower than a core and has a lot of dark area, which suggests cache or other storage. I count eight of those. There's also one other block type, a narrow column, of which there are four total, two in the top row and two in the bottom. (After I had sorted all of this out myself, I saw this B3D post with an excellent visual aid. Worth a look if you can't identify what's what.)
My best guess is that the eight larger, dark-and-light blocks are texture sampling and filtering hardware. Larrabee doesn't have as much dedicated hardware as most GPUs, but it does have that.
After spending some quality time with the color-coded RV770 die shot at the bottom of this page and noodling it around with David Kanter, who bears no responsibility for any of this mess, I'm betting the logic bits running along the upper and and lower edges of the die, outside of the cores and such, are the memory pads. I see four repeating patterns there. Kanter notes that the four narrow columns on the interior of the chip are perpendicular to the memory pads. They are relatively evenly spaced, protrude from the edge of the chip into the center, and thus could be memory interfaces and other assorted logic that participates on the bus and talks to the I/O pads. So the magic number for memory interfaces would appear to be four.
David also suggests it might be fun to play "Where's Waldo?" with the fuses, analog nest, and any other logic we'd expect to find in a GPU. We're guessing the PCIe interface logic is along the right edge of the chip. Some other unidentified, non-repeating bits are on that side of the die.
I spent wasted some time trying to figure out the relationships between the cores and these other bits of hardware, but there don't appear to be any clear groupings of blocks or physical alignments between cores and texture units. More than likely, each of these resources is just a client on Larrabee's ring bus.
Happily, with no more information than that, we can tentatively pretend to start handicapping this chip's possible graphics power. We know Larrabee cores have 16-wide vector processing units, so 32 of them would yield a total of 512 operations per clock. The RV770/790 has 160 five-wide execution units for 800 ops per clock, and the GT200/b has 240 scalar units, for 240 ops/clock. Of course, that's not the whole story. The GT200/b is designed to run at higher clock frequencies than the RV770/790, and its scalar execution units should be more fully utilized, to name two of several considerations. Also, Larrabee cores are dual-issue capable, with a separate scalar execution unit.
If I'm right about the identity of the texture and memory blocks, and if they are done in the usual way for today's GPUs (quite an assumption, I admit), then this chip should have eight texture units capable of filtering four texels per clock, for a total of 32 tpc, along with four 64-bit memory interfaces. I'd assume we're looking at GDDR5 memory, which would mean four transfers per clock over that 256-bit (aggregate) memory interface.
All of which brings us closer to some additional guessing about likely clock speeds. Today's GPUs range from around 700 to 1500MHz, if you count GT200/b shader clocks. G92 shader clocks range up to nearly 1.9GHz. But Larrabee is expected to be produced on Intel's 45nm fab process, which offers higher switching speeds than the usual 55/65nm TSMC process used by Nvidia and AMD. Penryn and Nehalem chips have made it to ~3.4GHz on Intel's 45nm tech. At the other end of the spectrum, the low-power Atom tends to run comfortably at 1.6GHz. I'd expect Larrabee to fall somewhere in between.
Where, exactly? Tough to say. I've got to think we're looking at somewhere between 1.5 and 2.5GHz. Assuming we were somehow magically right about everything, and counting on a MADD instruction to enable a peak of two FLOPS per clock, that would mean the Larrabee chip in this die shot could line up something like this:
| Peak pixel fill rate (Gpixels/s) |
Peak bilinear texel filtering rate (Gtexels/s) |
Peak bilinear FP16 texel filtering rate (Gtexels/s) |
Peak memory bandwidth (GB/s) |
Peak shader arithmetic (GFLOPS) |
||
| Single-issue | Dual-issue | |||||
| GeForce GTX 285 | 21.4 | 53.6 | 26.8 | 166.4 | 744 | 1116 |
| Radeon HD 4890 | 13.6 | 34.0 | 17.0 | 124.8 | 1360 | - |
| LRB die 1.5GHz | - | 48.0 | 24.0 | 128.0 | 1536 | 1620 |
| LRB die 2.0GHz | - | 64.0 | 32.0 | 128.0 | 2048 | 2160 |
| LRB die 2.5GHz | - | 80.0 | 40.0 | 128.0 | 2560 | 2700 |
In the numbers above, I'm betting that GDDR5 memory will make it up to 1GHz by the time this GPU is released, and I'm counting on Intel's texture filtering logic to work at half the rate on FP16 texture formats. We can't determine the pixel fill rate because Larrabee will use its x86 cores to do rasterization in software rather than dedicated hardware. I'm just working my way through Michael Abrash's write-up of the default Larrabee rasterizer now, but I don't think we can assume a certain rate per clock given how it all works.
Obviously, clock speed makes a tremendous difference in this whole picture. Nonetheless, we're looking at a potentially rather powerful graphics chip, at least in terms of raw, peak arithmetic. If the tile-based approach to rasterization is as fast and efficient as purported, then the relatively pedestrian memory bandwidth quoted above might not be as much of an obstacle as it would be for a conventional GPU, either.
That's my first crack at this, anyhow. Would be cool if I turned out to be more right than wrong, but it's all guesswork for now. At the very least, one can begin to see the potential for Larrabee to compete with today's best DX10 GPUs. Whether or not it will be effective enough to contend with tomorrow's DX11 parts, well, that's another story.
82 comments
—
Last by FroBozz_Inc at 3:08 PM on 05/20/09
No, really. I went and made a TR page on Facebook tonight. Perhaps not the best-advised of actions, but it seemed like the thing to do. Check it out, if you're into that sort of thing.
57 comments
—
Last by Ozenmacher at 5:43 PM on 02/06/09
My last post on mouse and control issues in Dead Space brought in lots of advice, including encouragement to try things (like disabling vsync) that had helped some folks improve the mouse control on their systems. I did, in fact, try most of the common remedies discussed online early on in troubleshooting the problem, before coming to the conclusion that the game's mouse-and-keyboard control scheme is fundamentally broken. After spending some more time with the problem, I still believe the game's controls are broken, but I do have some insights about how they're broken—and some possible workarounds that might help.
We should begin by classifying the several Dead Space control pathologies, since they are related but somewhat separate.
- In the game's menus, mouse pointer movement is strangely imprecise and feels "floaty," as many folks have described it. Just setting up options and starting the game can be a chore.
- In the game itself, mouse control is both imprecise and also slow. Cranking up the game's mouse sensitivity slider to the highest setting may help somewhat, and using a high-DPI mouse may help, too. But you may still find yourself "rowing" the mouse across the desk multiple times in order to turn around.
- Worse yet, mouse movement becomes even slower if you are pressing a W/A/S/D movement key or if you have your character in "aim mode" in the game. The gap between regular sensitivity and movement/aim-mode sensitivity seems to vary inconsistently while playing and makes predictable movement almost impossible.
For me, the final problem was really the last straw, since I realized that none of the various troubleshooting measures recommended online were any help for this particular problem. I tried all sorts of things—disabling vsync, forcing on triple-buffering, setting the Nvidia graphics drivers to allow the GPU to render more than three frames ahead, dropping to a lower screen resolution, and using a very fast mouse—in various combinations, but none of them resolved this disconnect. In fact, they might have made it worse.
I did listen to those of you who said the game worked well for you, though, including my friend Andy, who had played through the whole game on his own PC without issue after disabling vsync and forcing on triple-buffering. Curious to see the game working well, I tried it out on his system, and sure enough, it worked much better than it did on my GPU test rig. His system is pretty nice, with a Radeon HD 4870 X2 and a 65nm Core 2 processor, but it's not quite as fast as my Core i7-965 Extreme-based GPU test rig with dual GeForce GTX 260s. Yet his PC ran the game pretty well, with very little difference between aim-mode and regular mouse sensitivity.

This discovery launched another round of troubleshooting. I came home and fired up my second test rig, with a Radeon HD 4870 X2 and a Core i7-965 Extreme, to see if using a different video card was somehow the key to the control problems. At first, the Radeon-based system seemed to be a night-and-day improvement. Controls were fluid and responsive with only vsync disabled, even without forcing on triple-buffering. However, as I played through the game, I realized that the responsiveness of the controls and the disconnect between aim-mode and regular controls both seemed to vary depending on the situation. Enter a large room and scan for baddies, and control seems great. Walk into a small bathroom on the ship or turn and face a corner, and the controls would slow considerably.
Slowly but surely, a little light bulb icon pulsed to life above my head. Could it be that the problem was worse when the video card was doing less work, when it was rendering the game too quickly?
I fired up FRAPS to get a frame-rate counter and tested my theory. Right away, I noticed that frame rates in the problem areas were approaching 180 frames per second, while open areas with smoother controls ran closer to 120 FPS or less. This was at 1920x1200 resolution on the Radeon test rig. Switching up to 2560x1600 slowed frame rates a little, and control improved as a result.
To further confirm the problem, I ran FRAPS on the GeForce-based test system and gave it a shot. Sure enough, at 1680x1050, frate rates were ranging well beyond 200 FPS, and again, the controls were most pathological where the FPS counter was highest. Solving the problem with the GTX 260s was tough, however. I tried going to 2560x1600 resolution and forcing on 16X anisotropic filtering and 16X AA in the Nvidia control panel, but the AA settings didn't seem to work. Even aniso and high res didn't prevent FPS spikes into the 180-200 FPS range. Finally, I disabled SLI, effectively dropping back to a single GeForce GTX 260, and BAM—fixed. Frame rates settled into 60-90 FPS territory, and the game's controls worked effectively.
Let me explain what I mean by that. Even when using my older, lower-DPI mouse, the sensitivity and responsiveness were excellent. I could turn my character around with the flick of a wrist when needed. On top of that, the controls didn't become any slower when I pressed a movement key. The mouse sensitivity in aim mode may have been a little bit slower, but not by much. The difference wasn't so great I couldn't adjust easily, and heck, it might be as the game designers intended to enhance precision. Suddenly, the game wasn't just playable but earnestly enjoyable, and I found myself playing through level one in a single sitting.
So the worst problems, it seems, happen when your PC is actually too fast. Crazy.
Of course, the game's developers likely didn't run into this problem on any of the consoles. And yes, I was more likely than most to encounter this problem in a really nasty way, given the sort of hardware I was packing. But none of this changes the fact that the game's controls are basically broken on the PC platform. The easiest workaround is to make sure you use sufficiently demanding graphical settings to keep frame rates from getting too high. However, this fix is only partial, because frame rates will vary as you play the game. Every once in a while, you'll enter an elevator or other tight space, the FPS counter will spike, and the mouse response will slow down.
Still, this is the most effective fix I've found, and it has rescued an almost terminally ill PC port for me.
Another avenue for addressing this problem, strangely enough, is to enable vsync in the game itself. Doing this will lock frame rates at 30 FPS, according to FRAPS, and it will unfortunately make mouse control in the game's initial menus incredibly over-sensitive and "floaty." However, this fix makes the in-game mouse control work well, with enough sensitivity that you may want to turn down the slider some, even with an older mouse. And controls are very consistent with this vsync frame-rate cap in place. The obvious drawback here is the locked 30 FPS refresh rate, which looks slow and not especially smooth to my eye. Still, this may be the only consistently effective fix, especially for those folks with a faster graphics card and a relatively low monitor resolution like 1680x1050.
Seems to me like EA could issue a patch that caps Dead Space's frame rate at something like 60-70 FPS and save us all a lot of grief. Hasn't happened yet, though. If you're having problems with the controls in the game, you might try one of my remedies for slowing things down. Your PC might just be too fast for its own good.
Update: Voila! As a couple of commenters have noted, enabling vsync in the Nvidia drivers appears to lock frame rates at 60 FPS, which is pretty much a best-of-all-worlds fix. The menus and the game controls both work nicely. With vsync forced on through the Nvidia drivers, it doesn't appear to matter what the vsync setting in the game is; I'm seeing 60 FPS either way. Even SLI works perfectly.
Apparently, you end up with that 30 FPS cap, sluggishness, and imprecise mouse control in menus if you have vsync set to follow the application setting in the drivers (or perhaps forced off) and then have vsync enabled in the game.
Frustratingly, forcing on vsync in the AMD drivers appears to have no effect, so this fix won't work for half of us.
57 comments
—
Last by toyota at 2:55 PM on 12/22/08
The mouse controls in Dead Space are absolutely horrible. At first, I thought it was just me, but apparently not. In the game's menus, the mouse is way too sensitive and jumpy. In the game itself, the camera control is exceptionally slow and imprecise. You've got to move the mouse halfway across your desk to get the camera to turn 90 degrees.
I tried cranking up the mouse sensitivity to the max, but it was still too slow. So I plugged in a high-DPI mouse, using hardware to compensate for an obvious software problem. The faster mouse allowed me to control the camera properly when my character is standing still, but get this: the mouse tracking rate changes when you're in "aim mode" or whenever you press a movement key on the keyboard. In both cases, the mouse slows back down again to nearly unusable speeds. Worse, perhaps, is just the disconnect between when you're moving and when you're not. One tends to make that transition constantly when playing a game like this, sweeping the camera with the mouse while maneuvering one's character in the world. Letting off of W, A, S, or D in mid-camera-sweep here, though, means a sudden transition to much different mouse tracking rate.
Which is a complicated way of explaining that I can't frickin' move right.
Obviously, this was a game designed for game consoles and dual-stick gamepads, and everything about it just screams "bad PC port." You'd think that moving from a gamepad to a vastly more sensitive and precise form of analog input, a game developer would want to take full advantage of the improved control of the new platform, especially in a shooter like this one. But that doesn't seem to be the case here. Also, the game's entire interface, from the initial setup menus to the in-game controls over HUD items and weapons, are obviously designed for a gamepad with a small number of buttons. This game has so many nested menus where a single keypress would suffice—like it does in almost every other PC shooter. Ugh.
Having read through various forum threads on the matter and experimented with it myself, I'm pretty well convinced Dead Space's controls are just broken. Our best hope now seems to be a patch, but given the amount of care given to the PC port, I wouldn't hold my breath.
And don't tell me to use a gamepad. I'd rather drive a railroad spike into my temple.
In fact, I'm moving on to Fallout 3, and I may just leave Dead Space out of our next round of video card tests. PC gamers deserve better.
Update: Much love to Games Radar, whose review says:
This would have been a much more compelling horror game if not for the bizarrely sluggish mouse movement, which feels strangely slow and floaty. This is not a sensitivity issue, and occurs even in the main menu. You get used to it eventually, but it makes pinpoint-accuracy abnormally and unnecessarily difficult in a game that demands it more than the average shooter.
Amen.
28 comments
—
Last by Vandyl at 8:09 AM on 12/23/08
I have a pair of 20" 1600x1200 LCDs on my desk, and I really like this arrangement. Both are Dell displays, and the older one of the two (a 2001FP) has begun to fail. Most of the time, it has a two-inch-wide strip of flickering, garbled junk running down the middle of the screen. The easiest path for me would be to order up another 20" LCD to replace it, but here's my worry: both current displays are true 8-bit panels with good color reproduction, a must for web work. How do I replace this bum monitor with a new 8-bit display of the same size and resolution? Has everything reasonably priced shifted to 6-bit panels?
A related question would be: can I find a wide-aspect monitor to plug in next to my other 20" display that has the same height and pixel pitch? Looks like the closest match would be a 24" 1920x1200 monitor, but that would have a slightly larger pixel pitch (0.27 mm) than my 20" 1600x1200 display (0.255 mm). Perhaps that would be a better place to hunt for an 8-bit panel, even if it's not a perfect match? Hmm.
57 comments
—
Last by xtalentx at 2:26 PM on 10/07/08
In the wake of my post about data recovery services, an old buddy of mine from college, Dave Kirby, wrote me with a question about how to avoid needing to use such a service. I think it's a timely question, since backup strategies for home users have become more difficult than ever. He frames the question like so:
Problem: Joe six-pack has accumulated several computers and a number of large external drives, all of them are nearing capacity and all combined total ~1.5TB-2TB. After a co-worker or friend suffers a catastrophic drive failure of their personal data, Joe six-pack realizes his data is not backed up (except maybe a few critical things like financial data). He starts to examine the situation and discovers:
1) A good portion of his data COULD be recreated (i.e. MP3s and DVD rips of his physical collections), but he gets a class A migraine just thinking about the time involved.
2) Some of his data is not backed up at all, is completely irreplaceable and has high emotional attachment (i.e. personal digital photos over the last 5+ years).
3) Because the vast majority of his data is media centric, it is not generally compressible.
4) His data is heavily "fragmented" across multiple drives and not terribly well organized.
5) He doesn't have the knowledge or resources to purchase/build and (more importantly) maintain anything like a RAID or SAMBA server. What is a good backup strategy for this type of consumer?
So how does the average guy protect his data? More relevantly, how do our decidedly above-average and unusually attractive readers handle this challenge? Me, I lean heavily on a combination of prayer, animal sacrifice, and RAID 1, supported by occasional backups of critical data to DVDs. I suspect there are better options.
119 comments
—
Last by JediDan at 8:54 PM on 09/26/08
I was out at the store yesterday and saw this puppy. Bought one on the spot.

It's a Logitech V450 Nano cordless mouse for laptops. Check out that teensy little USB receiver! Here's how it looks in my laptop:
The "nano receiver" is small enough to stick into the port and just leave there, and it definitely doesn't get in the way of mousing around right next to the machine, either. If you do decide to remove the receiver for travel, it slots into a storage compartment next to the batteries:
Putting the receiver into that slot automatically turns the mouse off to save battery life.
The mouse itself is great, too—a near-perfect copy of the hump-backed shape of the seminal Microsoft Optical Notebook Mouse, the first portable mouse whose shape didn't cramp my hands. I've worn out my MS mouse through years of use, and this is a perfect replacement that practically erases my concern over the fact my laptop doesn't have Bluetooth. Smart.
53 comments
—
Last by nonegatives at 10:15 AM on 09/02/08
