The Tera-scale processor is an array of 80 “tiles” each containing a processing engine made up of a five-port router, two independent fully-pipelined single-precision floating-point multiply-accumulator (FPMAC) units, 3KB of single-cycle instruction memory, and 2KB of data memory. The two FPMACs are based on a Very Long Instruction Word-type design, much like Intel’s Itanium. They have nine-stage pipelines and are able to provide an aggregate 16 gigaFLOPS of performance. And thanks to the five-port router, each “tile” can communicate with other tiles at up to 80GB/s.
The resulting chip contains 100 million transistors, which Intel has packed into a die area of 275mm². (For reference, Intel’s Core 2 Duo has 291 million transistors and a 143mm² die size.) At 3.13GHz with a core voltage of 1V, Intel says the tera-scale chip has performance of 1.0 teraFLOPS with typical power consumption of just 98W. The design can even scale down to 310 gigaFLOPS of total performance with 11W of power draw. This low power consumption is aided by a power management system that not only allows per-tile power management based on workload, but also splits each 3mm² tile into 21 dynamically-controlled sleep regions.
Don’t expect this chip to ever hit stores, though. Last year, Intel CEO Paul Otellini then said he believed the tera-scale chip would become available as a production product in the future. Intel has now changed its tune and says it “has no plans to bring this exact chip designed with floating point cores to market.” However, the company thinks its work with the chip is instrumental in investigating new types of processor or core functions, new interconnects, and how to best optimize software for multi-core designs. In its upcoming research, Intel says it will attempt to stack 3D memory onto the chip and even develop fancier prototypes with x86 cores.