Niagara
From View
[ Chip Multi Processor Watch ]
Contents |
Purpose and Target Market
Throughput Computing
Keywords: Thread Level Parallel (TLP) Target: Commercial server application
Architecture
homogenous? heterogenous?
Basic Processing Element(s)
Styles of PE in case of heterogenous architecture? How many of each style?
Interconnect and Topologies
Buses? Point to point crossbar? Network on chip?
The crossbar interconnect provides the communication link between Sparc pipelines, L2 cache banks, and other shared resources on the CPU.
It provide more than 200Gbytes/s of bandwidth.
Memory Structure and Hierarchy
Shared memory? Distributed memory? Caches? Scratch Spaces?
L1 Caches
•Level 1 caches are shared among 4 threads
•I Cache : 16kB, 4 – way set associative, 32B line size
•D Cache: 8kB, 4 – way set associative, 16B line size.
•Store buffers: 8 entry buffer per thread
L2 Cache
•Designed for high bandwidth and low power
•4 way banked 3MB L2 cache
•12 way set associative to handle 32 threads
•Data is 64B interleaved across banks
•64B block size
•Multiple outstanding miss handling
•Writeback protocol
DRAM Access
•Multiple DDRII DRAM channels
•Supports memory size of upto 128GB
•Excess of 20GB/s peak bandwidth
Special Purpose Hardware Units
Vector units? Crypto units?
I/O and Peripherals
Memory controller? DMA engines? Ethernet Controller? Hypertransport?
Physical Properties
Process Technology
Whose fab? which year? Layers of metal? Fast or slow process? multi-Vt process?
Die Size
Total die size? relative size of logic vs arrays? relative sizes of cores and others?
Power Dissipation
Average or peak Watts? Joules per SPEC?
Usage Model
Program Model
Assembly? C/C++? Domain specific design language?
Software Development Environment
System design tool stack? Availability of layers in this tool stack?
Recent News and Publications
Product annoucements? Roadmaps? Industry comments? Any other references?
