Niagara

From View

Jump to: navigation, search

[ Chip Multi Processor Watch ]

Contents

Purpose and Target Market

Throughput Computing

Keywords: Thread Level Parallel (TLP) Target: Commercial server application

Architecture

homogenous? heterogenous?

Basic Processing Element(s)

Styles of PE in case of heterogenous architecture? How many of each style?

Interconnect and Topologies

Buses? Point to point crossbar? Network on chip?

The crossbar interconnect provides the communication link between Sparc pipelines, L2 cache banks, and other shared resources on the CPU.

It provide more than 200Gbytes/s of bandwidth.

Memory Structure and Hierarchy

Shared memory? Distributed memory? Caches? Scratch Spaces?

L1 Caches

•Level 1 caches are shared among 4 threads

•I Cache : 16kB, 4 – way set associative, 32B line size

•D Cache: 8kB, 4 – way set associative, 16B line size.

•Store buffers: 8 entry buffer per thread

L2 Cache

•Designed for high bandwidth and low power

•4 way banked 3MB L2 cache

•12 way set associative to handle 32 threads

•Data is 64B interleaved across banks

•64B block size

•Multiple outstanding miss handling

•Writeback protocol

DRAM Access


•Multiple DDRII DRAM channels

•Supports memory size of upto 128GB

•Excess of 20GB/s peak bandwidth

Special Purpose Hardware Units

Vector units? Crypto units?

I/O and Peripherals

Memory controller? DMA engines? Ethernet Controller? Hypertransport?


Physical Properties

Process Technology

Whose fab? which year? Layers of metal? Fast or slow process? multi-Vt process?

Die Size

Total die size? relative size of logic vs arrays? relative sizes of cores and others?

Power Dissipation

Average or peak Watts? Joules per SPEC?


Usage Model

Program Model

Assembly? C/C++? Domain specific design language?

Software Development Environment

System design tool stack? Availability of layers in this tool stack?


Recent News and Publications

Product annoucements? Roadmaps? Industry comments? Any other references?

Personal tools