There has been a proliferation of multiprocessor computing platforms for scientific and embedded computing domains, as well as for the general purpose computing domain. This page looks at a set of recent multiprocessor platforms, and also provide some tools for analyzing a multiprocessor platform.


Recent Chip Multi Processors

Processor Name Company Target Market Year Released Cores Thread Groups Threads PE Interconnect Memory Speed: FLOPs, freq & op/cyc Process Die Size Power Programming Model Estimated Price (unit/qty)
Niagara Sun GP (Web, TP) 2005 8x UltraSPARC 8 32 Full Crossbar to L2$ 3MB shared L2 90nm 9 Layer Cu 379 mm^2 72W Shared Memory Multi Threading
Niagara2 Sun Servers Second half 2007 8x UltraSPARC 16 64 Full Crossbar to L2$ 4MB shared L2 65nm 342 mm^2 84W Shared Memory Multi Threading
Rock Sun High-end Servers 2009 4x UltraSPARC cluster 16 32 2MB discrete L2s 65nm Shared Memory Multi Threading +Transactional Memory
IBM Power7 IBM Servers 2010 4-8 Power7 4-8 16-32 Shared Memory Multi Threading
Barcelona AMD Servers, Desktop Q3 2007 4x NG-Opteron 4 4 Full Crossbar on-chip 512KB private L2$, 2MB shared "victim L3" 65nm 291 mm^2 95W, 65W Traditional SMP 2XXX: $209-389, 8XXX: $698-1019
Nehalem Intel Servers, Desktop Q4 2008 x86 2,4,8 4,8,16 4core: 8MB shared L3, 8core: 24MB shared L3 2.133GHz 45nm 270mm2 (quad core) 60-130W Shared Memory Multi Threading
ClearSpeed CSX600 ClearSpeed Technology HPC 2006 mono + 96 poly 1 + 96 8 + 96 Linear swazzle path 6KB LS (poly) 33 GFLOPS 130nm 10W SIMD array
SiCortex SiCortex, Inc. HPC 2006 6x MIPS64 @500MHz 6 6 L2? 1536KB Shared L2 10W Traditional SMP
Xenon IBM / Microsoft XBox360 2005 3x PowerPC w/VMX128 3 6 1MB L2 90nm 168mm^2 Traditional SMP
Cell Sony / Toshiba / IBM Game Consoles, HPC 2006 PowerPC + 8xSPE(SIMD) 1+8 2+8 4 Rings 512KB L2, 8x256KB LS GFLOP: dp:1.8, sp:25 90nm 220 mm^2 ~100W Shared DRAM, private SRAM
CRS-1 Metro Tensilica Cores (IBM/Cisco design) Networking 2006 188 + 4 spares
Vega 2 Azul Business; Java Acceleration not released; 2007 48 48 90nm
Rapport KC256 Rapport, Inc. Embedded Systems 2006 256 256 256
Rapport KC1025 IBM / Rapport, Inc. Embedded Systems 2008 PowerPC + 1024x8-bit 1 + 1024 1 + 1024
picoChip PC205 picoChip 3G/WiMAX base stations 2006 ARM + 248* 16-bit 3-way LIW 1+ 248 1+ 248 TDM grid 128KB +128KB +700KB 90nm 5W ? VHDL for connectivity and C for computation
Cavium CN38XX / CN58XX Cavium Networking, General purpose 2006 16* MIPS 16 16 L2 cache 14W - 40W traditional SMP
TILE64 Tilera Networking, Digital Video 2007 64* 3-way MIPS-derived VLIW 64 64 2D-Grid L2s collectively act like a 4MB L3 90 nm 19W traditional SMP $435
Quadro FX 5600 NVIDIA GPU 2007 March 128 streaming processors 90 nm 171 W
XMOS XS1-G4 XMOS Ltd. 4 32 threads 256 KByte $12[1]
gCORE Boston Circuits, Inc. 16 "Grid on Chip architecture" [2]
Tesla TC1060 NVIDIA GPGPU Fall 2008 240 streaming processors 1.3 Ghz 160 W CUDA
SEAforth 40C18 IntellaSys Embedded and Wireless 2008 40 stack processors 700 Mhz less than 360 mW Forth


Domain Specific Multiprocessors

  • IBM/Microsoft Xenon, 3 PEs. An IBM designed processor, customized for Microsoft, Xenon is the CPU of the Xbox 360, and shipped to consumers in November 2005.
  • Sony/Toshiba/IBM Cell Processor, 9 PEs. A joint project of Sony, Toshiba, IBM, Cell is envisioned primarily as a computing engine for media applications. It is the centerpiece of Sony's Playstation 3, shipping in 2006.
  • ClearSpeed CSX600, 64-96 PEs. Clearspeed has developed a highly parallel architecture for High Performance computing work.
  • Cisco CRS-1 Metro, 192 PEs. A massively many-core custom network processor for Cisco's highest end routers.
  • Azul, 24 PEs.
  • Rapport KC256, 256 PE. Parallel Processing using 256 Kilocore (TM).

Network Processors

General Purpose Multiprocessors

Analysis Tools

  • Multicore Performance Estimator
  • Cache Calculator
  • BACPAC, Berkeley Advanced Chip Performance Calculator
  • RAMP, Research Accelerator for Multiple Processors


