List of ARM microarchitectures

From Infogalactic: the planetary knowledge core
(Redirected from X-Gene (microarchitecture))
Jump to: navigation, search

<templatestyles src="Module:Hatnote/styles.css"></templatestyles>

This is a list of microarchitectures based on the ARM family of instruction sets designed by ARM Holdings and 3rd parties, sorted by version of the ARM instruction set, release and name. ARM provides a summary of the numerous vendors who implement ARM cores in their design.[1] Keil also provides a somewhat newer summary of vendors of ARM based processors.[2] ARM further provides a chart[3] displaying an overview of the ARM processor lineup with performance and functionality versus capabilities for the more recent ARM core families.

ARM cores

Designed by ARM

ARM family ARM architecture ARM core Feature Cache (I / D), MMU Typical MIPS @ MHz
ARM1 ARMv1 ARM1 First implementation None
ARM2 ARMv2 ARM2 ARMv2 added the MUL (multiply) instruction None 4 MIPS @ 8 MHz
0.33 DMIPS/MHz
ARMv2a ARM250 Integrated MEMC (MMU), graphics and I/O processor. ARMv2a added the SWP and SWPB (swap) instructions None, MEMC1a 7 MIPS @ 12 MHz
ARM3 ARMv2a ARM3 First integrated memory cache KB unified 12 MIPS @ 25 MHz
0.50 DMIPS/MHz
ARM6 ARMv3 ARM60 ARMv3 first to support 32-bit memory address space (previously 26-bit) None 10 MIPS @ 12 MHz
ARM600 As ARM60, cache and coprocessor bus (for FPA10 floating-point unit) 4 KB unified 28 MIPS @ 33 MHz
ARM610 As ARM60, cache, no coprocessor bus 4 KB unified 17 MIPS @ 20 MHz
0.65 DMIPS/MHz
ARM7 ARMv3 ARM700 8 KB unified 40 MHz
ARM710 As ARM700, no coprocessor bus 8 KB unified 40 MHz
ARM710a As ARM710 8 KB unified 40 MHz
0.68 DMIPS/MHz
ARM7T ARMv4T ARM7TDMI(-S) 3-stage pipeline, Thumb, ARMv4 first to drop legacy ARM 26-bit addressing None 15 MIPS @ 16.8 MHz
63 DMIPS @ 70 MHz
ARM710T As ARM7TDMI, cache 8 KB unified, MMU 36 MIPS @ 40 MHz
ARM720T As ARM7TDMI, cache 8 KB unified, MMU with FCSE (Fast Context Switch Extension) 60 MIPS @ 59.8 MHz
ARM740T As ARM7TDMI, cache MPU
ARM7EJ ARMv5TEJ ARM7EJ-S 5-stage pipeline, Thumb, Jazelle DBX, Enhanced DSP instructions None
ARM8 ARMv4 ARM810[4][5] 5-stage pipeline, static branch prediction, double-bandwidth memory 8 KB unified, MMU 84 MIPS @ 72 MHz
1.16 DMIPS/MHz
ARM9T ARMv4T ARM9TDMI 5-stage pipeline, Thumb None
ARM920T As ARM9TDMI, cache 16 KB / 16 KB, MMU with FCSE (Fast Context Switch Extension)[6] 200 MIPS @ 180 MHz
ARM922T As ARM9TDMI, caches 8 KB / 8 KB, MMU
ARM940T As ARM9TDMI, caches 4 KB / 4 KB, MPU
ARM9E ARMv5TE ARM946E-S Thumb, Enhanced DSP instructions, caches Variable, tightly coupled memories, MPU
ARM966E-S Thumb, Enhanced DSP instructions No cache, TCMs
ARM968E-S As ARM966E-S No cache, TCMs
ARMv5TEJ ARM926EJ-S Thumb, Jazelle DBX, Enhanced DSP instructions Variable, TCMs, MMU 220 MIPS @ 200 MHz
ARMv5TE ARM996HS Clockless processor, as ARM966E-S No caches, TCMs, MPU
ARM10E ARMv5TE ARM1020E 6-stage pipeline, Thumb, Enhanced DSP instructions, (VFP) 32 KB / 32 KB, MMU
ARM1022E As ARM1020E 16 KB / 16 KB, MMU
ARMv5TEJ ARM1026EJ-S Thumb, Jazelle DBX, Enhanced DSP instructions, (VFP) Variable, MMU or MPU
ARM11 ARMv6 ARM1136J(F)-S[7] 8-stage pipeline, SIMD, Thumb, Jazelle DBX, (VFP), Enhanced DSP instructions Variable, MMU 740 @ 532–665 MHz (i.MX31 SoC), 400–528 MHz
ARMv6T2 ARM1156T2(F)-S 9-stage pipeline,[8] SIMD, Thumb-2, (VFP), Enhanced DSP instructions Variable, MPU
ARMv6Z ARM1176JZ(F)-S As ARM1136EJ(F)-S Variable, MMU + TrustZone 965 DMIPS @ 772 MHz, up to 2,600 DMIPS with four processors[9]
ARMv6K ARM11MPCore As ARM1136EJ(F)-S, 1–4 core SMP Variable, MMU
SecurCore ARMv6-M SC000 0.9 DMIPS/MHz
ARMv4T SC100
ARMv7-M SC300 1.25 DMIPS/MHz
Cortex-M ARMv6-M Cortex-M0[10] Microcontroller profile, most Thumb + some Thumb-2,[11] hardware multiply instruction (optional small), optional system timer, optional bit-banding memory Optional cache, no TCM, no MPU 0.84 DMIPS/MHz
Cortex-M0+[12] Microcontroller profile, most Thumb + some Thumb-2,[11] hardware multiply instruction (optional small), optional system timer, optional bit-banding memory Optional cache, no TCM, optional MPU with 8 regions 0.93 DMIPS/MHz
Cortex-M1[13] Microcontroller profile, most Thumb + some Thumb-2,[11] hardware multiply instruction (optional small), OS option adds SVC / banked stack pointer, optional system timer, no bit-banding memory Optional cache, 0-1024 KB I-TCM, 0-1024 KB D-TCM, no MPU 136 DMIPS @ 170 MHz,[14] (0.8 DMIPS/MHz FPGA-dependent)[15]
ARMv7-M Cortex-M3[16] Microcontroller profile, Thumb / Thumb-2, hardware multiply and divide instructions, optional bit-banding memory Optional cache, no TCM, optional MPU with 8 regions 1.25 DMIPS/MHz
ARMv7E-M Cortex-M4[17] Microcontroller profile, Thumb / Thumb-2 / DSP / optional VFPv4-SP single-precision FPU, hardware multiply and divide instructions, optional bit-banding memory Optional cache, no TCM, optional MPU with 8 regions 1.25 DMIPS/MHz (1.27 w/FPU)
ARMv7E-M Cortex-M7[18] Microcontroller profile, Thumb / Thumb-2 / DSP / optional VFPv5 single and double precision FPU, hardware multiply and divide instructions 0-64 KB I-cache, 0-64 KB D-cache, 0-16 MB I-TCM, 0-16 MB D-TCM (all these w/optional ECC), optional MPU with 8 or 16 regions 2.14 DMIPS/MHz
Cortex-R ARMv7-R Cortex-R4[19] Real-time profile, Thumb / Thumb-2 / DSP / optional VFPv3 FPU, hardware multiply and optional divide instructions, optional parity & ECC for internal buses / cache / TCM, 8-stage pipeline dual-core running lockstep with fault logic 0–64 KB / 0–64 KB, 0–2 of 0–8 MB TCM, opt MPU with 8/12 regions
Cortex-R5[20] Real-time profile, Thumb / Thumb-2 / DSP / optional VFPv3 FPU and precision, hardware multiply and optional divide instructions, optional parity & ECC for internal buses / cache / TCM, 8-stage pipeline dual-core running lock-step with fault logic / optional as 2 independent cores, low-latency peripheral port (LLPP), accelerator coherency port (ACP)[21] 0–64 KB / 0–64 KB, 0–2 of 0–8 MB TCM, opt MPU with 12/16 regions
Cortex-R7[22] Real-time profile, Thumb / Thumb-2 / DSP / optional VFPv3 FPU and precision, hardware multiply and optional divide instructions, optional parity & ECC for internal buses / cache / TCM, 11-stage pipeline dual-core running lock-step with fault logic / out-of-order execution / dynamic register renaming / optional as 2 independent cores, low-latency peripheral port (LLPP), ACP[21] 0–64 KB / 0–64 KB, ? of 0–128 KB TCM, opt MPU with 16 regions
Cortex-R8 TBD TBD
Cortex-A
(32-bit)
ARMv7-A Cortex-A5[23] Application profile, ARM / Thumb / Thumb-2 / DSP / SIMD / Optional VFPv4-D16 FPU / Optional NEON / Jazelle RCT and DBX, 1–4 cores / optional MPCore, snoop control unit (SCU), generic interrupt controller (GIC), accelerator coherence port (ACP) 4-64 KB / 4-64 KB L1, MMU + TrustZone 1.57 DMIPS/MHz per core
Cortex-A7[24] Application profile, ARM / Thumb / Thumb-2 / DSP / VFPv4-D16 FPU / NEON / Jazelle RCT and DBX / Hardware virtualization, in-order execution, superscalar, 1–4 SMP cores, MPCore, Large Physical Address Extensions (LPAE), snoop control unit (SCU), generic interrupt controller (GIC), ACP, architecture and feature set are identical to A15, 8-10 stage pipeline, low-power design[25] 8-64 KB / 8-64 KB L1, 0–1 MB L2, MMU + TrustZone 1.9 DMIPS/MHz per core
Cortex-A8[26] Application profile, ARM / Thumb / Thumb-2 / VFPv3 FPU / NEON / Jazelle RCT and DAC, 13-stage superscalar pipeline 16-32 KB / 16–32 KB L1, 0–1 MB L2 opt ECC, MMU + TrustZone Up to 2000 (2.0 DMIPS/MHz in speed from 600 MHz to greater than 1 GHz)
Cortex-A9[27] Application profile, ARM / Thumb / Thumb-2 / DSP / Optional VFPv3 FPU / Optional NEON / Jazelle RCT and DBX, out-of-order speculative issue superscalar, 1–4 SMP cores, MPCore, snoop control unit (SCU), generic interrupt controller (GIC), accelerator coherence port (ACP) 16–64 KB / 16–64 KB L1, 0–8 MB L2 opt parity, MMU + TrustZone 2.5 DMIPS/MHz per core, 10,000 DMIPS @ 2 GHz on Performance Optimized TSMC 40G (dual-core)
Cortex-A12[28] Application profile, ARM / Thumb-2 / DSP / VFPv4 FPU / NEON / Hardware virtualization, out-of-order speculative issue superscalar, 1–4 SMP cores, Large Physical Address Extensions (LPAE), snoop control unit (SCU), generic interrupt controller (GIC), accelerator coherence port (ACP) 32-64 KB / 32 KB L1, 256 KB-8 MB L2 3.0 DMIPS/MHz per core
Cortex-A15[29] Application profile, ARM / Thumb / Thumb-2 / DSP / VFPv4 FPU / NEON / integer divide / fused MAC / Jazelle RCT / hardware virtualization, out-of-order speculative issue superscalar, 1–4 SMP cores, MPCore, Large Physical Address Extensions (LPAE), snoop control unit (SCU), generic interrupt controller (GIC), ACP, 15-24 stage pipeline[25] 32 KB w/parity / 32 KB w/ECC L1, 0–4 MB L2, L2 has ECC, MMU + TrustZone At least 3.5 DMIPS/MHz per core (up to 4.01 DMIPS/MHz depending on implementation)[30]
Cortex-A17 Application profile, ARM / Thumb / Thumb-2 / DSP / VFPv4 FPU / NEON / integer divide / fused MAC / Jazelle RCT / hardware virtualization, out-of-order speculative issue superscalar, 1–4 SMP cores, MPCore, Large Physical Address Extensions (LPAE), snoop control unit (SCU), generic interrupt controller (GIC), ACP MMU + TrustZone ?
ARMv8-A Cortex-A32[31] TBD / see ARM's web page with product information TBD
Cortex-A
(64-bit)
ARMv8-A Cortex-A35[32] Application profile, AArch32 and AArch64, 1-4 SMP cores, TrustZone, NEON advanced SIMD, VFPv4, hardware virtualization, dual issue, in-order pipeline 8-64 KB w/parity / 8-64 KB w/ECC L1 per core, 128 KB-1 MB L2 shared, 40-bit physical addresses
Cortex-A53[33] Application profile, AArch32 and AArch64, 1-4 SMP cores, TrustZone, NEON advanced SIMD, VFPv4, hardware virtualization, dual issue, in-order pipeline 8-64 KB w/parity / 8-64 KB w/ECC L1 per core, 128 KB-2 MB L2 shared, 40-bit physical addresses 2.3 DMIPS/MHz
Cortex-A57[34] Application profile, AArch32 and AArch64, 1-4 SMP cores, TrustZone, NEON advanced SIMD, VFPv4, hardware virtualization, multi-issue, deeply out-of-order pipeline 48 KB w/DED parity / 32 KB w/ECC L1 per core, 512 KB-2 MB L2 shared, 44-bit physical addresses At least 4.1 DMIPS/MHz per core (up to 4.76 DMIPS/MHz depending on implementation)
Cortex-A72[35] Application profile, AArch32 and AArch64, 1-4 SMP cores, TrustZone, NEON advanced SIMD, VFPv4, hardware virtualization, multi-issue, deeply out-of-order pipeline 48 KB w/DED parity / 32 KB w/ECC L1 per core, 512 KB-4 MB L2 shared, 44-bit physical addresses At least 6.3 DMIPS/MHz per core (up to 7.35 DMIPS/MHz depending on implementation)
Cortex-A73 Application profile, AArch32 and AArch64 At least 8.0 DMIPS/MHz per core
ARM family ARM architecture ARM core Feature Cache (I / D), MMU Typical MIPS @ MHz

Designed by third parties

These cores implement the ARM instruction set, and were developed independently by companies with an architectural license from ARM.

Family Instruction set Microarchitecture Feature Cache (I / D), MMU Typical MIPS @ MHz
StrongARM
(Digital)
ARMv4 SA-110 5-stage pipeline 16 KB / 16 KB, MMU 100–206 MHz
1.0 DMIPS/MHz
SA-1100 derivative of the SA-110 16 KB / 8 KB, MMU
Faraday[36]
(Faraday Technology)
ARMv4 FA510 6-stage pipeline Up to 32 KB / 32 KB cache, MPU 1.26 DMIPS/MHz
100–200 MHz
FA526 Up to 32 KB / 32 KB cache, MMU 1.26 MIPS/MHz
166-300 MHz
FA626 8-stage pipeline 32 KB / 32 KB cache, MMU 1.35 DMIPS/MHz
500 MHz
ARMv5TE FA606TE 5-stage pipeline No cache, no MMU 1.22 DMIPS/MHz
200 MHz
FA626TE 8-stage pipeline 32 KB / 32 KB cache, MMU 1.43 MIPS/MHz
800 MHz
FMP626TE 8-stage pipeline, SMP 1.43 MIPS/MHz
500 MHz
FA726TE 13 stage pipeline, dual issue 2.4 DMIPS/MHz
1000 MHz
XScale
(Intel / Marvell)
ARMv5TE XScale 7-stage pipeline, Thumb, Enhanced DSP instructions 32 KB / 32 KB, MMU 133–400 MHz
Bulverde Wireless MMX, Wireless SpeedStep added 32 KB / 32 KB, MMU 312–624 MHz
Monahans[37] Wireless MMX2 added 32 KB / 32 KB (L1), optional L2 cache up to 512 KB, MMU Up to 1.25 GHz
Sheeva
(Marvell)
ARMv5 Feroceon 5-8 stage pipeline, single-issue 16 KB / 16 KB, MMU 600–2000 MHz
Jolteon 5-8 stage pipeline, dual-issue 32 KB / 32 KB, MMU
PJ1 (Mohawk) 5-8 stage pipeline, single-issue, Wireless MMX2 32 KB / 32 KB, MMU 1.46 DMIPS/MHz
1.06 GHz
ARMv6 / ARMv7-A PJ4 6-9 stage pipeline, dual-issue, Wireless MMX2, SMP 32 KB / 32 KB, MMU 2.41 DMIPS/MHz
1.6 GHz
Snapdragon
(Qualcomm)
ARMv7-A Scorpion[38] 1 or 2 cores. ARM / Thumb / Thumb-2 / DSP / SIMD / VFPv3 FPU / NEON (128-bit wide) 256 KB L2 per core 2.1 DMIPS/MHz per core
Krait[38] 1, 2, or 4 cores. ARM / Thumb / Thumb-2 / DSP / SIMD / VFPv4 FPU / NEON (128-bit wide) 4 KB / 4 KB L0, 16 KB / 16 KB L1, 512 KB L2 per core 3.3 DMIPS/MHz per core
ARMv8-A Kryo[39] 4 cores.  ? Up to 2.2 GHz

(6.3 DMIPS/MHz)

Ax
(Apple)
ARMv7-A Swift[40] 2 cores. ARM / Thumb / Thumb-2 / DSP / SIMD / VFPv4 FPU / NEON L1: 32 KB / 32 KB, L2: 1 MB 3.5 DMIPS/MHz per core
ARMv8-A Cyclone[41] 2 cores. ARM / Thumb / Thumb-2 / DSP / SIMD / VFPv4 FPU / NEON / TrustZone / AArch64 L1: 64 KB / 64 KB, L2: 1 MB, L3: 4 MB 1.3 - 1.4 GHz
ARMv8-A Typhoon[41][42] 2 or 3 cores. ARM / Thumb / Thumb-2 / DSP / SIMD / VFPv4 FPU / NEON / TrustZone / AArch64 L1: 64 KB / 64 KB, L2: 1 or 2 MB, L3: 4 MB 1.4 - 1.5 GHz
ARMv8-A Twister[43] 2 cores. ARM / Thumb / Thumb-2 / DSP / SIMD / VFPv4 FPU / NEON / TrustZone / AArch64 L1: 64 KB / 64 KB, L2: 2 MB, L3: 4 MB or 0 MB 1.85 or 2.26 GHz
X-Gene
(Applied Micro)
ARMv8-A X-Gene 64-bit, quad issue, SMP, 64 cores[44] Cache, MMU, virtualization 3 GHz (4.2 DMIPS/MHz per core)
Denver
(Nvidia)
ARMv8-A Denver[45][46] 2 cores. AArch64, 7-wide superscalar, in-order, dynamic code optimization, 128 MB optimization cache 128 KB I / 64 KB D Up to 2.5 GHz
ThunderX
(Cavium)
ARMv8-A ThunderX 64-bit, with two models with 8-16 or 24-48 cores (×2 w/two chips)  ? Up to 2.5 GHz
K12
(AMD)
ARMv8-A K12[47]  ?  ?  ?
Exynos
(Samsung)
ARMv8-A M1 ("Mongoose")[48] 64-bit  ? 5.1 DMIPS/MHz

(2.3 GHz)

ARM core timeline

The following table lists each core by the year it was announced.[49][50]

Year Classic cores Cortex cores
ARM7 ARM8 ARM9 ARM10 ARM11 Microcontroller Real-time Application
(32-bit)
Application
(64-bit)
1994 ARM7DI
1995 ARM710a
1996 ARM810
1997 ARM720T
ARM740T
1998 ARM7TDMI
ARM710T
ARM9TDMI
ARM940T
1999 ARM9E-S
ARM966E-S
2000 ARM920T
ARM922T
ARM946E-S
ARM1020T
2001 ARM7TDMI-S
ARM7EJ-S
ARM9EJ-S
ARM926EJ-S
ARM1020E
ARM1022E
2002 ARM1026EJ-S ARM1136J(F)-S
2003 ARM968E-S ARM1156T2(F)-S
ARM1176JZ(F)-S
2004 Cortex-M3
2005 ARM11MPCore Cortex-A8
2006 ARM996HS
2007 Cortex-M1 Cortex-A9
2008
2009 Cortex-M0 Cortex-A5
2010 Cortex-M4 Cortex-A15
2011 Cortex-R4
Cortex-R5
Cortex-R7
Cortex-A7
2012 Cortex-M0+ Cortex-A53
Cortex-A57
2013 Cortex-A12
2014 Cortex-M7 Cortex-A17
2015 Cortex-A35
Cortex-A72
2016 Cortex-R8 Cortex-A32

See also

References

  1. Lua error in package.lua at line 80: module 'strict' not found.
  2. Lua error in package.lua at line 80: module 'strict' not found.
  3. Lua error in package.lua at line 80: module 'strict' not found.
  4. Lua error in package.lua at line 80: module 'strict' not found.
  5. Lua error in package.lua at line 80: module 'strict' not found.
  6. Register 13, FCSE PID register ARM920T Technical Reference Manual
  7. Lua error in package.lua at line 80: module 'strict' not found.
  8. https://www.arm.com/products/processors/classic/arm11/arm1156.php
  9. Lua error in package.lua at line 80: module 'strict' not found.
  10. Cortex-M0 Specification Summary; ARM Holdings.
  11. 11.0 11.1 11.2 Cortex-M0/M0+/M1 Instruction set; ARM Holding.
  12. Cortex-M0+ Specification Summary; ARM Holdings.
  13. Cortex-M1 Specification Summary; ARM Holdings.
  14. Lua error in package.lua at line 80: module 'strict' not found.
  15. Lua error in package.lua at line 80: module 'strict' not found.
  16. Cortex-M3 Specification Summary; ARM Holdings.
  17. Cortex-M4 Specification Summary; ARM Holdings.
  18. Cortex-M7 Specification Summary; ARM Holdings.
  19. Cortex-R4 Specification Summary; ARM Holdings.
  20. Cortex-R5 Specification Summary; ARM Holdings.
  21. 21.0 21.1 Cortex-R5 & Cortex-R7 Press Release; ARM Holdings; 31 January 2011.
  22. Cortex-R7 Specification Summary; ARM Holdings.
  23. Cortex-A5 Specification Summary; ARM Holdings.
  24. Cortex-A7 Specification Summary; ARM Holdings.
  25. 25.0 25.1 Lua error in package.lua at line 80: module 'strict' not found.
  26. Cortex-A8 Specification Summary; ARM Holdings.
  27. Cortex-A9 Specification Summary; ARM Holdings.
  28. Cortex-A12 Summary; ARM Holdings.
  29. Cortex-A15 Specification Summary; ARM Holdings.
  30. Exclusive : ARM Cortex-A15 "40 Per Cent" Faster Than Cortex-A9 | ITProPortal.com
  31. Lua error in package.lua at line 80: module 'strict' not found.
  32. Lua error in package.lua at line 80: module 'strict' not found.
  33. Lua error in package.lua at line 80: module 'strict' not found.
  34. Lua error in package.lua at line 80: module 'strict' not found.
  35. Lua error in package.lua at line 80: module 'strict' not found.
  36. Lua error in package.lua at line 80: module 'strict' not found.
  37. Lua error in package.lua at line 80: module 'strict' not found.
  38. 38.0 38.1 Qualcomm's New Snapdragon S4: MSM8960 & Krait Architecture Explored; Anandtech.
  39. Lua error in package.lua at line 80: module 'strict' not found.
  40. Lua error in package.lua at line 80: module 'strict' not found.
  41. 41.0 41.1 Lua error in package.lua at line 80: module 'strict' not found.
  42. Lua error in package.lua at line 80: module 'strict' not found.
  43. Lua error in package.lua at line 80: module 'strict' not found.
  44. http://www.pcworld.com/article/2464600/appliedmicros-64core-chip-could-spark-off-arm-core-war.html
  45. http://www.anandtech.com/Gallery/Album/3847
  46. http://blogs.nvidia.com/blog/2014/08/11/tegra-k1-denver-64-bit-for-android/
  47. http://www.anandtech.com/show/7990/amd-announces-k12-core-custom-64bit-arm-design-in-2016
  48. Lua error in package.lua at line 80: module 'strict' not found.
  49. ARM Company Milestones.
  50. ARM Press Releases.

Further reading

<templatestyles src="Module:Hatnote/styles.css"></templatestyles>