ProductAMD Instinct™ MI100 Accelerator - 32GB HBM2 - PCIe 4.0 x16 - Passive CoolingNVIDIA® A10 GPU Computing Accelerator - 24GB GDDR6 - PCIe 4.0 x16 - Passive Cooler (w/o CEC)NVIDIA® A30 GPU Computing Accelerator - 24GB HBM2 - PCIe 4.0 x16 - Passive CoolerNVIDIA® A40 GPU Computing Accelerator - 48GB GDDR6 - PCIe 4.0 x16 - Passive Cooling (w/o CEC)NVIDIA® T400 4GB GDDR6 - PCIe 3.0 x16 - Active Cooling (3x mDP)NVIDIA® T1000 8GB GDDR6 - PCIe 3.0 x16 - Active Cooling (4x mDP)NVIDIA® RTX A6000 - 48GB GDDR6 - PCIe 4.0 x16 - Active Cooling (4xDP)
ActionSelectSelectSelectSelectSelectSelectSelect
Main Specifications
Product Series AMD InstinctNvidia A10Nvidia A30Nvidia A40
Core Type NVIDIA TENSORNVIDIA TENSORNVIDIA TENSOR
Core Clock Speed 1502 MHz885 MHz (1695 MHz Boost Clock)
Host Interface PCI Express 4.0 x16PCI Express 4.0 x16 64GB/sPCI Express 4.0 x16PCI Express 4.0 x16PCI Express 3.0 x16PCI Express 3.0 x16PCI Express 4.0 x16
GPU Architecture CDNAAmpereAmpereAmpere
Product Type WorkstationWorkstationWorkstation
Product Line NVIDIA Professional Graphics
Memory Technology GDDR6GDDR6GDDR6
Memory Capacity 4 GB8 GB48 GB
Max Displays 4 Displays4 Displays
Detailed Specifications
Streaming Processor Cores 7,68010752 CUDA Cores384 CUDA Parallel-Processing Cores896 CUDA Parallel-Processing Cores10752 Shading Units
Compute Units 120
NVIDIA Tensor Cores 336 Tensor Cores336
NVIDIA RT Cores 72 RT Cores84 RT Cores84
Memory Clock Speed 1.2 GHz1563 MHz2000 MHz 16 Gbps effective
Memory Interface 4096-bit384-bit64-bit128-bit384-bit
Memory Speeds (GT/s) 14.5Gbps GDDR6
Max Memory Size 32 GB HBM224 GB GDDR624 GB HBM248 GB GDDR6 with error-correcting code (ECC)
Max Memory Bandwidth Up to 1228.8 GB/s600 GB/s933 GB/s696 GB/s
Infinity Fabric™ Links 3
Peak Infinity Fabric™ Link Bandwidth 92 GB/s
Peak FP64 5.2 teraFLOPS
Peak FP64 Tensor Core 10.3 teraFLOPS
INT8 Tensor Core 250 TOPS | 500 TOPS330 TOPS | 661 TOPS
TF32 Tensor Core 62.5 teraFLOPS | 125 teraFLOPS82 teraFLOPS | 165 teraFLOPS
FP32 31.2 teraFLOPS10.3 teraFLOPS
Peak BFLOAT16 Tensor Core 125 teraFLOPS | 250 teraFLOPS165 teraFLOPS | 330 teraFLOPS
Peak FP16 Tensor Core 125 teraFLOPS | 250 teraFLOPS165 teraFLOPS | 330 teraFLOPS
Peak INT4 Tensor Core 500 TOPS | 1,000 TOPS661 TOPS | 1321 TOPS
Total NVLink Bandwidth Third-gen NVLINK: 200GB/sNVIDIA NVLink 112.5 GB/s (bidirectional) PCIe Gen4 16 GB/s
NVIDIA CUDA™ Technology Yes
Peak Half Precision (FP16) Performance 184.6 TFLOPs
Peak Single Precision Matrix (FP32) Performance 46.1 TFLOPs
Peak Single Precision (FP32) Performance 23.1 TFLOPs
Peak Double Precision (FP64) Performance 11.5 TFLOPs
Peak INT4 Performance 184.6 TOPs
Peak INT8 Performance 184.6 TOPs
Peak bfloat16 92.3 TFLOPs
Transistor Count 28.3 Billion
DisplayPort Connectors 3x DisplayPort 1.4
A40 is configured for virtualization by default with physical display connectors disabled. The display outputs can be enabled via management software tools.
OS Support Linux x86_64
Cooling PassivePassivePassive
Dual Slot yesSingle-slotDual-slot2-slot Low-profile
Dimensions 10.5" (267 mm) Board LengthFHFL4.4" (H) x 10.5" (L)2.713” H x 6.137” L4.4" (H) x 10.5" (L)
Lithography TSMC 7nm FinFET8 nmSamsung 8nmSamsung 8nm
Supplementary Power Connectors 2x PCIe 8-pin connectorsNone1x 8-pin CPU (EPS12V)1x 8-pin CPU (EPS12V)1x 8-pin EPS
Max Graphics Card Power (W) 300W150W165W300W30W50W300W
Processor NVIDIA TuringNVIDIA TuringAmpere (GA102)
Memory Bandwidth 80 GB/s160 GB/s768 GB/s
Core Clock Speed 1455 MHz Base Clock
1860 MHz Boost Clock
L2 Cache Size 6 MB
API Support CUDA C, CUDA C++, DirectCompute 5.0, OpenCL, Java, Python, and Fortran
Shader Model 5.1 (OpenGL 4.5 and DirectX 12)
CUDA C, CUDA C++, DirectCompute 5.0, OpenCL, Java, Python, and Fortran
Shader Model 5.1 (OpenGL 4.5 and DirectX 12)
CUDA 8.5, OpenCL 2.0
Shader Model 6.5, OpenGL 4.6, DirectX 12 Ultimate (12_2), Vulkan 1.2
Texture Fill Rate 625 GTexel/s
Graphics Resolution Max Digital Resolution: 7680 x 4320 at 60 HzMax Digital Resolution: 7680 x 4320 at 60 Hz7680 x 4320 x36 bpp at 60 Hz
Peak Double Precision FP64 Performance 1,250 GFLOPS (1:32)
Peak Single Precision FP32 Performance 1.094 TFLOPS2.50 TFLOPS38.7 TFLOPS
Peak Half Precision FP16 Performance 40.00 TFLOPS (1:1)
Multi-GPU Scalability NVLINK 2-way low profile (2-slot and 3-slot bridges) connects 2x NVIDIA RTX A6000
NVLink Interconnect 112.5 GB/s (bidirectional)
VR Ready Yes
Vulkan API 1.2
DisplayPort Output 4x DisplayPort 1.4a
Mini DisplayPort Output x3x4
Minimum Recommended Power, Single Card (W) 700W
Minimum Recommended Power, 2-Way (W) 850
Minimum Recommended Power, 3-Way (W) 1000
Minimum Recommended Power, 4-Way (W) 1200
Thermal Solution Ultra-quiet active fansinkUltra-quiet active fansinkActive Heatsink
Slot Height Low-Profile Single SlotLow-Profile Single Slot2-Slot
ActionSelectSelectSelectSelectSelectSelectSelect