ProductAMD Instinct™ MI210 Accelerator - 64GB HBM2e - PCIe 4.0 x16 - Passive CoolingNVIDIA® A2 GPU Computing Accelerator - 16GB GDDR6 - PCIe 4.0 x8 - Passive CoolerNVIDIA® A16 GPU Computing Accelerator - 64GB (4x 16GB) GDDR6 - PCIe 4.0 x16 - Passive CoolerNVIDIA® A30 GPU Computing Accelerator - 24GB HBM2 - PCIe 4.0 x16 - Passive CoolerNVIDIA® A40 GPU Computing Accelerator - 48GB GDDR6 - PCIe 4.0 x16 - Passive CoolingNVIDIA® L40S ADA GPU Computing Accelerator - 48GB GDDR6 - PCIe 4.0 x16 - Passive Cooling
ActionSelectSelectSelectSelectSelectSelect
Main Specifications
Product Series AMD InstinctNvidia A2Nvidia A16Nvidia A30Nvidia A40Nvidia L40S
Core Type NVIDIA TENSORNVIDIA TENSORNVIDIA TENSORNVIDIA TENSORNVIDIA TENSOR
Core Clock Speed 1700 MHz1440 MHz (1770 MHz Boost Clock)
Host Interface PCI Express 4.0 x16PCI Express 4.0 x8PCI Express 4.0 x16PCI Express 4.0 x16PCI Express 4.0 x16PCI Express 4.0 x16
GPU Architecture CDNA2AmpereAmpereAmpereAmpereAda Lovelace
Detailed Specifications
Streaming Processor Cores 66561280 CUDA Cores10752 CUDA Cores18,176
Compute Units 104
NVIDIA Tensor Cores 40 | Gen 3336 Tensor Cores568 | Gen 4
NVIDIA RT Cores 10 | Gen 284 RT Cores142 | Gen 3
Memory Clock Speed 1.6 GHz6251 MHz
Memory Interface 4096-bit128-bit384-bit
Memory Speeds (GT/s) 14.5Gbps GDDR6
Max Memory Size 64 GB HBM2e16 GB GDDR6 ECC4x 16GB GDDR6 with error-correcting code (ECC)24 GB HBM248 GB GDDR6 with error-correcting code (ECC)48GB GDDR6 with ECC
Max Memory Bandwidth Up to 1638.4 GB/s200 GB/s4x 232GB/s933 GB/s696 GB/s864 GB/s
Infinity Fabric™ Links 3
Peak Infinity Fabric™ Link Bandwidth 100 GB/s
Peak FP64 5.2 teraFLOPS
Peak FP64 Tensor Core 10.3 teraFLOPS
INT8 Tensor Core 330 TOPS | 661 TOPS733 teraFLOPS
TF32 Tensor Core 9 TFLOPS | 18 TFLOPS Sparsity82 teraFLOPS | 165 teraFLOPS183 teraFLOPS
FP32 22.6 TFLOPs4.5 TFLOPS10.3 teraFLOPS91.6 teraFLOPS
Peak BFLOAT16 Tensor Core 165 teraFLOPS | 330 teraFLOPS362.05 teraFLOPS
Peak FP16 Tensor Core 18 TFLOPS | 36 TFLOPS Sparsity165 teraFLOPS | 330 teraFLOPS362.05 teraFLOPS
Peak FP8 Tensor Core 733 teraFLOPS
Peak INT4 Tensor Core 661 TOPS | 1321 TOPS733 teraFLOPS
Total NVLink Bandwidth Third-gen NVLINK: 200GB/sNVIDIA NVLink 112.5 GB/s (bidirectional) PCIe Gen4 16 GB/sNot supported
Multi-Instance GPUs No
NVIDIA CUDA™ Technology 11.1 or later
NVENC | NVDEC 3x l 3x (includes AV1 encode and decode)
Secure Boot with Root of Trust Yes
NEBS Ready Level 3
Peak Single Precision Matrix (FP32) Performance 45.3 TFLOPs
Peak Double Precision Matrix (FP64) Performance 45.3 TFLOPs
Peak Double Precision (FP64) Performance 22.6 TFLOPs
Peak INT4 Performance 181 TOPs72 TOPS | 144 TOPS Sparsity
Peak INT8 Performance 36 TOPS | 72 TOPS Sparsity
Peak bfloat16 181 TFLOPs
ECC Protection Yes (Full-Chip)On by Default
DisplayPort Connectors 3x DisplayPort 1.4
A40 is configured for virtualization by default with physical display connectors disabled. The display outputs can be enabled via management software tools.
4x DisplayPort 1.4a
OS Support Linux x86_64
Cooling PassivePassivePassivePassivePassive
Dual Slot yesSingle-slotDual-slotDual-slot2-slot Low-profile
Dimensions 10.5" (267 mm) Board Length6.61” L x 2.71” H4.4" (H) x 10.5" (L)4.4" (H) x 10.5" (L)
Form Factor Full HeightLow-Profile PCIe
Lithography Samsung 8nm
Supplementary Power Connectors 1x8 pin 12V EPS8-pin CPU1x 8-pin CPU (EPS12V)1x 8-pin CPU (EPS12V)1x 16-pin
Max Graphics Card Power (W) 300W Peak40-60 W | Configurable250W165W300W350W
ActionSelectSelectSelectSelectSelectSelect