Product | AMD Instinct™ MI100 Accelerator - 32GB HBM2 - PCIe 4.0 x16 - Passive Cooling | AMD Instinct™ MI210 Accelerator - 64GB HBM2e - PCIe 4.0 x16 - Passive Cooling | NVIDIA® A2 GPU Computing Accelerator - 16GB GDDR6 - PCIe 4.0 x8 - Passive Cooler | NVIDIA® A16 GPU Computing Accelerator - 64GB (4x 16GB) GDDR6 - PCIe 4.0 x16 - Passive Cooler | NVIDIA® A30 GPU Computing Accelerator - 24GB HBM2 - PCIe 4.0 x16 - Passive Cooler | NVIDIA® A40 GPU Computing Accelerator - 48GB GDDR6 - PCIe 4.0 x16 - Passive Cooling | NVIDIA® L4 ADA GPU Computing Accelerator - 24GB GDDR6X - PCIe 4.0 x16 - Passive Cooling | NVIDIA® L40 ADA GPU Computing Accelerator - 48GB GDDR6 - PCIe 4.0 x16 - Passive Cooling | NVIDIA® L40S ADA GPU Computing Accelerator - 48GB GDDR6 - PCIe 4.0 x16 - Passive Cooling | NVIDIA® H100 NVL GPU Computing Accelerator - 94GB HBM3 - PCIe 5.0 x16 - Passive Cooling | NVIDIA® RTX 5000 Ada Generation - 32GB GDDR6 ECC - PCIe 4.0 x16 - Active Cooling (4xDP) | NVIDIA® RTX 6000 Ada Generation - 48GB GDDR6 ECC - PCIe 4.0 x16 - Active Cooling (4xDP) |
Action | Select | Select | Select | Select | Select | Select | Select | Select | Select | Select | Select | Select |
Main Specifications | ||||||||||||
Product Series | AMD Instinct | AMD Instinct | Nvidia A2 | Nvidia A16 | Nvidia A30 | Nvidia A40 | Nvidia L4 | Nvidia L40 | Nvidia L40S | Nvidia H100 NVL | ||
Core Type | NVIDIA TENSOR | NVIDIA TENSOR | NVIDIA TENSOR | NVIDIA TENSOR | NVIDIA TENSOR | NVIDIA TENSOR | NVIDIA TENSOR | NVIDIA TENSOR | ||||
Core Clock Speed | 1502 MHz | 1700 MHz | 1440 MHz (1770 MHz Boost Clock) | 795 MHz Base | 2040 MHz Boost | ||||||||
Host Interface | PCI Express 4.0 x16 | PCI Express 4.0 x16 | PCI Express 4.0 x8 | PCI Express 4.0 x16 | PCI Express 4.0 x16 | PCI Express 4.0 x16 | PCI Express 4.0 x16 | PCI Express 4.0 x16 | PCI Express 4.0 x16 | PCI Express 5.0 x16 | PCI Express 4.0 x16 | PCI Express 4.0 x16 |
GPU Architecture | CDNA | CDNA2 | Ampere | Ampere | Ampere | Ampere | Ada Lovelace | Ada Lovelace | Ada Lovelace | Hopper | ||
Product Type | Workstation | Workstation | ||||||||||
Product Line | NVIDIA Professional Graphics | NVIDIA Professional Graphics | ||||||||||
Memory Technology | GDDR6 | GDDR6 | ||||||||||
Memory Capacity | 32 GB GDDR6 ECC | 48 GB with ECC | ||||||||||
Max Displays | 4 Displays | 4 Displays | ||||||||||
Detailed Specifications | ||||||||||||
Streaming Processor Cores | 7,680 | 6656 | 1280 CUDA Cores | 10752 CUDA Cores | 18,176 | 12,800 CUDA Parallel Processing Cores | 18,176 | |||||
Compute Units | 120 | 104 | ||||||||||
NVIDIA Tensor Cores | 40 | Gen 3 | 336 Tensor Cores | 568 | Gen 4 | 400 | 568 | |||||||
NVIDIA RT Cores | 10 | Gen 2 | 84 RT Cores | 142 | Gen 3 | 100 | 142 | |||||||
PCIe x16 Interconnect Bandwidth | PCIe Gen5: 128GB/s | |||||||||||
Memory Clock Speed | 1.2 GHz | 1.6 GHz | 6251 MHz | 6251 MHz | ||||||||
Memory Interface | 4096-bit | 4096-bit | 128-bit | 384-bit | 192-bit | 256-bit | 384-bit | |||||
Memory Speeds (GT/s) | 14.5Gbps GDDR6 | |||||||||||
Max Memory Size | 32 GB HBM2 | 64 GB HBM2e | 16 GB GDDR6 ECC | 4x 16GB GDDR6 with error-correcting code (ECC) | 24 GB HBM2 | 48 GB GDDR6 with error-correcting code (ECC) | 24 GB | 48 GB GDDR6 with ECC | 48GB GDDR6 with ECC | 94 GB | ||
Max Memory Bandwidth | Up to 1228.8 GB/s | Up to 1638.4 GB/s | 200 GB/s | 4x 232GB/s | 933 GB/s | 696 GB/s | 300 GB/s | 864 GB/s | 7.8TB/s | |||
Infinity Fabric™ Links | 3 | 3 | ||||||||||
Peak Infinity Fabric™ Link Bandwidth | 92 GB/s | 100 GB/s | ||||||||||
Peak FP64 | 5.2 teraFLOPS | 68 teraFLOPs | ||||||||||
Peak FP64 Tensor Core | 10.3 teraFLOPS | 134 teraFLOPs | ||||||||||
INT8 Tensor Core | 330 TOPS | 661 TOPS | 485 TOPS | Sparsity | 733 teraFLOPS | 7,916 TOPS | ||||||||
TF32 Tensor Core | 9 TFLOPS | 18 TFLOPS Sparsity | 82 teraFLOPS | 165 teraFLOPS | 120 TFLOPS | Sparsity | 183 teraFLOPS | 1,979 teraFLOPs | |||||||
FP32 | 22.6 TFLOPs | 4.5 TFLOPS | 10.3 teraFLOPS | 30.3 TFLOPS | 91.6 teraFLOPS | 134 teraFLOPs | ||||||
Peak BFLOAT16 Tensor Core | 165 teraFLOPS | 330 teraFLOPS | 242 TFLOPS | Sparsity | 362.05 teraFLOPS | 3,958 teraFLOPs | ||||||||
Peak FP16 Tensor Core | 18 TFLOPS | 36 TFLOPS Sparsity | 165 teraFLOPS | 330 teraFLOPS | 242 TFLOPS | Sparsity | 362.05 teraFLOPS | 3,958 teraFLOPs | |||||||
Peak FP8 Tensor Core | 485 TFLOPS | Sparsity | 733 teraFLOPS | 7,916 teraFLOPs | |||||||||
Peak INT4 Tensor Core | 661 TOPS | 1321 TOPS | 733 teraFLOPS | ||||||||||
Total NVLink Bandwidth | Third-gen NVLINK: 200GB/s | NVIDIA NVLink 112.5 GB/s (bidirectional) PCIe Gen4 16 GB/s | Not supported | 600GB/s | ||||||||
Multi-Instance GPUs | No | |||||||||||
Tensor Performance | 1044.4 TFLOPS | 1457.0 TFLOPS | ||||||||||
NVIDIA CUDA™ Technology | 11.1 or later | |||||||||||
vGPU Software Support | NVIDIA vPC/vApps, NVIDIA RTX Virtual Workstation (vWS) | |||||||||||
NVENC | NVDEC | 2 | 4 | 4 | JPEG Decoders | AV1 Encode and Decode | 3x | 3x (Includes AV1 Encode & Decode) | 3x l 3x (includes AV1 encode and decode) | |||||||||
Secure Boot with Root of Trust | Yes | Yes | Yes | |||||||||
NEBS Ready | Yes | Level 3 | Yes / Level 3 | Level 3 | |||||||||
Peak Half Precision (FP16) Performance | 184.6 TFLOPs | |||||||||||
Peak Single Precision Matrix (FP32) Performance | 46.1 TFLOPs | 45.3 TFLOPs | ||||||||||
Peak Double Precision Matrix (FP64) Performance | 45.3 TFLOPs | |||||||||||
Peak Single Precision (FP32) Performance | 23.1 TFLOPs | |||||||||||
Peak Double Precision (FP64) Performance | 11.5 TFLOPs | 22.6 TFLOPs | ||||||||||
Peak INT4 Performance | 184.6 TOPs | 181 TOPs | 72 TOPS | 144 TOPS Sparsity | |||||||||
Peak INT8 Performance | 184.6 TOPs | 36 TOPS | 72 TOPS Sparsity | ||||||||||
Peak bfloat16 | 92.3 TFLOPs | 181 TFLOPs | ||||||||||
ECC Protection | Yes (Full-Chip) | On by Default | On by Default | |||||||||
Transistor Count | 76.3 billion | 76.3 billion | ||||||||||
DisplayPort Connectors | 3x DisplayPort 1.4 A40 is configured for virtualization by default with physical display connectors disabled. The display outputs can be enabled via management software tools. | None | vGPU Only | 4x DP 1.4a | 4x DisplayPort 1.4a | ||||||||
OS Support | Linux x86_64 | Linux x86_64 | ||||||||||
Cooling | Passive | Passive | Passive | Passive | Passive | Passive | Passive | Passive | Passive | |||
Dual Slot | yes | yes | Single-slot | Dual-slot | Dual-slot | 2-slot Low-profile | No | Yes | Yes | |||
Dimensions | 10.5" (267 mm) Board Length | 10.5" (267 mm) Board Length | 6.61” L x 2.71” H | 4.4" (H) x 10.5" (L) | 4.4" (H) x 10.5" (L) | 4.4" (H) x 10.5" (L) | 4.4" H x 10.5" L | 4.4" H x 10.5" L | ||||
Form Factor | Full Height | Low-Profile PCIe | 6.61” L x 2.71” H (Low-profile) | PCIe | PCIe | |||||||
Lithography | TSMC 7nm FinFET | Samsung 8nm | 4 nm NVIDIA Custom Process | 4 nm NVIDIA Custom Process | ||||||||
Supplementary Power Connectors | 2x PCIe 8-pin connectors | 1x8 pin 12V EPS | 8-pin CPU | 1x 8-pin CPU (EPS12V) | 1x 8-pin CPU (EPS12V) | 1x 16-pin PCIe CEM5 | 1x 16-pin | 1x 16-pin CEM5 PCIe | 1x PCIe CEM5 16-pin | |||
Max Graphics Card Power (W) | 300W | 300W Peak | 40-60 W | Configurable | 250W | 165W | 300W | 72W | 300W | 350W | 400W | 250W | 300W |
Processor | NVIDIA Ada Lovelace | NVIDIA Ada Lovelace | ||||||||||
Memory Bandwidth | 576 GB/s | 960 GB/s | ||||||||||
Peak Single-Precision Performance | 65.3 TFLOPS | |||||||||||
Peak Single Precision FP32 Performance | 91.1 TFLOPS | |||||||||||
NVLink Interconnect | Not Supported | |||||||||||
RT Core Performance | 151.0 TFLOPS | 210.6 TFLOPS | ||||||||||
DisplayPort Output | 4x DP 1.4a | |||||||||||
Mini DisplayPort Output | 4x DP 1.4a | |||||||||||
Minimum Recommended Power, Single Card (W) | 600 | |||||||||||
Minimum Recommended Power, 2-Way (W) | 750 | |||||||||||
Minimum Recommended Power, 3-Way (W) | 850 | |||||||||||
Minimum Recommended Power, 4-Way (W) | 1000 | |||||||||||
Thermal Solution | Blower Active Fan | Blower Active Fan | ||||||||||
Slot Height | 2-Slot | 2-Slot | ||||||||||
Action | Select | Select | Select | Select | Select | Select | Select | Select | Select | Select | Select | Select |