| | | | | | | |
| Product | AMD Instinct™ MI210 Accelerator - 64GB HBM2e - PCIe 4.0 x16 - Passive Cooling | NVIDIA® A2 GPU Computing Accelerator - 16GB GDDR6 - PCIe 4.0 x8 - Passive Cooler (w/o CEC) | NVIDIA® A30 GPU Computing Accelerator - 24GB HBM2 - PCIe 4.0 x16 - Passive Cooler | NVIDIA® L4 ADA GPU Computing Accelerator - 24GB GDDR6X - PCIe 4.0 x16 - Passive Cooling | NVIDIA® L40 ADA GPU Computing Accelerator - 48GB GDDR6 - PCIe 4.0 x16 - Passive Cooling | NVIDIA® L40S ADA GPU Computing Accelerator - 48GB GDDR6 - PCIe 4.0 x16 - Passive Cooling | NVIDIA® RTX 6000 Ada Generation - 48GB GDDR6 ECC - PCIe 4.0 x16 - Active Cooling (4xDP) |
| Action | Select | Select | Select | Select | Select | Select | Select |
| Main Specifications |
| Product Series |
AMD Instinct | Nvidia A2 | Nvidia A30 | Nvidia L4 | Nvidia L40 | Nvidia L40S | |
| Core Type |
| NVIDIA TENSOR | NVIDIA TENSOR | NVIDIA TENSOR | NVIDIA TENSOR | NVIDIA TENSOR | |
| Core Clock Speed |
1700 MHz | 1440 MHz (1770 MHz Boost Clock) | | 795 MHz Base | 2040 MHz Boost | | | |
| Host Interface |
PCI Express 4.0 x16 | PCI Express 4.0 x8 | PCI Express 4.0 x16 | PCI Express 4.0 x16 | PCI Express 4.0 x16 | PCI Express 4.0 x16 | PCI Express 4.0 x16 |
| GPU Architecture |
CDNA2 | Ampere | Ampere | Ada Lovelace | Ada Lovelace | Ada Lovelace | |
| Product Type |
| | | | | | Workstation |
| Product Line |
| | | | | | NVIDIA Professional Graphics |
| Memory Technology |
| | | | | | GDDR6 |
| Memory Capacity |
| | | | | | 48 GB with ECC |
| Max Displays |
| | | | | | 4 Displays |
| Detailed Specifications |
| Streaming Processor Cores |
6656 | 1280 CUDA Cores | | | | 18,176 | 18,176 |
| Compute Units |
104 | | | | | | |
| NVIDIA Tensor Cores |
| 40 | Gen 3 | | | | 568 | Gen 4 | 568 |
| NVIDIA RT Cores |
| 10 | Gen 2 | | | | 142 | Gen 3 | 142 |
| Memory Clock Speed |
1.6 GHz | 6251 MHz | | 6251 MHz | | | |
| Memory Interface |
4096-bit | 128-bit | | 192-bit | | | 384-bit |
| Max Memory Size |
64 GB HBM2e | 16 GB GDDR6 ECC | 24 GB HBM2 | 24 GB | 48 GB GDDR6 with ECC | 48GB GDDR6 with ECC | |
| Max Memory Bandwidth |
Up to 1638.4 GB/s | 200 GB/s | 933 GB/s | 300 GB/s | | 864 GB/s | |
| ECC Protection |
Yes (Full-Chip) | On by Default | | On by Default | | | |
| Infinity Fabric™ Links |
3 | | | | | | |
| Peak Infinity Fabric™ Link Bandwidth |
100 GB/s | | | | | | |
| Peak FP64 |
| | 5.2 teraFLOPS | | | | |
| Peak FP64 Tensor Core |
| | 10.3 teraFLOPS | | | | |
| INT8 Tensor Core |
| | 330 TOPS | 661 TOPS | 485 TOPS | Sparsity | | 733 teraFLOPS | |
| TF32 Tensor Core |
| 9 TFLOPS | 18 TFLOPS Sparsity | 82 teraFLOPS | 165 teraFLOPS | 120 TFLOPS | Sparsity | | 183 teraFLOPS | |
| FP32 |
22.6 TFLOPs | 4.5 TFLOPS | 10.3 teraFLOPS | 30.3 TFLOPS | | 91.6 teraFLOPS | |
| Peak BFLOAT16 Tensor Core |
| | 165 teraFLOPS | 330 teraFLOPS | 242 TFLOPS | Sparsity | | 362.05 teraFLOPS | |
| Peak FP16 Tensor Core |
| 18 TFLOPS | 36 TFLOPS Sparsity | 165 teraFLOPS | 330 teraFLOPS | 242 TFLOPS | Sparsity | | 362.05 teraFLOPS | |
| Peak FP8 Tensor Core |
| | | 485 TFLOPS | Sparsity | | 733 teraFLOPS | |
| Peak INT4 Tensor Core |
| | 661 TOPS | 1321 TOPS | | | 733 teraFLOPS | |
| Total NVLink Bandwidth |
| | Third-gen NVLINK: 200GB/s | | | Not supported | |
| Multi-Instance GPUs |
| | | | | No | |
| Tensor Performance |
| | | | | | 1457.0 TFLOPS |
| NVIDIA CUDA™ Technology |
| 11.1 or later | | | | | |
| vGPU Software Support |
| | | | NVIDIA vPC/vApps, NVIDIA RTX Virtual Workstation (vWS) | | |
| NVENC | NVDEC |
| | | 2 | 4 | 4 | JPEG Decoders | AV1 Encode and Decode | 3x | 3x (Includes AV1 Encode & Decode) | 3x l 3x (includes AV1 encode and decode) | |
| Secure Boot with Root of Trust |
| | | Yes | Yes | Yes | |
| NEBS Ready |
| | | Yes | Level 3 | Yes / Level 3 | Level 3 | |
| Peak Single Precision Matrix (FP32) Performance |
45.3 TFLOPs | | | | | | |
| Peak Double Precision Matrix (FP64) Performance |
45.3 TFLOPs | | | | | | |
| Peak Double Precision (FP64) Performance |
22.6 TFLOPs | | | | | | |
| Peak INT4 Performance |
181 TOPs | 72 TOPS | 144 TOPS Sparsity | | | | | |
| Peak INT8 Performance |
| 36 TOPS | 72 TOPS Sparsity | | | | | |
| Peak bfloat16 |
181 TFLOPs | | | | | | |
| Transistor Count |
| | | | | | 76.3 billion |
| DisplayPort Connectors |
| | | None | vGPU Only | 4x DP 1.4a | 4x DisplayPort 1.4a | |
| OS Support |
Linux x86_64 | | | | | | |
| Cooling |
Passive | Passive | | Passive | Passive | Passive | |
| Dual Slot |
yes | Single-slot | Dual-slot | No | Yes | | |
| Dimensions |
10.5" (267 mm) Board Length | 6.61” L x 2.71” H | | | 4.4" (H) x 10.5" (L) | 4.4" (H) x 10.5" (L) | 4.4" H x 10.5" L |
| Form Factor |
Full Height | Low-Profile PCIe | | 6.61” L x 2.71” H (Low-profile) | PCIe | | |
| Lithography |
| | | | | | 4 nm NVIDIA Custom Process |
| Supplementary Power Connectors |
1x8 pin 12V EPS | | 1x 8-pin CPU (EPS12V) | | 1x 16-pin PCIe CEM5 | 1x 16-pin | 1x PCIe CEM5 16-pin |
| Max Graphics Card Power (W) |
300W Peak | 40-60 W | Configurable | 165W | 72W | 300W | 350W | 300W |
| Processor |
| | | | | | NVIDIA Ada Lovelace |
| Memory Bandwidth |
| | | | | | 960 GB/s |
| Peak Single Precision FP32 Performance |
| | | | | | 91.1 TFLOPS |
| RT Core Performance |
| | | | | | 210.6 TFLOPS |
| DisplayPort Output |
| | | | | | 4x DP 1.4a |
| Minimum Recommended Power, Single Card (W) |
| | | | | | 600 |
| Minimum Recommended Power, 2-Way (W) |
| | | | | | 750 |
| Minimum Recommended Power, 3-Way (W) |
| | | | | | 850 |
| Minimum Recommended Power, 4-Way (W) |
| | | | | | 1000 |
| Thermal Solution |
| | | | | | Blower Active Fan |
| Slot Height |
| | | | | | 2-Slot |
| Action | Select | Select | Select | Select | Select | Select | Select |