ProductNVIDIA® A2 GPU Computing Accelerator - 16GB GDDR6 - PCIe 4.0 x8 - Passive Cooler (w/o CEC)NVIDIA® A16 GPU Computing Accelerator - 64GB (4x 16GB) GDDR6 - PCIe 4.0 x16 - Passive CoolerNVIDIA® A30 GPU Computing Accelerator - 24GB HBM2 - PCIe 4.0 x16 - Passive CoolerNVIDIA® A40 GPU Computing Accelerator - 48GB GDDR6 - PCIe 4.0 x16 - Passive CoolingNVIDIA® L4 ADA GPU Computing Accelerator - 24GB GDDR6X - PCIe 4.0 x16 - Passive CoolingNVIDIA® L40 ADA GPU Computing Accelerator - 48GB GDDR6 - PCIe 4.0 x16 - Passive CoolingNVIDIA® L40S ADA GPU Computing Accelerator - 48GB GDDR6 - PCIe 4.0 x16 - Passive CoolingNVIDIA® H100 NVL GPU Computing Accelerator - 94GB HBM3 - PCIe 5.0 x16 - Passive CoolingNVIDIA® RTX A4000 - 16GB GDDR6 - PCIe 4.0 x16 - Active Cooling (4xDP)NVIDIA® RTX 6000 Ada Generation - 48GB GDDR6 ECC - PCIe 4.0 x16 - Active Cooling (4xDP)
ActionSelectSelectSelectSelectSelectSelectSelectSelectSelectSelect
Main Specifications
Product Series Nvidia A2Nvidia A16Nvidia A30Nvidia A40Nvidia L4Nvidia L40Nvidia L40SNvidia H100 NVL
Core Type NVIDIA TENSORNVIDIA TENSORNVIDIA TENSORNVIDIA TENSORNVIDIA TENSORNVIDIA TENSORNVIDIA TENSORNVIDIA TENSOR
Core Clock Speed 1440 MHz (1770 MHz Boost Clock)795 MHz Base | 2040 MHz Boost
Host Interface PCI Express 4.0 x8PCI Express 4.0 x16PCI Express 4.0 x16PCI Express 4.0 x16PCI Express 4.0 x16PCI Express 4.0 x16PCI Express 4.0 x16PCI Express 5.0 x16PCI Express 4.0 x16PCI Express 4.0 x16
GPU Architecture AmpereAmpereAmpereAmpereAda LovelaceAda LovelaceAda LovelaceHopper
Product Type WorkstationWorkstation
Product Line NVIDIA Professional GraphicsNVIDIA Professional Graphics
Memory Technology GDDR6 with ECCGDDR6
Memory Capacity 16 GB GDDR6 with ECC48 GB with ECC
Max Displays 4 Displays4 Displays
Detailed Specifications
Streaming Processor Cores 1280 CUDA Cores10752 CUDA Cores18,1766144 CUDA Cores18,176
NVIDIA Tensor Cores 40 | Gen 3336 Tensor Cores568 | Gen 4192568
NVIDIA RT Cores 10 | Gen 284 RT Cores142 | Gen 348142
PCIe x16 Interconnect Bandwidth PCIe Gen5: 128GB/s
Memory Clock Speed 6251 MHz6251 MHz
Memory Interface 128-bit384-bit192-bit256-bit384-bit
Memory Speeds (GT/s) 14.5Gbps GDDR6
Max Memory Size 16 GB GDDR6 ECC4x 16GB GDDR6 with error-correcting code (ECC)24 GB HBM248 GB GDDR6 with error-correcting code (ECC)24 GB48 GB GDDR6 with ECC48GB GDDR6 with ECC94 GB
Max Memory Bandwidth 200 GB/s4x 232GB/s933 GB/s696 GB/s300 GB/s864 GB/s7.8TB/s
Peak FP64 5.2 teraFLOPS68 teraFLOPs
Peak FP64 Tensor Core 10.3 teraFLOPS134 teraFLOPs
Peak FP32 4.5 TFLOPS10.3 teraFLOPS30.3 TFLOPS91.6 teraFLOPS134 teraFLOPs
Peak TF32 Tensor Core 9 TFLOPS | 18 TFLOPS Sparsity82 teraFLOPS | 165 teraFLOPS120 TFLOPS | Sparsity183 teraFLOPS1,979 teraFLOPs
Peak BFLOAT16 Tensor Core 165 teraFLOPS | 330 teraFLOPS242 TFLOPS | Sparsity362.05 teraFLOPS3,958 teraFLOPs
Peak FP16 Tensor Core 18 TFLOPS | 36 TFLOPS Sparsity165 teraFLOPS | 330 teraFLOPS242 TFLOPS | Sparsity362.05 teraFLOPS3,958 teraFLOPs
Peak FP8 Tensor Core 485 TFLOPS | Sparsity733 teraFLOPS7,916 teraFLOPs
Peak INT8 Tensor Core 330 TOPS | 661 TOPS485 TOPS | Sparsity733 teraFLOPS7,916 TOPS
Peak INT4 Tensor Core 661 TOPS | 1321 TOPS733 teraFLOPS
NVIDIA NVLink™ Interconnect Bandwidth Third-gen NVLINK: 200GB/sNVIDIA NVLink 112.5 GB/s (bidirectional) PCIe Gen4 16 GB/sNot supported600GB/s
Multi-Instance GPUs No
Tensor Performance 1457.0 TFLOPS
NVIDIA CUDA™ Technology 11.1 or laterYes
vGPU Software Support NVIDIA vPC/vApps, NVIDIA RTX Virtual Workstation (vWS)
NVENC | NVDEC 2 | 4 | 4 | JPEG Decoders | AV1 Encode and Decode3x | 3x (Includes AV1 Encode & Decode)3x l 3x (includes AV1 encode and decode)
Secure Boot with Root of Trust YesYesYes
NEBS Ready Yes | Level 3Yes / Level 3Level 3
Peak INT4 Performance 72 TOPS | 144 TOPS Sparsity
Peak INT8 Performance 36 TOPS | 72 TOPS Sparsity
ECC Protection On by DefaultOn by Default
Transistor Count 17.4 Billion76.3 billion
DisplayPort Connectors 3x DisplayPort 1.4
A40 is configured for virtualization by default with physical display connectors disabled. The display outputs can be enabled via management software tools.
None | vGPU Only4x DP 1.4a4x DisplayPort 1.4a
Cooling PassivePassivePassivePassivePassivePassivePassive
Dual Slot Single-slotDual-slotDual-slot2-slot Low-profileNoYesYes
Dimensions 6.61” L x 2.71” H4.4" (H) x 10.5" (L)4.4" (H) x 10.5" (L)4.4" (H) x 10.5" (L)4.4” H x 9.5” L4.4" H x 10.5" L
Form Factor Low-Profile PCIe6.61” L x 2.71” H (Low-profile)PCIePCIe
Lithography Samsung 8nm8nm4 nm NVIDIA Custom Process
Supplementary Power Connectors 8-pin CPU1x 8-pin CPU (EPS12V)1x 8-pin CPU (EPS12V)1x 16-pin PCIe CEM51x 16-pin1x 6-pin PCIe1x PCIe CEM5 16-pin
Max Graphics Card Power (W) 40-60 W | Configurable250W165W300W72W300W350W400W140W300W
Processor Ampere (GA104)NVIDIA Ada Lovelace
Memory Bandwidth 448 GB/sec960 GB/s
Graphics Resolution Max Digital Resolution: 7680 x 4320 x36 bpp at 60 Hz
Peak Single Precision FP32 Performance 91.1 TFLOPS
Deep Learning TFLOPS 153.4 TFLOPS
RT Core Performance 210.6 TFLOPS
DisplayPort Output 4x DisplayPort 1.4a4x DP 1.4a
Minimum Recommended Power, Single Card (W) 300W600
Minimum Recommended Power, 2-Way (W) 500750
Minimum Recommended Power, 3-Way (W) 850850
Minimum Recommended Power, 4-Way (W) 10001000
Thermal Solution Active HeatsinkBlower Active Fan
Slot Height Single Slot2-Slot
ActionSelectSelectSelectSelectSelectSelectSelectSelectSelectSelect