Product | NVIDIA® A16 GPU Computing Accelerator - 64GB (4x 16GB) GDDR6 - PCIe 4.0 x16 - Passive Cooler | NVIDIA® A30 GPU Computing Accelerator - 24GB HBM2 - PCIe 4.0 x16 - Passive Cooler | NVIDIA® A40 GPU Computing Accelerator - 48GB GDDR6 - PCIe 4.0 x16 - Passive Cooling (w/o CEC) | NVIDIA® L4 ADA GPU Computing Accelerator - 24GB GDDR6X - PCIe 4.0 x16 - Passive Cooling | NVIDIA® L40 ADA GPU Computing Accelerator - 48GB GDDR6 - PCIe 4.0 x16 - Passive Cooling | NVIDIA® T1000 8GB GDDR6 - PCIe 3.0 x16 - Active Cooling (4x mDP) | NVIDIA® RTX A6000 - 48GB GDDR6 - PCIe 4.0 x16 - Active Cooling (4xDP) |
Action | Select | Select | Select | Select | Select | Select | Select |
Main Specifications | |||||||
Product Series | Nvidia A16 | Nvidia A30 | Nvidia A40 | Nvidia L4 | Nvidia L40 | ||
Core Type | NVIDIA TENSOR | NVIDIA TENSOR | NVIDIA TENSOR | NVIDIA TENSOR | NVIDIA TENSOR | ||
Core Clock Speed | 795 MHz Base | 2040 MHz Boost | ||||||
Host Interface | PCI Express 4.0 x16 | PCI Express 4.0 x16 | PCI Express 4.0 x16 | PCI Express 4.0 x16 | PCI Express 4.0 x16 | PCI Express 3.0 x16 | PCI Express 4.0 x16 |
GPU Architecture | Ampere | Ampere | Ampere | Ada Lovelace | Ada Lovelace | ||
Product Type | Workstation | Workstation | |||||
Product Line | NVIDIA Professional Graphics | ||||||
Memory Technology | GDDR6 | GDDR6 | |||||
Memory Capacity | 8 GB | 48 GB | |||||
Max Displays | 4 Displays | ||||||
Detailed Specifications | |||||||
Streaming Processor Cores | 10752 CUDA Cores | 896 CUDA Parallel-Processing Cores | 10752 Shading Units | ||||
NVIDIA Tensor Cores | 336 Tensor Cores | 336 | |||||
NVIDIA RT Cores | 84 RT Cores | 84 | |||||
Memory Clock Speed | 6251 MHz | 2000 MHz 16 Gbps effective | |||||
Memory Interface | 384-bit | 192-bit | 128-bit | 384-bit | |||
Memory Speeds (GT/s) | 14.5Gbps GDDR6 | ||||||
Max Memory Size | 4x 16GB GDDR6 with error-correcting code (ECC) | 24 GB HBM2 | 48 GB GDDR6 with error-correcting code (ECC) | 24 GB | 48 GB GDDR6 with ECC | ||
Max Memory Bandwidth | 4x 232GB/s | 933 GB/s | 696 GB/s | 300 GB/s | |||
Peak FP64 | 5.2 teraFLOPS | ||||||
Peak FP64 Tensor Core | 10.3 teraFLOPS | ||||||
INT8 Tensor Core | 330 TOPS | 661 TOPS | 485 TOPS | Sparsity | |||||
TF32 Tensor Core | 82 teraFLOPS | 165 teraFLOPS | 120 TFLOPS | Sparsity | |||||
FP32 | 10.3 teraFLOPS | 30.3 TFLOPS | |||||
Peak BFLOAT16 Tensor Core | 165 teraFLOPS | 330 teraFLOPS | 242 TFLOPS | Sparsity | |||||
Peak FP16 Tensor Core | 165 teraFLOPS | 330 teraFLOPS | 242 TFLOPS | Sparsity | |||||
Peak FP8 Tensor Core | 485 TFLOPS | Sparsity | ||||||
Peak INT4 Tensor Core | 661 TOPS | 1321 TOPS | ||||||
Total NVLink Bandwidth | Third-gen NVLINK: 200GB/s | NVIDIA NVLink 112.5 GB/s (bidirectional) PCIe Gen4 16 GB/s | |||||
NVIDIA CUDA™ Technology | Yes | ||||||
vGPU Software Support | NVIDIA vPC/vApps, NVIDIA RTX Virtual Workstation (vWS) | ||||||
NVENC | NVDEC | 2 | 4 | 4 | JPEG Decoders | AV1 Encode and Decode | 3x | 3x (Includes AV1 Encode & Decode) | |||||
Secure Boot with Root of Trust | Yes | Yes | |||||
NEBS Ready | Yes | Level 3 | Yes / Level 3 | |||||
ECC Protection | On by Default | ||||||
Transistor Count | 28.3 Billion | ||||||
DisplayPort Connectors | 3x DisplayPort 1.4 A40 is configured for virtualization by default with physical display connectors disabled. The display outputs can be enabled via management software tools. | None | vGPU Only | 4x DP 1.4a | ||||
Cooling | Passive | Passive | Passive | Passive | |||
Dual Slot | Dual-slot | Dual-slot | 2-slot Low-profile | No | Yes | ||
Dimensions | 4.4" (H) x 10.5" (L) | 4.4" (H) x 10.5" (L) | 2.713” H x 6.137” L | 4.4" (H) x 10.5" (L) | |||
Form Factor | 6.61” L x 2.71” H (Low-profile) | PCIe | |||||
Lithography | Samsung 8nm | Samsung 8nm | |||||
Supplementary Power Connectors | 8-pin CPU | 1x 8-pin CPU (EPS12V) | 1x 8-pin CPU (EPS12V) | 1x 16-pin PCIe CEM5 | 1x 8-pin EPS | ||
Max Graphics Card Power (W) | 250W | 165W | 300W | 72W | 300W | 50W | 300W |
Processor | NVIDIA Turing | Ampere (GA102) | |||||
Memory Bandwidth | 160 GB/s | 768 GB/s | |||||
Core Clock Speed | 1455 MHz Base Clock 1860 MHz Boost Clock | ||||||
L2 Cache Size | 6 MB | ||||||
API Support | CUDA C, CUDA C++, DirectCompute 5.0, OpenCL, Java, Python, and Fortran Shader Model 5.1 (OpenGL 4.5 and DirectX 12) | CUDA 8.5, OpenCL 2.0 Shader Model 6.5, OpenGL 4.6, DirectX 12 Ultimate (12_2), Vulkan 1.2 | |||||
Texture Fill Rate | 625 GTexel/s | ||||||
Graphics Resolution | Max Digital Resolution: 7680 x 4320 at 60 Hz | 7680 x 4320 x36 bpp at 60 Hz | |||||
Peak Double Precision FP64 Performance | 1,250 GFLOPS (1:32) | ||||||
Peak Single Precision FP32 Performance | 2.50 TFLOPS | 38.7 TFLOPS | |||||
Peak Half Precision FP16 Performance | 40.00 TFLOPS (1:1) | ||||||
Multi-GPU Scalability | NVLINK 2-way low profile (2-slot and 3-slot bridges) connects 2x NVIDIA RTX A6000 | ||||||
NVLink Interconnect | 112.5 GB/s (bidirectional) | ||||||
VR Ready | Yes | ||||||
Vulkan API | 1.2 | ||||||
DisplayPort Output | 4x DisplayPort 1.4a | ||||||
Mini DisplayPort Output | x4 | ||||||
Minimum Recommended Power, Single Card (W) | 700W | ||||||
Minimum Recommended Power, 2-Way (W) | 850 | ||||||
Minimum Recommended Power, 3-Way (W) | 1000 | ||||||
Minimum Recommended Power, 4-Way (W) | 1200 | ||||||
Thermal Solution | Ultra-quiet active fansink | Active Heatsink | |||||
Slot Height | Low-Profile Single Slot | 2-Slot | |||||
Action | Select | Select | Select | Select | Select | Select | Select |