Product | NVIDIA® A40 GPU Computing Accelerator - 48GB GDDR6 - PCIe 4.0 x16 - Passive Cooling (w/o CEC) | NVIDIA® L40 ADA GPU Computing Accelerator - 48GB GDDR6 - PCIe 4.0 x16 - Passive Cooling | NVIDIA® L40S ADA GPU Computing Accelerator - 48GB GDDR6 - PCIe 4.0 x16 - Passive Cooling | NVIDIA® RTX A4000 - 16GB GDDR6 - PCIe 4.0 x16 - Active Cooling (4xDP) | NVIDIA® RTX A6000 - 48GB GDDR6 - PCIe 4.0 x16 - Active Cooling (4xDP) | NVIDIA® RTX 6000 Ada Generation - 48GB GDDR6 ECC - PCIe 4.0 x16 - Active Cooling (4xDP) |
Action | Select | Select | Select | Select | Select | Select |
Main Specifications | ||||||
Product Series | Nvidia A40 | Nvidia L40 | Nvidia L40S | |||
Core Type | NVIDIA TENSOR | NVIDIA TENSOR | NVIDIA TENSOR | |||
Host Interface | PCI Express 4.0 x16 | PCI Express 4.0 x16 | PCI Express 4.0 x16 | PCI Express 4.0 x16 | PCI Express 4.0 x16 | PCI Express 4.0 x16 |
GPU Architecture | Ampere | Ada Lovelace | Ada Lovelace | |||
Product Type | Workstation | Workstation | Workstation | |||
Product Line | NVIDIA Professional Graphics | NVIDIA Professional Graphics | NVIDIA Professional Graphics | |||
Memory Technology | GDDR6 with ECC | GDDR6 | GDDR6 | |||
Memory Capacity | 16 GB GDDR6 with ECC | 48 GB | 48 GB with ECC | |||
Max Displays | 4 Displays | 4 Displays | ||||
Detailed Specifications | ||||||
Streaming Processor Cores | 10752 CUDA Cores | 18,176 | 6144 CUDA Cores | 10752 Shading Units | 18,176 | |
NVIDIA Tensor Cores | 336 Tensor Cores | 568 | Gen 4 | 192 | 336 | 568 | |
NVIDIA RT Cores | 84 RT Cores | 142 | Gen 3 | 48 | 84 | 142 | |
Memory Clock Speed | 2000 MHz 16 Gbps effective | |||||
Memory Interface | 384-bit | 256-bit | 384-bit | 384-bit | ||
Memory Speeds (GT/s) | 14.5Gbps GDDR6 | |||||
Max Memory Size | 48 GB GDDR6 with error-correcting code (ECC) | 48 GB GDDR6 with ECC | 48GB GDDR6 with ECC | |||
Max Memory Bandwidth | 696 GB/s | 864 GB/s | ||||
INT8 Tensor Core | 733 teraFLOPS | |||||
TF32 Tensor Core | 183 teraFLOPS | |||||
FP32 | 91.6 teraFLOPS | |||||
Peak BFLOAT16 Tensor Core | 362.05 teraFLOPS | |||||
Peak FP16 Tensor Core | 362.05 teraFLOPS | |||||
Peak FP8 Tensor Core | 733 teraFLOPS | |||||
Peak INT4 Tensor Core | 733 teraFLOPS | |||||
Total NVLink Bandwidth | NVIDIA NVLink 112.5 GB/s (bidirectional) PCIe Gen4 16 GB/s | Not supported | ||||
Multi-Instance GPUs | No | |||||
Tensor Performance | 1457.0 TFLOPS | |||||
NVIDIA CUDA™ Technology | Yes | Yes | ||||
vGPU Software Support | NVIDIA vPC/vApps, NVIDIA RTX Virtual Workstation (vWS) | |||||
NVENC | NVDEC | 3x | 3x (Includes AV1 Encode & Decode) | 3x l 3x (includes AV1 encode and decode) | ||||
Secure Boot with Root of Trust | Yes | Yes | ||||
NEBS Ready | Yes / Level 3 | Level 3 | ||||
Transistor Count | 17.4 Billion | 28.3 Billion | 76.3 billion | |||
DisplayPort Connectors | 3x DisplayPort 1.4 A40 is configured for virtualization by default with physical display connectors disabled. The display outputs can be enabled via management software tools. | 4x DP 1.4a | 4x DisplayPort 1.4a | |||
Cooling | Passive | Passive | Passive | |||
Dual Slot | 2-slot Low-profile | Yes | ||||
Dimensions | 4.4" (H) x 10.5" (L) | 4.4" (H) x 10.5" (L) | 4.4" (H) x 10.5" (L) | 4.4” H x 9.5” L | 4.4" (H) x 10.5" (L) | 4.4" H x 10.5" L |
Form Factor | PCIe | |||||
Lithography | Samsung 8nm | 8nm | Samsung 8nm | 4 nm NVIDIA Custom Process | ||
Supplementary Power Connectors | 1x 8-pin CPU (EPS12V) | 1x 16-pin PCIe CEM5 | 1x 16-pin | 1x 6-pin PCIe | 1x 8-pin EPS | 1x PCIe CEM5 16-pin |
Max Graphics Card Power (W) | 300W | 300W | 350W | 140W | 300W | 300W |
Processor | Ampere (GA104) | Ampere (GA102) | NVIDIA Ada Lovelace | |||
Memory Bandwidth | 448 GB/sec | 768 GB/s | 960 GB/s | |||
Core Clock Speed | 1455 MHz Base Clock 1860 MHz Boost Clock | |||||
L2 Cache Size | 6 MB | |||||
API Support | CUDA 8.5, OpenCL 2.0 Shader Model 6.5, OpenGL 4.6, DirectX 12 Ultimate (12_2), Vulkan 1.2 | |||||
Texture Fill Rate | 625 GTexel/s | |||||
Graphics Resolution | Max Digital Resolution: 7680 x 4320 x36 bpp at 60 Hz | 7680 x 4320 x36 bpp at 60 Hz | ||||
Peak Double Precision FP64 Performance | 1,250 GFLOPS (1:32) | |||||
Peak Single Precision FP32 Performance | 38.7 TFLOPS | 91.1 TFLOPS | ||||
Peak Half Precision FP16 Performance | 40.00 TFLOPS (1:1) | |||||
Deep Learning TFLOPS | 153.4 TFLOPS | |||||
Multi-GPU Scalability | NVLINK 2-way low profile (2-slot and 3-slot bridges) connects 2x NVIDIA RTX A6000 | |||||
NVLink Interconnect | 112.5 GB/s (bidirectional) | |||||
RT Core Performance | 210.6 TFLOPS | |||||
VR Ready | Yes | |||||
Vulkan API | 1.2 | |||||
DisplayPort Output | 4x DisplayPort 1.4a | 4x DisplayPort 1.4a | 4x DP 1.4a | |||
Minimum Recommended Power, Single Card (W) | 300W | 700W | 600 | |||
Minimum Recommended Power, 2-Way (W) | 500 | 850 | 750 | |||
Minimum Recommended Power, 3-Way (W) | 850 | 1000 | 850 | |||
Minimum Recommended Power, 4-Way (W) | 1000 | 1200 | 1000 | |||
Thermal Solution | Active Heatsink | Active Heatsink | Blower Active Fan | |||
Slot Height | Single Slot | 2-Slot | 2-Slot | |||
Action | Select | Select | Select | Select | Select | Select |