| Product | NVIDIA® A2 GPU Computing Accelerator - 16GB GDDR6 - PCIe 4.0 x8 - Passive Cooler (w/o CEC) | NVIDIA® L4 ADA GPU Computing Accelerator - 24GB GDDR6X - PCIe 4.0 x16 - Passive Cooling | NVIDIA® L40 ADA GPU Computing Accelerator - 48GB GDDR6 - PCIe 4.0 x16 - Passive Cooling | NVIDIA® L40S ADA GPU Computing Accelerator - 48GB GDDR6 - PCIe 4.0 x16 - Passive Cooling | NVIDIA® T1000 8GB GDDR6 - PCIe 3.0 x16 - Active Cooling (4x mDP) | NVIDIA® RTX 4000 SFF Ada Generation - 20GB GDDR6 ECC - PCIe 4.0 x16 - Active Cooling (4x mDP) |
| Action | Select | Select | Select | Select | Select | Select |
| Main Specifications | ||||||
| Product Series | Nvidia A2 | Nvidia L4 | Nvidia L40 | Nvidia L40S | ||
| Core Type | NVIDIA TENSOR | NVIDIA TENSOR | NVIDIA TENSOR | NVIDIA TENSOR | ||
| Core Clock Speed | 1440 MHz (1770 MHz Boost Clock) | 795 MHz Base | 2040 MHz Boost | ||||
| Host Interface | PCI Express 4.0 x8 | PCI Express 4.0 x16 | PCI Express 4.0 x16 | PCI Express 4.0 x16 | PCI Express 3.0 x16 | PCI Express 4.0 x16 |
| GPU Architecture | Ampere | Ada Lovelace | Ada Lovelace | Ada Lovelace | ||
| Product Type | Workstation | Workstation | ||||
| Product Line | NVIDIA Professional Graphics | |||||
| Memory Technology | GDDR6 | GDDR6 | ||||
| Memory Capacity | 8 GB | 20 GB GDDR6 ECC | ||||
| Max Displays | 4 Displays | |||||
| Detailed Specifications | ||||||
| Streaming Processor Cores | 1280 CUDA Cores | 18,176 | 896 CUDA Parallel-Processing Cores | 6144 CUDA Cores | ||
| NVIDIA Tensor Cores | 40 | Gen 3 | 568 | Gen 4 | 192 | Gen 4 | |||
| NVIDIA RT Cores | 10 | Gen 2 | 142 | Gen 3 | 48 | Gen 3 | |||
| Memory Clock Speed | 6251 MHz | 6251 MHz | ||||
| Memory Interface | 128-bit | 192-bit | 128-bit | 160-bit | ||
| Max Memory Size | 16 GB GDDR6 ECC | 24 GB | 48 GB GDDR6 with ECC | 48GB GDDR6 with ECC | ||
| Max Memory Bandwidth | 200 GB/s | 300 GB/s | 864 GB/s | |||
| ECC Protection | On by Default | On by Default | ||||
| INT8 Tensor Core | 485 TOPS | Sparsity | 733 teraFLOPS | ||||
| TF32 Tensor Core | 9 TFLOPS | 18 TFLOPS Sparsity | 120 TFLOPS | Sparsity | 183 teraFLOPS | |||
| FP32 | 4.5 TFLOPS | 30.3 TFLOPS | 91.6 teraFLOPS | |||
| Peak BFLOAT16 Tensor Core | 242 TFLOPS | Sparsity | 362.05 teraFLOPS | ||||
| Peak FP16 Tensor Core | 18 TFLOPS | 36 TFLOPS Sparsity | 242 TFLOPS | Sparsity | 362.05 teraFLOPS | |||
| Peak FP8 Tensor Core | 485 TFLOPS | Sparsity | 733 teraFLOPS | ||||
| Peak INT4 Tensor Core | 733 teraFLOPS | |||||
| Total NVLink Bandwidth | Not supported | |||||
| Multi-Instance GPUs | No | |||||
| Tensor Performance | 306.8 TFLOPS | |||||
| NVIDIA CUDA™ Technology | 11.1 or later | Yes | ||||
| vGPU Software Support | NVIDIA vPC/vApps, NVIDIA RTX Virtual Workstation (vWS) | |||||
| NVENC | NVDEC | 2 | 4 | 4 | JPEG Decoders | AV1 Encode and Decode | 3x | 3x (Includes AV1 Encode & Decode) | 3x l 3x (includes AV1 encode and decode) | |||
| Secure Boot with Root of Trust | Yes | Yes | Yes | |||
| NEBS Ready | Yes | Level 3 | Yes / Level 3 | Level 3 | |||
| Peak INT4 Performance | 72 TOPS | 144 TOPS Sparsity | |||||
| Peak INT8 Performance | 36 TOPS | 72 TOPS Sparsity | |||||
| Transistor Count | 35.8 Billion | |||||
| DisplayPort Connectors | None | vGPU Only | 4x DP 1.4a | 4x DisplayPort 1.4a | |||
| Cooling | Passive | Passive | Passive | Passive | ||
| Dual Slot | Single-slot | No | Yes | |||
| Dimensions | 6.61” L x 2.71” H | 4.4" (H) x 10.5" (L) | 4.4" (H) x 10.5" (L) | 2.713” H x 6.137” L | 2.7” H x 6.6”L | |
| Form Factor | Low-Profile PCIe | 6.61” L x 2.71” H (Low-profile) | PCIe | |||
| Supplementary Power Connectors | 1x 16-pin PCIe CEM5 | 1x 16-pin | No Auxiliary Power Required | |||
| Max Graphics Card Power (W) | 40-60 W | Configurable | 72W | 300W | 350W | 50W | 70W |
| Processor | NVIDIA Turing | NVIDIA Ada Lovelace | ||||
| Memory Bandwidth | 160 GB/s | 320 GB/s | ||||
| API Support | CUDA C, CUDA C++, DirectCompute 5.0, OpenCL, Java, Python, and Fortran Shader Model 5.1 (OpenGL 4.5 and DirectX 12) | |||||
| Graphics Resolution | Max Digital Resolution: 7680 x 4320 at 60 Hz | |||||
| Peak Single Precision FP32 Performance | 2.50 TFLOPS | 19.2 TFLOPS | ||||
| RT Core Performance | 44.3 TFLOPS | |||||
| VR Ready | Yes | |||||
| Mini DisplayPort Output | x4 | 4x mDP 1.4a | ||||
| Thermal Solution | Ultra-quiet active fansink | Active Heatsink | ||||
| Slot Height | Low-Profile Single Slot | Low Profile Dual Slot | ||||
| Action | Select | Select | Select | Select | Select | Select |