Product	NVIDIA® A2 GPU Computing Accelerator - 16GB GDDR6 - PCIe 4.0 x8 - Passive Cooler	NVIDIA® A10 GPU Computing Accelerator - 24GB GDDR6 - PCIe 4.0 x16 - Passive Cooler (w/o CEC)	NVIDIA® A16 GPU Computing Accelerator - 64GB (4x 16GB) GDDR6 - PCIe 4.0 x16 - Passive Cooler	NVIDIA® A30 GPU Computing Accelerator - 24GB HBM2 - PCIe 4.0 x16 - Passive Cooler	NVIDIA® A40 GPU Computing Accelerator - 48GB GDDR6 - PCIe 4.0 x16 - Passive Cooling	NVIDIA® L40S ADA GPU Computing Accelerator - 48GB GDDR6 - PCIe 4.0 x16 - Passive Cooling	NVIDIA® RTX A6000 - 48GB GDDR6 - PCIe 4.0 x16 - Active Cooling (4xDP)
Action	Select	Select	Select	Select	Select	Select	Select
Main Specifications
Product Series	Nvidia A2	Nvidia A10	Nvidia A16	Nvidia A30	Nvidia A40	Nvidia L40S
Core Type	NVIDIA TENSOR	NVIDIA TENSOR	NVIDIA TENSOR	NVIDIA TENSOR	NVIDIA TENSOR	NVIDIA TENSOR
Core Clock Speed	1440 MHz (1770 MHz Boost Clock)	885 MHz (1695 MHz Boost Clock)
Host Interface	PCI Express 4.0 x8	PCI Express 4.0 x16 64GB/s	PCI Express 4.0 x16	PCI Express 4.0 x16	PCI Express 4.0 x16	PCI Express 4.0 x16	PCI Express 4.0 x16
GPU Architecture	Ampere	Ampere	Ampere	Ampere	Ampere	Ada Lovelace
Product Type							Workstation
Product Line							NVIDIA Professional Graphics
Memory Technology							GDDR6
Memory Capacity							48 GB
Detailed Specifications
Streaming Processor Cores	1280 CUDA Cores				10752 CUDA Cores	18,176	10752 Shading Units
NVIDIA Tensor Cores	40 \| Gen 3				336 Tensor Cores	568 \| Gen 4	336
NVIDIA RT Cores	10 \| Gen 2	72 RT Cores			84 RT Cores	142 \| Gen 3	84
Memory Clock Speed	6251 MHz	1563 MHz					2000 MHz 16 Gbps effective
Memory Interface	128-bit				384-bit		384-bit
Memory Speeds (GT/s)					14.5Gbps GDDR6
Max Memory Size	16 GB GDDR6 ECC	24 GB GDDR6	4x 16GB GDDR6 with error-correcting code (ECC)	24 GB HBM2	48 GB GDDR6 with error-correcting code (ECC)	48GB GDDR6 with ECC
Max Memory Bandwidth	200 GB/s	600 GB/s	4x 232GB/s	933 GB/s	696 GB/s	864 GB/s
Peak FP64				5.2 teraFLOPS
Peak FP64 Tensor Core				10.3 teraFLOPS
INT8 Tensor Core		250 TOPS \| 500 TOPS		330 TOPS \| 661 TOPS		733 teraFLOPS
TF32 Tensor Core	9 TFLOPS \| 18 TFLOPS Sparsity	62.5 teraFLOPS \| 125 teraFLOPS		82 teraFLOPS \| 165 teraFLOPS		183 teraFLOPS
FP32	4.5 TFLOPS	31.2 teraFLOPS		10.3 teraFLOPS		91.6 teraFLOPS
Peak BFLOAT16 Tensor Core		125 teraFLOPS \| 250 teraFLOPS		165 teraFLOPS \| 330 teraFLOPS		362.05 teraFLOPS
Peak FP16 Tensor Core	18 TFLOPS \| 36 TFLOPS Sparsity	125 teraFLOPS \| 250 teraFLOPS		165 teraFLOPS \| 330 teraFLOPS		362.05 teraFLOPS
Peak FP8 Tensor Core						733 teraFLOPS
Peak INT4 Tensor Core		500 TOPS \| 1,000 TOPS		661 TOPS \| 1321 TOPS		733 teraFLOPS
Total NVLink Bandwidth				Third-gen NVLINK: 200GB/s	NVIDIA NVLink 112.5 GB/s (bidirectional) PCIe Gen4 16 GB/s	Not supported
Multi-Instance GPUs						No
NVIDIA CUDA™ Technology	11.1 or later						Yes
NVENC \| NVDEC						3x l 3x (includes AV1 encode and decode)
Secure Boot with Root of Trust						Yes
NEBS Ready						Level 3
Peak INT4 Performance	72 TOPS \| 144 TOPS Sparsity
Peak INT8 Performance	36 TOPS \| 72 TOPS Sparsity
ECC Protection	On by Default
Transistor Count							28.3 Billion
DisplayPort Connectors					3x DisplayPort 1.4 A40 is configured for virtualization by default with physical display connectors disabled. The display outputs can be enabled via management software tools.	4x DisplayPort 1.4a
Cooling	Passive	Passive	Passive		Passive	Passive
Dual Slot	Single-slot	Single-slot	Dual-slot	Dual-slot	2-slot Low-profile
Dimensions	6.61” L x 2.71” H	FHFL			4.4" (H) x 10.5" (L)	4.4" (H) x 10.5" (L)	4.4" (H) x 10.5" (L)
Form Factor	Low-Profile PCIe
Lithography		8 nm			Samsung 8nm		Samsung 8nm
Supplementary Power Connectors		None	8-pin CPU	1x 8-pin CPU (EPS12V)	1x 8-pin CPU (EPS12V)	1x 16-pin	1x 8-pin EPS
Max Graphics Card Power (W)	40-60 W \| Configurable	150W	250W	165W	300W	350W	300W
Processor							Ampere (GA102)
Memory Bandwidth							768 GB/s
Core Clock Speed							1455 MHz Base Clock 1860 MHz Boost Clock
L2 Cache Size							6 MB
API Support							CUDA 8.5, OpenCL 2.0 Shader Model 6.5, OpenGL 4.6, DirectX 12 Ultimate (12_2), Vulkan 1.2
Texture Fill Rate							625 GTexel/s
Graphics Resolution							7680 x 4320 x36 bpp at 60 Hz
Peak Double Precision FP64 Performance							1,250 GFLOPS (1:32)
Peak Single Precision FP32 Performance							38.7 TFLOPS
Peak Half Precision FP16 Performance							40.00 TFLOPS (1:1)
Multi-GPU Scalability							NVLINK 2-way low profile (2-slot and 3-slot bridges) connects 2x NVIDIA RTX A6000
NVLink Interconnect							112.5 GB/s (bidirectional)
VR Ready							Yes
Vulkan API							1.2
DisplayPort Output							4x DisplayPort 1.4a
Minimum Recommended Power, Single Card (W)							700W
Minimum Recommended Power, 2-Way (W)							850
Minimum Recommended Power, 3-Way (W)							1000
Minimum Recommended Power, 4-Way (W)							1200
Thermal Solution							Active Heatsink
Slot Height							2-Slot
Action	Select	Select	Select	Select	Select	Select	Select