Product	AMD Instinct™ MI210 Accelerator - 64GB HBM2e - PCIe 4.0 x16 - Passive Cooling	NVIDIA® A30 GPU Computing Accelerator - 24GB HBM2 - PCIe 4.0 x16 - Passive Cooler	NVIDIA® A40 GPU Computing Accelerator - 48GB GDDR6 - PCIe 4.0 x16 - Passive Cooling	NVIDIA® L4 ADA GPU Computing Accelerator - 24GB GDDR6X - PCIe 4.0 x16 - Passive Cooling	NVIDIA® L40S ADA GPU Computing Accelerator - 48GB GDDR6 - PCIe 4.0 x16 - Passive Cooling	NVIDIA® RTX A6000 - 48GB GDDR6 - PCIe 4.0 x16 - Active Cooling (4xDP)
Action	Select	Select	Select	Select	Select	Select
Main Specifications
Product Series	AMD Instinct	Nvidia A30	Nvidia A40	Nvidia L4	Nvidia L40S
Core Type		NVIDIA TENSOR	NVIDIA TENSOR	NVIDIA TENSOR	NVIDIA TENSOR
Core Clock Speed	1700 MHz			795 MHz Base \| 2040 MHz Boost
Host Interface	PCI Express 4.0 x16	PCI Express 4.0 x16	PCI Express 4.0 x16	PCI Express 4.0 x16	PCI Express 4.0 x16	PCI Express 4.0 x16
GPU Architecture	CDNA2	Ampere	Ampere	Ada Lovelace	Ada Lovelace
Product Type						Workstation
Product Line						NVIDIA Professional Graphics
Memory Technology						GDDR6
Memory Capacity						48 GB
Detailed Specifications
Streaming Processor Cores	6656		10752 CUDA Cores		18,176	10752 Shading Units
Compute Units	104
NVIDIA Tensor Cores			336 Tensor Cores		568 \| Gen 4	336
NVIDIA RT Cores			84 RT Cores		142 \| Gen 3	84
Memory Clock Speed	1.6 GHz			6251 MHz		2000 MHz 16 Gbps effective
Memory Interface	4096-bit		384-bit	192-bit		384-bit
Memory Speeds (GT/s)			14.5Gbps GDDR6
Max Memory Size	64 GB HBM2e	24 GB HBM2	48 GB GDDR6 with error-correcting code (ECC)	24 GB	48GB GDDR6 with ECC
Max Memory Bandwidth	Up to 1638.4 GB/s	933 GB/s	696 GB/s	300 GB/s	864 GB/s
Infinity Fabric™ Links	3
Peak Infinity Fabric™ Link Bandwidth	100 GB/s
Peak FP64		5.2 teraFLOPS
Peak FP64 Tensor Core		10.3 teraFLOPS
INT8 Tensor Core		330 TOPS \| 661 TOPS		485 TOPS \| Sparsity	733 teraFLOPS
TF32 Tensor Core		82 teraFLOPS \| 165 teraFLOPS		120 TFLOPS \| Sparsity	183 teraFLOPS
FP32	22.6 TFLOPs	10.3 teraFLOPS		30.3 TFLOPS	91.6 teraFLOPS
Peak BFLOAT16 Tensor Core		165 teraFLOPS \| 330 teraFLOPS		242 TFLOPS \| Sparsity	362.05 teraFLOPS
Peak FP16 Tensor Core		165 teraFLOPS \| 330 teraFLOPS		242 TFLOPS \| Sparsity	362.05 teraFLOPS
Peak FP8 Tensor Core				485 TFLOPS \| Sparsity	733 teraFLOPS
Peak INT4 Tensor Core		661 TOPS \| 1321 TOPS			733 teraFLOPS
Total NVLink Bandwidth		Third-gen NVLINK: 200GB/s	NVIDIA NVLink 112.5 GB/s (bidirectional) PCIe Gen4 16 GB/s		Not supported
Multi-Instance GPUs					No
NVIDIA CUDA™ Technology						Yes
NVENC \| NVDEC				2 \| 4 \| 4 \| JPEG Decoders \| AV1 Encode and Decode	3x l 3x (includes AV1 encode and decode)
Secure Boot with Root of Trust				Yes	Yes
NEBS Ready				Yes \| Level 3	Level 3
Peak Single Precision Matrix (FP32) Performance	45.3 TFLOPs
Peak Double Precision Matrix (FP64) Performance	45.3 TFLOPs
Peak Double Precision (FP64) Performance	22.6 TFLOPs
Peak INT4 Performance	181 TOPs
Peak bfloat16	181 TFLOPs
ECC Protection	Yes (Full-Chip)			On by Default
Transistor Count						28.3 Billion
DisplayPort Connectors			3x DisplayPort 1.4 A40 is configured for virtualization by default with physical display connectors disabled. The display outputs can be enabled via management software tools.	None \| vGPU Only	4x DisplayPort 1.4a
OS Support	Linux x86_64
Cooling	Passive		Passive	Passive	Passive
Dual Slot	yes	Dual-slot	2-slot Low-profile	No
Dimensions	10.5" (267 mm) Board Length		4.4" (H) x 10.5" (L)		4.4" (H) x 10.5" (L)	4.4" (H) x 10.5" (L)
Form Factor	Full Height			6.61” L x 2.71” H (Low-profile)
Lithography			Samsung 8nm			Samsung 8nm
Supplementary Power Connectors	1x8 pin 12V EPS	1x 8-pin CPU (EPS12V)	1x 8-pin CPU (EPS12V)		1x 16-pin	1x 8-pin EPS
Max Graphics Card Power (W)	300W Peak	165W	300W	72W	350W	300W
Processor						Ampere (GA102)
Memory Bandwidth						768 GB/s
Core Clock Speed						1455 MHz Base Clock 1860 MHz Boost Clock
L2 Cache Size						6 MB
API Support						CUDA 8.5, OpenCL 2.0 Shader Model 6.5, OpenGL 4.6, DirectX 12 Ultimate (12_2), Vulkan 1.2
Texture Fill Rate						625 GTexel/s
Graphics Resolution						7680 x 4320 x36 bpp at 60 Hz
Peak Double Precision FP64 Performance						1,250 GFLOPS (1:32)
Peak Single Precision FP32 Performance						38.7 TFLOPS
Peak Half Precision FP16 Performance						40.00 TFLOPS (1:1)
Multi-GPU Scalability						NVLINK 2-way low profile (2-slot and 3-slot bridges) connects 2x NVIDIA RTX A6000
NVLink Interconnect						112.5 GB/s (bidirectional)
VR Ready						Yes
Vulkan API						1.2
DisplayPort Output						4x DisplayPort 1.4a
Minimum Recommended Power, Single Card (W)						700W
Minimum Recommended Power, 2-Way (W)						850
Minimum Recommended Power, 3-Way (W)						1000
Minimum Recommended Power, 4-Way (W)						1200
Thermal Solution						Active Heatsink
Slot Height						2-Slot
Action	Select	Select	Select	Select	Select	Select