Product	AMD Instinct™ MI100 Accelerator - 32GB HBM2 - PCIe 4.0 x16 - Passive Cooling	NVIDIA® A10 GPU Computing Accelerator - 24GB GDDR6 - PCIe 4.0 x16 - Passive Cooler (w/o CEC)	NVIDIA® A40 GPU Computing Accelerator - 48GB GDDR6 - PCIe 4.0 x16 - Passive Cooling	NVIDIA® L40 ADA GPU Computing Accelerator - 48GB GDDR6 - PCIe 4.0 x16 - Passive Cooling	NVIDIA® L40S ADA GPU Computing Accelerator - 48GB GDDR6 - PCIe 4.0 x16 - Passive Cooling
Action	Select	Select	Select	Select	Select
Main Specifications
Product Series	AMD Instinct	Nvidia A10	Nvidia A40	Nvidia L40	Nvidia L40S
Core Type		NVIDIA TENSOR	NVIDIA TENSOR	NVIDIA TENSOR	NVIDIA TENSOR
Core Clock Speed	1502 MHz	885 MHz (1695 MHz Boost Clock)
Host Interface	PCI Express 4.0 x16	PCI Express 4.0 x16 64GB/s	PCI Express 4.0 x16	PCI Express 4.0 x16	PCI Express 4.0 x16
GPU Architecture	CDNA	Ampere	Ampere	Ada Lovelace	Ada Lovelace
Detailed Specifications
Streaming Processor Cores	7,680		10752 CUDA Cores		18,176
Compute Units	120
NVIDIA Tensor Cores			336 Tensor Cores		568 \| Gen 4
NVIDIA RT Cores		72 RT Cores	84 RT Cores		142 \| Gen 3
Memory Clock Speed	1.2 GHz	1563 MHz
Memory Interface	4096-bit		384-bit
Memory Speeds (GT/s)			14.5Gbps GDDR6
Max Memory Size	32 GB HBM2	24 GB GDDR6	48 GB GDDR6 with error-correcting code (ECC)	48 GB GDDR6 with ECC	48GB GDDR6 with ECC
Max Memory Bandwidth	Up to 1228.8 GB/s	600 GB/s	696 GB/s		864 GB/s
Infinity Fabric™ Links	3
Peak Infinity Fabric™ Link Bandwidth	92 GB/s
INT8 Tensor Core		250 TOPS \| 500 TOPS			733 teraFLOPS
TF32 Tensor Core		62.5 teraFLOPS \| 125 teraFLOPS			183 teraFLOPS
FP32		31.2 teraFLOPS			91.6 teraFLOPS
Peak BFLOAT16 Tensor Core		125 teraFLOPS \| 250 teraFLOPS			362.05 teraFLOPS
Peak FP16 Tensor Core		125 teraFLOPS \| 250 teraFLOPS			362.05 teraFLOPS
Peak FP8 Tensor Core					733 teraFLOPS
Peak INT4 Tensor Core		500 TOPS \| 1,000 TOPS			733 teraFLOPS
Total NVLink Bandwidth			NVIDIA NVLink 112.5 GB/s (bidirectional) PCIe Gen4 16 GB/s		Not supported
Multi-Instance GPUs					No
vGPU Software Support				NVIDIA vPC/vApps, NVIDIA RTX Virtual Workstation (vWS)
NVENC \| NVDEC				3x \| 3x (Includes AV1 Encode & Decode)	3x l 3x (includes AV1 encode and decode)
Secure Boot with Root of Trust				Yes	Yes
NEBS Ready				Yes / Level 3	Level 3
Peak Half Precision (FP16) Performance	184.6 TFLOPs
Peak Single Precision Matrix (FP32) Performance	46.1 TFLOPs
Peak Single Precision (FP32) Performance	23.1 TFLOPs
Peak Double Precision (FP64) Performance	11.5 TFLOPs
Peak INT4 Performance	184.6 TOPs
Peak INT8 Performance	184.6 TOPs
Peak bfloat16	92.3 TFLOPs
DisplayPort Connectors			3x DisplayPort 1.4 A40 is configured for virtualization by default with physical display connectors disabled. The display outputs can be enabled via management software tools.	4x DP 1.4a	4x DisplayPort 1.4a
OS Support	Linux x86_64
Cooling	Passive	Passive	Passive	Passive	Passive
Dual Slot	yes	Single-slot	2-slot Low-profile	Yes
Dimensions	10.5" (267 mm) Board Length	FHFL	4.4" (H) x 10.5" (L)	4.4" (H) x 10.5" (L)	4.4" (H) x 10.5" (L)
Form Factor				PCIe
Lithography	TSMC 7nm FinFET	8 nm	Samsung 8nm
Supplementary Power Connectors	2x PCIe 8-pin connectors	None	1x 8-pin CPU (EPS12V)	1x 16-pin PCIe CEM5	1x 16-pin
Max Graphics Card Power (W)	300W	150W	300W	300W	350W
Action	Select	Select	Select	Select	Select