Product	AMD Instinct™ MI100 Accelerator - 32GB HBM2 - PCIe 4.0 x16 - Passive Cooling	AMD Instinct™ MI210 Accelerator - 64GB HBM2e - PCIe 4.0 x16 - Passive Cooling	NVIDIA® A2 GPU Computing Accelerator - 16GB GDDR6 - PCIe 4.0 x8 - Passive Cooler (w/o CEC)	NVIDIA® A10 GPU Computing Accelerator - 24GB GDDR6 - PCIe 4.0 x16 - Passive Cooler (w/o CEC)	NVIDIA® A16 GPU Computing Accelerator - 64GB (4x 16GB) GDDR6 - PCIe 4.0 x16 - Passive Cooler	NVIDIA® A30 GPU Computing Accelerator - 24GB HBM2 - PCIe 4.0 x16 - Passive Cooler	NVIDIA® A40 GPU Computing Accelerator - 48GB GDDR6 - PCIe 4.0 x16 - Passive Cooling	NVIDIA® L40S ADA GPU Computing Accelerator - 48GB GDDR6 - PCIe 4.0 x16 - Passive Cooling	NVIDIA® RTX A6000 - 48GB GDDR6 - PCIe 4.0 x16 - Active Cooling (4xDP)
Action	Select	Select	Select	Select	Select	Select	Select	Select	Select
Main Specifications
Product Series	AMD Instinct	AMD Instinct	Nvidia A2	Nvidia A10	Nvidia A16	Nvidia A30	Nvidia A40	Nvidia L40S
Core Type			NVIDIA TENSOR	NVIDIA TENSOR	NVIDIA TENSOR	NVIDIA TENSOR	NVIDIA TENSOR	NVIDIA TENSOR
Core Clock Speed	1502 MHz	1700 MHz	1440 MHz (1770 MHz Boost Clock)	885 MHz (1695 MHz Boost Clock)
Host Interface	PCI Express 4.0 x16	PCI Express 4.0 x16	PCI Express 4.0 x8	PCI Express 4.0 x16 64GB/s	PCI Express 4.0 x16	PCI Express 4.0 x16	PCI Express 4.0 x16	PCI Express 4.0 x16	PCI Express 4.0 x16
GPU Architecture	CDNA	CDNA2	Ampere	Ampere	Ampere	Ampere	Ampere	Ada Lovelace
Product Type									Workstation
Product Line									NVIDIA Professional Graphics
Memory Technology									GDDR6
Memory Capacity									48 GB
Detailed Specifications
Streaming Processor Cores	7,680	6656	1280 CUDA Cores				10752 CUDA Cores	18,176	10752 Shading Units
Compute Units	120	104
NVIDIA Tensor Cores			40 \| Gen 3				336 Tensor Cores	568 \| Gen 4	336
NVIDIA RT Cores			10 \| Gen 2	72 RT Cores			84 RT Cores	142 \| Gen 3	84
Memory Clock Speed	1.2 GHz	1.6 GHz	6251 MHz	1563 MHz					2000 MHz 16 Gbps effective
Memory Interface	4096-bit	4096-bit	128-bit				384-bit		384-bit
Memory Speeds (GT/s)							14.5Gbps GDDR6
Max Memory Size	32 GB HBM2	64 GB HBM2e	16 GB GDDR6 ECC	24 GB GDDR6	4x 16GB GDDR6 with error-correcting code (ECC)	24 GB HBM2	48 GB GDDR6 with error-correcting code (ECC)	48GB GDDR6 with ECC
Max Memory Bandwidth	Up to 1228.8 GB/s	Up to 1638.4 GB/s	200 GB/s	600 GB/s	4x 232GB/s	933 GB/s	696 GB/s	864 GB/s
Infinity Fabric™ Links	3	3
Peak Infinity Fabric™ Link Bandwidth	92 GB/s	100 GB/s
Peak FP64						5.2 teraFLOPS
Peak FP64 Tensor Core						10.3 teraFLOPS
INT8 Tensor Core				250 TOPS \| 500 TOPS		330 TOPS \| 661 TOPS		733 teraFLOPS
TF32 Tensor Core			9 TFLOPS \| 18 TFLOPS Sparsity	62.5 teraFLOPS \| 125 teraFLOPS		82 teraFLOPS \| 165 teraFLOPS		183 teraFLOPS
FP32		22.6 TFLOPs	4.5 TFLOPS	31.2 teraFLOPS		10.3 teraFLOPS		91.6 teraFLOPS
Peak BFLOAT16 Tensor Core				125 teraFLOPS \| 250 teraFLOPS		165 teraFLOPS \| 330 teraFLOPS		362.05 teraFLOPS
Peak FP16 Tensor Core			18 TFLOPS \| 36 TFLOPS Sparsity	125 teraFLOPS \| 250 teraFLOPS		165 teraFLOPS \| 330 teraFLOPS		362.05 teraFLOPS
Peak FP8 Tensor Core								733 teraFLOPS
Peak INT4 Tensor Core				500 TOPS \| 1,000 TOPS		661 TOPS \| 1321 TOPS		733 teraFLOPS
Total NVLink Bandwidth						Third-gen NVLINK: 200GB/s	NVIDIA NVLink 112.5 GB/s (bidirectional) PCIe Gen4 16 GB/s	Not supported
Multi-Instance GPUs								No
NVIDIA CUDA™ Technology			11.1 or later						Yes
NVENC \| NVDEC								3x l 3x (includes AV1 encode and decode)
Secure Boot with Root of Trust								Yes
NEBS Ready								Level 3
Peak Half Precision (FP16) Performance	184.6 TFLOPs
Peak Single Precision Matrix (FP32) Performance	46.1 TFLOPs	45.3 TFLOPs
Peak Double Precision Matrix (FP64) Performance		45.3 TFLOPs
Peak Single Precision (FP32) Performance	23.1 TFLOPs
Peak Double Precision (FP64) Performance	11.5 TFLOPs	22.6 TFLOPs
Peak INT4 Performance	184.6 TOPs	181 TOPs	72 TOPS \| 144 TOPS Sparsity
Peak INT8 Performance	184.6 TOPs		36 TOPS \| 72 TOPS Sparsity
Peak bfloat16	92.3 TFLOPs	181 TFLOPs
ECC Protection		Yes (Full-Chip)	On by Default
Transistor Count									28.3 Billion
DisplayPort Connectors							3x DisplayPort 1.4 A40 is configured for virtualization by default with physical display connectors disabled. The display outputs can be enabled via management software tools.	4x DisplayPort 1.4a
OS Support	Linux x86_64	Linux x86_64
Cooling	Passive	Passive	Passive	Passive	Passive		Passive	Passive
Dual Slot	yes	yes	Single-slot	Single-slot	Dual-slot	Dual-slot	2-slot Low-profile
Dimensions	10.5" (267 mm) Board Length	10.5" (267 mm) Board Length	6.61” L x 2.71” H	FHFL			4.4" (H) x 10.5" (L)	4.4" (H) x 10.5" (L)	4.4" (H) x 10.5" (L)
Form Factor		Full Height	Low-Profile PCIe
Lithography	TSMC 7nm FinFET			8 nm			Samsung 8nm		Samsung 8nm
Supplementary Power Connectors	2x PCIe 8-pin connectors	1x8 pin 12V EPS		None	8-pin CPU	1x 8-pin CPU (EPS12V)	1x 8-pin CPU (EPS12V)	1x 16-pin	1x 8-pin EPS
Max Graphics Card Power (W)	300W	300W Peak	40-60 W \| Configurable	150W	250W	165W	300W	350W	300W
Processor									Ampere (GA102)
Memory Bandwidth									768 GB/s
Core Clock Speed									1455 MHz Base Clock 1860 MHz Boost Clock
L2 Cache Size									6 MB
API Support									CUDA 8.5, OpenCL 2.0 Shader Model 6.5, OpenGL 4.6, DirectX 12 Ultimate (12_2), Vulkan 1.2
Texture Fill Rate									625 GTexel/s
Graphics Resolution									7680 x 4320 x36 bpp at 60 Hz
Peak Double Precision FP64 Performance									1,250 GFLOPS (1:32)
Peak Single Precision FP32 Performance									38.7 TFLOPS
Peak Half Precision FP16 Performance									40.00 TFLOPS (1:1)
Multi-GPU Scalability									NVLINK 2-way low profile (2-slot and 3-slot bridges) connects 2x NVIDIA RTX A6000
NVLink Interconnect									112.5 GB/s (bidirectional)
VR Ready									Yes
Vulkan API									1.2
DisplayPort Output									4x DisplayPort 1.4a
Minimum Recommended Power, Single Card (W)									700W
Minimum Recommended Power, 2-Way (W)									850
Minimum Recommended Power, 3-Way (W)									1000
Minimum Recommended Power, 4-Way (W)									1200
Thermal Solution									Active Heatsink
Slot Height									2-Slot
Action	Select	Select	Select	Select	Select	Select	Select	Select	Select