NVIDIA H200 SXM 141 GB vs NVIDIA RTX 5000 Max-Q Ada Generation

Contents:

Memory ML Performance Compute Power Architecture & Compatibility ML Software Support Clocks & Performance Power Consumption Rendering Benchmarks Additional

Memory

Memory Size

+781% 141 ГБ

16 GB

Memory Type

HBM3e GDDR6

Memory Bandwidth

4.89 TB/s

576.0 GB/s

Memory Bus Width

6,144 бит 256 бит

ML Performance

FP16 (Half Precision)

+719% 267.6 TFLOPS

32.69 TFLOPS

BF16 (Brain Float)

No No

TF32 (TensorFloat)

No No

Compute Power

FP32 (Single Precision)

+105% 66.91 TFLOPS

32.69 TFLOPS

FP64 (Double Precision)

+6,450% 33.45 TFLOPS

0.5107 TFLOPS

CUDA Cores

+74% 16,896

9,728

RT Cores

Architecture & Compatibility

GPU Architecture

Hopper Ada Lovelace

SM (Streaming Multiprocessor)

+74% 132

PCIe Version

PCIe 5.0 x16 PCIe 4.0 x16

ML Software Support

CUDA Version

9.0

8.9

CUDA Toolkit (first supported)

v12 v11

CUDA Toolkit status

Supported Supported

Clocks & Performance

Base Clock

+61% 1,500

930

Boost Clock

+18% 1,980

1,680

Memory Clock

1,593

+41% 2,250

Power Consumption

Recommended PSU

1100 W No

Power Connector

8-pin EPS None

TDP/TGP

700 W

-83% 120 W

Rendering

Texture Units (TMU)

+74% 528

304

ROP

L2 Cache

50 MB

+28% 64 MB

Benchmarks

MLPerf, llama2-70b-99.9 (UNSET)

3 534 tokens/s —

MLPerf, llama2-70b-99.9 (fp16)

3 553 tokens/s —

MLPerf, llama2-70b-99.9 (fp8)

2 444 tokens/s —

MLPerf, llama3.1-405b (fp16)

40.8 tokens/s —

MLPerf, llama3.1-405b (fp8)

25.3 tokens/s —

MLPerf, llama3.1-8b (fp8)

5 161 tokens/s —

MLPerf, deepseek-r1 (fp8)

1 113 tokens/s —

MLPerf, mixtral-8x7b (fp8)

7 132 tokens/s —

Additional

Slots

SXM Module

IGP

Release Date

Nov. 18, 2024 March 21, 2023

Display Outputs

No outputs

Portable Device Dependent

NVIDIA H200 SXM 141 GB vs NVIDIA RTX 5000 Max-Q Ada Generation

Comparison NVIDIA H200 SXM 141 GB with 141 GB HBM3e and 16,896 cores vs NVIDIA RTX 5000 Max-Q Ada Generation with 16 GB GDDR6 and 9,728 cores.

H200 (141GB)