Generative text inference — tokens produced per second per GPU. · llama2-70b · Offline + Server scenarios · MLPerf v4.1/v5.0 · on-demand pricing
| GPU | Scenario | tok/s per GPU | |
|---|---|---|---|
| NVIDIA B200-SXM-180GB | Offline | 12,357 tok/s | |
| NVIDIA B200-SXM-180GB | Server | 12,305 tok/s | |
| NVIDIA H200-SXM-141GB | Offline | 4,432 tok/s | |
| NVIDIA H200-SXM-141GB | Server | 4,134 tok/s | |
| NVIDIA H100-SXM-80GB | Offline | 3,913 tok/s | |
| NVIDIA H200-NVL-141GB | Offline | 3,894 tok/s | |
| NVIDIA H100-SXM-80GB | Server | 3,888 tok/s | |
| NVIDIA H200-NVL-141GB | Server | 3,606 tok/s |
| Provider | GPU | Scenario | Price | tok/$ | |
|---|---|---|---|---|---|
| RunPod | NVIDIA B200-SXM-180GB | Offline | $5.49/GPU·hr 1d ago | 8,103,115 | |
| RunPod | NVIDIA B200-SXM-180GB | Server | $5.49/GPU·hr 1d ago | 8,069,123 | |
| Lambda Labs | NVIDIA B200-SXM-180GB | Offline | $6.69/GPU·hr 1d ago | 6,649,641 | |
| Lambda Labs | NVIDIA B200-SXM-180GB | Server | $6.69/GPU·hr 1d ago | 6,621,747 | |
| GCP (us-central1) | NVIDIA B200-SXM-180GB | Offline | $8.05/GPU·hr today | 5,522,793 | |
| GCP (us-central1) | NVIDIA B200-SXM-180GB | Server | $8.05/GPU·hr today | 5,499,626 | |
| CoreWeave | NVIDIA B200-SXM-180GB | Offline | $8.60/GPU·hr 1d ago | 5,172,802 | |
| CoreWeave | NVIDIA B200-SXM-180GB | Server | $8.60/GPU·hr 1d ago | 5,151,103 | |
| RunPod | NVIDIA H100-SXM-80GB | Offline | $2.99/GPU·hr 1d ago | 4,711,726 | |
| RunPod | NVIDIA H100-SXM-80GB | Server | $2.99/GPU·hr 1d ago | 4,681,547 | |
| Crusoe | NVIDIA H200-SXM-141GB | Offline | $4.29/GPU·hr 1d ago | 3,718,846 | |
| RunPod | NVIDIA H200-SXM-141GB | Offline | $4.31/GPU·hr 1d ago | 3,701,589 | |
| Crusoe | NVIDIA H100-SXM-80GB | Offline | $3.90/GPU·hr 1d ago | 3,612,323 | |
| Crusoe | NVIDIA H100-SXM-80GB | Server | $3.90/GPU·hr 1d ago | 3,589,186 | |
| Lambda Labs | NVIDIA H100-SXM-80GB | Offline | $3.99/GPU·hr 1d ago | 3,530,842 | |
| Lambda Labs | NVIDIA H100-SXM-80GB | Server | $3.99/GPU·hr 1d ago | 3,508,227 | |
| Crusoe | NVIDIA H200-SXM-141GB | Server | $4.29/GPU·hr 1d ago | 3,469,034 | |
| RunPod | NVIDIA H200-SXM-141GB | Server | $4.31/GPU·hr 1d ago | 3,452,937 | |
| OCI | NVIDIA B200-SXM-180GB | Offline | $14.00/GPU·hr today | 3,177,579 | |
| OCI | NVIDIA B200-SXM-180GB | Server | $14.00/GPU·hr today | 3,164,249 | |
| CoreWeave | NVIDIA H200-SXM-141GB | Offline | $6.31/GPU·hr 1d ago | 2,528,344 | |
| CoreWeave | NVIDIA H200-SXM-141GB | Server | $6.31/GPU·hr 1d ago | 2,358,503 | |
| CoreWeave | NVIDIA H100-SXM-80GB | Offline | $6.16/GPU·hr 1d ago | 2,287,023 | |
| CoreWeave | NVIDIA H100-SXM-80GB | Server | $6.16/GPU·hr 1d ago | 2,272,374 | |
| OCI | NVIDIA H200-SXM-141GB | Offline | $10.00/GPU·hr today | 1,595,385 | |
| Azure (eastus2) | NVIDIA H200-SXM-141GB | Offline | $10.60/GPU·hr today | 1,505,080 | |
| GCP (us-central1) | NVIDIA H200-SXM-141GB | Offline | $10.60/GPU·hr today | 1,504,958 | |
| OCI | NVIDIA H200-SXM-141GB | Server | $10.00/GPU·hr today | 1,488,216 | |
| Azure (eastus2) | NVIDIA H200-SXM-141GB | Server | $10.60/GPU·hr today | 1,403,977 | |
| GCP (us-central1) | NVIDIA H200-SXM-141GB | Server | $10.60/GPU·hr today | 1,403,863 | |
| AWS (us-east-1) | NVIDIA H200-NVL-141GB | Offline | $10.60/GPU·hr today | 1,322,401 | |
| OCI | NVIDIA H100-SXM-80GB | Offline | $10.75/GPU·hr today | 1,310,517 | |
| OCI | NVIDIA H100-SXM-80GB | Server | $10.75/GPU·hr today | 1,302,123 | |
| GCP (us-central1) | NVIDIA H100-SXM-80GB | Offline | $11.06/GPU·hr today | 1,273,641 | |
| GCP (us-central1) | NVIDIA H100-SXM-80GB | Server | $11.06/GPU·hr today | 1,265,483 | |
| AWS (us-east-1) | NVIDIA H200-NVL-141GB | Server | $10.60/GPU·hr today | 1,224,696 | |
| AWS (us-east-1) | NVIDIA H100-SXM-80GB | Offline | $12.29/GPU·hr today | 1,146,303 | |
| Azure (eastus2) | NVIDIA H100-SXM-80GB | Offline | $12.29/GPU·hr today | 1,146,303 | |
| Azure (eastus) | NVIDIA H100-SXM-80GB | Offline | $12.29/GPU·hr today | 1,146,303 | |
| AWS (us-east-1) | NVIDIA H100-SXM-80GB | Server | $12.29/GPU·hr today | 1,138,961 | |
| Azure (eastus2) | NVIDIA H100-SXM-80GB | Server | $12.29/GPU·hr today | 1,138,961 | |
| Azure (eastus) | NVIDIA H100-SXM-80GB | Server | $12.29/GPU·hr today | 1,138,961 |