KataGo speeds of different hardwares

Here are some rounded KataGo speeds in visits/s on my RTX 4070 for different libraries in comparison to other people’s older hardware (which sometimes measure playouts). I use

  • RTX 4070 (Asus TUF 12G, Quiet mode, 200W TDP, 100% power target)
  • Ryzen 7700 (8C, 16T)
  • 64 GB DDR5-RAM JEDEC
  • Nvidia Studio driver 531.61
  • KataGo 1_13_0 (OpenCL, CUDA) or 1_13_1 (TensorRT)
  • 18-Block-Model = kata1-b18c384nbt-s6386600960-d3368371862
  • default values for RAM cache = 3GB and time = 5s
  • Nvidia libraries CUDA_11_6_2 + CUDNN_8_9_1_23 + TensorRT_8_5_2_2.

Note that a good versus bad combination of Nvidia libraries can cause a 5.3x speed difference.

Speed   Hardware

6500    RTX 4070 TensorRT, 140,000 visits, 80 threads (recommended)

4000    RTX 4070 CUDA, 100,000 visits, 64 threads (recommended)

3000    2 * RTX 2080TI, CUDA, b40, 64GB, 100000 visits, 1s, 40 threads (recommended) ~= 2800 visits/s, 80 threads ~= 3000 visits/s

2200    RTX 4070 OpenCL, 100,000 visits, 40 threads threads (recommended)

0580    5700XT, b40, 12GB, 16 threads

0300    iPad_Pro/M1

0200    iPhone 13 pro, b40

0170    iPad/A12X

Where do your RTX 1000, RTX 3000, RTX 4000, RTX Laptop cards and Macs fit?

2 Likes