Here are some rounded KataGo speeds in visits/s on my RTX 4070 for different libraries in comparison to other people’s older hardware (which sometimes measure playouts). I use
- RTX 4070 (Asus TUF 12G, Quiet mode, 200W TDP, 100% power target)
- Ryzen 7700 (8C, 16T)
- 64 GB DDR5-RAM JEDEC
- Nvidia Studio driver 531.61
- KataGo 1_13_0 (OpenCL, CUDA) or 1_13_1 (TensorRT)
- 18-Block-Model = kata1-b18c384nbt-s6386600960-d3368371862
- default values for RAM cache = 3GB and time = 5s
- Nvidia libraries CUDA_11_6_2 + CUDNN_8_9_1_23 + TensorRT_8_5_2_2.
Note that a good versus bad combination of Nvidia libraries can cause a 5.3x speed difference.
Speed Hardware
6500 RTX 4070 TensorRT, 140,000 visits, 80 threads (recommended)
4000 RTX 4070 CUDA, 100,000 visits, 64 threads (recommended)
3000 2 * RTX 2080TI, CUDA, b40, 64GB, 100000 visits, 1s, 40 threads (recommended) ~= 2800 visits/s, 80 threads ~= 3000 visits/s
2200 RTX 4070 OpenCL, 100,000 visits, 40 threads threads (recommended)
0580 5700XT, b40, 12GB, 16 threads
0300 iPad_Pro/M1
0200 iPhone 13 pro, b40
0170 iPad/A12X
Where do your RTX 1000, RTX 3000, RTX 4000, RTX Laptop cards and Macs fit?