CUTLASS requires a C++11 host compiler and performs best when built with the CUDA 11.8 Toolkit.
It is also compatible with CUDA 11.x.
Operating Systems
We have tested the following environments.
| Operating System |
Compiler |
| Windows 10 |
Microsoft Visual Studio 2015 |
|
Microsoft Visual Studio 2017 |
|
Microsoft Visual Studio 2019 |
| Ubuntu 18.04 |
GCC 7.5.0 |
| Ubuntu 20.04 |
GCC 10.3.0 |
| Ubuntu 22.04 |
GCC 11.2.0 |
Additionally, CUTLASS may be built with clang.
See these instructions for more details.
CUTLASS runs successfully on the following NVIDIA GPUs, and it is expected to be efficient on
any Volta-, Turing-, or NVIDIA Ampere- architecture NVIDIA GPU.
| GPU |
CUDA Compute Capability |
Minimum CUDA Toolkit |
Minimum CUDA Toolkit Enabling Native Tensor Cores |
| NVIDIA Tesla V100 |
7.0 |
9.2 |
10.1 |
| NVIDIA TitanV |
7.0 |
9.2 |
10.1 |
| NVIDIA GeForce RTX 2080 TI, 2080, 2070 |
7.5 |
10.0 |
10.2 |
| NVIDIA Tesla T4 |
7.5 |
10.0 |
10.2 |
| NVIDIA A100 |
8.0 |
11.0 |
11.0 |
| NVIDIA A10 |
8.6 |
11.1 |
11.1 |
| NVIDIA GeForce 3090 |
8.6 |
11.1 |
11.1 |
| NVIDIA H100 PCIe |
9.0 |
11.8 |
Double-precision: 11.8 |