mirror of
https://github.com/NVIDIA/cutlass.git
synced 2026-03-21 20:34:34 +01:00
1
Resources
Matthew Nicely edited this page 2022-12-11 08:20:57 -05:00
We have also described the structure of an efficient GEMM in our talk at the GPU Technology Conference 2018.
- CUTLASS: Software Primitives for Dense Linear Algebra at All Levels and Scales within CUDA
- Developing CUDA Kernels to Push Tensor Cores to the Absolute Limit on NVIDIA A100
- Accelerating Convolution with Tensor Cores in CUTLASS
- Accelerating Backward Data Gradient by Increasing Tensor Core Utilization in CUTLASS
- CUTLASS: Python API, Enhancements, and NVIDIA Hopper