mirror of
https://github.com/NVIDIA/cutlass.git
synced 2026-05-06 11:51:00 +02:00
1
Resources
Matthew Nicely edited this page 2022-12-11 08:20:57 -05:00
We have also described the structure of an efficient GEMM in our talk at the GPU Technology Conference 2018.
- CUTLASS: Software Primitives for Dense Linear Algebra at All Levels and Scales within CUDA
- Developing CUDA Kernels to Push Tensor Cores to the Absolute Limit on NVIDIA A100
- Accelerating Convolution with Tensor Cores in CUTLASS
- Accelerating Backward Data Gradient by Increasing Tensor Core Utilization in CUTLASS
- CUTLASS: Python API, Enhancements, and NVIDIA Hopper