├── LICENSE └── README.md /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2023 ZincCat 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Awesome-Triton-Kernels 2 | Collection of kernels written in Triton language (didn't seem to be a lot till now). Welcoming contribution! 3 | 4 | [Main Repo by OpenAI](https://github.com/openai/triton) 5 | 6 | [Official Tutorials](https://triton-lang.org/main/getting-started/tutorials/index.html) 7 | 8 | [Awesome resources from cuda-mode](https://github.com/cuda-mode/resource-stream), their [guide](https://www.youtube.com/watch?v=DdTsX6DQk24&ab_channel=CUDAMODE) to Triton 9 | 10 | [Triton Kernel collection by cuda-mode](https://github.com/cuda-mode/triton-index) 11 | 12 | [Puzzles by Sasha Rush](https://github.com/srush/Triton-Puzzles) 13 | 14 | ## General Operators 15 | [attorch](https://github.com/BobMcDear/attorch) subset of PyTorch's nn module 16 | 17 | [FlagGems](https://github.com/FlagOpen/FlagGems) 18 | 19 | [Kernels by PyTorch Labs](https://github.com/pytorch-labs/applied-ai) 20 | 21 | [scattermoe: Sparse Mixture-of-Experts](https://github.com/shawntan/scattermoe) 22 | 23 | [Fused moe from vLLM](https://github.com/vllm-project/vllm/blob/main/vllm/model_executor/layers/fused_moe/fused_moe.py) 24 | 25 | ## Transformer 26 | [Liger Kernel: Efficient Triton Kernels for LLM Training](https://github.com/linkedin/Liger-Kernel) 27 | 28 | [Flash Linear Attention](https://github.com/sustcsonglin/flash-linear-attention) 29 | 30 | [FLASHNN for LLM Serving](https://github.com/AlibabaPAI/FLASHNN) 31 | 32 | [Kernels by Kernl](https://github.com/ELS-RD/kernl) 33 | 34 | [Kernels by Unsloth](https://github.com/unslothai/unsloth) 35 | 36 | [GPTQ by fpgaminer](https://github.com/fpgaminer/GPTQ-triton) 37 | 38 | [GPTQ on PyTorch blog](https://pytorch.org/blog/accelerating-triton/) 39 | 40 | [FlagAttention, memory-efficient attention kernels](https://github.com/FlagOpen/FlagAttention) 41 | 42 | ## Activations 43 | [Activation functions by dogukantai](https://github.com/dogukantai/triton-activations) 44 | 45 | ## Matrix Operations 46 | [Sparse Toolkit: Block-sparse matrix multiplication](https://github.com/stanford-futuredata/stk) ([paper](https://openreview.net/forum?id=doa11nN5vG)) 47 | 48 | [GemLite: Fused low-bit matrix multiplication](https://github.com/mobiusml/gemlite) 49 | 50 | ## Communication 51 | [symm-mem-recipes](https://github.com/yifuwang/symm-mem-recipes) 52 | 53 | [tccl](https://github.com/cchan/tccl) 54 | 55 | ## Quantization 56 | [Quantization kernels by bitsandbytes](https://github.com/bitsandbytes-foundation/bitsandbytes/tree/main/bitsandbytes/triton) 57 | 58 | ## Special operations 59 | [EquiTriton for equivariant NN by IntelLabs](https://github.com/IntelLabs/EquiTriton) 60 | 61 | ## Benchmark 62 | [TritonBench by PyTorch](https://github.com/pytorch-labs/tritonbench) 63 | 64 | ## Integrations 65 | [JAX-Triton](https://github.com/jax-ml/jax-triton) 66 | 67 | ## Others 68 | [Triton-distributed](https://github.com/ByteDance-Seed/Triton-distributed) 69 | --------------------------------------------------------------------------------