flashattention

Star

Here are 3 public repositories matching this topic...

rogerchang1108 / FlashAttention-with-CUDA

Star

200 lines Flash Attention (only forward pass) in CUDA.

cuda forward-pass flashattention

Updated Feb 23, 2025
Cuda

MaxLSB / flash-attn2

Star

FlashAttention-2 in Triton for sliding window attention (fwd + bwd pass)

python deep-learning pytorch triton sliding-window flash-attention-2 flashattention

Updated Mar 6, 2025
Python

naidezhujimo / Triton-FlashAttention

Star

This repository contains multiple implementations of Flash Attention optimized with Triton kernels, showcasing progressive performance improvements through hardware-aware optimizations. The implementations range from basic block-wise processing to advanced techniques like FP8 quantization and prefetching

triton attention flashattention

Updated Apr 8, 2025
Python

Improve this page

Add a description, image, and links to the flashattention topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the flashattention topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

flashattention

Here are 3 public repositories matching this topic...

rogerchang1108 / FlashAttention-with-CUDA

MaxLSB / flash-attn2

naidezhujimo / Triton-FlashAttention

Improve this page

Add this topic to your repo