STACKQUADRANT

xlite-dev/Awesome-LLM-Inference

Inference Engines

📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

6.6
GitHub Metrics
Stars
5.1k
Forks
361
Open Issues
2
Watchers
132
Contributors
36
Weekly Commits
0
Language
Python
License
GPL-3.0
Last Commit
Apr 9, 2026
Created
Aug 27, 2023
Latest Release
v2.6.20
Release Date
Jun 17, 2025
Synced: Apr 16, 2026
Quality Scores
Documentation Qualityw: 20%
5.5

No dedicated docs site. Description: 127 chars. Stars signal: 5,144. Contributors: 36. Score: 5.5/10

Community Healthw: 20%
7.2

Stars: 5,144. Contributors: 36. Watchers: 132. Forks: 361. Issue ratio: 0.0%. Score: 7.2/10

Maintenance Velocityw: 15%
5.7

Last commit: 7d ago. Weekly commits: 0. Latest release: v2.6.20. Maturity bonus: 2.6y old. Score: 5.7/10

API Design & DXw: 20%
7.0

Stars/issues ratio: 2572. Dynamic language: Python. No dedicated API docs. License: GPL-3.0. Popularity signal: 5,144 stars. Score: 7/10

Production Readinessw: 15%
7.2

Battle-tested: 5,144 stars. Peer review: 36 contributors. Versioned: v2.6.20. Licensed: GPL-3.0. Age: 2.6 years. Maintenance: last commit 7d ago. Score: 7.2/10

Ecosystem Integrationw: 10%
7.2

Fork interest: 361. Major ecosystem: Python. License: GPL-3.0. Adoption: 5,144 stars. Score: 7.2/10

Tags
awesome-llmdeepseekdeepseek-r1deepseek-v3flash-attentionflash-attention-3flash-mlallm-inferenceminimax-01mla
Radar
Documentation Quality
Community Health
Maintenance Velocity
API Design & DX
Production Readiness
Ecosystem Integration