STACKQUADRANT

NVIDIA-NeMo/Curator

Fine-tuning Tools

Scalable data pre processing and curation toolkit for LLMs

6.2
GitHub Metrics
Stars
1.6k
Forks
292
Open Issues
229
Watchers
18
Contributors
69
Weekly Commits
0
Language
Python
License
Apache-2.0
Last Commit
Jun 27, 2026
Created
Mar 14, 2024
Latest Release
v1.2.0
Release Date
May 14, 2026
Synced: Jun 29, 2026
Quality Scores
Documentation Qualityw: 20%
5.0

No dedicated docs site. Description: 58 chars. Stars signal: 1,638. Contributors: 69. Score: 5/10

Community Healthw: 20%
5.8

Stars: 1,638. Contributors: 69. Watchers: 18. Forks: 292. Issue ratio: 14.0%. Score: 5.8/10

Maintenance Velocityw: 15%
7.5

Last commit: 2d ago. Weekly commits: 0. Latest release: v1.2.0. Maturity bonus: 2.3y old. Score: 7.5/10

API Design & DXw: 20%
5.6

Stars/issues ratio: 7. Dynamic language: Python. No dedicated API docs. Permissive license: Apache-2.0. Popularity signal: 1,638 stars. Score: 5.6/10

Production Readinessw: 15%
7.1

Battle-tested: 1,638 stars. Peer review: 69 contributors. Versioned: v1.2.0. Licensed: Apache-2.0. Age: 2.3 years. Maintenance: last commit 2d ago. Score: 7.1/10

Ecosystem Integrationw: 10%
7.6

Fork interest: 292. Major ecosystem: Python. Integration-friendly: Apache-2.0. Adoption: 1,638 stars. Score: 7.6/10

Tags
datadata-curationdata-prepdata-preparationdata-processingdata-processing-pipelinesdata-qualitydatacurationdatarecipesdeduplication
Radar
Documentation Quality
Community Health
Maintenance Velocity
API Design & DX
Production Readiness
Ecosystem Integration