STACKQUADRANT

NVIDIA-NeMo/Curator

Fine-tuning Tools

Scalable data pre processing and curation toolkit for LLMs

6.2
GitHub Metrics
Stars
1.5k
Forks
256
Open Issues
219
Watchers
20
Contributors
61
Weekly Commits
0
Language
Python
License
Apache-2.0
Last Commit
Apr 16, 2026
Created
Mar 14, 2024
Latest Release
v1.1.0
Release Date
Feb 23, 2026
Synced: Apr 16, 2026
Quality Scores
Documentation Qualityw: 20%
4.9

No dedicated docs site. Description: 58 chars. Stars signal: 1,530. Contributors: 61. Score: 4.9/10

Community Healthw: 20%
5.7

Stars: 1,530. Contributors: 61. Watchers: 20. Forks: 256. Issue ratio: 14.3%. Score: 5.7/10

Maintenance Velocityw: 15%
7.5

Last commit: 0d ago. Weekly commits: 0. Latest release: v1.1.0. Maturity bonus: 2.1y old. Score: 7.5/10

API Design & DXw: 20%
5.6

Stars/issues ratio: 7. Dynamic language: Python. No dedicated API docs. Permissive license: Apache-2.0. Popularity signal: 1,530 stars. Score: 5.6/10

Production Readinessw: 15%
7.1

Battle-tested: 1,530 stars. Peer review: 61 contributors. Versioned: v1.1.0. Licensed: Apache-2.0. Age: 2.1 years. Maintenance: last commit 0d ago. Score: 7.1/10

Ecosystem Integrationw: 10%
7.5

Fork interest: 256. Major ecosystem: Python. Integration-friendly: Apache-2.0. Adoption: 1,530 stars. Score: 7.5/10

Tags
datadata-curationdata-prepdata-preparationdata-processingdata-processing-pipelinesdata-qualitydatacurationdatarecipesdeduplication
Radar
Documentation Quality
Community Health
Maintenance Velocity
API Design & DX
Production Readiness
Ecosystem Integration