deepeval
confident-ai/deepeval
7.8
Evaluation & Testing
★ 14.8k◇ 1.4kPythonApache-2.0today
Ragas
explodinggradients/ragas
7.7
Evaluation & Testing
★ 13.4k◇ 1.4kPythonApache-2.01mo ago
garak
NVIDIA/garak
7.3
Evaluation & Testing
★ 7.5k◇ 884HTMLApache-2.0today
chinese-llm-benchmark
jeinlee1991/chinese-llm-benchmark
6.3
Evaluation & Testing
★ 5.9k◇ 2371d ago
LLM-Engineers-Handbook
PacktPublishing/LLM-Engineers-Handbook
6.7
Evaluation & Testing
★ 4.9k◇ 1.2kPythonMIT1mo ago
agenta
Agenta-AI/agenta
7.8
Evaluation & Testing
★ 4.0k◇ 510TypeScriptNOASSERTIONtoday
lmms-eval
EvolvingLMMs-Lab/lmms-eval
7.5
Evaluation & Testing
★ 4.0k◇ 561PythonNOASSERTION1d ago
AI-Infra-Guard
Tencent/AI-Infra-Guard
7.3
Evaluation & Testing
★ 3.5k◇ 346PythonApache-2.0today
trulens
truera/trulens
7.3
Evaluation & Testing
★ 3.2k◇ 263PythonMITtoday
lmnr
lmnr-ai/lmnr
6.9
Evaluation & Testing
★ 2.8k◇ 191TypeScriptApache-2.0today
aisheets
huggingface/aisheets
6.2
Evaluation & Testing
★ 1.6k◇ 138TypeScriptApache-2.07d ago
FuzzyAI
cyberark/FuzzyAI
5.6
Evaluation & Testing
★ 1.3k◇ 188Jupyter NotebookApache-2.02mo ago
prompty
microsoft/prompty
6.8
Evaluation & Testing
★ 1.2k◇ 114TypeScriptMITtoday
uqlm
cvs-health/uqlm
6.6
Evaluation & Testing
★ 1.1k◇ 119PythonApache-2.0today
judgeval
JudgmentLabs/judgeval
6.7
Evaluation & Testing
★ 1.0k◇ 90PythonApache-2.03d ago
scenario
langwatch/scenario
5.9
Evaluation & Testing
★ 835◇ 58TypeScriptMITtoday
Awesome-LLM-Eval
onejune2018/Awesome-LLM-Eval
5.0
Evaluation & Testing
★ 631◇ 58MIT4mo ago
Awesome-LLM-in-Social-Science
ValueByte-AI/Awesome-LLM-in-Social-Science
5.0
Evaluation & Testing
★ 610◇ 46MIT1mo ago
langtest
PacificAI/langtest
6.1
Evaluation & Testing
★ 555◇ 49PythonApache-2.01d ago
langtest
Pacific-AI-Corp/langtest
6.1
Evaluation & Testing
★ 555◇ 49PythonApache-2.01d ago
continuous-eval
relari-ai/continuous-eval
4.7
Evaluation & Testing
★ 516◇ 38PythonApache-2.01y ago
aimock
CopilotKit/aimock
5.8
Evaluation & Testing
★ 456◇ 24TypeScriptMITtoday
llm-leaderboard
JonathanChavezTamales/llm-leaderboard
4.8
Evaluation & Testing
★ 361◇ 40JavaScriptNOASSERTION5mo ago
palico-ai
palico-ai/palico-ai
4.5
Evaluation & Testing
★ 342◇ 28TypeScriptMIT1y ago
rhesis
rhesis-ai/rhesis
5.4
Evaluation & Testing
★ 312◇ 24PythonNOASSERTIONtoday
llms-tools
PetroIvaniuk/llms-tools
4.7
Evaluation & Testing
★ 306◇ 40Apache-2.01mo ago
athina-evals
athina-ai/athina-evals
4.1
Evaluation & Testing
★ 299◇ 21Python10mo ago
flutter-skill
ai-dashboad/flutter-skill
5.1
Evaluation & Testing
★ 195◇ 24DartMIT1d ago
qaskills
PramodDutta/qaskills
4.0
Evaluation & Testing
★ 104◇ 4TypeScript2d ago