AI/LLM Ecosystem Directory — Repository Ratings

Awesome-LLM-Eval: a curated list of tools, datasets/benchmark, demos, leaderboard, papers, docs and models, mainly for Evaluation on LLMs. 一个由工具、基准/数据、演示、排行榜和大模型等组成的精选列表，主要面向基础大模型评测，旨在探求生成式AI的技术边界.

★ 647◇ 76MIT7mo ago

aimock

CopilotKit/aimock

6.3

Mock everything your AI app talks to — LLM APIs, MCP, A2A, vector DBs, search. One package, one port, zero dependencies.

★ 637◇ 44TypeScriptMIT1d ago

Awesome-LLM-in-Social-Science

ValueByte-AI/Awesome-LLM-in-Social-Science

5.1

Awesome papers involving LLMs in Social Science.

★ 633◇ 49MIT20d ago

LLMTornado

lofcz/LLMTornado

6.5

The .NET library to build AI agents with 30+ built-in connectors.

★ 621◇ 106C#MIT1d ago

daydreams

daydreamsai/daydreams

5.9

Daydreams is a set of tools for building agents for commerce

★ 608◇ 133TypeScriptMIT3mo ago

fastapi-ml-skeleton

eightBEC/fastapi-ml-skeleton

4.5

FastAPI Skeleton App to serve machine learning models production-ready.

★ 604◇ 91PythonApache-2.05mo ago

agent-skills-eval

darkrishabh/agent-skills-eval

5.3

A test runner for agentskills.io-style AI agent skills

★ 603◇ 30TypeScriptMIT4d ago

yalm

andrewkchan/yalm

3.7

Yet Another Language Model: LLM inference in C++/CUDA, no libraries except for I/O

yinizhilian/ICLR2025-Papers-with-Code

★ 590◇ 64C++9mo ago

ICLR2025-Papers-with-Code

3.3

历年ICLR论文和开源项目合集，包含ICLR2021、ICLR2022、ICLR2023、ICLR2024、ICLR2025.

rohan-paul/LLM-FineTuning-Large-Language-Models

★ 587◇ 331y ago

LLM-FineTuning-Large-Language-Models

3.6

LLM (Large Language Model) FineTuning

★ 576◇ 140Jupyter Notebook1y ago

awesome-evals

benchflow-ai/awesome-evals

A curated, non-BS library of the best resources for building and evaluating AI agents — papers, blogs, talks, tools, benchmarks. Maintained by BenchFlow.

★ 576◇ 42NOASSERTION1d ago

gitagent

open-gitagent/gitagent

5.8

A framework-agnostic, git-native standard for defining AI agents

★ 573◇ 113TypeScriptMITtoday

iFixAi

ifixai-ai/iFixAi

6.2

The open-source diagnostic for AI misalignment. 32 tests across fabrication, manipulation, deception, unpredictability, and opacity. Provider-agnostic. Runs against OpenAI, Anthropic, Bedrock, Azure, Gemini, and more. Letter grade in under 5 minutes, content-addressed manifest for bit-identical replay. Built by iMe.

★ 572◇ 114PythonApache-2.0today

langtest

Pacific-AI-Corp/langtest

5.8

Deliver safe & effective language models

★ 562◇ 50PythonApache-2.02mo ago

langtest

PacificAI/langtest

5.8

Deliver safe & effective language models

★ 562◇ 50PythonApache-2.02mo ago

KuiperLLama

zjhellofss/KuiperLLama

4.0

校招、秋招、春招、实习好项目，带你从零动手实现支持LLama2/3和Qwen2.5的大模型推理框架。

underneathall/pinferencia

★ 547◇ 143C++8mo ago

pinferencia

Python + Inference - Model Deployment library in Python. Simplest model inference server ever.

★ 543◇ 83PythonApache-2.03y ago

continuous-eval

relari-ai/continuous-eval

Data-Driven Evaluation for LLM-Powered Applications

★ 517◇ 38PythonApache-2.01y ago

Athena-Public

winstonkoh87/Athena-Public

5.9

The Linux OS for AI Agents — Persistent memory, autonomy, and time-awareness for any LLM. Own the state. Rent the intelligence.

LLM Frameworks

★ 512◇ 69PythonMIT8d ago

LLM-VM

anarchy-ai/LLM-VM

irresponsible innovation. Try now at https://chat.dev/

★ 491◇ 137PythonMIT2y ago

openinfer

openinfer-project/openinfer

6.0

Pure Rust + CUDA LLM inference engine — no PyTorch, OpenAI-compatible, serves Qwen3 to Kimi-K2

★ 488◇ 70RustApache-2.0today

agency

operand/agency

5.0

A fast and minimal framework for building agentic systems

★ 486◇ 28PythonMIT18d ago

ome

ome-projects/ome

6.1

Open Model Engine (OME) — Kubernetes operator for LLM serving, GPU scheduling, and model lifecycle management. Works with SGLang, vLLM, TensorRT-LLM, and Triton

★ 472◇ 83GoApache-2.0today

Finetune_LLMs

mallorbc/Finetune_LLMs

3.8

Repo for fine-tuning Casual LLMs

★ 465◇ 86PythonAGPL-3.02y ago

fakecloud

faiscadev/fakecloud

5.7

Free, open-source AWS emulator. LocalStack alternative: 26 services, 1,924 operations, 100% conformance. No account, no auth token, no paid tier.

★ 456◇ 31RustAGPL-3.0today

awsome-distributed-training

awslabs/awsome-distributed-training

5.6

Collection of best practices, reference architectures, model training examples and utilities to train large models on AWS.

★ 451◇ 196ShellMIT-0today

agentsilex

howl-anderson/agentsilex

A transparent, minimal, and hackable agent framework. ~300 lines of readable code. Full control, no magic.

★ 451◇ 45PythonMIT5mo ago

JetStream

AI-Hypercomputer/JetStream

JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).

★ 448◇ 66PythonApache-2.05mo ago

Aquila2

FlagAI-Open/Aquila2

3.6

The official repo of Aquila2 series proposed by BAAI, including pretrained & chat large language models.

★ 446◇ 32Python1y ago

xFasterTransformer

intel/xFasterTransformer

xFasterTransformer — open-source AI/LLM project.

★ 436◇ 75C++Apache-2.09mo ago

awesome-on-policy-distillation

chrisliu298/awesome-on-policy-distillation

4.2

A curated collection of papers, technical reports, frameworks, and tools for on-policy distillation (OPD) of large language models

★ 425◇ 12CC0-1.01d ago

gpu-rest-engine

NVIDIA/gpu-rest-engine

3.7

A REST API for Caffe using Docker and Go

★ 422◇ 93C++BSD-3-Clause7y ago

InternEvo

InternLM/InternEvo

InternEvo is an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencies.

★ 420◇ 67PythonApache-2.010mo ago

Awesome-LLM-Prompt-Optimization

jxzhangjhu/Awesome-LLM-Prompt-Optimization

Awesome-LLM-Prompt-Optimization: a curated list of advanced prompt optimization and tuning methods in Large Language Models

Prompt Engineering

★ 412◇ 233d ago

tiger

tigerlab-ai/tiger

Open Source LLM toolkit to build trustworthy LLM applications. TigerArmor (AI safety), TigerRAG (embedding, RAG), TigerTune (fine-tuning)

★ 403◇ 27Jupyter NotebookApache-2.02y ago

awesome-azure-openai-llm

kimtth/awesome-azure-openai-llm

4.6

A curated collection of resources for 🌌 Azure OpenAI, 🦙 LLMs (RAG, Agents).

★ 402◇ 58Python25d ago

tessera

zengxiao-he/tessera

From teacher to tiles — a from-scratch LLM distillation & serving engine: custom Triton/CUDA kernels, FSDP distillation, paged-KV continuous batching, speculative decoding, a Rust gateway, a JAX oracle, and interpretability tooling.

★ 394◇ 4PythonNOASSERTION24d ago

stable-diffusion-deploy

Lightning-Universe/stable-diffusion-deploy

4.6

Learn to serve Stable Diffusion models on cloud infrastructure at scale. This Lightning App shows load-balancing, orchestrating, pre-provisioning, dynamic batching, GPU-inference, micro-services working together via the Lightning Apps framework.

★ 391◇ 39PythonApache-2.02y ago

LightRFT

opendilab/LightRFT

5.1

LightRFT: Light, Efficient, Omni-modal & Reward-model Driven Reinforcement Fine-Tuning Framework

★ 388◇ 11PythonApache-2.02mo ago

rhesis

rhesis-ai/rhesis

5.5

The testing platform for AI teams. Bring engineers, PMs, and domain experts together to generate tests, simulate (adversarial) conversations, and trace every failure to its root cause.

★ 373◇ 26PythonNOASSERTIONtoday

APOLLO

zhuhanqing/APOLLO

4.2

APOLLO: SGD-like Memory, AdamW-level Performance; MLSys'25 Oustanding Paper Honorable Mention

★ 364◇ 19PythonNOASSERTION7mo ago

Dulus

KevRojo/Dulus

5.3

Open-source autonomous AI agent — runs on Claude-web, Gemini-web, Kimi-web, Deepseek-web and more for free and Every paid model via liteLLM. No API key required.

★ 360◇ 27PythonGPL-3.05d ago

llm-leaderboard

JonathanChavezTamales/llm-leaderboard

A comprehensive set of LLM benchmark scores and provider prices. (deprecated, read more in README)