🔬

Data & Fine-tuning

Dataset management, annotation tools, fine-tuning frameworks, and experiment tracking.

Graduated · 4Incubating · 5Sandbox · 110 total

Hugging Face Hub

Graduated

5/5

The GitHub of AI — models, datasets, and Spaces in one platform

Essential infrastructure for the AI ecosystem. 500K+ models, 100K+ datasets, and Spaces for demos. Git-based versioning, model cards, and community features. The first place to look for any model.

Hybrid

Docs ↗

Weights & Biases

Graduated

5/5

ML experiment tracking, model registry, and dataset versioning

Industry standard for experiment tracking. Log metrics, visualise runs, compare experiments, and manage model lifecycle. Integrates with every major ML framework. Essential for any ML team.

Hybrid

Docs ↗

Unsloth

Incubating

5/5

2× faster LLM fine-tuning with 70% less memory

Game-changer for fine-tuning. Custom Triton kernels make QLoRA 2× faster while using 70% less VRAM. Fine-tune Llama 70B on a single 48GB GPU. Best tool for democratising fine-tuning.

Open Source

Docs ↗

Axolotl

Incubating

4/5

Easy-to-use fine-tuning toolkit supporting YAML-based configuration

Most popular fine-tuning toolkit. YAML config makes it easy to start. Supports LoRA, QLoRA, full fine-tuning, DPO, and RLHF. Good for teams without deep ML engineering expertise.

Open Source

Docs ↗

Scale AI

Graduated

4/5

Enterprise data labelling platform with human-in-the-loop

Leading enterprise data annotation platform. Managed labelling workforce, quality control, and RLHF services. Used by major AI labs. Best for teams needing high-volume, high-quality labels.

Proprietary

Docs ↗

Label Studio

Incubating

4/5

Open-source data labelling for text, images, audio, and video

Most flexible open-source annotation tool. Supports every modality — text, image, audio, video, HTML, and time-series. Customisable labelling interfaces. Self-hostable with ML-assisted pre-labelling.

Open Source

Docs ↗

Argilla

Incubating

4/5

Open-source platform for human feedback and RLHF data curation

Best open-source tool for collecting human feedback. Purpose-built for RLHF/DPO workflows. Tight Hugging Face Hub integration for publishing datasets. Essential for alignment data.

Open Source

Docs ↗

PEFT / LoRA

Graduated

4/5

Hugging Face's parameter-efficient fine-tuning library

Essential library for parameter-efficient fine-tuning. LoRA, QLoRA, IA³, and adapters. Reduces trainable params by 99% while maintaining quality. Core dependency of every fine-tuning toolkit.

Open Source

Docs ↗

DVC

Incubating

3/5

Git-based data and model version control for ML projects

Git for data. Track large datasets and models alongside code. Pipeline DAGs for reproducible experiments. Works with any storage backend. Essential for MLOps teams needing data lineage.

Open Source

Docs ↗

LitGPT

Sandbox

3/5

Lightning AI toolkit for pretraining, fine-tuning, and deploying LLMs

Clean, hackable LLM training code from Lightning AI. Supports 20+ model architectures. Good for researchers wanting to understand and modify training pipelines. Less abstraction than Axolotl.

Open Source

Docs ↗

Other Categories

🧠Text Generation & Reasoning20 tools 💻Code Generation12 tools 🎨Image Generation12 tools 🎬Video Generation10 tools 🎙️Speech & Audio10 tools 🔗Embedding Models10 tools 👁️Multimodal & Vision10 tools 🤖AI Agents & Platforms10 tools 🔧Frameworks & SDKs12 tools ☁️Cloud AI Platforms10 tools 🗄️Vector Databases10 tools 📊Observability & Evals8 tools 🚀Model Serving8 tools 🛡️Guardrails & Safety8 tools