AI Wisdom
πŸ”¬

Data & Fine-tuning

Dataset management, annotation tools, fine-tuning frameworks, and experiment tracking.

Graduated Β· 4Incubating Β· 5Sandbox Β· 110 total
← All categories

Hugging Face Hub

Graduated
5/5

The GitHub of AI β€” models, datasets, and Spaces in one platform

Essential infrastructure for the AI ecosystem. 500K+ models, 100K+ datasets, and Spaces for demos. Git-based versioning, model cards, and community features. The first place to look for any model.

Weights & Biases

Graduated
5/5

ML experiment tracking, model registry, and dataset versioning

Industry standard for experiment tracking. Log metrics, visualise runs, compare experiments, and manage model lifecycle. Integrates with every major ML framework. Essential for any ML team.

Unsloth

Incubating
5/5

2Γ— faster LLM fine-tuning with 70% less memory

Game-changer for fine-tuning. Custom Triton kernels make QLoRA 2Γ— faster while using 70% less VRAM. Fine-tune Llama 70B on a single 48GB GPU. Best tool for democratising fine-tuning.

Open Source

Axolotl

Incubating
4/5

Easy-to-use fine-tuning toolkit supporting YAML-based configuration

Most popular fine-tuning toolkit. YAML config makes it easy to start. Supports LoRA, QLoRA, full fine-tuning, DPO, and RLHF. Good for teams without deep ML engineering expertise.

Open Source

Scale AI

Graduated
4/5

Enterprise data labelling platform with human-in-the-loop

Leading enterprise data annotation platform. Managed labelling workforce, quality control, and RLHF services. Used by major AI labs. Best for teams needing high-volume, high-quality labels.

Proprietary

Label Studio

Incubating
4/5

Open-source data labelling for text, images, audio, and video

Most flexible open-source annotation tool. Supports every modality β€” text, image, audio, video, HTML, and time-series. Customisable labelling interfaces. Self-hostable with ML-assisted pre-labelling.

Open Source

Argilla

Incubating
4/5

Open-source platform for human feedback and RLHF data curation

Best open-source tool for collecting human feedback. Purpose-built for RLHF/DPO workflows. Tight Hugging Face Hub integration for publishing datasets. Essential for alignment data.

Open Source

PEFT / LoRA

Graduated
4/5

Hugging Face's parameter-efficient fine-tuning library

Essential library for parameter-efficient fine-tuning. LoRA, QLoRA, IAΒ³, and adapters. Reduces trainable params by 99% while maintaining quality. Core dependency of every fine-tuning toolkit.

Open Source

DVC

Incubating
3/5

Git-based data and model version control for ML projects

Git for data. Track large datasets and models alongside code. Pipeline DAGs for reproducible experiments. Works with any storage backend. Essential for MLOps teams needing data lineage.

Open Source

LitGPT

Sandbox
3/5

Lightning AI toolkit for pretraining, fine-tuning, and deploying LLMs

Clean, hackable LLM training code from Lightning AI. Supports 20+ model architectures. Good for researchers wanting to understand and modify training pipelines. Less abstraction than Axolotl.

Open Source