🛡️

Guardrails & Safety

Input/output validation, PII protection, content moderation, and prompt injection defence.

Production · 2Stable · 3Experimental · 38 total

Anthropic Constitutional AI

Stable

4/5

Principle-based self-improvement for harmlessness and helpfulness

Pioneering approach where the model critiques and revises its own outputs based on a set of principles. Built into Claude models. Influence on the field is massive even if not a standalone product.

Proprietary

Docs ↗

Azure AI Content Safety

Production

4/5

Microsoft's content moderation API for text and images

Enterprise-grade content filtering integrated into Azure OpenAI. Fine-grained severity thresholds for hate, violence, sexual content, and self-harm. Required for regulated use cases on Azure.

Managed

Docs ↗

Guardrails AI

Stable

4/5

Add input/output validation and safety rails to LLM calls

Cleanest Python API for defining validators on LLM inputs and outputs. RAIL spec and Pydantic-based guards. Retry on failure is well-designed. Server mode needed for prod latency.

Open Source

Article →Docs ↗

Lakera Guard

Stable

4/5

Real-time prompt injection and jailbreak protection API

Sub-millisecond inference for prompt injection detection. Managed API means zero infra. Integrates as middleware. Best for teams that want safety without building classifiers.

Managed

Docs ↗

Llama Guard 3

Experimental

4/5

Meta's fine-tuned safety classifier for prompt and response screening

Production-deployable content moderation model. Run it as a sidecar to screen every input/output. MLCommons hazard taxonomy built in. Free, open-weight, and fast on a single GPU.

Open Source

Docs ↗

Microsoft Presidio

Production

4/5

Data protection and anonymization for PII in LLM pipelines

Best open-source PII detection and anonymization library. Detect-and-replace SSN, emails, credit cards before sending to LLMs. Critical for GDPR/HIPAA compliance in RAG.

Open Source

Docs ↗

NeMo Guardrails

Experimental

3/5

NVIDIA toolkit for programmable guardrails via Colang language

Unique dialogue-flow approach using Colang DSL. Best for complex multi-turn conversation policies. Steeper learning curve than Guardrails AI but richer for conversation steering.

Open Source

Docs ↗

Rebuff

Experimental

2/5

Prompt injection detection API for LLM applications

Purpose-built for prompt injection detection — a real attack vector in RAG systems. Uses a canary token technique alongside an LLM classifier. Early-stage; combine with input sanitization.

Open Source

Docs ↗

Other Categories

🧠Text Generation & Reasoning20 tools 💻Code Generation12 tools 🎨Image Generation12 tools 🎬Video Generation10 tools 🎙️Speech & Audio10 tools 🔗Embedding Models10 tools 👁️Multimodal & Vision10 tools 🤖AI Agents & Platforms10 tools 🔧Frameworks & SDKs12 tools ☁️Cloud AI Platforms10 tools 🗄️Vector Databases10 tools 🔬Data & Fine-tuning10 tools 📊Observability & Evals8 tools 🚀Model Serving8 tools