Tag
production
7 articles tagged with “production”
How to Build a Production-Ready AI System (Azure OpenAI + AI Search — Real Architecture)
Azure OpenAI + AI Search + embeddings — real-world architecture for production AI systems, including legacy data, orchestration, hybrid retrieval, cost control, and failure modes.
Multi-Agent Architecture Patterns in Production
Orchestrator, supervisor, and swarm patterns for multi-agent systems with real trade-offs and failure modes.
Building Reliable AI Agents with Semantic Kernel
Plugin architecture, memory, planners, and error handling patterns for building production AI agents in .NET with Semantic Kernel.
Small Language Models in Production
When and how to use small language models like Phi, Gemma, and Mistral in production — quantization, deployment patterns, and latency-cost trade-offs.
The AI Gateway Pattern: Why Every Production LLM Needs One
API Management, rate limiting, semantic caching, and cost control with Azure APIM as an AI Gateway.
Integrating Azure OpenAI with ASP.NET Core: A Production Guide
SDK setup, retry policies, streaming responses, and structured outputs for Azure OpenAI in .NET production applications.
Designing RAG Systems That Actually Scale
Chunking strategies, embedding pipelines, retrieval patterns, and when RAG breaks down in production systems.
