Tag
caching
2 articles tagged with “caching”
AIAI-SearchAPI-gatewayASP.NET-CoreAzure-APIMAzure-OpenAICIClaudeCopilotCosmos-DBCursorEvent-GridGPTGemmaGitHubJSONLLMMCPMistralModel-Context-ProtocolPhiPineconeQdrantRAGSDKSLMSemantic-KernelService-BusWindsurfagentsarchitectureasyncautomationbenchmarksbest-practicescachingchatbotcode-reviewcoding-agentscomparisoncost-controlcost-optimizationdecision-frameworkdotnetembeddingsengineeringevaluationevent-drivenfew-shotfine-tuningfunction-callingintegrationknowledge-basememorymetricsmocksmodel-routingmulti-agentorchestrationpersonal-AIplanningpluginsproductionprompt-designprompt-engineeringqualityquantizationqueuesregression-testingschemastructured-outputtestingtoken-economicstool-callingtool-usevector-dbvector-searchversion-control
models
Token Economics: Understanding and Optimizing LLM Costs
A practical guide to understanding token pricing, measuring real costs, and implementing optimization strategies — caching, prompt compression, model routing.
10 min
Read →ai architecture
The AI Gateway Pattern: Why Every Production LLM Needs One
API Management, rate limiting, semantic caching, and cost control with Azure APIM as an AI Gateway.
10 min
Read →