The Reliable AI: Bridging System Engineering and Large Language Models
For engineers building AI-powered systems, that unpredictability introduces real risks: hallucinations, runaway inference costs, outages, privacy, security and legal liability.
This talk shows how to treat LLMs as unreliable components inside reliable systems. By applying principles from system reliability engineering, you can design architectures that contain the uncertainty of AI instead of letting it leak into production.
Key takeaways
How to wrap unreliable AI models in reliable system architecture
Practical patterns for managing cost, reliability, and failure modes in AI systems
How to detect hallucinations and model drift with semantic monitoring
Security techniques to prevent prompt injection and data leakage
Who Is This For?
- Software engineers building AI-powered products
- SREs and platform engineers responsible for reliability
- Architects integrating AI services into production systems
Level
Practitioner to advanced
What This Session Covers
- Categorizing AI risks
- Architectural patterns to mitigate those risks (sandbox, gateways, fallbacks, rate limiter, compartmentalization, AI firewall, circuit breakers, ...)
- Cost control patterns for AI workloads
- Semantic monitoring and evaluation pipelines
- Security patterns for prompt injection and data leakage
What It’s Not
- Not a prompt engineering talk
- Not about building LLM models from scratch
- Not a theoretical overview of AI
Full description
Large Language Models are incredibly powerful, but fundamentally unpredictable. For engineers tasked with integrating AI into production, this stochastic nature introduces significant risks: spiraling inference costs, outages, hallucinations, security vulnerabilities in AI-generated code, privacy breaches, and unpredictable vendor outages.
Drawing on principles from System Reliability Engineering, this talk explores how to bridge conventional infrastructure practices with modern AI systems. The key idea is not better prompting, but better architecture. By wrapping unpredictable AI components in deterministic system patterns, we can build systems that remain reliable even when the models themselves are not.
This session demonstrates practical engineering patterns for measuring, monitoring, and enforcing reliability in AI-powered applications — without sacrificing the capabilities of the underlying models.