The Reliable AI: Bridging System Engineering and Large Language Models

For engineers building AI-powered systems, that unpredictability introduces real risks: hallucinations, runaway inference costs, outages, privacy, security and legal liability.

This talk shows how to treat LLMs as unreliable components inside reliable systems. By applying principles from system reliability engineering, you can design architectures that contain the uncertainty of AI instead of letting it leak into production.

Key takeaways

How to wrap unreliable AI models in reliable system architecture Practical patterns for managing cost, reliability, and failure modes in AI systems How to detect hallucinations and model drift with semantic monitoring Security techniques to prevent prompt injection and data leakage

Who Is This For?

  • Software engineers building AI-powered products
  • SREs and platform engineers responsible for reliability
  • Architects integrating AI services into production systems

Level

Practitioner to advanced

What This Session Covers

  • Categorizing AI risks
  • Architectural patterns to mitigate those risks (sandbox, gateways, fallbacks, rate limiter, compartmentalization, AI firewall, circuit breakers, ...)
  • Cost control patterns for AI workloads
  • Semantic monitoring and evaluation pipelines
  • Security patterns for prompt injection and data leakage

What It’s Not

  • Not a prompt engineering talk
  • Not about building LLM models from scratch
  • Not a theoretical overview of AI

Full description

Large Language Models are incredibly powerful, but fundamentally unpredictable. For engineers tasked with integrating AI into production, this stochastic nature introduces significant risks: spiraling inference costs, outages, hallucinations, security vulnerabilities in AI-generated code, privacy breaches, and unpredictable vendor outages.

Drawing on principles from System Reliability Engineering, this talk explores how to bridge conventional infrastructure practices with modern AI systems. The key idea is not better prompting, but better architecture. By wrapping unpredictable AI components in deterministic system patterns, we can build systems that remain reliable even when the models themselves are not.

This session demonstrates practical engineering patterns for measuring, monitoring, and enforcing reliability in AI-powered applications — without sacrificing the capabilities of the underlying models.