AI Engineer

Eugene Yan - Building Blocks for LLM Systems & Products

What are some building blocks for integrating LLMs into production systems and customer-facing products? In this talk, we'll discuss evals, RAG, guardrails, and collecting feedback.

“There is a large class of problems that are easy to imagine and build demos for, but extremely hard to make products out of. For example, self-driving: It’s easy to demo a car self-driving around a block, but making it into a product takes a decade.” - Karpathy

This talk is about practical patterns for integrating large language models (LLMs) into systems and products. We’ll draw from academic research, industry resources, and practitioner know-how, and try to distill them into key ideas and practices. There are seven key patterns. I’ve also organized them along the spectrum of improving performance vs. reducing cost/risk, and closer to the data vs. closer to the user.

  • Evals: To measure performance
  • RAG: To add recent, external knowledge
  • Fine-tuning: To get better at specific tasks
  • Caching: To reduce latency & cost
  • Guardrails: To ensure output quality
  • Defensive UX: To anticipate & manage errors gracefully
  • Collect user feedback: To build our data flywheel

Eugene Yan

Eugene Yan designs, builds, and operates machine learning systems that serve customers at scale. He's currently a Senior Applied Scientist at Amazon. Previously, he led machine learning at Lazada (acquired by Alibaba) and a Healthtech Series A. He writes & speaks about ML systems, engineering, and career at eugeneyan.com and ApplyingML.com.

Eugene Yan
Senior Applied Scientist, Amazon