# AI Engineer World's Fair 2025 Main website: https://ai.engineer/ Basic llms info: https://ai.engineer/llms.txt LLms-full.txt including speakers: https://ai.engineer/llms-full.txt Calendar view schedule: https://ai.engineer/schedule Get all sessions in JSON for your vibecoded frontend: https://ai.engineer/sessions-speakers-details.json ## Overview **June 3–5, 2025 • San Francisco** The AI Engineer World's Fair is the largest technical conference for engineers working in AI today. Returning for its third year, this event is where the leading AI labs, founders, VPs of AI, and engineers gather to share what they're building and what's next. - ~3,000 attendees: Founders, VPs of AI, AI Engineers - ~150 launches and talks from top speakers - ~100 practical workshops and expo sessions - ~50 top DevTools and employers represented in the Expo Organized by the team behind the AI Engineer Summit. **[Buy Tickets](https://ti.to/software-3/ai-engineer-worlds-fair-2025?source={{UTM_SOURCE}}) | [Watch 2023/2024/2025 Talks](https://youtube.com/@aidotengineer) | [Subscribe to Newsletter](https://ai.engineer/newsletter)** ## Schedule June 3: Workshops + exclusive Speaker Dinner June 4: MCP, Tiny Teams, LLM RecSys, GraphRAG, Agent Reliability, Infrastructure, AI PM, Voice, AI in Fortune 500, AI Architects June 5: Reasoning + RL, SWE-Agents, Evals, Retrieval + Search, Security, Generative Media, AI Design, Robotics/Autonomy, AI in Fortune 500 (day 2), AI Architects (Day 2) ### Tuesday, June 3 – Workshop Day + Evening Expo & Reception - Exclusive hands-on workshops across 5 tracks, instructed by industry-leading companies, founders, and engineers. - Topics span all levels of experience and specialties in AI Engineering. - **Evening Welcome Reception** (4:00–7:00pm): Held in the Grand Assembly & Expo Hall. Open to all ticketholders. ### Wednesday & Thursday, June 4–5 – Conference Days - 18 tracks of talks, panels, and demos. - Keynotes from the biggest and most consequential labs and companies. - High-value hallway track and facilitated networking. - Workshops and exclusive access for "Conference + Workshop Pass" holders. ## Meals and Events ### June 3 - Continental Breakfast. 7:15am - 9:45am. Grand Assembly. - Registration. 8:00am - 7:00pm. Atrium: Event Hub. - Rehearsals/Tech Check. 10:00am - 5:00pm. Keynote/General Session (Yerba Buena 7&8) - Morning Break. 10:15am - 10:40am. Grand Assembly. - Lunch. 12:00pm - 1:00pm. Grand Assembly. - Afternoon Break. 3:00pm - 3:30pm. Grand Assembly. - Welcome Reception. 4:00pm - 7:00pm. Grand Assembly. - Community Meetups. 5:30pm - 9:00pm. Atrium: Event Hub ### June 4 - Continental Breakfast. 7:15am - 9:55am. Grand Assembly. - Registration. 7:00am - 7:00pm. Atrium: Event Hub. - Rehearsals/Tech Check. 7:45am - 8:45am. Keynote/General Session (Yerba Buena 7&8) - Morning Break. 10:15am - 11:00am. Salons 9-15: Expo Hall. - Lunch. 12:00pm - 2:00pm. Salons 9-15: Expo Hall. - Afternoon Break. 3:00pm - 3:45pm. Salons 9-15: Expo Hall. - The Toolbit Afterparty. 5:15pm - 7:00pm. Salons 9-15: Expo Hall. - Community Meetups. 7:00pm - 10:40pm. Atrium: Event Hub ### June 5 - Continental Breakfast. 7:15am - 9:55am. Grand Assembly. - Registration. 7:00am - 3:00pm. Atrium: Event Hub. - Rehearsals/Tech Check. 7:45am - 8:45am. Keynote/General Session (Yerba Buena 7&8) - Morning Break. 10:30am - 11:15am. Salons 9-15: Expo Hall. - Lunch. 12:00pm - 2:00pm. Salons 9-15: Expo Hall. - Afternoon Break. 3:00pm - 3:45pm. Salons 9-15: Expo Hall. - Community Meetups. 5:30pm - 9:00pm. Atrium: Event Hub ## Tracks ====================================================================== --- Track: AI ARCHITECTS (June 4-5) --- ====================================================================== Session ID: 941249 Track: AI Architects Speaker: Clay Bavor (Cofounder, Sierra) Format: Talk Room: SOMA: AI Architects Time: 4 Jun 2025 11:15 AM Session Title: Rise of the AI Architect Description: As the amount of consumer facing AI products grows, the most forward leaning enterprises have created a new role: the AI Architect. These leaders are responsible for helping define, manage, and evolve their company's AI agent experiences over time. In this session, Clay Bavor (Cofounder of Sierra) will join Alessio Fanelli (co-host of Latent Space) in a fireside chat to share what it means to be an AI Architect, success stories from the market, and the future of the role. ------------------------------------ Session ID: 913965 Track: AI Architects Speaker: Dani Grant (CEO) Format: Talk Room: Salons 2-6: Workshops Time: 3 Jun 2025 09:55 AM Session Title: The AI Engineer’s Guide to Raising VC Description: A no fluff, all tactics discussion. More AI engineers should build startups, the world needs more software. But there’s a way to raise VC and it’s hard to do it if you’ve never seen it done. We are going to walk through the exact playbook to raise your first round of funding. We will show you real pitch decks, real cold emails and real term sheets so when you go out to raise your first round of funding, you are setup to do it. Every AI Engineer should be equip to start their own company and this session makes sure raising $$$ is not going to be the blocker. ------------------------------------ Session ID: 913965 Track: AI Architects Speaker: Chelcie Taylor (Investor ) Format: Talk Room: Salons 2-6: Workshops Time: 3 Jun 2025 09:55 AM Session Title: The AI Engineer’s Guide to Raising VC Description: A no fluff, all tactics discussion. More AI engineers should build startups, the world needs more software. But there’s a way to raise VC and it’s hard to do it if you’ve never seen it done. We are going to walk through the exact playbook to raise your first round of funding. We will show you real pitch decks, real cold emails and real term sheets so when you go out to raise your first round of funding, you are setup to do it. Every AI Engineer should be equip to start their own company and this session makes sure raising $$$ is not going to be the blocker. ------------------------------------ Session ID: 915067 Track: AI Architects Speaker: Anoop Kotha (Applied AI) Format: Talk Room: SOMA: AI Architects Time: 4 Jun 2025 02:40 PM Session Title: Building Effective Voice Agents Description: How to build production voice applications and learnings from working with customers along the way ------------------------------------ Session ID: 914401 Track: AI Architects Speaker: Rossella Blatt Vital (VP of Engineering - AI) Format: Talk Room: Golden Gate Ballroom C: AI in the Fortune 500 Time: 5 Jun 2025 11:15 AM Session Title: From Hype to Habit: How We’re Building an AI-First SaaS Company—While Still Shipping the Roadmap Description: What does it really take to move a modern SaaS company from AI experimentation to becoming truly AI-first? At Sprout Social, we’re in the midst of that transformation—rearchitecting strategy, systems, teams, and incentives to put AI at the heart of how we think, build, and deliver value. This is a story in motion: a behind-the-scenes look at how we’re evolving from isolated AI feature experiments to an AI-native operating model. I’ll share what we’re learning as we navigate the innovation dilemma—integrating disruptive AI capabilities without breaking what already works or our roadmap. That includes rethinking how we define success, how we hire, reward, grow talent, and how we handle legal and ethical complexity without slowing down. We’ll explore the real-world tensions between rapid innovation, value delivery, making progress on Responsible AI, all while elevating internal AI fluency, and engaging with the broader AI ecosystem to stay at the edge. This isn’t a playbook from the finish line—it’s a candid reflection from deep inside the journey. My goal is to help other leaders chart their own AI path with greater clarity, confidence, and care. ------------------------------------ Session ID: 907834 Track: AI Architects Speaker: Michael Albada (Principal Applied Scientist) Format: Talk Room: SOMA: AI Architects Time: 4 Jun 2025 11:35 AM Session Title: Building Applications with AI Agents Description: Generative AI has dramatically shortened the distance between ideas and implementation, enabling faster prototyping and deployment than ever before. But while language models can streamline individual tasks, true transformation comes from combining these capabilities into intelligent, autonomous systems—AI agents. This talk explores how to build and deploy foundation model-enabled agent systems that go beyond simple prompt chaining or chatbots. Drawing from real-world implementations and the latest research, it offers a clear and practical path to designing both single-agent and multi-agent systems capable of handling complex workflows with minimal oversight. Attendees will gain a deeper understanding of the core design principles behind agentic systems, the architectural trade-offs involved in orchestrating multiple agents, and the strategies required to develop tailored solutions that enhance efficiency and innovation. Whether just beginning or scaling up, participants will leave with actionable insights to navigate the rapidly evolving world of AI autonomy. ------------------------------------ Session ID: 933641 Track: AI Architects Speaker: Kshitij Grover (Co-Founder & CTO, Orb Inc.) Room: Juniper: Expo Sessions Time: 4 Jun 2025 01:15 PM Session Title: Revenue Engineering: How to Price (and Reprice) Your AI Product Description: You’ve trained the model—now it’s time to train the business. This talk dives into the engineering behind pricing systems that can evolve as fast as your AI stack. Orb CTO Kshitij Grover will walk through how leading AI companies design infrastructure to support experimentation, scale, and real-world monetization constraints. Topics include: - How to meter usage and map it to pricing with accuracy and auditability - Factoring in margins and underlying costs when designing pricing strategy - Handling complexity across motions: self-serve vs. enterprise, pay-as-you-go vs. committed contracts - How to test pricing changes safely (and roll them back when needed) Whether you’re bootstrapping a pricing system from scratch or replacing a brittle V1, you’ll leave with architectural patterns and mental models to make pricing a first-class engineering concern. ------------------------------------ Session ID: 914814 Track: AI Architects Speaker: Ivan Burazin (CEO) Format: Talk Room: SOMA: AI Architects Time: 5 Jun 2025 12:15 PM Session Title: AX is the only Experience that Matters Description: If you’re building devtools for humans, you’re building for the past. Already a quarter of Y Combinator’s latest batch used AI to write 95% or more of their code. AI agents are scaling at an exponential rate and soon, they’ll outnumber human developers by orders of magnitude. The real bottleneck isn’t intelligence. It’s tooling. Terminals, local machines, and dashboards weren’t built for agents. They make do… until they can’t. In this talk, I’ll share how we killed the CLI at Daytona, rebuilt our infrastructure from first principles, and what it takes to build devtools that agents can actually use. Because in an agent-native future, if agents can’t use your tool, no one will. ------------------------------------ Session ID: 914401 Track: AI Architects Speaker: Deepsha Menghani (Director of Engineering – AI) Format: Talk Room: Golden Gate Ballroom C: AI in the Fortune 500 Time: 5 Jun 2025 11:15 AM Session Title: From Hype to Habit: How We’re Building an AI-First SaaS Company—While Still Shipping the Roadmap Description: What does it really take to move a modern SaaS company from AI experimentation to becoming truly AI-first? At Sprout Social, we’re in the midst of that transformation—rearchitecting strategy, systems, teams, and incentives to put AI at the heart of how we think, build, and deliver value. This is a story in motion: a behind-the-scenes look at how we’re evolving from isolated AI feature experiments to an AI-native operating model. I’ll share what we’re learning as we navigate the innovation dilemma—integrating disruptive AI capabilities without breaking what already works or our roadmap. That includes rethinking how we define success, how we hire, reward, grow talent, and how we handle legal and ethical complexity without slowing down. We’ll explore the real-world tensions between rapid innovation, value delivery, making progress on Responsible AI, all while elevating internal AI fluency, and engaging with the broader AI ecosystem to stay at the edge. This isn’t a playbook from the finish line—it’s a candid reflection from deep inside the journey. My goal is to help other leaders chart their own AI path with greater clarity, confidence, and care. ------------------------------------ Session ID: 933612 Track: AI Architects Speaker: Antje Barth (Principal Developer Advocate) Format: Keynote Room: Keynote/General Session (Yerba Buena 7&8) Time: 4 Jun 2025 04:05 PM Session Title: Building Agents at Cloud-Scale Description: Let's explore practical strategies for building and scaling agents in production. Discover how to move from local MCP implementations to cloud-scale architectures and how engineering teams leverage these patterns to develop sophisticated agent systems. Expect a mix of demos, use case discussions, and a glimpse into the future of agentic services! ------------------------------------ Session ID: 914798 Track: AI Architects Speaker: Mark Bissell (Applied Interpretability Research ) Format: Talk Room: Foothill F: Generative Media Time: 5 Jun 2025 02:40 PM Session Title: Why you should care about AI interpretability Description: The goal of mechanistic interpretability is to reverse engineer neural networks. Having direct, programmable access to the internal neurons of models unlocks new ways for developers and users to interact with AI — from more precise steering to guardrails to novel user interfaces. While interpretability has long been an interesting research topic, it is now finding real-world use cases, making it an important tool for AI engineers. ------------------------------------ Session ID: 915921 Track: AI Architects Speaker: Ben Hylak (Co-Founder) Format: Talk Room: SOMA: AI Architects Time: 5 Jun 2025 02:20 PM Session Title: Building AI Products That Actually Work Description: You've made the demo. How do you make the product? A lot of AI products don't actually work. Even worse, a lot of the techniques being advertised for making AI products better don't work either. We'll cover the challenges + techniques we've seen actually work in the real world. ------------------------------------ Session ID: 936933 Track: AI Architects Speaker: Nina Lopatina (Lead Developer Advocate) Format: Workshop Room: Golden Gate Ballroom B: Workshops Time: 3 Jun 2025 09:00 AM Session Title: Forget RAG Pipelines—Build Production-Ready AI Agents in 15 Minutes Description: Want to take advantage of your data, but don't want to reinvent RAG infrastructure? Join our workshop and see how you can deploy Agentic RAG in minutes using Contextual AI's managed RAG solution. We'll explore how Contextual handles intelligent parsing and chunking of your data, retrieves information with state of the art accuracy, and generates responses with a multi layered set of guardrails against hallucinations. Together, we'll build an end-to-end Agentic RAG pipeline and demonstrate its integration with Claude Desktop via MCP, so you can see how this could plug into your existing ecosystem. By the end of this session, you'll have a functioning Agentic RAG prototype that you can easily customize and deploy to production for your specific use cases, even with complex, unstructured documents. ------------------------------------ Session ID: 933610 Track: AI Architects Speaker: Mike Chambers (AI/ML Specialist DA AWS) Format: Talk Room: Golden Gate Ballroom C: AI in the Fortune 500 Time: 5 Jun 2025 12:15 PM Session Title: Ship it! Building Production-Ready Agents Description: Explore the practical challenges and solutions for deploying AI agents in real-world production environments. Through detailed technical analysis and practical examples, we'll examine strategies for building and orchestrating agent systems at scale. We'll cover critical infrastructure decisions, scalability frameworks, and best practices for creating robust, production-ready agent architectures. ------------------------------------ Session ID: 915921 Track: AI Architects Speaker: Sid Bendre (Co-Founder) Format: Talk Room: SOMA: AI Architects Time: 5 Jun 2025 02:20 PM Session Title: Building AI Products That Actually Work Description: You've made the demo. How do you make the product? A lot of AI products don't actually work. Even worse, a lot of the techniques being advertised for making AI products better don't work either. We'll cover the challenges + techniques we've seen actually work in the real world. ------------------------------------ Session ID: 933621 Track: AI Architects Speaker: Laurie Voss (VP Developer Relations, LlamaIndex) Room: Juniper: Expo Sessions Time: 4 Jun 2025 01:00 PM Session Title: Effective agent design patterns in production Description: At LlamaIndex we see a lot of agents built every day, and we've got a sense of what works and what doesn't. We've distilled those learnings down into a series of patterns and best practices for building real-world, production agents, and we're here to share them. You'll learn patterns for applying structure and guidance to famously nondeterministic LLMs and get concrete instruction on how to implement them. ------------------------------------ Session ID: 912033 Track: AI Architects Speaker: Alvaro Morales (CEO) Format: Talk Room: SOMA: AI Architects Time: 5 Jun 2025 11:15 AM Session Title: Monetizing AI: From Zero to Profit Description: As AI continues to transform industries, companies are faced with the critical challenge of effectively monetizing AI-driven products in a way that captures value, ensures customer adoption, and scales revenue sustainably. Unlike traditional SaaS models, AI-powered products have unique complexities - such as fluctuating usage patterns, variable compute costs, and evolving customer demands, making conventional pricing strategies unhelpful to the growth of an AI product-led startup. In this session, Alvaro Morales, CEO and co-founder of Orb, will explore why the often overlooked monetization aspect of AI is critical for businesses. He’ll share real-world examples and data to demonstrate how adaptive pricing models can drive cost savings, enhance customer experience, and reduce operational bottlenecks. Alvaro will lead a live demo, showcasing how engineers can simulate AI pricing strategies and subsequently integrate them with a simple plug-and-play solution. He’ll also share how real-world revenue simulations enable companies to test and refine pricing before implementing — reducing risk, boosting adoption, and unlocking new revenue streams. As a quick example, cloud software development platform Replit was looking to adopt a usage-based pricing model for a new product, but their existing billing system couldn't support the new model, and building a new billing system would delay the launch timeline. In order to get things done, they turned to Orb, which enabled them to make pricing changes up to the last minute. After the launch, Orb became the single source of truth for both Replit and its customers - providing usage alerts to notify Replit when users hit cost thresholds and provide insights into user spend and payment methods. Key takeaways: The challenge of AI monetization – Why traditional subscription-based SaaS pricing models don’t work for AI-powered products. Precision pricing – Exploring how usage-based, tiered, and hybrid pricing models can maximize revenue potential. Revenue simulation for AI pricing – Leveraging real-time data to test, adjust and optimize pricing strategies. Avoiding common pricing pitfalls – Identifying mistakes that can lead to revenue leakage and customer churn. This session is designed for AI executives, product leaders, and engineering teams looking for actionable strategies to build adaptive, scalable pricing models that drive long-term growth and profitability. ------------------------------------ Session ID: 941249 Track: AI Architects Speaker: Alessio Fanelli (Co-Host, Latent Space) Format: Talk Room: SOMA: AI Architects Time: 4 Jun 2025 11:15 AM Session Title: Rise of the AI Architect Description: As the amount of consumer facing AI products grows, the most forward leaning enterprises have created a new role: the AI Architect. These leaders are responsible for helping define, manage, and evolve their company's AI agent experiences over time. In this session, Clay Bavor (Cofounder of Sierra) will join Alessio Fanelli (co-host of Latent Space) in a fireside chat to share what it means to be an AI Architect, success stories from the market, and the future of the role. ------------------------------------ Session ID: 930540 Track: AI Architects Speaker: Ilan Bigio (Developer Experience) Format: Workshop Room: Golden Gate Ballroom B: Workshops Time: 3 Jun 2025 01:00 PM Session Title: Model-Maxxing: RFT, DPO, SFT (Fine-tuning with OpenAI) Description: Covering all forms of fine-tuning and prompt engineering, like SFT, DPO, RFT, prompt engineering / optimization, and agent scaffolding. ------------------------------------ Session ID: 904822 Track: AI Architects Speaker: Denys Linkov (Head of ML) Format: Talk Room: SOMA: AI Architects Time: 4 Jun 2025 02:20 PM Session Title: Structuring a modern AI team Description: You've been given an AI mandate but don't have additional headcount, what next? Re-skilling, up-skilling and team augmentation become essential to delivering on a new mandate. In this talk we'll cover strategies to structure cross functional AI teams with domain experts, software engineers and ML engineers. We'll cover key skills and milestones that each traditional role can contribute to in unique ways. ------------------------------------ Session ID: 933605 Track: AI Architects Speaker: Mani Khanuja (Principal ML Services SA) Room: Juniper: Expo Sessions Time: 4 Jun 2025 03:30 PM Session Title: Data is Your Differentiator: Building Secure and Tailored AI Systems Description: As organizations seek to harness their proprietary data while maintaining security and compliance, Amazon Bedrock provides a comprehensive framework for building tailored AI applications. Using Amazon Bedrock Knowledge Bases and Amazon Bedrock Data Automation, organizations can create AI solutions that truly understand their unique business context, terminology, and requirements. Combined with Amazon Bedrock Guardrails, these capabilities enhance the accuracy and relevance of AI-generated responses, while ensuring that sensitive information remains protected within the organization's control - enabling businesses to build secure and compliant enterprise-grade generative AI solutions that accelerate time to value. ------------------------------------ Session ID: 916157 Track: AI Architects Speaker: Nathan Wan (Head of AI) Format: Talk Room: SOMA: AI Architects Time: 4 Jun 2025 02:00 PM Session Title: AI That Pays: Lessons from Revenue Cycle Description: While much of the AI innovation in healthcare has centered on clinical and patient-facing applications, Revenue Cycle Management (RCM) remains an underexplored yet critical domain. Given the growing financial pressures facing providers, rethinking how healthcare gets paid is essential to ensuring access and sustainability. The combination of which makes RCM an opportune area for AI disruption. This session explores how the combination of vast structured and unstructured data, often rule-based workflows, and direct financial opportunity to drive meaningful outcomes. We’ll also share practical lessons from our journey evolving a traditional machine learning mindset to incorporate the latest advances in Generative AI, and how that shift is reshaping what's possible in healthcare operations. ------------------------------------ Session ID: 914371 Track: AI Architects Speaker: Henry Weller (Senior Product Manager, Vector Search @ MongoDB) Format: Talk Room: Salons 9-15: Expo Hall Time: 5 Jun 2025 03:00 PM Session Title: Building Vector Search Experiences with MongoDB: Access patterns, data models, and scaling considera Description: This talk will explore typical and forward-looking use cases for Atlas Vector Search, as well as how different types of data models and query patterns can be implemented and effectively scaled to meet the needs of those use cases. There will be a focus on the "Iron Triangle of Search" balancing accuracy, speed, and cost and talking about practical considerations that emerge within those use cases. This will be a technical talk focused on the "how" of Atlas Vector Search and considerations when building information retrieval systems given by a technical PM, not a sales pitch explaining how basic vector retrieval "solves" hallucinations. ------------------------------------ Session ID: 941906 Track: AI Architects Speaker: Alex Atallah (CEO OpenRouter, co-founder of OpenSea) Format: Keynote Room: Keynote/General Session (Yerba Buena 7&8) Time: 5 Jun 2025 04:35 PM Session Title: fun stories from building OpenRouter and where all this is going Description: How the first LLM aggregator got started, some of the weird moments in its early growth, architecture challenges, and where we'll be taking it down the road ------------------------------------ Session ID: 914912 Track: AI Architects Speaker: Yegor Denisov-Blanch (Developer Productivity Researcher at Stanford University) Format: Talk Room: SOMA: AI Architects Time: 5 Jun 2025 11:55 AM Session Title: Does AI Actually Boost Developer Productivity? (Stanford / 100k Devs Study) Description: Forget vendor hype: Is AI actually boosting developer productivity, or just shifting bottlenecks? Stop guessing. Our study at Stanford cuts through the noise, analyzing real-world productivity data from nearly 100,000 developers across hundreds of companies. We reveal the hard numbers: while the average productivity boost is significant (~20%), the reality is complex – some teams even see productivity decrease with AI adoption. The crucial insights lie in why this variance occurs. Discover which company types, industries, and tech stacks achieve dramatic gains versus minimal impact (or worse). Leave with the objective, data-driven evidence needed to build a winning AI strategy tailored to your context, not just follow the trend. ------------------------------------ Session ID: 915974 Track: AI Architects Speaker: Alex Duffy (Head of AI) Format: Talk Room: Yerba Buena Ballroom Salons 2-6: Tiny Teams Time: 4 Jun 2025 02:40 PM Session Title: Benchmarks Are Memes: How What We Measure Shapes AI—and Us Description: Benchmarks shape more than just AI models—they shape our future. The things we choose to measure become self-fulfilling prophecies, guiding AI toward specific abilities and, ultimately, defining humanity’s evolving role in the AI era. Today’s benchmarks have propelled incredible progress, but now we have an exciting opportunity: thoughtfully designing benchmarks around what genuinely matters to us—cooperation, creativity, education, and meaningful human experiences. In this talk, we’ll explore how benchmarks function as powerful cultural memes, influencing not only technical outcomes but societal direction. Drawing on practical examples we have seen at Every consulting in industries like finance, journalism, education, and even personally making AI play diplomacy. We’ll uncover what makes a benchmark impactful, approachable, and inspiring. You’ll see our engaging new AI Diplomacy benchmark demo, illustrating vividly how thoughtful evaluation design can excite both engineers and the wider community. You’ll hopefully walk away inspired and equipped to define benchmarks intentionally, helping steer AI toward outcomes that truly matter. ------------------------------------ Session ID: 936933 Track: AI Architects Speaker: Rajiv Shah (Chief Evangelist) Format: Workshop Room: Golden Gate Ballroom B: Workshops Time: 3 Jun 2025 09:00 AM Session Title: Forget RAG Pipelines—Build Production-Ready AI Agents in 15 Minutes Description: Want to take advantage of your data, but don't want to reinvent RAG infrastructure? Join our workshop and see how you can deploy Agentic RAG in minutes using Contextual AI's managed RAG solution. We'll explore how Contextual handles intelligent parsing and chunking of your data, retrieves information with state of the art accuracy, and generates responses with a multi layered set of guardrails against hallucinations. Together, we'll build an end-to-end Agentic RAG pipeline and demonstrate its integration with Claude Desktop via MCP, so you can see how this could plug into your existing ecosystem. By the end of this session, you'll have a functioning Agentic RAG prototype that you can easily customize and deploy to production for your specific use cases, even with complex, unstructured documents. ------------------------------------ Session ID: 914551 Track: AI Architects Speaker: Patrick Debois (AI Product Engineer - AI Native Dev Advisor) Format: Online Talk Session Title: The 4 Patterns of AI Native Development Description: AI is fundamentally reshaping software development roles and activities. While the change is obvious, understanding the actual shifts taking place on the individual developer remains challenging. In this talk, we introduce the four AI Native Dev patterns that are currently emerging: - From producer to manager: we say what AI needs to do - From implementation to intent: we care less on the how but focus on the why - From delivery to discovery: we experiment and learn - From content creation to knowledge: capture knowhow to get better We backup these patterns by showcasing features in tools that support these shift. The aim of the patterns is to help grasp how to position you and your team members 's career effectively in this changing landscape. ====================================================================== --- Track: AI PRODUCT MANAGEMENT (TBA) --- ====================================================================== Session ID: 925337 Track: AI Product Management Speaker: Raiza Martin (CEO & Co-Founder Huxe || Previously NotebookLM) Format: Talk Room: Foothill G 1&2: Product Management Time: 4 Jun 2025 11:15 AM Session Title: [PM Keynote] Everything is ugly so go build something that isn't Description: We're in an awkward adolescent phase of AI product (design). But what if this chaotic moment is actually our greatest opportunity? Enter the rebuilding revolution. In this talk, we'll explore how the current state of AI interfaces offers a once-in-a-career chance to rethink fundamental UX patterns, with practical guidance on avoiding common pitfalls that plague first-generation AI products. Learn how to balance technical constraints with user needs, identify which conventional wisdom to keep versus discard, and ship AI experiences that actually delight users rather than frustrate them. ------------------------------------ Session ID: 940848 Track: AI Product Management Speaker: Tom Moor (Head of Engineering) Format: Talk Room: Foothill G 1&2: Product Management Time: 4 Jun 2025 02:40 PM Session Title: Building the platform for agent coordination Description: Learn how we're evolving Linear into an operating system for engineering teams to ship product with agents as a first class citizen. ------------------------------------ Session ID: 933686 Track: AI Product Management Speaker: Michael Grinich (Founder & CEO, WorkOS ) Format: Talk Room: SOMA: AI Architects Time: 5 Jun 2025 02:00 PM Session Title: CIAM for AI: Who Are Your Agents and What Can They Do? Description: AI agents are changing the way modern SaaS products operate. Whether automating workflows, integrating with APIs, or acting on behalf of users, AI-driven assistants and autonomous systems are becoming core product features. But securing these agents presents a fundamental challenge: How do you authenticate AI agents? How do you control what they can access? How do you ensure they act within the right permissions? This talk will explore these concepts and more while highlighting current research and best practices. ------------------------------------ Session ID: 933474 Track: AI Product Management Speaker: Kenneth DuMez (DevRel Lead, Graphite) Room: Willow: Expo Sessions Time: 5 Jun 2025 03:15 PM Session Title: Cattle, not genies: building AI agents from first principles Description: As magical as they may seem, AI agents should be treated like any other software system. This talk will cover the best practices in designing and building AI systems including observability, security hardening, and proper UX. ------------------------------------ Session ID: 915648 Track: AI Product Management Speaker: Ben Stein (CEO) Format: Talk Room: Foothill G 1&2: Product Management Time: 4 Jun 2025 02:00 PM Session Title: Shipping Products When You Don’t Know What they Can Do Description: A customer recently asked me: “Hey, can I tag your AI agent in a Google Doc comment?” The honest answer: I have no idea! We never designed our agents to handle Google Doc comments, but we tried it anyway… and it worked! The agent performed beautifully, the customer was thrilled, and I was left bewildered. Welcome to Product Management for AI agents, where roadmaps are fuzzy and we only learn the boundaries of our products after they’re released. When a product doesn’t follow predefined requirements but instead learns and improvises at runtime, PMs must give up control and lean into uncertainty, curiosity, experimentation, and fast feedback loops. This talk is a field guide for Product/Engineering teams navigating this new reality. We’ll cover how to write specs for affordances instead of features, how to use AI evals as a product development tool, and how to perform User Acceptance Testing on undocumented emergent behavior. Most importantly, we’ll explore how to build trust with customers even when the answer is, truthfully, “I don’t know.” If you’re managing AI-native products in 2025 the same way you managed web apps in 2020, you might find yourself A/B testing an agent that decided to go off and do C, D, and E all by themselves! ------------------------------------ Session ID: 933622 Track: AI Product Management Speaker: Chris Hernandez (Manager of AI Speech Analytics at Chime) Room: Willow: Expo Sessions Time: 5 Jun 2025 11:00 AM Session Title: The Build-Operate Divide: Bridging Product Vision and AI Operational Reality Description: Product leaders see AI possibilities. Operations teams see implementation chaos. That disconnect can kill promising AI features before they ever reach users. In this session, Chris Hernandez and Jeremy Silva share an integrated framework that bridges product strategy and operational reality. You'll learn how they transformed fragmented AI workflows into a unified approach—from prototyping and prompt testing to human review loops and model benchmarking. We’ll explore how to build evaluation systems that satisfy both technical and business stakeholders, create effective HITL processes from day one, and use QA as a strategic enabler of generative AI quality. Most importantly, we’ll show how product and operations can move beyond friction—working together to deliver AI features that scale responsibly and ship faster, with confidence. ------------------------------------ Session ID: 914842 Track: AI Product Management Speaker: James Lowe (Head of AI Engineering) Format: Talk Room: Foothill G 1&2: Product Management Time: 4 Jun 2025 11:35 AM Session Title: Why your product needs an AI product manager, and why it should be you Description: So you've built another cool demo. Now what? You have hype, but not impact. You have kudos but no users. Ultimately you have a demo, but not a product. The unique uncertainty of AI technology demands a new approach – beyond traditional product management. You need an AI Product Manager. This talk explains why this role is essential for building real AI products, using real case studies from the incubator for Artificial Intelligence in the UK Government. More importantly, it reveals why your technical depth makes you uniquely suited to step into this critical leadership gap. Discover why could be the ideal candidate to be the AI Product Manager your product needs, and how to step into that role. ------------------------------------ Session ID: 915770 Track: AI Product Management Speaker: Jeremy Silva (Product Lead ) Format: Talk Room: Golden Gate Ballroom C: AI in the Fortune 500 Time: 4 Jun 2025 02:40 PM Session Title: Build Dynamic Products, and Stop the AI Sideshow Description: AI across product, GTM, and strategy was a great approach in 2023, but by now, we all already know that AI is disrupting the global landscape and how business gets done. Now is the time to stop chasing your competitors, and letting the technology lead your product strategy. There’s a better way to build that will allow you to differentiate and keep pace. Join AI product managers Eliza Cabrera and Jeremy Silva to learn how to crawl, walk, and run your way towards building dynamic products. ------------------------------------ Session ID: 915770 Track: AI Product Management Speaker: Eliza Cabrera (Principal AI Product Manager @ Workday) Format: Talk Room: Golden Gate Ballroom C: AI in the Fortune 500 Time: 4 Jun 2025 02:40 PM Session Title: Build Dynamic Products, and Stop the AI Sideshow Description: AI across product, GTM, and strategy was a great approach in 2023, but by now, we all already know that AI is disrupting the global landscape and how business gets done. Now is the time to stop chasing your competitors, and letting the technology lead your product strategy. There’s a better way to build that will allow you to differentiate and keep pace. Join AI product managers Eliza Cabrera and Jeremy Silva to learn how to crawl, walk, and run your way towards building dynamic products. ------------------------------------ Session ID: 914975 Track: AI Product Management Speaker: Brian Balfour (Founder & CEO, Reforge) Format: Keynote Room: Foothill G 1&2: Product Management Time: 4 Jun 2025 12:15 PM Session Title: Survive the AI Knife-Fight: Building Products That Win Description: If you’ve ever been blocked by vague specs, shifting goals, or chasing “vibes,” things have only gotten messier in the age of AI. Everyone is obsessing over engineers doing PM work and PMs cranking out prototypes—but that skips the hardest question: What should we build, and why will it win? Today’s competitive landscape is a knife-fight. When it’s trivial to ship “something,” true differentiation becomes brutally difficult. ------------------------------------ Session ID: 945392 Track: AI Product Management Speaker: Kenneth Auchenberg (Partner at AlleyCorp l, ex Product Lead at Stripe, Microsoft (VS Code)) Format: Talk Room: Foothill G 1&2: Product Management Time: 4 Jun 2025 11:55 AM Session Title: Shipping something to someone always wins Description: Learnings from building products at Stripe and applying them in an AI native word ====================================================================== --- Track: AI IN ACTION (TBA) --- ====================================================================== Session ID: 927324 Track: AI in Action Speaker: Sarah Guo (Founder) Format: Keynote Session Title: The 2025 AI Landscape (tba) Description: Sarah Guo shares her insights on the evolving landscape of AI investment and innovation. ------------------------------------ Session ID: 935987 Track: AI in Action Speaker: Simon Willison (AI Engineer) Format: Keynote Session Title: Frontier LLMs: What's Changed, What Won't Description: What's changed in the world of LLMs since the AI World's Fair last year? A lot! I'll be taking full advantage of my role as a fiercely independent researcher to review the past 12 months of advances in the field and catch everyone up on the latest models, free from any influence of vendors or employers. ------------------------------------ Session ID: 936800 Track: AI in Action Speaker: Asha Sharma (CVP, Head of Product, Microsoft AI Platform) Format: Keynote Room: Keynote/General Session (Yerba Buena 7&8) Time: 4 Jun 2025 09:20 AM Session Title: Spark to System: Building the Open Agentic Web Description: AI builders no longer ask whether to use agents—but how many and how fast. In this kickoff keynote, Microsoft’s Asha Sharma shows what happens when natural language creation meets an industrial grade backbone. Watch live demos—to see agents move from idea to production in real time. Walk out with the commands, repos, and open protocols to build your piece of the agentic web. ------------------------------------ Session ID: 910732 Track: AI in Action Speaker: Philipp Schmid (AI Developer Experience) Format: Workshop Room: SOMA: Workshops Time: 3 Jun 2025 01:00 PM Session Title: AI Engineering with the Google Gemini 2.5 Model Family Description: Hands on Workshop on learning to use Gemini 2.5 Pro in combination with Agentic tooling and MCP Servers. ------------------------------------ Session ID: 935461 Track: AI in Action Speaker: Logan Kilpatrick (Product, Google Deepmind) Format: Keynote Room: Keynote/General Session (Yerba Buena 7&8) Time: 5 Jun 2025 09:05 AM Session Title: A year of Gemini progress + what comes next Description: Over the last year, Google and Gemini models have shown rapid progress across all dimensions (model, product, etc). Let's highlight all the work that has happened, how we got the worlds best models, and where we are going next (across both the model landscape and out AI products). ------------------------------------ Session ID: 925974 Track: AI in Action Speaker: Sean Grove (Member of Technical Staff) Format: Keynote Room: Keynote/General Session (Yerba Buena 7&8) Time: 5 Jun 2025 04:55 PM Session Title: Prompt Engineering is Dead - Everything is a Spec Description: [!!Subject to change!!] Large models are trained through mountains of data and learned reward functions, yet - quis custodiet ipsos custodes? - what exactly are those amorphous blobs of data and rewards trying to specify? Building LLMs in any domain demands both clarity of thought and the skill to communicate those thoughts precisely - not only to other humans but to the models themselves. Without either, we risk unpleasant surprises. This talk dives into: • Why prompt spaghetti and data gumbo inevitably collapse at scale, unleashing behaviors we never intended - while a rigorously versioned spec keeps safety, personality, and UX firmly aligned, and makes incidents easier and faster to diagnose and fix. • How OpenAI’s public Model Spec provides a clear template, complemented by emerging “dev tools” that turn hazy human intent into precise, human-and-machine-readable policy. • How deliberative alignment training teaches models to first read and reason about the spec, boosting robustness without inflating context windows. • Practical tactics for catching ambiguity, untangling contradictions, and preserving global consistency. Plus, techniques for verifying that deployed models truly follow the contract we crafted. Resources: Model Spec (2025‑04‑11) and Deliberative Alignment, Guan et al., 2024. ------------------------------------ Session ID: 936937 Track: AI in Action Speaker: Corey Cooper (Developer Relations Manager @ Circle) Format: Workshop Room: Golden Gate Ballroom B: Workshops Time: 3 Jun 2025 10:40 AM Session Title: Automating Escrow with USDC and AI Description: This workshop explores how USDC, AI, and smart contracts can streamline escrow by automating fund release based on task or process verification. By using AI to interpret off-chain signals such as document validation, delivery confirmations, or milestone completion, we can trigger secure, programmable USDC payouts without manual intervention. The result is a faster, trust-minimized escrow system ideal for services, trade, and gig economy use cases. ------------------------------------ Session ID: 933575 Track: AI in Action Speaker: Beyang Liu (Co-founder and CTO, Sourcegraph) Room: Willow: Expo Sessions Time: 5 Jun 2025 03:30 PM Session Title: The emerging skillset of wielding coding agents Description: It's raining coding agents. But while many are saying they're feeling the AGI, others say they're not that useful for serious programming. How much is hype and how much is a skill issue? We'll share empirical observations that help explain the divergence of developer opinion. And we'll cover emergent strategies uncovered by users of Amp, a new coding agent in research preview, that can help you employ agents to complete more complex tasks in production codebases. ------------------------------------ Session ID: 933719 Track: AI in Action Speaker: Charles Frye (Developer Advocate, Modal Labs) Format: Workshop Room: Foothill F: Infrastructure Time: 4 Jun 2025 11:15 AM Session Title: What every AI engineer needs to know about GPUs Description: Every programmer needs to know a few things about hardware, like processors, memory, and disks. Due to AI systems' extreme demand for mathematical processing power, AI engineers need to know a few things about GPUs -- the world's most popular high-throughput mathematical co-processor. In this talk, I will explain the fundamental engineering constraints and design decisions that shape GPUs and trace those up to some counter-intuitive facts about the performance characteristics of AI systems, with actionable insights for their deployers and consumers. ------------------------------------ Session ID: 933707 Track: AI in Action Speaker: Apoorva Joshi (Senior AI Developer Advocate, MongoDB) Format: Workshop Room: SOMA: Workshops Time: 3 Jun 2025 10:40 AM Session Title: Building Multimodal AI Agents (From Scratch) Description: In this hands-on workshop, you will build a multimodal AI agent capable of processing mixed-media content—from analyzing charts and diagrams to extracting insights from documents with embedded visuals. Using MongoDB as a vector database and memory store, and Google's Gemini for multimodal reasoning, you will gain hands-on experience with multimodal data processing pipelines and agent orchestration patterns by implementing core components directly, using good ol' Python. ------------------------------------ Session ID: 933607 Track: AI in Action Speaker: Duan Lightfoot (AWS, Sr. Cloud Networking Developer Advocate) Format: Workshop Room: Golden Gate Ballroom A: Workshops Time: 3 Jun 2025 01:00 PM Session Title: Building Agents with Amazon Nova Act and MCP Description: In this 2-hour workshop, participants will gain practical hands-on experience building sophisticated AI agents using Amazon's agent technologies. You'll learn to build agents that can navigate the web like humans, perform complex multi-step tasks, and leverage specialized tools through natural language commands. You’ll explore Amazon Nova Act for reliable web navigation, Model Context Protocol (MCP) for connecting agents to external data sources and APIs, and Amazon Bedrock Agents for orchestrating complex workflows. Through guided exercises, you'll create agents capable of retrieving information and taking action across web applications, all through natural language interactions. By the end of this workshop, you'll have the practical skills to build AI agents that can browse websites, interact with web interfaces, and solve multi-step problems by combining these powerful Amazon technologies. ------------------------------------ Session ID: 933462 Track: AI in Action Speaker: Kenneth DuMez (DevRel Lead, Graphite) Format: Workshop Room: Willow: Expo Sessions Time: 5 Jun 2025 10:45 AM Session Title: The fastest software dev workflow in the world: AI meets stacked diffs Description: Learn the secrets behind the workflows that engineers at the fastest moving companies in the world are using to build software for billions of users worldwide. This workshop will cover a comprehensive overview of how to leverage generative AI to write code, how to stack and submit these pull requests, and finally how to use AI to review them. ------------------------------------ Session ID: 933685 Track: AI in Action Speaker: Tejashwa Tiwari (Automation Engineer @ Windsurf) Room: Salons 9-15: Expo Hall Time: 4 Jun 2025 01:45 PM Session Title: Windsurf & Wonders Description: Come learn about why Windsurf is the premiere choice for engineers and enterprises alike in applications of AI for development. ------------------------------------ Session ID: 933612 Track: AI in Action Speaker: Antje Barth (Principal Developer Advocate) Format: Keynote Room: Keynote/General Session (Yerba Buena 7&8) Time: 4 Jun 2025 04:05 PM Session Title: Building Agents at Cloud-Scale Description: Let's explore practical strategies for building and scaling agents in production. Discover how to move from local MCP implementations to cloud-scale architectures and how engineering teams leverage these patterns to develop sophisticated agent systems. Expect a mix of demos, use case discussions, and a glimpse into the future of agentic services! ------------------------------------ Session ID: 933636 Track: AI in Action Speaker: Ado Kukic (Director, Developer Relations) Room: Salons 9-15: Expo Hall Time: 4 Jun 2025 01:00 PM Session Title: Everything is changing Description: We believe programming with AI is going through massive changes — again. Turns out the models yearn for the tools and tokens. We hold them back if we make them ask before they can change a file. Give them tools & tokens and everything changes: what we use them for, how we use them, how many we run at the same time, how they talk to each other, how they talk to you, what they even are... It's all going to change. And with Amp, we're embracing it. If you want to find out where this is all going — come with us. ------------------------------------ Session ID: 936933 Track: AI in Action Speaker: Nina Lopatina (Lead Developer Advocate) Format: Workshop Room: Golden Gate Ballroom B: Workshops Time: 3 Jun 2025 09:00 AM Session Title: Forget RAG Pipelines—Build Production-Ready AI Agents in 15 Minutes Description: Want to take advantage of your data, but don't want to reinvent RAG infrastructure? Join our workshop and see how you can deploy Agentic RAG in minutes using Contextual AI's managed RAG solution. We'll explore how Contextual handles intelligent parsing and chunking of your data, retrieves information with state of the art accuracy, and generates responses with a multi layered set of guardrails against hallucinations. Together, we'll build an end-to-end Agentic RAG pipeline and demonstrate its integration with Claude Desktop via MCP, so you can see how this could plug into your existing ecosystem. By the end of this session, you'll have a functioning Agentic RAG prototype that you can easily customize and deploy to production for your specific use cases, even with complex, unstructured documents. ------------------------------------ Session ID: 907544 Track: AI in Action Speaker: Dan Mason (Principal, Head of AI) Format: Workshop Room: Foothill C: Workshops Time: 3 Jun 2025 01:00 PM Session Title: Case Study + Deep Dive: Telemedicine Support Agents with LangGraph/MCP Description: We've all seen website chat bots which can look up an order or answer a basic question -- but what does it take to build autonomous agents which manage long, delicate processes like multi-day medical treatments? In this workshop, we'll explore a workflow Stride built in partnership with Avila (https://avilascience.com/) that helps patients self-administer medication regimens at home. The stack includes LangGraph/LangSmith, Claude, MCP, Node.js, React, MongoDB, and Twilio, and rests on a foundation of treatment "blueprints" which LLM-powered agents use to guide patients to good outcomes. You'll learn how to: -Build a hybrid system of code and prompts that leverages LLM decisioning to drive a web application, message queue and database -Design and maintain flexible agentic workflow blueprints, with no special tools (just Google Docs!) -Create an agent evaluation system, which uses LLM-as-a-judge to evaluate the complexity of each interaction and escalate to human support when needed We'll also talk about the prompt engineered guidelines and guardrails which helps agents adhere to protocol as much as possible, while gracefully handling curveballs from the patient. Please bring questions -- we look forward to sharing our learnings on how to make agentic systems like this work in the real world! ------------------------------------ Session ID: 948075 Track: AI in Action Speaker: Ethan Sutin (CTO - Bee) Format: Talk Room: Grand Assembly Time: 5 Jun 2025 03:00 PM Session Title: The Buzz About Ambient Personal AI: What Really Works Description: Your smartphone knows your location, your smartwatch tracks your heartbeat, but what if AI could understand your entire life context? The idea is obvious (just ask Sam and Jony), but the reality of ambient intelligence brings technical challenges no one talks about, from processing human context at scale to making it actually useful. But what if we could crack the code on truly personal AI that lives with you, not just on your phone? Enter the era of ambient personal intelligence. We'll dive into hard-won lessons from processing over 150 billion tokens of personal context. We will discuss privacy-first systems from edge computing to Secure Enclaves, discover why ambient understanding is both harder and more powerful than you think, and explore the frontier where personal AI agents continuously reason about your needs and take actions proactively ------------------------------------ Session ID: 933632 Track: AI in Action Speaker: Numair Baseer (Deployed Engineer, Windsurf) Format: Workshop Room: Golden Gate Ballroom A: Workshops Time: 3 Jun 2025 03:30 PM Session Title: Agentic Coding with Windsurf Description: Agentic coding marks a new era in software development, where AI agents take on autonomous roles in coding tasks. The Windsurf IDE embodies this shift by integrating intelligent agents like Cascade, which maintain full codebase context to perform multi-file edits, run terminal commands, and suggest changes through tools like Supercomplete and Flows. In this session, we will explore features that allow developers to guide strategy while the AI handles execution, enhancing productivity and enabling more creative, high-level work. ------------------------------------ Session ID: 943904 Track: AI in Action Speaker: Barr Yaron (Partner at Amplify Partners) Format: Keynote Room: Keynote/General Session (Yerba Buena 7&8) Time: 5 Jun 2025 04:25 PM Session Title: State of AI Engineering 2025 Description: Come hear the results of the 2025 State of AI Engineering. ------------------------------------ Session ID: 939091 Track: AI in Action Speaker: Junyang Lin (Alibaba Qwen) Format: Online Talk Session Title: Qwen: Towards a Generalist Model / Agent Description: Since Alibaba launched the Qwen series of large models in 2023, the Qwen series of large language models and multimodal large models have been continuously updated and improved. This presentation will introduce the latest developments in the Qwen series of models, including the large language model Qwen3, vision-language large model Qwen2.5-VL, omni model Qwen2.5-Omni, etc. Additionally, this presentation will also cover the future development directions of the Qwen series. ------------------------------------ Session ID: 933646 Track: AI in Action Speaker: Jesús Barrasa (AI Field CTO) Room: Juniper: Expo Sessions Time: 5 Jun 2025 10:45 AM Session Title: Why Your Agent’s Brain Needs a Playbook: Practical Wins from Using Ontologies Description: You're trying to guide how your agents think and act. Code-orchestrated workflows are too rigid, but LLMs charting their own course feel too chaotic. When you need a middle ground, it’s time to reach for the secret weapon: ontologies. These graph-shaped fragments of actionable knowledge can fill in critical gaps. In this talk, we’ll explore together how ontologies bring structure, semantics, and sanity to GenAI-powered applications. You’ll learn when they’re useful, how to apply them, and what kinds of problems they help solve. Through practical examples, we’ll show how ontologies (1) guide knowledge graph construction, (2) add a semantic layer for more efficient and accurate retrieval (GraphRAG), and (3) encode domain logic you don’t want to leave up to the LLM. ------------------------------------ Session ID: 930540 Track: AI in Action Speaker: Ilan Bigio (Developer Experience) Format: Workshop Room: Golden Gate Ballroom B: Workshops Time: 3 Jun 2025 01:00 PM Session Title: Model-Maxxing: RFT, DPO, SFT (Fine-tuning with OpenAI) Description: Covering all forms of fine-tuning and prompt engineering, like SFT, DPO, RFT, prompt engineering / optimization, and agent scaffolding. ------------------------------------ Session ID: 915928 Track: AI in Action Speaker: Ishan Anand (AI Consultant and educator) Format: Workshop Room: Golden Gate Ballroom A: Workshops Time: 3 Jun 2025 09:00 AM Session Title: How LLMs work for Web Devs: GPT in 600 lines of Vanilla JS Description: Don't be intimidated. Modern AI can feel like magic, but underneath the hood are principles that web developers can understand, even if you don't have a machine learning background. In this workshop, we'll explore a complete GPT-2 inference implementation built entirely in Vanilla JS. This JavaScript translation of the popular "Spreadsheets-are-all-you-need" approach will let you debug and step through a real LLM line by line without the overhead of learning a new language, framework, or even IDE. All the major LLMs, including ChatGPT, Claude, DeepSeek, and Llama, inherit from GPT-2's architecture, making this exploration a solid foundation to understand modern AI systems and comprehend the latest research. While we won't have time to cover *everything*, you'll gain the essential knowledge to understand the key concepts that matter when building with LLMs, including how they: -Convert raw text into meaningful tokens - Represent semantic meaning through vector embeddings - Train neural networks through gradient descent - Generate text with sampling algorithms like top-k, top-p, and temperature This intense but beginner-friendly workshop is designed specifically for web developers diving into ML and AI for the first time. It’s your "missing AI degree" in just two hours. You'll walk away with an intuitive mental model of how Transformers work that you can apply immediately to your own LLM-powered projects. ------------------------------------ Session ID: 933716 Track: AI in Action Speaker: Charles Frye (Building useful technology out of large neural networks) Room: Juniper: Expo Sessions Time: 4 Jun 2025 12:45 PM Session Title: How fast are LLM inference engines anyway? Description: Open weights models and open source inference servers have made massive strides in the year since we last got together at AIE World's Fair. Where once we had only pirated LLaMA 2 weights and Transformers, we now have an embarrassment of riches. In fact, we have too many choices! What's an AI engineer looking to self-host inference to do? In this session, we'll share our benchmarking results from hundreds of runs across models, frameworks, and hardware. We'll also share tips and tricks from working with teams deploying LLM inference at scale. ------------------------------------ Session ID: 933633 Track: AI in Action Speaker: Sam Fertig (Deployed Engineer, Windsurf) Format: Talk Room: Nobhill A&B: Expo Sessions Time: 5 Jun 2025 03:30 PM Session Title: The Eyes Are The (Context) Window to The Soul: How Windsurf Gets to Know You Description: Sometimes it seems like Windsurf knows you a little too well. It's one thing to generate generic code, but to predict your next intent? From matching existing code patterns and styles to tracking how local changes affect the larger codebase, this talk digs into the technical challenges of context awareness and why simply indexing code falls short. Relive our journey tackling the core issue in the AI IDE space : balancing retrieval quality with latency constraints and scaling effectively as codebases grow. For those curious about the infrastructure behind context-aware AI, this talk offers insights into our approach of turning massive codebases into collections of useful context. ------------------------------------ Session ID: 949405 Track: AI in Action Speaker: Hf0 Residency (N/A) Format: Talk Room: Golden Gate Ballroom A: Retrieval + Search Time: 5 Jun 2025 01:25 PM Session Title: [New Session] The Next AI Unicorns Description: 10 CEOs. 2 minutes each. Cutting edge voice models. Post-transformer architectures. Game changing tech. Insane business plans. Be the first to hear updates from the leaders of 10 of the fastest growing Seed+ & Series A startups in the world. Who do you think will be the first to hit $1B? Teams: Area OpenRouter Favorited OpenAudio Coframe OpenHome Upside Recursal Glow Generation Lab ------------------------------------ Session ID: 933688 Track: AI in Action Speaker: Zack Proser (Open source hacker. Dev Education at WorkOS) Format: Workshop Room: Salons 2-6: Workshops Time: 3 Jun 2025 03:30 PM Session Title: AI Pipelines and Agents in Pure TypeScript with Mastra.ai Description: This hands-on workshop introduces Mastra.ai, a TypeScript framework that streamlines the development of agentic AI systems compared to traditional approaches using LangChain and vector databases. Participants will learn to build structured AI workflows with composable tools and reliable control, enabling them to create internal AI assistants that can handle requests like data cleaning, email drafting, and document summarization with minimal code. The session covers Mastra installation, running a local MCP server, defining tools and agents in TypeScript, using the Mastra playground, and implementing practical examples such as RAG setups and tool-chaining agents—all designed to equip attendees with the skills to develop scalable AI-driven internal tools based on sound software engineering principles rather than just experimental prompts. ------------------------------------ Session ID: 933688 Track: AI in Action Speaker: Nick Nisi (Software developer and panelist on the JS Party podcast) Format: Workshop Room: Salons 2-6: Workshops Time: 3 Jun 2025 03:30 PM Session Title: AI Pipelines and Agents in Pure TypeScript with Mastra.ai Description: This hands-on workshop introduces Mastra.ai, a TypeScript framework that streamlines the development of agentic AI systems compared to traditional approaches using LangChain and vector databases. Participants will learn to build structured AI workflows with composable tools and reliable control, enabling them to create internal AI assistants that can handle requests like data cleaning, email drafting, and document summarization with minimal code. The session covers Mastra installation, running a local MCP server, defining tools and agents in TypeScript, using the Mastra playground, and implementing practical examples such as RAG setups and tool-chaining agents—all designed to equip attendees with the skills to develop scalable AI-driven internal tools based on sound software engineering principles rather than just experimental prompts. ------------------------------------ Session ID: 933656 Track: AI in Action Speaker: Nick Nisi (Software developer and panelist on the JS Party podcast) Room: Willow: Expo Sessions Time: 5 Jun 2025 12:45 PM Session Title: Agents, Access, and the Future of Machine Identity Description: AI agents are calling APIs, submitting forms, and sending emails—but how do you control what they’re allowed to do? As agents act on behalf of users or organizations, traditional patterns like OAuth, session tokens, and role-based access often fall short. In this talk, we’ll explore how machine identity is evolving to meet this new landscape. You’ll learn: - How to think about authentication for agents (not just humans) - What it means to authorize an action when the actor is an LLM or headless service - Real-world strategies from WorkOS and Cloudflare for assigning, managing, and revoking agent identity and access By the end, you’ll walk away with practical tools and mental models to build agent-powered systems that are secure, auditable, and scalable. ------------------------------------ Session ID: 929790 Track: AI in Action Speaker: Danielle Perszyk (Cognitive Scientist, PhD) Format: Keynote Room: Golden Gate Ballroom A: Workshops Time: 3 Jun 2025 12:40 PM Session Title: Useful General Intelligence Description: We’re all hearing that AI agents will enable AGI, but they can’t yet reliably perform even basic computer tasks. It turns out that getting AI to click, type, and scroll is more challenging than getting it to generate code. How can we build general-purpose agents that can do anything we can do on a computer? This is our goal at the Amazon AGI SF Lab. In this talk, I’ll propose a new approach to agents that we call Useful General Intelligence. After describing how we’re solving the biggest challenges in computer use while enabling developers to access our tech in it’s earliest developmental stages, I’ll show real workflows that developers have built with Nova Act, our agentic model and SDK. ------------------------------------ Session ID: 949016 Track: AI in Action Speaker: swyx . (Curator, smol.ai) Room: Keynote/General Session (Yerba Buena 7&8) Time: 4 Jun 2025 09:10 AM Session Title: Designing AI-Intensive Applications Description: Whether you call it a workflow or an agent, AI engineered applications are seeing user-input:LLM-call ratios go from 1:1 (ChatGPT) to 1:100 (Deep Research, Codex) and even 0:n (Ambient/Proactive agents). How does AI Engineering change as you build increasingly AI intensive applications? ------------------------------------ Session ID: 915465 Track: AI in Action Speaker: Mark Myshatyn (Enterprise AI Architect - Los Alamos National Laboratory ) Format: Talk Room: Grand Assembly Time: 4 Jun 2025 10:55 AM Session Title: Government Agents - AI Agents Meet Tough Regulations Description: What does it mean to field not only LLMs, but whole agentic solutions to highly regulated problems? Come join Los Alamos National Laboratory to hear about fielding AI in hard places. ------------------------------------ Session ID: 936933 Track: AI in Action Speaker: Rajiv Shah (Chief Evangelist) Format: Workshop Room: Golden Gate Ballroom B: Workshops Time: 3 Jun 2025 09:00 AM Session Title: Forget RAG Pipelines—Build Production-Ready AI Agents in 15 Minutes Description: Want to take advantage of your data, but don't want to reinvent RAG infrastructure? Join our workshop and see how you can deploy Agentic RAG in minutes using Contextual AI's managed RAG solution. We'll explore how Contextual handles intelligent parsing and chunking of your data, retrieves information with state of the art accuracy, and generates responses with a multi layered set of guardrails against hallucinations. Together, we'll build an end-to-end Agentic RAG pipeline and demonstrate its integration with Claude Desktop via MCP, so you can see how this could plug into your existing ecosystem. By the end of this session, you'll have a functioning Agentic RAG prototype that you can easily customize and deploy to production for your specific use cases, even with complex, unstructured documents. ------------------------------------ Session ID: 933629 Track: AI in Action Speaker: Sam Alba (Co-Founder of Dagger) Room: Willow: Expo Sessions Time: 4 Jun 2025 12:45 PM Session Title: How to trust an agent with software delivery Description: AI-powered agents promise faster, easier software delivery, but their unpredictable behavior often makes engineers hesitant to fully trust them with critical workflows. Sam Alba, Co-founder of Dagger (and previously co-creator of Docker), explains how teams can reliably integrate agents into their delivery pipelines by shifting how they structure and manage automation. He'll share four practical strategies learned from real-world experience: 1. Treat agents as workflow participants, not isolated tools. Stop using agents as disconnected scripts or IDE plugins. Treating them as first-class parts of your delivery process simplifies your architecture, reduces hidden complexity, and makes agent outcomes more predictable. 2. Use many small agents instead of one big one. Just as software evolved from monoliths to microservices, software delivery benefits from smaller, specialized agents with clearly defined responsibilities. Smaller agents are easier to understand, maintain, and integrate. 3. Define clear environments—the real lever for reliability. Instead of chasing perfect prompts or models, focus on clearly defining the tools, resources, and permissions around your agents. Precisely controlling their environments makes agents behave consistently and reliably. 4. Design workflows for easy debugging and observability. Agents will sometimes fail unexpectedly. Sam will share simple, effective ways to build clear tracing and observability into your workflows from the start, making debugging quicker and less frustrating. You'll leave with practical, immediately usable techniques that give you the confidence to trust AI agents in your software delivery pipelines. ------------------------------------ Session ID: 933684 Track: AI in Action Speaker: Eashan Sinha (Deployed Engineer, Windsurf) Format: Talk Room: Nobhill A&B: Expo Sessions Time: 4 Jun 2025 10:55 AM Session Title: Mastering Engineering Flow with Windsurf Description: As experienced engineers, especially senior and staff engineers, our focus shifts towards complex problem-solving, architectural decisions, and mentoring. While AI tools promise productivity gains, Windsurf offers more than just code completion and chat assistance – it's an agentic IDE built to enhance engineering flow. This talk explores how experienced engineers can leverage Windsurf's deep contextual awareness, structured guidance, and automated workflows to tackle sophisticated and complex tasks. We'll demonstrate practical strategies for accelerating feature development, automating code maintenance and reviews, and ultimately freeing up cognitive load to focus on high-impact engineering challenges. Learn how to move beyond basic AI assistance and truly partner with Windsurf to excel in your role. ------------------------------------ Session ID: 933685 Track: AI in Action Speaker: Tejashwa Tiwari (Analytics and Automation Engineer) Room: Salons 9-15: Expo Hall Time: 4 Jun 2025 01:45 PM Session Title: Windsurf & Wonders Description: Come learn about why Windsurf is the premiere choice for engineers and enterprises alike in applications of AI for development. ------------------------------------ Session ID: 936006 Track: AI in Action Speaker: swyx (Curator) Format: Keynote Session Title: Building AI-Intensive Applications Description: swyx updates the Martin Kleppmann classic. Whether you call it a workflow or an agent, AI engineered applications are seeing user-input:LLM-call ratios go from 1:1 (ChatGPT) to 1:100 (Deep Research, Codex) and even 0:n (Ambient/Proactive agents). How does AI Engineering change as you build increasingly AI intensive applications? Let's call a spade a SPADE. ====================================================================== --- Track: AI IN FORTUNE 500 (June 4-5) --- ====================================================================== Session ID: 936800 Track: AI in Fortune 500 Speaker: Asha Sharma (CVP, Head of Product, Microsoft AI Platform) Format: Keynote Room: Keynote/General Session (Yerba Buena 7&8) Time: 4 Jun 2025 09:20 AM Session Title: Spark to System: Building the Open Agentic Web Description: AI builders no longer ask whether to use agents—but how many and how fast. In this kickoff keynote, Microsoft’s Asha Sharma shows what happens when natural language creation meets an industrial grade backbone. Watch live demos—to see agents move from idea to production in real time. Walk out with the commands, repos, and open protocols to build your piece of the agentic web. ------------------------------------ Session ID: 903524 Track: AI in Fortune 500 Speaker: Joel Hron (CTO) Format: Talk Room: Golden Gate Ballroom C: AI in the Fortune 500 Time: 4 Jun 2025 11:15 AM Session Title: From Copilot to Colleague: Building Trustworthy Productivity Agents for High-Stakes Work Description: This keynote will explore what it takes to move from basic generative assistants to fully agentic AI—systems that don’t just suggest but plan, act, and adapt—all within the structured, high-trust environments where professionals actually work. ------------------------------------ Session ID: 937225 Track: AI in Fortune 500 Speaker: Harrison Chase (CEO) Format: Talk Room: Golden Gate Ballroom C: AI in the Fortune 500 Time: 4 Jun 2025 12:15 PM Session Title: 3 ingredients for building reliable enterprise agents Description: It's easy to build a prototype of an agent, but hard to put an agent in production - especially in an enterprise setting. In this section, will talk about three ingredients for building reliable agents in the enterprise. ------------------------------------ Session ID: 932429 Track: AI in Fortune 500 Speaker: Ben Kus (CTO) Format: Talk Room: Golden Gate Ballroom C: AI in the Fortune 500 Time: 4 Jun 2025 02:00 PM Session Title: Building an Agentic Platform Description: Explore the technical evolution of metadata extraction at Box and how it shaped the foundation of our AI platform. We’ll walk through our transition to an agentic-first design—why it was necessary, how we approached the rebuild, challenges we encountered along the way, and the advantages it unlocked. ------------------------------------ Session ID: 916025 Track: AI in Fortune 500 Speaker: Hariharan Ganesan (Sr. Solutions Architect) Format: Talk Room: SOMA: AI Architects Time: 5 Jun 2025 11:35 AM Session Title: CIOs and Industry Leaders: Do You Trust Your AI’s Inferences? Description: Enterprise AI adoption is accelerating, but with it comes a hard question: Do we trust the model’s decisions? In this 18-minute talk, I’ll explore the invisible risks behind automated decision-making in safety-critical and revenue-sensitive environments. Drawing on case studies across manufacturing, telecom, and industrial IoT, I’ll highlight how explainability, traceability, and robust guardrails drive adoption and protect enterprise value. Attendees will walk away with: • A 3-step framework for operationalizing AI trust • Real-world lessons from building guardrails in on-prem and hybrid systems • Tools and techniques for debugging and explaining inferences at scale • A blueprint for building trust between models, engineers, and executive stakeholders ------------------------------------ Session ID: 916085 Track: AI in Fortune 500 Speaker: Jaspreet Singh (Senior Staff Software Engineer) Format: Talk Room: Golden Gate Ballroom C: AI in the Fortune 500 Time: 5 Jun 2025 11:55 AM Session Title: How Intuit uses LLMs to explain taxes to millions of taxpayers Description: I will talk about how Intuit uses LLMs to explain tax situations to Turbotax users. Users want explanations of their tax situations - this drives confidence in the product. Over the course of last two tax years, Intuit has built out explanations using Anthropic and openAI’s models to develop genAI powered explanations. This includes design a complex system with prompt engineered solutions and both LLM & human powered evaluations to ensure high quality bar that our users expect when filing taxes with us. During the course of my talk, I will talk across GenAI development lifecycle at scale - including development , evaluations and scaling. And security evaluations. We also developed a fine-tuned version of Claude Haiku & shall be covering that in the presentation. We also expanded into tax question and answering powered by RAG, including graphRAG and I would be covering those developments too. ------------------------------------ Session ID: 915738 Track: AI in Fortune 500 Speaker: Christopher Lovejoy (Head of Clinical AI ) Format: Talk Room: Foothill G 1&2: Product Management Time: 4 Jun 2025 02:20 PM Session Title: Make your LLM app a Domain Expert: How to Build an LLM-Native Expert System Description: Vertical AI is a multi-trillion-dollar opportunity. But you can't build a domain-expert application simply by grabbing the latest LLMs off-the-shelf: you need a system for codifying latent insights from domain experts and using that to drive development of your application. In this talk, we'll describe the system we've built at Anterior which has enabled us to achieve SOTA clinical reasoning and serve health insurance providers covering 50 million American lives. We'll share: - how and why to encode domain-specific failure modes as an ontology - a practical system for converting domain expertise into quantifiable eval metrics - how we structure work and collaboration between our clinicians, engineer and PMs - our eval-driven AI iteration process and how this can be adapted to any industry ------------------------------------ Session ID: 914891 Track: AI in Fortune 500 Speaker: Adam Behrens (CEO) Format: Talk Room: Golden Gate Ballroom C: AI in the Fortune 500 Time: 5 Jun 2025 11:35 AM Session Title: Machines of Buying & Selling Grace Description: How to go beyond browser automation to truly agentic commerce, where AI can buy, sell and negotiate on behalf of users and merchants. ------------------------------------ Session ID: 933575 Track: AI in Fortune 500 Speaker: Beyang Liu (Co-founder and CTO, Sourcegraph) Room: Willow: Expo Sessions Time: 5 Jun 2025 03:30 PM Session Title: The emerging skillset of wielding coding agents Description: It's raining coding agents. But while many are saying they're feeling the AGI, others say they're not that useful for serious programming. How much is hype and how much is a skill issue? We'll share empirical observations that help explain the divergence of developer opinion. And we'll cover emergent strategies uncovered by users of Amp, a new coding agent in research preview, that can help you employ agents to complete more complex tasks in production codebases. ------------------------------------ Session ID: 914049 Track: AI in Fortune 500 Speaker: Yogi Miraje (Lead AI Engineer) Format: Talk Room: Golden Gate Ballroom C: AI in the Fortune 500 Time: 5 Jun 2025 02:00 PM Session Title: How to Build Agents without losing control Description: LLMs are getting smarter—but Agents are still unpredictable, unreliable, and hard to control. In this talk, I’ll share practical lessons from building real-world plan-and-execute agents —covering how to steer autonomous agents using agentic workflows, blueprints, and evals. If you’re struggling to make your agents behave (without giving up flexibility), this one’s for you. ------------------------------------ Session ID: 933685 Track: AI in Fortune 500 Speaker: Tejashwa Tiwari (Automation Engineer @ Windsurf) Room: Salons 9-15: Expo Hall Time: 4 Jun 2025 01:45 PM Session Title: Windsurf & Wonders Description: Come learn about why Windsurf is the premiere choice for engineers and enterprises alike in applications of AI for development. ------------------------------------ Session ID: 915616 Track: AI in Fortune 500 Speaker: Donald Hruska (Engineering lead for Retool Agents) Format: Talk Room: Golden Gate Ballroom C: AI in the Fortune 500 Time: 4 Jun 2025 11:55 AM Session Title: How agents will unlock the $500B promise of AI Description: AI agents are on the cusp of revolutionizing work as we know it. The number of use cases software can tackle is set to explode as AI handles tasks requiring real judgment. But to cross the gap between an interesting AI prototype and an essential business tool, you need agents built by developers with real guardrails and security. This means blending AI assistance with traditional coding in a multimodal approach that maximizes efficiency and control. The future isn't about dropping in an LLM — it requires integrating any model, any data, any system to deliver results. Companies utilizing this approach can finally turn their slice of the $500B+ of total AI investment into real business results. ------------------------------------ Session ID: 936933 Track: AI in Fortune 500 Speaker: Nina Lopatina (Lead Developer Advocate) Format: Workshop Room: Golden Gate Ballroom B: Workshops Time: 3 Jun 2025 09:00 AM Session Title: Forget RAG Pipelines—Build Production-Ready AI Agents in 15 Minutes Description: Want to take advantage of your data, but don't want to reinvent RAG infrastructure? Join our workshop and see how you can deploy Agentic RAG in minutes using Contextual AI's managed RAG solution. We'll explore how Contextual handles intelligent parsing and chunking of your data, retrieves information with state of the art accuracy, and generates responses with a multi layered set of guardrails against hallucinations. Together, we'll build an end-to-end Agentic RAG pipeline and demonstrate its integration with Claude Desktop via MCP, so you can see how this could plug into your existing ecosystem. By the end of this session, you'll have a functioning Agentic RAG prototype that you can easily customize and deploy to production for your specific use cases, even with complex, unstructured documents. ------------------------------------ Session ID: 915431 Track: AI in Fortune 500 Speaker: Thor 雷神 Schaeff (DX at ElevenLabs) Format: Workshop Room: Golden Gate Ballroom A: Workshops Time: 3 Jun 2025 11:00 AM Session Title: Build multilingual Conversational AI Agents Description: In this workshop you will learn how to build multilingual Conversational AI agents that can automatically detect your user's spoken language and can seamlessly switch to their preferred language. ------------------------------------ Session ID: 904722 Track: AI in Fortune 500 Speaker: Infant Vasanth (Senior Director of Engineering) Format: Talk Room: Golden Gate Ballroom C: AI in the Fortune 500 Time: 4 Jun 2025 11:35 AM Session Title: Accelerating Investment Operations: How BlackRock Builds Custom Knowledge Apps at Scale. Description: Investment Operations teams are the backbone of asset and investment management firms. Their day-to-day work not only enables portfolio managers to respond swiftly to market events but also ensures that complex, unstructured data flows seamlessly across the organization. In this talk, we introduce a modular, Kubernetes-native AI framework purpose-built to scale custom Knowledge Apps across the enterprise. Designed with speed, flexibility, and compliance in mind, the framework empowers teams to launch production-grade document extraction applications in minutes instead of months, unlocking new levels of automation and efficiency for investment management workflows. We’ll also share how this framework has helped BlackRock streamline document extraction processes, generate investment signals, reduce operational overhead, and accelerate the delivery of high-impact business use cases—all while maintaining the robustness and control required in a regulated industry. ------------------------------------ Session ID: 914890 Track: AI in Fortune 500 Speaker: Kevin Madura (Director, Advanced Technologies) Format: Talk Room: Golden Gate Ballroom C: AI in the Fortune 500 Time: 4 Jun 2025 02:20 PM Session Title: The Billable Hour is Dead; Long Live the Billable Hour? Description: If software was eating the world before, knowledge work will soon be devoured by AI. In corporate America there are thousands of hours spent on rote tasks every day by employees, consultants, and lawyers alike. But is AI really capable of replacing work in the real world yet? Productivity estimates from GenAI range from 1.5% (NBER) to 96% (☝ us! ️). In this talk we'll share war stories of where the answer is yes (and no) and how we reduced human time spent on tasks from days to minutes in high-impact situations. The path from promise to actual product, used in real world settings, from our experience, is still unmapped. Learn what we built, how we built it - with code - and how we got stakeholder buy-in to deploy it. ------------------------------------ Session ID: 933633 Track: AI in Fortune 500 Speaker: Sam Fertig (Deployed Engineer, Windsurf) Format: Talk Room: Nobhill A&B: Expo Sessions Time: 5 Jun 2025 03:30 PM Session Title: The Eyes Are The (Context) Window to The Soul: How Windsurf Gets to Know You Description: Sometimes it seems like Windsurf knows you a little too well. It's one thing to generate generic code, but to predict your next intent? From matching existing code patterns and styles to tracking how local changes affect the larger codebase, this talk digs into the technical challenges of context awareness and why simply indexing code falls short. Relive our journey tackling the core issue in the AI IDE space : balancing retrieval quality with latency constraints and scaling effectively as codebases grow. For those curious about the infrastructure behind context-aware AI, this talk offers insights into our approach of turning massive codebases into collections of useful context. ------------------------------------ Session ID: 914890 Track: AI in Fortune 500 Speaker: Mo Bhasin (Director of AI Products) Format: Talk Room: Golden Gate Ballroom C: AI in the Fortune 500 Time: 4 Jun 2025 02:20 PM Session Title: The Billable Hour is Dead; Long Live the Billable Hour? Description: If software was eating the world before, knowledge work will soon be devoured by AI. In corporate America there are thousands of hours spent on rote tasks every day by employees, consultants, and lawyers alike. But is AI really capable of replacing work in the real world yet? Productivity estimates from GenAI range from 1.5% (NBER) to 96% (☝ us! ️). In this talk we'll share war stories of where the answer is yes (and no) and how we reduced human time spent on tasks from days to minutes in high-impact situations. The path from promise to actual product, used in real world settings, from our experience, is still unmapped. Learn what we built, how we built it - with code - and how we got stakeholder buy-in to deploy it. ------------------------------------ Session ID: 935969 Track: AI in Fortune 500 Speaker: Rita Kozlov (vp developers & ai, cloudflare) Format: Talk Room: Golden Gate Ballroom C: AI in the Fortune 500 Time: 5 Jun 2025 02:40 PM Session Title: Building Agents (the hard parts!) Description: AI workloads are rapidly shifting from AI being used for augmentation (co-pilots), to AI becoming responsible for full, end-to-end automation (agents). But building effective agents, and even more importantly, agent experiences that boost productivity requires many pieces. In this talk, we'll be covering the building blocks of agents, how to put them together, and what we've learned from top companies building agents along the way. ------------------------------------ Session ID: 933605 Track: AI in Fortune 500 Speaker: Mani Khanuja (Principal ML Services SA) Room: Juniper: Expo Sessions Time: 4 Jun 2025 03:30 PM Session Title: Data is Your Differentiator: Building Secure and Tailored AI Systems Description: As organizations seek to harness their proprietary data while maintaining security and compliance, Amazon Bedrock provides a comprehensive framework for building tailored AI applications. Using Amazon Bedrock Knowledge Bases and Amazon Bedrock Data Automation, organizations can create AI solutions that truly understand their unique business context, terminology, and requirements. Combined with Amazon Bedrock Guardrails, these capabilities enhance the accuracy and relevance of AI-generated responses, while ensuring that sensitive information remains protected within the organization's control - enabling businesses to build secure and compliant enterprise-grade generative AI solutions that accelerate time to value. ------------------------------------ Session ID: 915990 Track: AI in Fortune 500 Speaker: Amir Haghighat (CTO) Format: Talk Room: SOMA: AI Architects Time: 4 Jun 2025 11:55 AM Session Title: The Rise of Open Models in the Enterprise Description: This year kicked off with the DeepSeek-R1 news cycle breaking out of our AI Engineering bubble into the mainstream tech and business world. Leaders at the highest levels of the largest enterprises started asking how open source models could enhance and accelerate their AI strategy. Open source models promise increased ownership of AI systems: control over performance and price, improved uptime and reliability, better compliance, and flexible hosting options. How are these promises playing out after months of implementation? In this talk, I’ll draw on hundreds of conversations with AI leaders at enterprise companies to discuss what has — and hasn’t — changed about enterprise AI strategy in a world where open-source models compete on the frontier of intelligence. ------------------------------------ Session ID: 925912 Track: AI in Fortune 500 Speaker: Randall Hunt (CTO at Caylent) Format: Talk Room: Golden Gate Ballroom C: AI in the Fortune 500 Time: 5 Jun 2025 02:20 PM Session Title: POC to PROD: Hard Lessons from 200+ Enterprise GenAI Deployments Description: The transition from experimental GenAI demonstrations to robust, production-grade systems involves significant technical and organizational complexities. Humans provide a ceiling on the true ROI of automations. This session synthesizes key patterns and practical strategies gathered from more than 200 GenAI implementations across multiple industries and business sizes. Beyond the general lessons that apply to most products leveraging GenAI, we'll cover detailed observations within three application areas: multimodal understanding and search, enterprise knowledge retrieval, and AI agent architectures. We will share real-world comparative performance data and metrics on embedding models, vector index implementations, and explore various implementation methodologies that balance performance and cost. Additionally, the session addresses organizational insights critical to successful AI deployments, such as the importance of clearly defined evaluation processes and understanding real-world user interaction challenges, highlighted by examples from healthcare environments. Attendees will gain an understanding of decision-making criteria, including the appropriate complexity of prompt engineering versus more elaborate orchestration methods, token/cost management strategies in multilingual settings, and the challenges in driving behavioral change with new UX and application interaction capabilities. Participants will leave equipped with practical, data-supported insights for effectively navigating their own GenAI projects, including benchmarks and criteria for informed technology selection, and techniques to streamline the transition from initial concept to sustainable operational deployment. Please note, we all know this field evolves rapidly and we will mark which lessons we believe are immutable. ------------------------------------ Session ID: 904722 Track: AI in Fortune 500 Speaker: Vaibhav Page (Principal Engineer ) Format: Talk Room: Golden Gate Ballroom C: AI in the Fortune 500 Time: 4 Jun 2025 11:35 AM Session Title: Accelerating Investment Operations: How BlackRock Builds Custom Knowledge Apps at Scale. Description: Investment Operations teams are the backbone of asset and investment management firms. Their day-to-day work not only enables portfolio managers to respond swiftly to market events but also ensures that complex, unstructured data flows seamlessly across the organization. In this talk, we introduce a modular, Kubernetes-native AI framework purpose-built to scale custom Knowledge Apps across the enterprise. Designed with speed, flexibility, and compliance in mind, the framework empowers teams to launch production-grade document extraction applications in minutes instead of months, unlocking new levels of automation and efficiency for investment management workflows. We’ll also share how this framework has helped BlackRock streamline document extraction processes, generate investment signals, reduce operational overhead, and accelerate the delivery of high-impact business use cases—all while maintaining the robustness and control required in a regulated industry. ------------------------------------ Session ID: 916025 Track: AI in Fortune 500 Speaker: Sahil Yadav (Head of AI Products (Sr. Director, Product Management)) Format: Talk Room: SOMA: AI Architects Time: 5 Jun 2025 11:35 AM Session Title: CIOs and Industry Leaders: Do You Trust Your AI’s Inferences? Description: Enterprise AI adoption is accelerating, but with it comes a hard question: Do we trust the model’s decisions? In this 18-minute talk, I’ll explore the invisible risks behind automated decision-making in safety-critical and revenue-sensitive environments. Drawing on case studies across manufacturing, telecom, and industrial IoT, I’ll highlight how explainability, traceability, and robust guardrails drive adoption and protect enterprise value. Attendees will walk away with: • A 3-step framework for operationalizing AI trust • Real-world lessons from building guardrails in on-prem and hybrid systems • Tools and techniques for debugging and explaining inferences at scale • A blueprint for building trust between models, engineers, and executive stakeholders ------------------------------------ Session ID: 936933 Track: AI in Fortune 500 Speaker: Rajiv Shah (Chief Evangelist) Format: Workshop Room: Golden Gate Ballroom B: Workshops Time: 3 Jun 2025 09:00 AM Session Title: Forget RAG Pipelines—Build Production-Ready AI Agents in 15 Minutes Description: Want to take advantage of your data, but don't want to reinvent RAG infrastructure? Join our workshop and see how you can deploy Agentic RAG in minutes using Contextual AI's managed RAG solution. We'll explore how Contextual handles intelligent parsing and chunking of your data, retrieves information with state of the art accuracy, and generates responses with a multi layered set of guardrails against hallucinations. Together, we'll build an end-to-end Agentic RAG pipeline and demonstrate its integration with Claude Desktop via MCP, so you can see how this could plug into your existing ecosystem. By the end of this session, you'll have a functioning Agentic RAG prototype that you can easily customize and deploy to production for your specific use cases, even with complex, unstructured documents. ------------------------------------ Session ID: 933684 Track: AI in Fortune 500 Speaker: Eashan Sinha (Deployed Engineer, Windsurf) Format: Talk Room: Nobhill A&B: Expo Sessions Time: 4 Jun 2025 10:55 AM Session Title: Mastering Engineering Flow with Windsurf Description: As experienced engineers, especially senior and staff engineers, our focus shifts towards complex problem-solving, architectural decisions, and mentoring. While AI tools promise productivity gains, Windsurf offers more than just code completion and chat assistance – it's an agentic IDE built to enhance engineering flow. This talk explores how experienced engineers can leverage Windsurf's deep contextual awareness, structured guidance, and automated workflows to tackle sophisticated and complex tasks. We'll demonstrate practical strategies for accelerating feature development, automating code maintenance and reviews, and ultimately freeing up cognitive load to focus on high-impact engineering challenges. Learn how to move beyond basic AI assistance and truly partner with Windsurf to excel in your role. ------------------------------------ Session ID: 933685 Track: AI in Fortune 500 Speaker: Tejashwa Tiwari (Analytics and Automation Engineer) Room: Salons 9-15: Expo Hall Time: 4 Jun 2025 01:45 PM Session Title: Windsurf & Wonders Description: Come learn about why Windsurf is the premiere choice for engineers and enterprises alike in applications of AI for development. ====================================================================== --- Track: AGENT RELIABILITY (June 4) --- ====================================================================== Session ID: 939640 Track: Agent Reliability Speaker: Christian Szegedy (Former co-founder of xAI, discoverer of adversarial examples) Format: Talk Room: Yerba Buena Ballroom 2-6: Reasoning + RL Time: 5 Jun 2025 02:40 PM Session Title: Towards Verified Superintelligence Description: I describe a new paradigm towards open-endedly self-improving intelligence by scaling verification to remove the human data and supervision bottleneck. The objective is to achieve trustless alignment of superintelligence. ------------------------------------ Session ID: 913755 Track: Agent Reliability Speaker: Preeti Somal (SVP Engineering) Format: Talk Room: Foothill C: Agent Reliability Time: 4 Jun 2025 11:55 AM Session Title: Scaling AI agents without breaking reliability Description: As AI agents move from prototypes to production, developers are running into new challenges with orchestration, failure handling, and infrastructure. This session will unpack lessons from teams already building real-world systems and share how to design for reliability from the start. ------------------------------------ Session ID: 933669 Track: Agent Reliability Speaker: Anushrut Gupta (Applied AI Lead, PromptQL) Format: Talk Room: Juniper: Expo Sessions Time: 4 Jun 2025 01:30 PM Session Title: "Data readiness" is a myth: Make AI Reliabile with an Agentic Semantic Layer Description: The rapid progress in LLM capability has not translated to increased reliability for business critical AI use cases. The root-cause? Data is "not ready". Conversational analytics doesn't go beyond the analyst team because it's hard to verify if the generated queries are actually doing what they are supposed to. RAG based systems often fail to handle the breadth and depth of real world use-cases because it requires a prohibitive amount of preparation & maintenance of an underlying knowledge graph. Agentic AI systems need to hard-code specific workflows to work reliably and end up looking more like software engineering with LLM calls instead of delivering on the promise of truly agentic workflows. In all of these failure modes, the common culprit is that the planning or reasoning done by the LLM fails to accurately capture the user's intent or the domain's context aka the lack of a well prepared semantic data layer. Enterprise data is silo-ed and vastly varying levels of quality and the perfect "semantic layer" and "metadata" is a moving target. New data is continuously being created and business definitions are rapidly changing and often entirely on-demand. In this talk we'll share how you can build and maintain a semantic data layer that is maintained entirely by AI, and show (with live examples) how that dramatically improves reliability of the AI system that needs dynamic access to data. We'll demonstrate how this sufficiently augments existing RAG, text-to-SQL and tool calling techniques and starts opening the door to reliable AI deployments. ------------------------------------ Session ID: 933599 Track: Agent Reliability Speaker: Rohit Talluri (WW Generative AI Specialist) Room: Nobhill A&B: Expo Sessions Time: 5 Jun 2025 01:15 PM Session Title: Serving Voice AI at Scale Description: Real-Time Voice AI applications demand the lowest possible latencies to enhance user experiences with more advanced reasoning and agentic capabilities. AWS is hosting Arjun Desai, co-founder of Cartesia, in a fireside chat for a technical deep dive into learnings and best practices for building a state-of-the-art inference stack that serves global enterprise customers. ------------------------------------ Session ID: 933575 Track: Agent Reliability Speaker: Beyang Liu (Co-founder and CTO, Sourcegraph) Room: Willow: Expo Sessions Time: 5 Jun 2025 03:30 PM Session Title: The emerging skillset of wielding coding agents Description: It's raining coding agents. But while many are saying they're feeling the AGI, others say they're not that useful for serious programming. How much is hype and how much is a skill issue? We'll share empirical observations that help explain the divergence of developer opinion. And we'll cover emergent strategies uncovered by users of Amp, a new coding agent in research preview, that can help you employ agents to complete more complex tasks in production codebases. ------------------------------------ Session ID: 933474 Track: Agent Reliability Speaker: Kenneth DuMez (DevRel Lead, Graphite) Room: Willow: Expo Sessions Time: 5 Jun 2025 03:15 PM Session Title: Cattle, not genies: building AI agents from first principles Description: As magical as they may seem, AI agents should be treated like any other software system. This talk will cover the best practices in designing and building AI systems including observability, security hardening, and proper UX. ------------------------------------ Session ID: 933612 Track: Agent Reliability Speaker: Antje Barth (Principal Developer Advocate) Format: Keynote Room: Keynote/General Session (Yerba Buena 7&8) Time: 4 Jun 2025 04:05 PM Session Title: Building Agents at Cloud-Scale Description: Let's explore practical strategies for building and scaling agents in production. Discover how to move from local MCP implementations to cloud-scale architectures and how engineering teams leverage these patterns to develop sophisticated agent systems. Expect a mix of demos, use case discussions, and a glimpse into the future of agentic services! ------------------------------------ Session ID: 914080 Track: Agent Reliability Speaker: Dexter Horthy (Founder) Format: Talk Room: Foothill C: Agent Reliability Time: 4 Jun 2025 11:35 AM Session Title: 12 Factor Agents - Principles of Reliable LLM Applications Description: Hi, I'm Dex. I've been hacking on AI agents for a while. I've tried every agent framework out there, from the plug-and-play crew/langchains to the "minimalist" smolagents of the world to the "production grade" langraph, griptape, etc. I've talked to a lot of really strong founders who are all building really impressive things with AI. Most of them are rolling the stack themselves. I don't see a lot of frameworks in production customer-facing agents. I've been surprised to find that most of the products out there billing themselves as "AI Agents" are not all that agentic. A lot of them are mostly deterministic code, with LLM steps sprinkled in at just the right points to make the experience truly magical. Agents, at least the good ones, don't follow the "here's your prompt, here's a bag of tools, loop until you hit the goal" pattern. Rather, they are comprised of mostly just software. So, I set out to answer: What are the principles we can use to build LLM-powered software that is actually good enough to put in the hands of production customers? ------------------------------------ Session ID: 905792 Track: Agent Reliability Speaker: Ahmad Awais (Founder & CEO, CHAI.new by Langbase) Format: Online Talk Session Title: Why the Best AI Agents Are Built Without Frameworks (Primitives over Frameworks) Description: Cursor, v0, chai.new, lovable, bolt — what do they all have in common? They weren’t built on AI frameworks—they're built using primitives optimized for speed, scale, and flexibility. LLMs are evolving fast—like, literally every week. New standards pop up (looking at you, MCP), and APIs change faster than you can keep track. Frameworks just can't move at this speed. In this talk, I'll challenge conventional engineering wisdom, sharing my real-world experience scaling thousands of AI agents to handle over 100 million monthly runs. You'll discover how using AI primitives can dramatically speed up iteration, provide bigger scale, and simplify maintenance. I'll share eight practical agent architectures—covering memory management, auto tool integration, and simple serverless deployment—to help you quickly build reliable and scalable AI agents. By the end of this session, you'll clearly see why we must rethink and rebuild our infrastructure and focus on AI-native primitives instead of heavy, bloated, and quickly outdated frameworks. I wonder if we need another S3-moment but for the AI agent infrastructure. ------------------------------------ Session ID: 933610 Track: Agent Reliability Speaker: Mike Chambers (AI/ML Specialist DA AWS) Format: Talk Room: Golden Gate Ballroom C: AI in the Fortune 500 Time: 5 Jun 2025 12:15 PM Session Title: Ship it! Building Production-Ready Agents Description: Explore the practical challenges and solutions for deploying AI agents in real-world production environments. Through detailed technical analysis and practical examples, we'll examine strategies for building and orchestrating agent systems at scale. We'll cover critical infrastructure decisions, scalability frameworks, and best practices for creating robust, production-ready agent architectures. ------------------------------------ Session ID: 933702 Track: Agent Reliability Speaker: Mikiko Bazeley (Staff Developer Advocate, MongoDB) Room: Salons 9-15: Expo Hall Time: 5 Jun 2025 01:15 PM Session Title: Smarter Together: Designing Multi-Agent Systems with Shared, Evolving Memory Description: In today’s most advanced AI systems, intelligence is no longer confined to a single model or agent—it emerges from coordination. But coordination requires memory: short-term, long-term, and shared. In this talk, we’ll break down how agent systems can store, retrieve, and evolve shared memory to become smarter over time. You'll learn what it takes to architect these continuously learning systems, how to track and improve memory quality, and why robust, flexible infrastructure is the foundation of it all. Stick around to see how this works in practice—live. ------------------------------------ Session ID: 933621 Track: Agent Reliability Speaker: Laurie Voss (VP Developer Relations, LlamaIndex) Room: Juniper: Expo Sessions Time: 4 Jun 2025 01:00 PM Session Title: Effective agent design patterns in production Description: At LlamaIndex we see a lot of agents built every day, and we've got a sense of what works and what doesn't. We've distilled those learnings down into a series of patterns and best practices for building real-world, production agents, and we're here to share them. You'll learn patterns for applying structure and guidance to famously nondeterministic LLMs and get concrete instruction on how to implement them. ------------------------------------ Session ID: 933605 Track: Agent Reliability Speaker: Mani Khanuja (Principal ML Services SA) Room: Juniper: Expo Sessions Time: 4 Jun 2025 03:30 PM Session Title: Data is Your Differentiator: Building Secure and Tailored AI Systems Description: As organizations seek to harness their proprietary data while maintaining security and compliance, Amazon Bedrock provides a comprehensive framework for building tailored AI applications. Using Amazon Bedrock Knowledge Bases and Amazon Bedrock Data Automation, organizations can create AI solutions that truly understand their unique business context, terminology, and requirements. Combined with Amazon Bedrock Guardrails, these capabilities enhance the accuracy and relevance of AI-generated responses, while ensuring that sensitive information remains protected within the organization's control - enabling businesses to build secure and compliant enterprise-grade generative AI solutions that accelerate time to value. ------------------------------------ Session ID: 933692 Track: Agent Reliability Speaker: Richmond Alake (Staff Developer Advocate, AI/ML at MongoDB) Room: Willow: Expo Sessions Time: 4 Jun 2025 10:40 AM Session Title: Architecting Agent Memory: Principles, Patterns, and Best Practices Description: In the rapidly evolving landscape of agentic systems, memory management has emerged as a key pillar for building intelligent, context-aware AI Agents. Inspired by the complexity of human memory systems—such as episodic, working, semantic, and procedural memory—this talk unpacks how AI agents can achieve believability, reliability, and capability by retaining and reasoning over past experiences. We’ll begin by establishing a conceptual framework based on real-world implementations from memory management libraries and system architectures: Memory Components representing various structured memory types (e.g., conversation, workflow, episodic, persona) Memory Modes reflecting operational strategies for short-term, long-term, and dynamic memory handling Next, the talk transitions to practical implementation patterns critical for effective memory lifecycle management: Maintaining rich conversation history and contextual awareness Persistence strategies leveraging vector databases and hybrid search Memory augmentation using embeddings, relevance scoring, and semantic retrieval Production-ready practices for scaling memory in multi-agent ecosystems We’ll also examine advanced memory strategies within agentic systems: Memory cascading and selective deletion Integration of tool use and persona memory Optimizing performance around memory retrieval and LLM context window limits Whether you're developing autonomous agents, chatbots, or complex workflow orchestration systems, this talk offers knowledge and tactical insights for building AI that can remember, adapt, and improve over time. This session is ideal for: AI engineers and agent framework developers Architects designing Agentic RAG or multi-agent systems Practitioners building contextual, personalized AI experiences By the end of the session, you’ll understand how to leverage memory as a strategic asset in agentic design—and walk away ready to build agents that not only act and reason but also remember. ------------------------------------ Session ID: 941906 Track: Agent Reliability Speaker: Alex Atallah (CEO OpenRouter, co-founder of OpenSea) Format: Keynote Room: Keynote/General Session (Yerba Buena 7&8) Time: 5 Jun 2025 04:35 PM Session Title: fun stories from building OpenRouter and where all this is going Description: How the first LLM aggregator got started, some of the weird moments in its early growth, architecture challenges, and where we'll be taking it down the road ------------------------------------ Session ID: 914015 Track: Agent Reliability Speaker: Sam Bhagwat (Co-founder) Format: Talk Room: Foothill C: Agent Reliability Time: 4 Jun 2025 02:20 PM Session Title: Agents vs Workflows: Why Not Both? Description: One current hot debate is should you make your top-level abstraction a ReAct type agent running in a loop? or should you make it a structured workflow graph? OpenAI is launching their new framework and throwing shade on workflow graph approaches TBH we think this whole debate is kinda dumb. We've seen a lot of folks be able to structure the problem in a way that a workflow graph makes a lot of sense. We also see a ton of agents where you need to run the core bit in a loop for a long time. You can also give your agents structured workflow graphs as a tool. You can use structured workflow graphs as a handoff mechanism between agents. What we've seen from the community is frankly that folks need to tinker with multiple approaches and combine primitives in interesting ways We'll share a couple stories where teams ended up with workflow graph based approaches, a couple where teams ended up with agent based approaches, and a couple where a blended approach made sense. ------------------------------------ Session ID: 916117 Track: Agent Reliability Speaker: Tanmai Gopal (CEO, Co-founder) Format: Talk Room: Foothill C: Agent Reliability Time: 4 Jun 2025 11:15 AM Session Title: AI Automation that actually works: $100M, messy data, zero surprises Description: We will review the different kinds of automation use-cases, and the approach we used, that will drive over a $100M of expected annual impact by deploying AI for business critical initiatives. We will discuss what kinds of automation initiatives become possible because of Gen AI. These were not tenable before because of the amount of customization required per customer or per scenario, and the kind of data involved in these workflows. Previously, these workflows were driven manually which were both error prone and required expensive training. To replace or augment these manual business critical processes, automation _has_ to cross a very high bar of reliability. We will share how we addressed the inherent non-determinism of Gen AI to create a predictable system that doesn’t have any surprising failure modes. We’ll also discuss how we worked with our existing data that was spread across various systems without an expensive centralisation and clean up effort. ------------------------------------ Session ID: 933629 Track: Agent Reliability Speaker: Sam Alba (Co-Founder of Dagger) Room: Willow: Expo Sessions Time: 4 Jun 2025 12:45 PM Session Title: How to trust an agent with software delivery Description: AI-powered agents promise faster, easier software delivery, but their unpredictable behavior often makes engineers hesitant to fully trust them with critical workflows. Sam Alba, Co-founder of Dagger (and previously co-creator of Docker), explains how teams can reliably integrate agents into their delivery pipelines by shifting how they structure and manage automation. He'll share four practical strategies learned from real-world experience: 1. Treat agents as workflow participants, not isolated tools. Stop using agents as disconnected scripts or IDE plugins. Treating them as first-class parts of your delivery process simplifies your architecture, reduces hidden complexity, and makes agent outcomes more predictable. 2. Use many small agents instead of one big one. Just as software evolved from monoliths to microservices, software delivery benefits from smaller, specialized agents with clearly defined responsibilities. Smaller agents are easier to understand, maintain, and integrate. 3. Define clear environments—the real lever for reliability. Instead of chasing perfect prompts or models, focus on clearly defining the tools, resources, and permissions around your agents. Precisely controlling their environments makes agents behave consistently and reliably. 4. Design workflows for easy debugging and observability. Agents will sometimes fail unexpectedly. Sam will share simple, effective ways to build clear tracing and observability into your workflows from the start, making debugging quicker and less frustrating. You'll leave with practical, immediately usable techniques that give you the confidence to trust AI agents in your software delivery pipelines. ------------------------------------ Session ID: 933599 Track: Agent Reliability Speaker: Arjun Desai (Co-Founder, Cartesia) Room: Nobhill A&B: Expo Sessions Time: 5 Jun 2025 01:15 PM Session Title: Serving Voice AI at Scale Description: Real-Time Voice AI applications demand the lowest possible latencies to enhance user experiences with more advanced reasoning and agentic capabilities. AWS is hosting Arjun Desai, co-founder of Cartesia, in a fireside chat for a technical deep dive into learnings and best practices for building a state-of-the-art inference stack that serves global enterprise customers. ====================================================================== --- Track: AUTONOMY+ROBOTICS (TBA) --- ====================================================================== Session ID: 938258 Track: Autonomy+Robotics Speaker: Quan Vuong (Cofounder) Format: Keynote Session Title: Physical AGI (tbc) Description: Quan Vuong (Cofounder) and Jost Tobias Springenberg (Research Scientist) from Physical Intelligence discuss the advancements and future of Physical AGI in the Autonomy/Robotics track. ------------------------------------ Session ID: 938258 Track: Autonomy+Robotics Speaker: Jost Tobias Springenberg (Research Scientist) Format: Keynote Session Title: Physical AGI (tbc) Description: Quan Vuong (Cofounder) and Jost Tobias Springenberg (Research Scientist) from Physical Intelligence discuss the advancements and future of Physical AGI in the Autonomy+Robotics track. ------------------------------------ Session ID: 915934 Track: Autonomy+Robotics Speaker: Jyh-Jing Hwang (Research Scientist & TLM ) Format: Talk Room: Foothill E: Autonomy + Robotics Time: 5 Jun 2025 02:00 PM Session Title: Teaching Cars to Think: Language Models and Autonomous Vehicles Description: This session explores Waymo's latest research on the End-to-End Multimodal Model for Autonomous Driving (EMMA) and advanced sensor simulation techniques. Jyh-Jing Hwang will demonstrate how multimodal large language models like Gemini could improve autonomous driving through unified end-to-end architectures that process raw sensor data directly into driving decisions. The presentation will showcase EMMA's state-of-the-art performance in trajectory planning, 3D object detection, and road graph understanding, as well as another Drive&Gen research approach to sensor simulation for evaluating an end-to-end motion planning model. Attendees will gain insights into the benefits of co-training across multiple autonomous driving tasks and the potential of controlled video generation for testing under various environmental conditions. More on EMMA here: https://waymo.com/blog/2024/10/introducing-emma ------------------------------------ Session ID: 916103 Track: Autonomy+Robotics Speaker: Annika Brundyn (GenAI Architect) Format: Talk Room: Foothill E: Autonomy + Robotics Time: 5 Jun 2025 11:55 AM Session Title: What Is a Humanoid Foundation Model? An Introduction to GR00T N1 Description: Foundation models don’t just write or draw anymore—they’re starting to move. GR00T N1 is NVIDIA’s open Vision-Language-Action (VLA) foundation model for humanoid robots. Built with a dual-system architecture, it combines a System 2 module for high-level reasoning with a System 1 module for real-time, fluid motor control. It’s trained end-to-end on a an impressive mix of data—from human videos to robot trajectories to synthetic simulations—and deployed on a full-sized humanoid robot performing bimanual manipulation tasks in the real world. This talk is a high-level, beginner-friendly overview of GR00T N1: - What makes a robot foundation model different from an LLM or vision model - How GR00T’s architecture is inspired by cognitive systems - Why grounding language, vision, and action together unlocks new generalist capabilities If you’ve ever wondered how large-scale AI is crossing over into the physical world, this session will get you up to speed—no robotics PhD required. ------------------------------------ Session ID: 916103 Track: Autonomy+Robotics Speaker: Aastha Jhunjhunwala (Solutions Architect) Format: Talk Room: Foothill E: Autonomy + Robotics Time: 5 Jun 2025 11:55 AM Session Title: What Is a Humanoid Foundation Model? An Introduction to GR00T N1 Description: Foundation models don’t just write or draw anymore—they’re starting to move. GR00T N1 is NVIDIA’s open Vision-Language-Action (VLA) foundation model for humanoid robots. Built with a dual-system architecture, it combines a System 2 module for high-level reasoning with a System 1 module for real-time, fluid motor control. It’s trained end-to-end on a an impressive mix of data—from human videos to robot trajectories to synthetic simulations—and deployed on a full-sized humanoid robot performing bimanual manipulation tasks in the real world. This talk is a high-level, beginner-friendly overview of GR00T N1: - What makes a robot foundation model different from an LLM or vision model - How GR00T’s architecture is inspired by cognitive systems - Why grounding language, vision, and action together unlocks new generalist capabilities If you’ve ever wondered how large-scale AI is crossing over into the physical world, this session will get you up to speed—no robotics PhD required. ------------------------------------ Session ID: 945538 Track: Autonomy+Robotics Speaker: Nikhil Abraham (CEO) Format: Talk Room: Foothill E: Autonomy + Robotics Time: 5 Jun 2025 02:20 PM Session Title: General purpose robots as professional Chefs Description: How we converted a bimanual robot into a professional chef that works in novel kitchens and learn new recipes from a single demonstration. ------------------------------------ Session ID: 916140 Track: Autonomy+Robotics Speaker: Stefania Druga (Independent AI Research Scientist ) Format: Talk Room: Foothill E: Autonomy + Robotics Time: 5 Jun 2025 11:35 AM Session Title: Real-time Experiments with an AI Co-Scientist Description: The sheer volume of data and complexity of modern scientific challenges necessitate tools that go beyond mere analysis. The vision of an "AI Co-scientist" – a true collaborative partner in the lab – requires sophisticated engineering to bridge the gap between powerful AI reasoning and the dynamic reality of physical experiments. This talk dives into the engineering required to build robust AI Co-scientists for hands-on research. We will explore scalable architectures, such as multi-agent systems leveraging foundation models like Gemini for complex reasoning, hypothesis refinement (inspired by the "generate, debate, evolve" paradigm described in recent AI Co-scientist research), and intelligent tool use. The core focus will be on the engineering challenges and solutions for integrating diverse, real-time empirical data streams – visual data from cameras, quantitative readings from sensors, positional feedback from actuators, and instrument outputs – directly into the AI's reasoning loop. I will illustrate this with concrete, technically detailed examples in chemistry (adaptive reaction monitoring), robotics (vision-guided assembly with SO Arm 100 and LeRobot library), and synthetic biology (real-time bacterial growth monitoring & interpretation). We'll discuss engineering strategies for handling data heterogeneity, latency, noise, and enabling the AI to interpret, correlate, and act upon live experimental feedback. Finally, we will touch upon how thoughtful engineering of these AI Co-scientists can contribute to democratizing access to advanced scientific capabilities. ------------------------------------ Session ID: 948652 Track: Autonomy+Robotics Speaker: JingXiang Mo (Founding Engineer @ K-Scale Labs, Robotics Product & Engineering Lead) Format: Talk Room: Foothill E: Autonomy + Robotics Time: 5 Jun 2025 02:40 PM Session Title: Scaling Open-source Humanoid Robots Description: Introducing developer ready robots that are open-source, affordable, and easy to use. ------------------------------------ Session ID: 949382 Track: Autonomy+Robotics Speaker: Rishabh Garg (Robotics Engineer at Tesla Optimus) Format: Talk Room: Foothill E: Autonomy + Robotics Time: 5 Jun 2025 12:15 PM Session Title: Communication and System Software in Robotics Description: A journey into building a small software stack for a robot and discussing the issues that may commonly come up along the way. ====================================================================== --- Track: DESIGN ENGINEERING (TBA) --- ====================================================================== Session ID: 915428 Track: Design Engineering Speaker: Maximillian Piras (Product Designer) Format: Online Talk Room: Foothill G 1&2: Design Engineering Time: 5 Jun 2025 11:55 AM Session Title: The Bitter Layout or: How I Learned to Love the Model Picker Description: Are conversational interfaces the future or, as many designers have suggested, a lazy solution that is bottlenecking AI-HCI? Despite well-documented usability issues, the design of many AI applications defaults to an input field, turn-by-turn flow, and an endless model picker — I call this “The Bitter Layout”. In this talk, we’ll explore how Clay Christensen’s theory of commoditization from the early PC industry can explain why scaling laws require AI interfaces to remain modular until models fully commoditize. The killer feature of conversational interfaces may not be that they’re natural, but that they’re conformable. Learn how to evolve interfaces as inference scales, spot shifts in the basis of competition, and stop worrying about the next model update steamrolling your design decisions. ------------------------------------ Session ID: 914027 Track: Design Engineering Speaker: Victor Dibia (Principal Research Engineer) Format: Talk Room: Foothill G 1&2: Design Engineering Time: 5 Jun 2025 11:15 AM Session Title: UX Design Principles for (Semi) Autonomous Multi-Agent Systems Description: Autonomous or semi-autonomous multi-agent systems (MAS) involve exponentially complex configurations (system config, agent configs, task management and delegation, etc.). These present unique interface design challenges for both developer tooling and end-user experiences. In this session, I'll explore UX design principles for multi-agent systems, addressing critical questions: What is the true configuration space for autonomous MAS? How can users arrive at the correct mental model of an MAS's capabilities, if at all? How can we improve trust and safety through techniques like cost-aware action delegation? What makes agent actions observable? How do we enable seamless interruptibility? Attendees will gain actionable insights to create more transparent, trustworthy, and user-centered multi-agent applications, illustrated through real-world implementations in AutoGen Studio - a low code developer tool built on AutoGen (44k stars on GitHub, MIT license) and similar tools. ------------------------------------ Session ID: 914361 Track: Design Engineering Speaker: Shafik Quoraishee (AI Game Engineer) Format: Talk Room: Foothill G 1&2: Design Engineering Time: 5 Jun 2025 12:15 PM Session Title: AI and Game Theory: A Case Study on NYT's Connections Description: This session will examine the interplay between human intuition and artificial intelligence in puzzle-solving, using the popular New York Times Connections game as a practical case study. We'll investigate how gameplay can be systematically evaluated through AI algorithms, exploring machine learning strategies such as clustering, semantic mapping, and natural language processing. Attendees will gain insights into building AI-driven puzzle solvers, learn methods for quantitatively assessing gameplay complexity, and discuss the potential impacts of AI on puzzle game design and player engagement. ------------------------------------ Session ID: 915389 Track: Design Engineering Speaker: Christopher Chedeau (Frenchy Front-end Engineer) Format: Talk Room: Foothill G 1&2: Design Engineering Time: 5 Jun 2025 02:00 PM Session Title: AI and Human Whiteboarding Partnership Description: Covid sent everybody home and created the space of virtual whiteboards. At first the experience reused the physical constraints but soon it became better than a physical whiteboard thanks to using virtual native concepts like copy-paste and using keyboard input. The next step in this evolution is to integrate AI into the workflow. We've tried a lot of things with Excalidraw and ended up landing on turning prompt into diagram. Come to the talk to understand how it fits into the workflow and how we implemented it. ------------------------------------ Session ID: 933474 Track: Design Engineering Speaker: Kenneth DuMez (DevRel Lead, Graphite) Room: Willow: Expo Sessions Time: 5 Jun 2025 03:15 PM Session Title: Cattle, not genies: building AI agents from first principles Description: As magical as they may seem, AI agents should be treated like any other software system. This talk will cover the best practices in designing and building AI systems including observability, security hardening, and proper UX. ------------------------------------ Session ID: 915783 Track: Design Engineering Speaker: Craig Wattrus (AI Design Engineer) Format: Talk Room: Foothill G 1&2: Design Engineering Time: 5 Jun 2025 02:40 PM Session Title: Form factors for your new AI coworkers Description: Designing user experiences for AI means moving beyond traditional interfaces. Designers are grappling with how to create intuitive and effective interactions for these new AI capabilities, while growing their practice to include philosophy, ethics and coding. What if AI interactions could be reimagined as new 'coworkers'? This talk explores AI systems as your new coworkers. Covering novel UX patterns we’ve implemented and are researching at Flatfile as well as a state of the union on emergent patterns we’re seeing and using from the industry. Attendees will get a peek into explorations into AI cursors, forward-leaning chat paradigms and tool UX. We will discuss both work thats in production today at some of our biggest customers as well as thought-provoking demos, offering a vision for the future of AI UX. ------------------------------------ Session ID: 916107 Track: Design Engineering Speaker: Jun Yu Tan (Founding Engineer, Tusk) Format: Online Talk Session Title: Designing AI to Scale Human Thought Description: Forget the hype of AI automation replacing jobs. The future lies in human augmentation — revealing blind spots, sparking creativity, and amplifying thoughtful decision-making. In this talk, we’ll explore the principles that distinguish augmentation from automation in AI UX design, covering interaction patterns, design principles, and trust-building feedback loops. Drawing from real-world experiences building AI-powered tools and beyond, we’ll dive into concepts for crafting interfaces that empower users to think smarter, not just work faster. Expect practical insights and a fresh perspective on AI’s role as a collaborative partner. AI Augmentation: https://jytan.net/blog/2025/ai-augmentation/ Tusk: https://www.usetusk.ai/ ------------------------------------ Session ID: 914845 Track: Design Engineering Speaker: John Pham (Head of Design ) Format: Talk Room: Foothill G 1&2: Design Engineering Time: 5 Jun 2025 11:35 AM Session Title: Good design hasn’t changed with AI Description: Bad designs are still bad. AI doesn’t make it good. The novelty of AI makes the bad things tolerable, for a short time. Building great designs and experiences with AI have the same first principles pre-AI. When people use software, they want it to feel responsive, safe, accessible and delightful. We’ll go over the big and small details that goes into software that people want to use, not forced to use. ====================================================================== --- Track: EVALS (June 5) --- ====================================================================== Session ID: 936564 Track: Evals Speaker: Micah Hill-Smith (CEO) Format: Keynote Room: Keynote/General Session (Yerba Buena 7&8) Time: 5 Jun 2025 04:00 PM Session Title: Trends Across the AI Frontier Description: The entire AI stack is developing faster than ever - from chips to infrastructure to models. How do you sort the signal from the noise? Artificial Analysis an independent benchmarking and insights company dedicated to helping developers and companies pick the right models and technologies for building applications. This talk will walk through the state of the frontier across the AI stack. ------------------------------------ Session ID: 939130 Track: Evals Speaker: Ankur Goyal (CEO, Braintrust) Format: Talk Room: Golden Gate Ballroom B: Evals Time: 5 Jun 2025 02:00 PM Session Title: [Evals Keynote] tba Description: tbc ------------------------------------ Session ID: 943899 Track: Evals Speaker: Ankur Goyal (CEO, Braintrust) Format: Keynote Room: Keynote/General Session (Yerba Buena 7&8) Time: 5 Jun 2025 04:20 PM Session Title: Evals Closing Keynote Description: The final word on Evals ------------------------------------ Session ID: 915684 Track: Evals Speaker: Taylor Jordan Smith (Senior Developer Advocate) Format: Workshop Room: Nobhill A&B: Workshops Time: 3 Jun 2025 09:00 AM Session Title: Beyond Benchmarks: Strategies for Evaluating LLMs in Production Description: Accuracy scores and leaderboard metrics look impressive—but production-grade AI requires evals that reflect real-world performance, reliability, and user happiness. Traditional benchmarks rarely help you understand how your LLM will perform when embedded in complex workflows or agentic systems. How can you realistically and adequately measure reasoning quality, agent consistency, MCP integration, and user-focused outcomes? In this practical, example-driven talk, we'll go beyond standard benchmarks and dive into tangible evaluation strategies using various open-source frameworks like GuideLLM and lm-eval-harness. You'll see concrete examples of how to create custom eval suites tailored to your use case, integrate human-in-the-loop feedback effectively, and implement agent reliability checks that reflect production conditions. Walk away with actionable insights and best practices for evaluating and improving your LLMs, ensuring they meet real-world expectations—not just leaderboard positions! ------------------------------------ Session ID: 936156 Track: Evals Speaker: Omar Khattab (Databricks Research Scientist) Format: Talk Room: Golden Gate Ballroom B: Evals Time: 5 Jun 2025 11:15 AM Session Title: On Engineering AI Systems that Endure The Bitter Lesson Description: Will discuss the principles for building AI software that underpin DSPy, highlighting the differences between conventional prompting (or finetuning/RL) versus the design and programming of truly modular AI systems. ------------------------------------ Session ID: 942167 Track: Evals Speaker: Manu Goyal (Founding Engineer, Braintrust) Room: Keynote/General Session (Yerba Buena 7&8) Time: 5 Jun 2025 09:45 AM Session Title: Why should anyone care about Evals? Description: An introduction to the evals track ------------------------------------ Session ID: 907684 Track: Evals Speaker: David Karam (CEO) Format: Workshop Room: Foothill G1&2: Workshops Time: 3 Jun 2025 10:40 AM Session Title: Solving for the hardest Eval challenge: Building Metrics that actually work Description: One of the biggest challenges in building evals you can trust is building metrics that reliably measure goodness in your application; metrics that are highly accurate, rapid fast, and tunable to ground truth rater and user behavior. This workshop is inspired by decades of AI and machine learning development in Google Search, reinvented for the modern LLM stack by the Pi team over the past year. In this workshop you will learn how to: 1) Brainstorm and design custom metrics tailored to your specific application needs. 2) Identify which types of signals (natural language, code, other models) work best for your use case through rapid trial and error. 3) Combine & calibrate your metrics against ground truth data using real examples from your domain. 4) Use simple tools like Google Sheets for visualizing and analyzing your inputs and outputs with those metrics. 5) Integrate your scoring models into both online workflows like agent control and offline ones like model comparison and training evaluation. ------------------------------------ Session ID: 933676 Track: Evals Speaker: Samuel Colvin (Founder of Pydantic) Format: Talk Room: Nobhill A&B: Expo Sessions Time: 5 Jun 2025 01:00 PM Session Title: Human-seeded Evals Description: In this talk I'll introduce the concept of Human-seeded Evals, explain the principle and demo them with Pydantic Logfire. ------------------------------------ Session ID: 933603 Track: Evals Speaker: Suman Debnath (Principal Developer Advocate, AI/ML, AWS) Room: Nobhill A&B: Expo Sessions Time: 5 Jun 2025 10:45 AM Session Title: Introducing Strands Agents, an Open Source AI Agents SDK Description: Building AI agents used to require complex orchestration, extensive scaffolding, and months of tuning. With Strands Agents, an open source SDK from AWS. You can now build, test, and deploy intelligent agents in just a few lines of code. This session introduces the model-driven approach behind Strands, where a model, a prompt, and a set of tools are all you need to create powerful, production-ready agents. Learn how Strands leverages modern foundation models to handle reasoning, tool use, and reflection, reducing development time from months to days. ------------------------------------ Session ID: 933712 Track: Evals Speaker: Nir Gazit (CEO @ Traceloop, OpenLLMetry co-creator) Room: Juniper: Expo Sessions Time: 5 Jun 2025 01:15 PM Session Title: Prompt Engineering is Dead Description: Manual prompt crafting doesn't scale. In this session, we'll explore how to replace it with a test-driven, automated approach. You'll see how to define output evaluators, write minimal prompts, and let agents iterate toward optimal performance—all without manual tweaking. If you're still hand-tuning prompts, you're doing it wrong. ------------------------------------ Session ID: 933702 Track: Evals Speaker: Mikiko Bazeley (Staff Developer Advocate, MongoDB) Room: Salons 9-15: Expo Hall Time: 5 Jun 2025 01:15 PM Session Title: Smarter Together: Designing Multi-Agent Systems with Shared, Evolving Memory Description: In today’s most advanced AI systems, intelligence is no longer confined to a single model or agent—it emerges from coordination. But coordination requires memory: short-term, long-term, and shared. In this talk, we’ll break down how agent systems can store, retrieve, and evolve shared memory to become smarter over time. You'll learn what it takes to architect these continuously learning systems, how to track and improve memory quality, and why robust, flexible infrastructure is the foundation of it all. Stick around to see how this works in practice—live. ------------------------------------ Session ID: 930540 Track: Evals Speaker: Ilan Bigio (Developer Experience) Format: Workshop Room: Golden Gate Ballroom B: Workshops Time: 3 Jun 2025 01:00 PM Session Title: Model-Maxxing: RFT, DPO, SFT (Fine-tuning with OpenAI) Description: Covering all forms of fine-tuning and prompt engineering, like SFT, DPO, RFT, prompt engineering / optimization, and agent scaffolding. ------------------------------------ Session ID: 916104 Track: Evals Speaker: Jason Liu (Principal) Format: Talk Room: Golden Gate Ballroom B: Evals Time: 5 Jun 2025 02:40 PM Session Title: How to look at your data; what to look for, how to measure Description: By the end of this talk, you'll understand what it takes to apply clustering techniques and data analysis to understand what is the valuable work that your AI application is doing through analyzing conversation histories and how to create generative evals to benchmark your newly discovered superpowers. ------------------------------------ Session ID: 942858 Track: Evals Speaker: Carlos Esteban (Solutions Engineer, Braintrust) Format: Workshop Room: Golden Gate Ballroom C: Workshops Time: 3 Jun 2025 03:30 PM Session Title: How to build world-class AI products (featuring Sarah Sachs, AI lead @ Notion) Description: Join us for a hands-on workshop where you'll learn practical strategies to evaluate AI applications throughout their lifecycle—from initial testing of prompts to ongoing monitoring in production. We’re excited to host Sarah Sachs, AI Lead at Notion, who will share insights into how Notion built their acclaimed Notion AI. ------------------------------------ Session ID: 936133 Track: Evals Speaker: Vitor Balocco (Staff AI Engineer) Format: Talk Room: Golden Gate Ballroom B: Evals Time: 5 Jun 2025 11:35 AM Session Title: Turning Fails into Features: Zapier’s Hard-Won Eval Lessons Description: Every agent failure can be a roadmap to your next breakthrough. This talk reveals how Zapier's evaluation system transforms frustrating user experiences into targeted improvements, creating a data flywheel that continuously strengthens our agents. You'll learn practical approaches for building the data flywheel, detecting implicit feedback signals, building solid evals, prioritizing metrics that actually matter, and why your most reliable evals might secretly be sabotaging your performance. ------------------------------------ Session ID: 949432 Track: Evals Speaker: Diego Rodriguez (N/A) Format: Talk Room: Golden Gate Ballroom B: Evals Time: 5 Jun 2025 12:15 PM Session Title: Perceptual Evaluation Description: Special session with KREA.ai's cofounder Diego Rodriguez on how evals for aesthetics and image/generative media work — the hardest kinds of evals. ------------------------------------ Session ID: 915826 Track: Evals Speaker: Nathan Sobo (CEO & Co-founder of Zed, co-creator of Atom and Electron) Format: Talk Room: Juniper: Expo Sessions Time: 5 Jun 2025 03:15 PM Session Title: CI in the Era of AI: From Unit Tests to Stochastic Evals Description: Software engineers have long understood that high-quality code requires comprehensive automated testing. For decades, our industry has relied on deterministic tests with clear pass/fail outcomes to ensure reliability. High-quality software depends on automated testing. That's certainly true at Zed, where we're building a next-generation native IDE in Rust. Zed runs at 120 frames per second, but it would also crash once a second if we didn't maintain and run a comprehensive suite of unit tests on every change. But what happens when AI enters the equation? In this talk, we'll explore how continuous integration evolves when working with AI components. "Evals" - parlance from the machine learning field - are fundamentally a continuation of the software testing tradition, but with a critical difference: they're inherently stochastic. Zed's traditional CI goes to extreme lengths to eliminate non-determinism, as nobody likes having their pull requests blocked by flaky builds. We've even fully simulated network interactions with a deterministic random scheduler. AI components, however, forced us to confront a fundamental paradigm shift—uncertainty isn't a bug but an intrinsic feature of these systems, compelling us to embrace what we couldn't avoid. We'll share our journey of reconceptualizing evals as "stochastic unit tests" - still verifying system behavior, but without binary pass/fail grades. We'll discuss practical approaches to: - Thoughtfully building test suites for AI components - Shifting from red/green outcomes to "shades of gray" - Replacing build gates with trend analysis and performance monitoring - Maintaining engineering confidence despite statistical variance Whether you're incorporating AI into existing systems or building new AI-powered tools, this talk will provide practical insights into maintaining quality when determinism gives way to probability. ------------------------------------ Session ID: 942803 Track: Evals Speaker: Doug Guthrie (Solutions Engineer, Braintrust) Format: Workshop Room: Golden Gate Ballroom C: Workshops Time: 3 Jun 2025 10:40 AM Session Title: Mastering AI Evaluation: From Playground to Production with Braintrust Description: This hands-on workshop will guide participants through the complete AI evaluation lifecycle using Braintrust, from initial prompt testing to production monitoring. Attendees will learn to build evaluation frameworks that ensure their AI applications perform reliably in real-world scenarios. Topics covered include both offline and online evaluation strategies, logging and feedback systems, and human review processes. ------------------------------------ Session ID: 949122 Track: Evals Speaker: Doug Guthrie (Solutions Engineer, Braintrust) Format: Workshop Room: Golden Gate Ballroom B: Evals Time: 5 Jun 2025 01:00 PM Session Title: Evals 101: Lunch & Learn Description: This hands-on workshop guides participants through the full AI evaluation lifecycle with Braintrust, from initial prompt testing to production monitoring. Attendees will build evaluation frameworks, practice offline and online strategies, and implement logging systems. ------------------------------------ Session ID: 916104 Track: Evals Speaker: Jeff Huber (CEO) Format: Talk Room: Golden Gate Ballroom B: Evals Time: 5 Jun 2025 02:40 PM Session Title: How to look at your data; what to look for, how to measure Description: By the end of this talk, you'll understand what it takes to apply clustering techniques and data analysis to understand what is the valuable work that your AI application is doing through analyzing conversation histories and how to create generative evals to benchmark your newly discovered superpowers. ------------------------------------ Session ID: 936564 Track: Evals Speaker: George Cameron (CPO) Format: Keynote Room: Keynote/General Session (Yerba Buena 7&8) Time: 5 Jun 2025 04:00 PM Session Title: Trends Across the AI Frontier Description: The entire AI stack is developing faster than ever - from chips to infrastructure to models. How do you sort the signal from the noise? Artificial Analysis an independent benchmarking and insights company dedicated to helping developers and companies pick the right models and technologies for building applications. This talk will walk through the state of the frontier across the AI stack. ------------------------------------ Session ID: 939231 Track: Evals Speaker: Ido Pesok (AI Engineer, v0) Format: Talk Room: Golden Gate Ballroom B: Evals Time: 5 Jun 2025 11:55 AM Session Title: Evals Are Not Unit Tests Description: How to think about evaluating a non-deterministic system — and how to actually succeed at it. ------------------------------------ Session ID: 936133 Track: Evals Speaker: Rafal Wilinski (AI Agents Lead) Format: Talk Room: Golden Gate Ballroom B: Evals Time: 5 Jun 2025 11:35 AM Session Title: Turning Fails into Features: Zapier’s Hard-Won Eval Lessons Description: Every agent failure can be a roadmap to your next breakthrough. This talk reveals how Zapier's evaluation system transforms frustrating user experiences into targeted improvements, creating a data flywheel that continuously strengthens our agents. You'll learn practical approaches for building the data flywheel, detecting implicit feedback signals, building solid evals, prioritizing metrics that actually matter, and why your most reliable evals might secretly be sabotaging your performance. ------------------------------------ Session ID: 905421 Track: Evals Speaker: Ofer Mendelevitch (Vectara - the trusted GenAI product platform) Format: Online Talk Session Title: open-rag-eval: RAG Evaluation without "golden" answers. Description: Open-RAG-Eval is an open-source framework that revolutionizes RAG evaluation by harnessing the power of LLM judges for scalable, automated evaluation without the need for golden answers or golden chunks. Building on pioneering research from the University of Waterloo, this framework integrates innovative tools like UMBRELA for reference-free relevance scoring and AutoNuggetizer for automated fact-checking. Designed with a flexible connectors architecture, it seamlessly plugs into any RAG pipeline while delivering fast, transparent, and interpretable metrics on retrieval, generation, and hallucination in RAG. ------------------------------------ Session ID: 915059 Track: Evals Speaker: John Dickerson (CEO, Mozilla AI) Format: Talk Room: Golden Gate Ballroom B: Evals Time: 5 Jun 2025 02:20 PM Session Title: 2025 is the Year of Evals! Just like 2024, and 2023, and … Description: AI is getting deployed without guardrails, without governance, without due diligence. Surely this is the year we’ll see a Fortune 500 CEO fired because of a preventable AI incident. Surely this is the year we’ll see enterprises wake up to pre-deployment evaluation and post-deployment monitoring being an urgent need. This story hasn’t changed for a decade, but surely this is the year it will. In this talk, I’ll cover what enterprise-level AI/ML evaluation has looked like for the last decade - what’s changed, what hasn’t, what sells, what doesn’t, and where I see things going from here on out. Evaluation matters - we all know this - but using my experience in the trenches over the last decade or so I hope to bridge the gap between what practitioners need and what the C-suite pays for in the space of AI evaluations. ====================================================================== --- Track: GENERATIVE MEDIA (June 5) --- ====================================================================== Session ID: 910197 Track: Generative Media Speaker: Kelvin Ma (Software Engineer ) Format: Talk Room: Foothill F: Generative Media Time: 5 Jun 2025 11:55 AM Session Title: Magic Editor Under the Hood: Weaving Generative AI into a Billion-User App Description: Go behind the scenes of Google Photos' Magic Editor. Explore the engineering feats required to integrate complex CV and cutting-edge generative AI models into a seamless mobile experience. We'll discuss optimizing massive models for latency/size, the crucial interplay with graphics rendering (OpenGL/Halide), and the practicalities of turning research concepts into polished features people actually use. ------------------------------------ Session ID: 947560 Track: Generative Media Speaker: Comfy Anonymous (Original creator of ComfyUI) Room: Salons 2-6: Workshops Time: 3 Jun 2025 09:00 AM Session Title: ComfyUI Description: Quick introduction to ComfyUI and what's new followed by a QA session. ------------------------------------ Session ID: 933493 Track: Generative Media Speaker: Chad Bailey (Senior Voice Bots Engineer, Daily) Room: Juniper: Expo Sessions Time: 4 Jun 2025 10:40 AM Session Title: Realtime conversational video with Pipecat and Tavus Description: Tavus shipped the world's first realtime video avatar platform last year. Developers use Tavus' conversational video APIs to create education, social, and customer support agents. The Tavus team built their innovative product using the Pipecat open source framework and Daily's global WebRTC infrastructure. Join us for a technical deep dive into conversational video. ------------------------------------ Session ID: 910158 Track: Generative Media Speaker: Gorkem Yurtseven (CTO ) Format: Talk Room: Foothill F: Generative Media Time: 5 Jun 2025 11:15 AM Session Title: The State of Generative Media Today Description: Generative AI is reshaping the creative landscape, enabling the production of images, audio, and video with unprecedented speed and sophistication. This session offers an in-depth exploration of the current state of generative media, highlighting cutting-edge models, platforms, and tools that are transforming the industry. ------------------------------------ Session ID: 943701 Track: Generative Media Speaker: Paige Bailey (Engineering Lead - Developer Relations @ Google DeepMind) Format: Talk Room: Foothill F: Generative Media Time: 5 Jun 2025 11:35 AM Session Title: Veo 3 for developers Description: This talk will briefly trace the history of video generation models before diving into Veo 3, Google DeepMind's latest state-of-the-art model that marks a significant leap by generating video with synchronized audio—including dialogue, sound effects, and music—all from text and image prompts. We'll show how it can understanding intricate details, maintain coherence over longer sequences, and simulate realistic physics and camera movements. For developers, Veo 3, accessible via Vertex AI (preview), unlocks many new capabilities. We'll discuss how its advanced capabilities, such as semantic context rendering and cinematic control, can empower innovation in filmmaking, game development, education, and more. This session will cover how developers can integrate Veo 3 into their workflows, or test it out today in the Gemini App, Flow, and via the Gemini APIs on Google Cloud. ------------------------------------ Session ID: 933493 Track: Generative Media Speaker: Brian Johnson (Staff Engineer, Tavus) Room: Juniper: Expo Sessions Time: 4 Jun 2025 10:40 AM Session Title: Realtime conversational video with Pipecat and Tavus Description: Tavus shipped the world's first realtime video avatar platform last year. Developers use Tavus' conversational video APIs to create education, social, and customer support agents. The Tavus team built their innovative product using the Pipecat open source framework and Daily's global WebRTC infrastructure. Join us for a technical deep dive into conversational video. ------------------------------------ Session ID: 943296 Track: Generative Media Speaker: Zeke Sikelianos (Founding Designer) Format: Workshop Room: Foothill F: Generative Media Time: 5 Jun 2025 12:15 PM Session Title: Design like Karpathy is watching 😎 Description: Legendary AI engineer and educator Andrej Karpathy recently blogged about his experiences building, deploying, and monetizing a vibe-coded web app called MenuGen. Let's dig into the challenges he faced and learn what we as AI designers can do to make life better for the Andrejs of the world. ------------------------------------ Session ID: 947929 Track: Generative Media Speaker: Sharif Shameem (Lexica Founder) Format: Talk Room: Foothill F: Generative Media Time: 5 Jun 2025 02:20 PM Session Title: Good Demos are Important Description: Creating and sharing demos is the easiest way to influence the future. It gets people to think about what's possible. A good tech demo doesn't have to be fully fleshed out. It doesn't even have to be fully functional. The purpose of a demo is to inspire. A good demo makes you feel like someone jumped into the future and pulled back an idea to the present. ------------------------------------ Session ID: 914081 Track: Generative Media Speaker: Keegan McCallum (Head of ML infrastructure at Luma AI) Format: Talk Room: Foothill F: Generative Media Time: 5 Jun 2025 02:00 PM Session Title: General Intelligence is Multimodal Description: Talking about Luma AI, our mission, and how our ML infrastructure enables SOTA multimodal model development ====================================================================== --- Track: GRAPHRAG (June 4) --- ====================================================================== Session ID: 932429 Track: GraphRAG Speaker: Ben Kus (CTO) Format: Talk Room: Golden Gate Ballroom C: AI in the Fortune 500 Time: 4 Jun 2025 02:00 PM Session Title: Building an Agentic Platform Description: Explore the technical evolution of metadata extraction at Box and how it shaped the foundation of our AI platform. We’ll walk through our transition to an agentic-first design—why it was necessary, how we approached the rebuild, challenges we encountered along the way, and the advantages it unlocked. ------------------------------------ Session ID: 915992 Track: GraphRAG Speaker: Mitesh Patel (Developer Advocate Manager) Format: Talk Room: Golden Gate Ballroom B: GraphRAG Time: 4 Jun 2025 11:15 AM Session Title: HybridRAG: A Fusion of Graph and Vector Retrieval to Enhance Data Interpretation Description: Interpreting complex information from unstructured text data poses significant challenges to Large Language Models (LLM), with difficulties often arising from specialized terminology and the multifaceted relationships between entities in document architectures. Conventional Retrieval Augmented Generation (RAG) methods face limitations in capturing these nuanced interactions, leading to suboptimal performance. In our talk, we introduce a novel approach integrating Knowledge Graph-based RAG (GraphRAG) with VectorRAG, designed to refine question-answering (Q&A) systems for more effective information extraction from complex texts. Our approach employs a dual retrieval strategy that harnesses both knowledge graphs and vector databases, enabling the generation of precise and contextually appropriate answers, thereby setting a new standard for LLMs in processing sophisticated data. ------------------------------------ Session ID: 933671 Track: GraphRAG Speaker: Zach Blumenfeld (AI/ML Product Specialist) Room: Nobhill A&B: Expo Sessions Time: 5 Jun 2025 12:45 PM Session Title: Agentic GraphRAG: Simplifying Retrieval Across Structured & Unstructured Data Description: Agentic workflows often become complex, brittle, and hard to maintain when they need to retrieve and reason across both structured data (typically requiring precise query execution) and unstructured data (commonly handled via vector search in RAG). In this talk, we’ll explore how mapping key information into a knowledge graph can simplify these workflows and improve retrieval quality. You’ll learn core concepts behind GraphRAG, how to integrate it into agent tools, and get access to end-to-end code examples so you can start building right away. ------------------------------------ Session ID: 912811 Track: GraphRAG Speaker: Sam Julien (Director of Developer Relations ) Format: Talk Room: Golden Gate Ballroom B: GraphRAG Time: 4 Jun 2025 11:55 AM Session Title: When Vectors Break Down: Graph-Based RAG for Dense Enterprise Knowledge Description: Enterprise knowledge bases are filled with "dense mapping," thousands of documents where similar terms appear repeatedly, causing traditional vector retrieval to return the wrong version or irrelevant information. When our customers kept hitting this wall with their RAG systems, we knew we needed a fundamentally different approach. In this talk, I'll share Writer's journey developing a graph-based RAG architecture that achieved 86.31% accuracy on the RobustQA benchmark while maintaining sub-second response times, significantly outperforming vector approaches. I'll survey the key techniques behind this performance leap and why graph-based approaches excel with complex enterprise information structures like product documentation, financial documents, and technical specifications that challenge traditional RAG systems. You'll learn about using specialized LLMs to build semantic relationships, how compression techniques efficiently handle concentrated enterprise data patterns, and how infusing key data points in the memory layer of the LLM lowers hallucination. The presentation will provide practical insights into identifying when graph-based approaches make sense for your organization's specific data challenges, helping you make informed architectural decisions for your next enterprise RAG system. ------------------------------------ Session ID: 915740 Track: GraphRAG Speaker: Michael Hunger (VP of Product Innovation) Format: Talk Room: Golden Gate Ballroom B: GraphRAG Time: 4 Jun 2025 02:00 PM Session Title: Practical GraphRAG - Making LLMs smarter with Knowledge Graphs Description: RAG has become one standard architecture component for GenAI applications to address hallucinations and integrate factual knowledge. While vector search over text is common, knowledge graphs represent a proven advancement by leveraging advanced RAG patterns to access and integrate interconnected factual information, complementing the language skills of LLMs. This talk explores GraphRAG challenges, implementation patterns, and real-world agentic examples with Google's ADK, demonstrating how this approach delivers more trustworthy and explainable GenAI solutions with enhanced reasoning capabilities. ------------------------------------ Session ID: 900332 Track: GraphRAG Speaker: Tom Smoker (Technical Founder ) Room: Golden Gate Ballroom B: GraphRAG Time: 4 Jun 2025 02:40 PM Session Title: Beyond Documents: Implementing Knowledge Graphs in Legal Agents Description: Structured Representations are pretty important in the law, where the relationships between clauses, documents, entities, and multiple parties matter. Structured Representation means Structured Context Injection. Better Context, Less Hallucinations. We walk through a couple of case studies of systems that we’ve built in production for legal use-cases - from recursive contractual clause retrieval, to HITL legal reasoning news agents. You'll gain insights into how structured representations significantly improve the effectiveness and reliability of legal agents. ------------------------------------ Session ID: 916063 Track: GraphRAG Speaker: Ola Mabadeje (Product Leader) Format: Talk Room: Golden Gate Ballroom B: GraphRAG Time: 4 Jun 2025 02:20 PM Session Title: Multi-Agent AI and Network Knowledge Graphs for Change Management and Network Testing Description: Traditional ticketing and testing workflows for change management and network operations often operate independently and lack critical real-world context and adaptive decision making capabilities. This fragmented approach results in delayed resolutions, repeated incidents, escalations, and dissatisfied stakeholders. This session explores an innovative solution leveraging the synergy of natural language processing from IT Service Management (ITSM) systems, Multi-agent reasoning, and dynamic context derived from live knowledge network graphs. Attendees will gain insights into an end-to-end architecture where natural language intents from ITSM tickets seamlessly integrate with experts AI agents for complex workflow tasks, supported by continuous network knowledge graph ingestion pipelines. Through a detailed production case study, we will demonstrate how Agentic reasoning combined with dynamic network knowledge graph contexts significantly improves critical validation and workflow interactions. The showcased results will highlight dramatic improvements in ticket resolution efficiency, accuracy of network testing, and overall execution quality, delivering tangible value to both technical teams and business stakeholders. ------------------------------------ Session ID: 915740 Track: GraphRAG Speaker: Jesús Barrasa (AI Field CTO) Format: Talk Room: Golden Gate Ballroom B: GraphRAG Time: 4 Jun 2025 02:00 PM Session Title: Practical GraphRAG - Making LLMs smarter with Knowledge Graphs Description: RAG has become one standard architecture component for GenAI applications to address hallucinations and integrate factual knowledge. While vector search over text is common, knowledge graphs represent a proven advancement by leveraging advanced RAG patterns to access and integrate interconnected factual information, complementing the language skills of LLMs. This talk explores GraphRAG challenges, implementation patterns, and real-world agentic examples with Google's ADK, demonstrating how this approach delivers more trustworthy and explainable GenAI solutions with enhanced reasoning capabilities. ------------------------------------ Session ID: 933646 Track: GraphRAG Speaker: Jesús Barrasa (AI Field CTO) Room: Juniper: Expo Sessions Time: 5 Jun 2025 10:45 AM Session Title: Why Your Agent’s Brain Needs a Playbook: Practical Wins from Using Ontologies Description: You're trying to guide how your agents think and act. Code-orchestrated workflows are too rigid, but LLMs charting their own course feel too chaotic. When you need a middle ground, it’s time to reach for the secret weapon: ontologies. These graph-shaped fragments of actionable knowledge can fill in critical gaps. In this talk, we’ll explore together how ontologies bring structure, semantics, and sanity to GenAI-powered applications. You’ll learn when they’re useful, how to apply them, and what kinds of problems they help solve. Through practical examples, we’ll show how ontologies (1) guide knowledge graph construction, (2) add a semantic layer for more efficient and accurate retrieval (GraphRAG), and (3) encode domain logic you don’t want to leave up to the LLM. ------------------------------------ Session ID: 933714 Track: GraphRAG Speaker: Zach Blumenfeld (Graph Data Science & AI Specialist, Neo4j) Format: Workshop Room: Golden Gate Ballroom C: Workshops Time: 3 Jun 2025 09:00 AM Session Title: Intro to GraphRAG Description: Learn the foundations of GraphRAG, starting with knowledge graph construction and then common retrieval patterns. ------------------------------------ Session ID: 916143 Track: GraphRAG Speaker: Mark Bain (Founder & Research Scientist) Format: Workshop Room: Golden Gate Ballroom B: GraphRAG Time: 4 Jun 2025 01:00 PM Session Title: Make Your AI Agents Remember What They Do! Description: Are you ready to give your AI agents a memory upgrade? Join us for a fast-paced workshop exploring how memory can transform your agents. What You'll Do: Learn Leading Memory Solutions: Gain practical experience with open-source tools like Neo4j, Cognee, Graphiti, and Mem0. Explore Memory Types: Understand the theory behind long-term, short-term, episodic, semantic, and other memory types. Discover Memory Benefits: Learn how memory improves recall, contextual awareness, and reasoning in autonomous agents. Compare Implementations: Get a snapshot of how different solutions implement memory—what’s built-in, flexible, and experimental. We'll also demonstrate GraphRAG memory solutions and a GraphRAG chat implemented with Google ADK. Whether you’re working on AI copilots, agentic workflows, or research prototypes, this workshop will help you embed real memory into your AI stack. ------------------------------------ Session ID: 933549 Track: GraphRAG Speaker: Stephen Chin (VP of Developer Relations) Room: Willow: Expo Sessions Time: 4 Jun 2025 01:30 PM Session Title: Agentic GraphRAG: AI’s Logical Edge Description: AI models are getting tasked to do increasingly complex and industry specific tasks where different retrieval approaches provide distinct advantages in accuracy, explainability, and cost to execute. GraphRAG retrieval models have become a powerful tool to solve domain specific problems where answers require logical reasoning and correlation that can be aided by graph relationships and proximity algorithms. We will demonstrate how an agent architecture combining RAG and GraphRAG retrieval patterns can bridge the gap in data analysis, strategic planning, and retrieval to solve complex domain specific problems. ------------------------------------ Session ID: 915740 Track: GraphRAG Speaker: Stephen Chin (VP of Developer Relations) Format: Talk Room: Golden Gate Ballroom B: GraphRAG Time: 4 Jun 2025 02:00 PM Session Title: Practical GraphRAG - Making LLMs smarter with Knowledge Graphs Description: RAG has become one standard architecture component for GenAI applications to address hallucinations and integrate factual knowledge. While vector search over text is common, knowledge graphs represent a proven advancement by leveraging advanced RAG patterns to access and integrate interconnected factual information, complementing the language skills of LLMs. This talk explores GraphRAG challenges, implementation patterns, and real-world agentic examples with Google's ADK, demonstrating how this approach delivers more trustworthy and explainable GenAI solutions with enhanced reasoning capabilities. ------------------------------------ Session ID: 921229 Track: GraphRAG Speaker: Alison Cossette (Data Science Strategist, Advocate, Educator) Format: Workshop Room: Golden Gate Ballroom C: Workshops Time: 3 Jun 2025 01:00 PM Session Title: Graph Intelligence: Enhance Reasoning and Retrieval Using Graph Analytics Description: Advanced GraphRAG techniques apply graph ML and algorithms, wrapped into tidy notebooks. ------------------------------------ Session ID: 921229 Track: GraphRAG Speaker: Andreas Kollegger (GenAI Lead) Format: Workshop Room: Golden Gate Ballroom C: Workshops Time: 3 Jun 2025 01:00 PM Session Title: Graph Intelligence: Enhance Reasoning and Retrieval Using Graph Analytics Description: Advanced GraphRAG techniques apply graph ML and algorithms, wrapped into tidy notebooks. ------------------------------------ Session ID: 915023 Track: GraphRAG Speaker: Daniel Chalef (Founder, Zep AI) Format: Talk Room: Golden Gate Ballroom B: GraphRAG Time: 4 Jun 2025 12:15 PM Session Title: Stop Using RAG as Memory Description: RAG is great for static knowledge retrieval—but terrible at memory. Vectorstore-based systems sold as memory lack relational and temporal awareness, leading agents astray with outdated or ambiguous information. Discover how temporally-aware knowledge graphs—built by the open-source Graphiti framework—solve these limitations. You’ll learn practical strategies to maintain precise, context-rich memory, enabling agents to reason accurately about historical context and knowledge provenance. ------------------------------------ Session ID: 914548 Track: GraphRAG Speaker: Chin Keong Lam (AI Engineer & Co-Founder ) Format: Talk Room: Golden Gate Ballroom B: GraphRAG Time: 4 Jun 2025 11:35 AM Session Title: Wisdom Discovery at Scale: Code Less KAG with n8n MultiAI Agents Description: "Wisdom Discovery at Scale: Code Less KAG with n8n MultiAI Agents" ------------------------------------ Session ID: 933706 Track: GraphRAG Speaker: Thibaut Gourdel (Senior Technical Product Marketing Manager, MongoDB) Room: Salons 9-15: Expo Hall Time: 4 Jun 2025 11:00 AM Session Title: GraphRAG: Integrating LLMs with Knowledge Graphs Description: While traditional RAG is effective, it can struggle with complex relationships and reasoning across large knowledge bases. GraphRAG, an advanced variant, addresses these challenges by leveraging knowledge graphs to enable deeper understanding and improved response accuracy. Learn how LLMs extract key entities and relationships from your data to construct a graph structure, and how the system uses graph traversal to find related entities and enrich prompts. Stay for a live demo showcasing these concepts in action. ====================================================================== --- Track: INFRASTRUCTURE (June 4) --- ====================================================================== Session ID: 916189 Track: Infrastructure Speaker: Jesse Han (Founder) Format: Keynote Room: Keynote/General Session (Yerba Buena 7&8) Time: 5 Jun 2025 10:10 AM Session Title: The infrastructure for the singularity Description: We're at an inflection point where AI agents are transitioning from experimental tools to practical coworkers. This new world will demand new infrastructure for RL training, test-time scaling, and deployment. This is why Morph Labs developed Infinibranch last year, and we are excited to finally unveil what's next. ------------------------------------ Session ID: 932429 Track: Infrastructure Speaker: Ben Kus (CTO) Format: Talk Room: Golden Gate Ballroom C: AI in the Fortune 500 Time: 4 Jun 2025 02:00 PM Session Title: Building an Agentic Platform Description: Explore the technical evolution of metadata extraction at Box and how it shaped the foundation of our AI platform. We’ll walk through our transition to an agentic-first design—why it was necessary, how we approached the rebuild, challenges we encountered along the way, and the advantages it unlocked. ------------------------------------ Session ID: 916116 Track: Infrastructure Speaker: Solomon Hykes (CEO & Co-founder of Dagger, creator of Docker) Format: Keynote Room: Keynote/General Session (Yerba Buena 7&8) Time: 5 Jun 2025 09:50 AM Session Title: Containing Agent Chaos Description: AI agents promise breakthroughs but often deliver operational chaos. Building reliable, deployable systems with unpredictable LLMs feels like wrestling fog – testing outputs alone is insufficient when the underlying workflow is opaque and flaky. How do we move beyond fragile prototypes? This talk, from the creator of Docker, argues the solution lies *outside* the model: engineering **reproducible execution workflows** built on rigorous architectural discipline. Learn how **containerization**, applied not just to deployment but to *each individual step* of an agent's workflow, provides the essential **isolation and environmental consistency** needed. Discover how combining this granular container approach with patterns like immutable state management allows us to **contain agent chaos**, unlock effective testing, simplify debugging, and bring essential control and predictability back to building powerful AI agents you can actually ship with confidence. ------------------------------------ Session ID: 933641 Track: Infrastructure Speaker: Kshitij Grover (Co-Founder & CTO, Orb Inc.) Room: Juniper: Expo Sessions Time: 4 Jun 2025 01:15 PM Session Title: Revenue Engineering: How to Price (and Reprice) Your AI Product Description: You’ve trained the model—now it’s time to train the business. This talk dives into the engineering behind pricing systems that can evolve as fast as your AI stack. Orb CTO Kshitij Grover will walk through how leading AI companies design infrastructure to support experimentation, scale, and real-world monetization constraints. Topics include: - How to meter usage and map it to pricing with accuracy and auditability - Factoring in margins and underlying costs when designing pricing strategy - Handling complexity across motions: self-serve vs. enterprise, pay-as-you-go vs. committed contracts - How to test pricing changes safely (and roll them back when needed) Whether you’re bootstrapping a pricing system from scratch or replacing a brittle V1, you’ll leave with architectural patterns and mental models to make pricing a first-class engineering concern. ------------------------------------ Session ID: 937137 Track: Infrastructure Speaker: Alex Cheema (Co-Founder) Format: Talk Room: Foothill F: Infrastructure Time: 4 Jun 2025 11:55 AM Session Title: Large Scale AI on Apple Silicon using EXO Description: The hardware lottery: when a research idea wins because it is better suited to current hardware and software, and not because it is universally superior. Machine learning researchers often treat hardware as a fixed constraint and stop exploring beyond it. Yet historically, breakthroughs have come from algorithms that best align with the dominant hardware-software stack - neural networks being a classic example. In this talk, EXO Labs co-founder Alex Cheema will share recent algorithmic improvements for running large scale AI workloads on Apple Silicon. Alex will demonstrate how the EXO Framework enables inference, fine-tuning, and training of large ML models on Apple Silicon, from the scale of one MacBook locally to clusters of colocated M3 Ultra Mac Studios. ------------------------------------ Session ID: 933599 Track: Infrastructure Speaker: Rohit Talluri (WW Generative AI Specialist) Room: Nobhill A&B: Expo Sessions Time: 5 Jun 2025 01:15 PM Session Title: Serving Voice AI at Scale Description: Real-Time Voice AI applications demand the lowest possible latencies to enhance user experiences with more advanced reasoning and agentic capabilities. AWS is hosting Arjun Desai, co-founder of Cartesia, in a fireside chat for a technical deep dive into learnings and best practices for building a state-of-the-art inference stack that serves global enterprise customers. ------------------------------------ Session ID: 933652 Track: Infrastructure Speaker: Mason Egger (Sr. Developer Advocate - Temporal ) Room: Willow: Expo Sessions Time: 4 Jun 2025 01:00 PM Session Title: Events are the Wrong Abstraction for Your AI Agents Description: AI Agents are distributed systems. Agents need to connect and communicate with tools, data repositories, other agents, etc., all over a network. Event-Driven Architecture is a common pattern for facilitating this connectivity, using Events as the communication abstraction. However, this pattern introduces complexities as well, such as fragmented logic, increased latency, decreased observability, and more. But what if there were a way to get the benefits of Event-Driven Architecture without the complexities? Enter Durable Execution. In this talk, we'll discuss the pitfalls of Event-Driven Architecture, how Durable Execution solves these issues, and why Durable Execution, not Events, is the correct abstraction for building AI Agents. ------------------------------------ Session ID: 916069 Track: Infrastructure Speaker: Iman Makaremi (Head of AI Product, Catio building Enterprise Architecture Copilot) Format: Online Talk Session Title: The Multi-agent Orchestration Stack: Building an AI Copilot with Bedrock, Flyte & LangGraph Description: As LLMs move into enterprise workflows, developers face a new kind of architecture challenge: how do you build reliable, interpretable systems powered by agents and reasoning? This talk unpacks how we designed and implemented an AI orchestration framework for enterprise architecture — combining LangGraph for multi-agent workflows, Flyte for distributed execution, and AWS Bedrock for LLM inference using Claude 3. The product: an AI copilot for enterprise architects, deeply rooted in your tech stack context. At the core of this system is a domain-specific **knowledge graph** that acts as long-term memory for the agents. It enables persistent, structured representations of architectural state, system dependencies, and business context — giving the agents the grounding they need to generate accurate recommendations, translate natural language into SQL or code, and maintain continuity across workflows. We’ll also cover how we’ve integrated observability practices — including planned OpenTelemetry instrumentation — to trace and debug autonomous AI systems in production. If you’re a developer or AI engineer thinking beyond the chatbot and looking to embed reasoning into complex system design and data tasks, this talk offers an end-to-end blueprint — from orchestration and grounding to production monitoring. ------------------------------------ Session ID: 933719 Track: Infrastructure Speaker: Charles Frye (Developer Advocate, Modal Labs) Format: Workshop Room: Foothill F: Infrastructure Time: 4 Jun 2025 11:15 AM Session Title: What every AI engineer needs to know about GPUs Description: Every programmer needs to know a few things about hardware, like processors, memory, and disks. Due to AI systems' extreme demand for mathematical processing power, AI engineers need to know a few things about GPUs -- the world's most popular high-throughput mathematical co-processor. In this talk, I will explain the fundamental engineering constraints and design decisions that shape GPUs and trace those up to some counter-intuitive facts about the performance characteristics of AI systems, with actionable insights for their deployers and consumers. ------------------------------------ Session ID: 905305 Track: Infrastructure Speaker: Dr. Jasper Zhang, PhD (CEO) Format: Talk Room: Foothill F: Infrastructure Time: 4 Jun 2025 11:35 AM Session Title: Why We Don’t Need More Data Centers Description: AI infrastructure today is caught in an endless cycle: build more data centers, deploy more GPUs, repeat. But this approach is fundamentally flawed—expensive, inefficient, and environmentally unsustainable. In this talk, we will unpack why continuously expanding data centers masks deeper infrastructure inefficiencies, and why leveraging a GPU marketplace to dynamically allocate existing resources is essential. We will explore practical use-cases where companies scale GPU capacity flexibly, startups gain affordable compute, and idle GPUs are monetized, enabling a future of sustainable and democratized AI infrastructure. ------------------------------------ Session ID: 916066 Track: Infrastructure Speaker: Philip Kiely (Head of Developer Relations) Format: Workshop Room: SOMA: Workshops Time: 3 Jun 2025 09:00 AM Session Title: Introduction to LLM serving with SGLang Description: Do you want to learn how to serve models like DeepSeek and Qwen with SOTA speeds on launch day? SGLang is an open-source fast serving framework for LLMs and VLMs that generates trillions of tokens per day at companies like xAI, AMD, and Meituan. This workshop guides AI engineers who are familiar with serving models using frameworks like vLLM, Ollama, and TensorRT-LLM through deploying and optimizing their first model with SGLang, as well as providing guidance on when SGLang is the appropriate tool for LLM workloads. ------------------------------------ Session ID: 933612 Track: Infrastructure Speaker: Antje Barth (Principal Developer Advocate) Format: Keynote Room: Keynote/General Session (Yerba Buena 7&8) Time: 4 Jun 2025 04:05 PM Session Title: Building Agents at Cloud-Scale Description: Let's explore practical strategies for building and scaling agents in production. Discover how to move from local MCP implementations to cloud-scale architectures and how engineering teams leverage these patterns to develop sophisticated agent systems. Expect a mix of demos, use case discussions, and a glimpse into the future of agentic services! ------------------------------------ Session ID: 942943 Track: Infrastructure Speaker: John Welsh (Member of technical staff, Anthropic) Format: Talk Room: Yerba Buena Ballroom Salons 7-8: MCP Time: 4 Jun 2025 11:35 AM Session Title: What we learned from shipping remote MCP support at Anthropic Description: We recently released remote MCP support for both claude.ai and the Anthropic API. This talk will cover architectural decisions we made in our implementation, remote MCP authentication, supporting engineers who are building out agentic AI tools, implementing custom internal transports, and whatever else we can fit into 18 minutes of your time. ------------------------------------ Session ID: 933625 Track: Infrastructure Speaker: Philipp Krenn (Code and conference monkey) Room: Willow: Expo Sessions Time: 4 Jun 2025 03:15 PM Session Title: Vector Search Benchmark[eting] Description: Every vector database out there is both faster and slower than any other competitor — if you believe all the benchmarketing out there. Let's turn the marketing into useful benchmarks that actually help you: 1. How not to benchmark (spoiler: don’t trust the glossy charts). 2. What’s uniquely tricky about benchmarking vector search. 3. How to build meaningful benchmarks tailored to your use case. PS: Yes, you will have to get your hands dirty. Never believe a benchmark that you haven't tweaked yourself. ------------------------------------ Session ID: 933610 Track: Infrastructure Speaker: Mike Chambers (AI/ML Specialist DA AWS) Format: Talk Room: Golden Gate Ballroom C: AI in the Fortune 500 Time: 5 Jun 2025 12:15 PM Session Title: Ship it! Building Production-Ready Agents Description: Explore the practical challenges and solutions for deploying AI agents in real-world production environments. Through detailed technical analysis and practical examples, we'll examine strategies for building and orchestrating agent systems at scale. We'll cover critical infrastructure decisions, scalability frameworks, and best practices for creating robust, production-ready agent architectures. ------------------------------------ Session ID: 916066 Track: Infrastructure Speaker: Yineng Zhang (Inference lead at SGLang) Format: Workshop Room: SOMA: Workshops Time: 3 Jun 2025 09:00 AM Session Title: Introduction to LLM serving with SGLang Description: Do you want to learn how to serve models like DeepSeek and Qwen with SOTA speeds on launch day? SGLang is an open-source fast serving framework for LLMs and VLMs that generates trillions of tokens per day at companies like xAI, AMD, and Meituan. This workshop guides AI engineers who are familiar with serving models using frameworks like vLLM, Ollama, and TensorRT-LLM through deploying and optimizing their first model with SGLang, as well as providing guidance on when SGLang is the appropriate tool for LLM workloads. ------------------------------------ Session ID: 912986 Track: Infrastructure Speaker: Robert Wachen (Co-founder of Etched ) Format: Talk Room: Foothill F: Infrastructure Time: 4 Jun 2025 02:40 PM Session Title: Flipping the Inference Stack: Why GPUs Bottleneck Real-Time AI at Scale Description: Current AI inference systems rely on brute-force scaling—adding more GPUs for each user—creating unsustainable compute demands and spiraling costs. Real-time use cases are bottlenecked by their latency and costs per user. In this talk, AI hardware expert and founder Robert Wachen will break down why the current approach to inference is not scalable, and how rethinking hardware is the only way to unlock real-time AI at scale. ------------------------------------ Session ID: 937936 Track: Infrastructure Speaker: Matthias Loibl (Director of Polar Signals Cloud) Room: Willow: Expo Sessions Time: 4 Jun 2025 03:30 PM Session Title: Maximize GPU Efficiency with Continuous Profiling for GPUs Description: Polar Signals Continuous Profiling for GPUs extends our industry-leading continuous profiling platform to provide deep, always-on visibility into your GPU workloads. Now you can see exactly how your GPUs are being utilized millisecond by millisecond. Our solution helps you move from guesswork to data-driven optimization. ------------------------------------ Session ID: 914934 Track: Infrastructure Speaker: Paul Klein IV (Founder) Format: Talk Room: SOMA: AI Architects Time: 5 Jun 2025 02:40 PM Session Title: The Web Browser Is All You Need Description: With the rise of MCP servers, A2A, and our trusty friend, OpenAPI, it turns out the web browser may be the default MCP server for the rest of the internet. In this talk, we'll walk through how a web browsing tool is probably the only tool you'll need to enable production AI Agents. ------------------------------------ Session ID: 933605 Track: Infrastructure Speaker: Mani Khanuja (Principal ML Services SA) Room: Juniper: Expo Sessions Time: 4 Jun 2025 03:30 PM Session Title: Data is Your Differentiator: Building Secure and Tailored AI Systems Description: As organizations seek to harness their proprietary data while maintaining security and compliance, Amazon Bedrock provides a comprehensive framework for building tailored AI applications. Using Amazon Bedrock Knowledge Bases and Amazon Bedrock Data Automation, organizations can create AI solutions that truly understand their unique business context, terminology, and requirements. Combined with Amazon Bedrock Guardrails, these capabilities enhance the accuracy and relevance of AI-generated responses, while ensuring that sensitive information remains protected within the organization's control - enabling businesses to build secure and compliant enterprise-grade generative AI solutions that accelerate time to value. ------------------------------------ Session ID: 933656 Track: Infrastructure Speaker: Nick Nisi (Software developer and panelist on the JS Party podcast) Room: Willow: Expo Sessions Time: 5 Jun 2025 12:45 PM Session Title: Agents, Access, and the Future of Machine Identity Description: AI agents are calling APIs, submitting forms, and sending emails—but how do you control what they’re allowed to do? As agents act on behalf of users or organizations, traditional patterns like OAuth, session tokens, and role-based access often fall short. In this talk, we’ll explore how machine identity is evolving to meet this new landscape. You’ll learn: - How to think about authentication for agents (not just humans) - What it means to authorize an action when the actor is an LLM or headless service - Real-world strategies from WorkOS and Cloudflare for assigning, managing, and revoking agent identity and access By the end, you’ll walk away with practical tools and mental models to build agent-powered systems that are secure, auditable, and scalable. ------------------------------------ Session ID: 914371 Track: Infrastructure Speaker: Henry Weller (Senior Product Manager, Vector Search @ MongoDB) Format: Talk Room: Salons 9-15: Expo Hall Time: 5 Jun 2025 03:00 PM Session Title: Building Vector Search Experiences with MongoDB: Access patterns, data models, and scaling considera Description: This talk will explore typical and forward-looking use cases for Atlas Vector Search, as well as how different types of data models and query patterns can be implemented and effectively scaled to meet the needs of those use cases. There will be a focus on the "Iron Triangle of Search" balancing accuracy, speed, and cost and talking about practical considerations that emerge within those use cases. This will be a technical talk focused on the "how" of Atlas Vector Search and considerations when building information retrieval systems given by a technical PM, not a sales pitch explaining how basic vector retrieval "solves" hallucinations. ------------------------------------ Session ID: 915471 Track: Infrastructure Speaker: Kyle Kranen US (Engineering Manager - Deep Learning Algorithms ) Format: Talk Room: Foothill F: Infrastructure Time: 4 Jun 2025 02:20 PM Session Title: Hacking the Inference Pareto Frontier for Cheaper and Faster Tokens Without Breaking SLAs Description: Your model works! It aces the evals! It even passes the vibe check! All that’s required is inference, right? Oops, you’ve just stepped into a minefield: -Not low-latency enough? Choppy experience. Users churn from your app. -Not cheap enough? You’re losing money on every query. -Not high enough output quality? Your system can’t be used for that application. A model and the inference system around it form a “token factory” associated with a Pareto frontier— a curve representing the best possible trade-offs between cost, throughput, latency and quality, outside of which your LLM system cannot be applied successfully. Outside of the Pareto frontier? You’re back to square one. That is, unless you’re able to change the shape of the Pareto frontier. In this session, we’ll introduce NVIDIA Dynamo, a datacenter-scale distributed inference framework as well as the bleeding-edge techniques it enables to hack the Pareto frontier of your inference systems, including: -Disaggregation - separating phases of LLM generation to make them more efficient -Speculation - predicting multiple tokens per cycle -KV routing, storage, and manipulation - ensuring that we don’t redo work that has already been done -Pipelining improvements for agents - accelerating our workflows using information about the agent By the end of the talk, we’ll understand how the Pareto frontier limits where models can be applied, the intuition behind how inference techniques can be used to modify it, as well as the mechanics of how these techniques work. ------------------------------------ Session ID: 948652 Track: Infrastructure Speaker: JingXiang Mo (Founding Engineer @ K-Scale Labs, Robotics Product & Engineering Lead) Format: Talk Room: Foothill E: Autonomy + Robotics Time: 5 Jun 2025 02:40 PM Session Title: Scaling Open-source Humanoid Robots Description: Introducing developer ready robots that are open-source, affordable, and easy to use. ------------------------------------ Session ID: 928676 Track: Infrastructure Speaker: Dylan Patel (Chief Analyst) Format: Talk Room: Foothill F: Infrastructure Time: 4 Jun 2025 02:00 PM Session Title: [Infra Keynote] Geopolitics of AI Infrastructure Description: As AI reshapes the global balance of power, the infrastructure behind it—chips, data centers, power, and supply chains—has become a new arena for geopolitical competition. This talk explores how nations are racing to secure critical AI hardware, control compute capacity, and assert influence over the technologies and talent that define the future. ------------------------------------ Session ID: 933599 Track: Infrastructure Speaker: Arjun Desai (Co-Founder, Cartesia) Room: Nobhill A&B: Expo Sessions Time: 5 Jun 2025 01:15 PM Session Title: Serving Voice AI at Scale Description: Real-Time Voice AI applications demand the lowest possible latencies to enhance user experiences with more advanced reasoning and agentic capabilities. AWS is hosting Arjun Desai, co-founder of Cartesia, in a fireside chat for a technical deep dive into learnings and best practices for building a state-of-the-art inference stack that serves global enterprise customers. ====================================================================== --- Track: MCP (June 4) --- ====================================================================== Session ID: 911821 Track: MCP Speaker: Damien Murphy (Founding Engineer) Format: Workshop Room: Foothill G1&2: Workshops Time: 3 Jun 2025 09:00 AM Session Title: A2A & MCP: Automating Business Processes with LLMs Description: Ever wished your webhooks could think for themselves? Join us to discover how A2A agents can transform passive webhook endpoints into intelligent workflow processors. In this session, we'll show you how to build a system that automatically spawns AI Agents to handle incoming webhooks. Using Google's Agent-to-Agent framework and MCP, you'll learn how to create dynamic AI agents that respond to events, communicate with external services, and make decisions based on content analysis. See the future of workflow automation where webhooks don't just trigger actions—they trigger intelligence! ------------------------------------ Session ID: 933709 Track: MCP Speaker: Tobin South (Secure AI Agents & MCP, WorkOS) Room: Willow: Expo Sessions Time: 4 Jun 2025 10:55 AM Session Title: What does Enterprise Ready MCP mean? Description: Everyone is building MCP servers: from Slack integrations to personal data tools. They're good demos, but not ready to turn into production. So, what does it take to make MCP *enterprise-ready?* We're going to cover the end-to-end process of getting a hacky MCP server authenticated, permissioned, and secure. We'll talk about registries, SSO, audit logs, agent identifiers, autonomy for agents, and oversight. Oh and we'll use MCP to buy some stuff. Come learn the stack needed to scale your MCP to the enterprise and some fun hacks along the way. ------------------------------------ Session ID: 933686 Track: MCP Speaker: Michael Grinich (Founder & CEO, WorkOS ) Format: Talk Room: SOMA: AI Architects Time: 5 Jun 2025 02:00 PM Session Title: CIAM for AI: Who Are Your Agents and What Can They Do? Description: AI agents are changing the way modern SaaS products operate. Whether automating workflows, integrating with APIs, or acting on behalf of users, AI-driven assistants and autonomous systems are becoming core product features. But securing these agents presents a fundamental challenge: How do you authenticate AI agents? How do you control what they can access? How do you ensure they act within the right permissions? This talk will explore these concepts and more while highlighting current research and best practices. ------------------------------------ Session ID: 936816 Track: MCP Speaker: Den Delimarsky (DEVDIV) (Principal Product Engineer) Format: Talk Room: Nobhill C&D: Microsoft Time: 5 Jun 2025 12:45 PM Session Title: Building Protected MCP Servers Description: Join us to see how VS Code and GitHub Copilot's expanding suite of AI features can match or even surpasses the benefits of other popular AI developer tools. We'll focus on practical scenarios to ensure immediate applicability and work through live demos of Copilot features such as: Code generation using Edits, Planning/problem solving using Chat, Inline terminal command generation, Boilerplate code generation using Agent mode, Improving boilerplate with custom instructions and then refactoring using Agent mode and Edits, Improving test generation and code reviews with custom instructions, as well as an Introduction to MCP. ------------------------------------ Session ID: 911925 Track: MCP Speaker: Samuel Colvin (Founder of Pydantic) Format: Talk Room: Yerba Buena Ballroom Salons 7-8: MCP Time: 4 Jun 2025 02:00 PM Session Title: MCP is all you need Description: Everyone is talking about agents, and right after that, they’re talking about agent-to-agent communications. Not surprisingly, various nascent, competing protocols are popping up to handle it. But maybe all we need is MCP — the OG of GenAI communication protocols (it's from way back in 2024!). Last year, Jason Liu gave the second most watched AIE talk — “Pydantic is all you need”. This year, I (the creator of Pydantic) am continuing the tradition by arguing that MCP might be all we need for agent-to-agent communications. What I’ll cover: - Misusing Common Patterns: MCP was designed for desktop/IDE applications like Claude Code and Cursor. How can we adapt MCP for autonomous agents? - Many Common Problems: MCP is great, but what can go wrong? How can you work around it? Can the protocol be extended to solve these issues? - Monitoring Complex Phenomena: How does observability work (and not work) with MCP? - Multiple Competing Protocols: A quick run-through of other agent communication protocols like A2A and AGNTCY, and probably a few more by June 😴 - Massive Crustaceans Party: What might success look like if everything goes to plan? ------------------------------------ Session ID: 933607 Track: MCP Speaker: Duan Lightfoot (AWS, Sr. Cloud Networking Developer Advocate) Format: Workshop Room: Golden Gate Ballroom A: Workshops Time: 3 Jun 2025 01:00 PM Session Title: Building Agents with Amazon Nova Act and MCP Description: In this 2-hour workshop, participants will gain practical hands-on experience building sophisticated AI agents using Amazon's agent technologies. You'll learn to build agents that can navigate the web like humans, perform complex multi-step tasks, and leverage specialized tools through natural language commands. You’ll explore Amazon Nova Act for reliable web navigation, Model Context Protocol (MCP) for connecting agents to external data sources and APIs, and Amazon Bedrock Agents for orchestrating complex workflows. Through guided exercises, you'll create agents capable of retrieving information and taking action across web applications, all through natural language interactions. By the end of this workshop, you'll have the practical skills to build AI agents that can browse websites, interact with web interfaces, and solve multi-step problems by combining these powerful Amazon technologies. ------------------------------------ Session ID: 933612 Track: MCP Speaker: Antje Barth (Principal Developer Advocate) Format: Keynote Room: Keynote/General Session (Yerba Buena 7&8) Time: 4 Jun 2025 04:05 PM Session Title: Building Agents at Cloud-Scale Description: Let's explore practical strategies for building and scaling agents in production. Discover how to move from local MCP implementations to cloud-scale architectures and how engineering teams leverage these patterns to develop sophisticated agent systems. Expect a mix of demos, use case discussions, and a glimpse into the future of agentic services! ------------------------------------ Session ID: 942943 Track: MCP Speaker: John Welsh (Member of technical staff, Anthropic) Format: Talk Room: Yerba Buena Ballroom Salons 7-8: MCP Time: 4 Jun 2025 11:35 AM Session Title: What we learned from shipping remote MCP support at Anthropic Description: We recently released remote MCP support for both claude.ai and the Anthropic API. This talk will cover architectural decisions we made in our implementation, remote MCP authentication, supporting engineers who are building out agentic AI tools, implementing custom internal transports, and whatever else we can fit into 18 minutes of your time. ------------------------------------ Session ID: 933626 Track: MCP Speaker: Philipp Krenn (Code and conference monkey) Format: Online Talk Room: Salons 9-15: Expo Hall Time: 5 Jun 2025 01:00 PM Session Title: Hope is Not a Strategy: Retrieval Patterns for MCP Description: MCP is a solid integration layer — but how does it hold up when it comes to output quality? Often, not as well as you'd like. Here are some practical retrieval patterns, from basic to advanced, that worked well in my experiments: * Naive: Just plug in plain MCP and hope the LLM gets it right. Sometimes it does. Sometimes you’ll need a miracle. * Semantic: Add more descriptive field names and extra metadata. It helps — but usually just a bit. * Templated: Use a structured template and have the LLM fill it out step by step. More effort, but by far the most reliable results. ------------------------------------ Session ID: 947165 Track: MCP Speaker: Theodora Chu (Product Manager for MCP @ Anthropic) Format: Keynote Room: Yerba Buena Ballroom Salons 7-8: MCP Time: 4 Jun 2025 11:15 AM Session Title: MCP Origins & RFS Description: Learn more about the latest updates on MCP and get ideas for what startups to build. ------------------------------------ Session ID: 933610 Track: MCP Speaker: Mike Chambers (AI/ML Specialist DA AWS) Format: Talk Room: Golden Gate Ballroom C: AI in the Fortune 500 Time: 5 Jun 2025 12:15 PM Session Title: Ship it! Building Production-Ready Agents Description: Explore the practical challenges and solutions for deploying AI agents in real-world production environments. Through detailed technical analysis and practical examples, we'll examine strategies for building and orchestrating agent systems at scale. We'll cover critical infrastructure decisions, scalability frameworks, and best practices for creating robust, production-ready agent architectures. ------------------------------------ Session ID: 916149 Track: MCP Speaker: Henry Mao (Founder @ Smithery.ai) Format: Online Talk Session Title: Are MCPs Overhyped? A Rant about MCPs Description: AI agents are becoming smarter but lack the broad capability to take action in practice. At Smithery, we believe the missing link is an AI orchestration layer—a unified interface that gives agents context, action, and a way to learn from real interactions. This talk explores the problem space in the Model Context Protocol (MCP) ecosystem and how we're tackling it at Smithery. ------------------------------------ Session ID: 935982 Track: MCP Speaker: Kent C. Dodds (Software Engineer Educator) Format: Online Talk Session Title: Letting AI Interface with Your App with MCP Description: We are entering a new era of user interaction. It's being built right before our very eyes and changing rapidly. As crazy as it sounds, soon each one of us will get our own Jarvis capable of performing actually useful tasks for us with a completely different user interaction mechanism than we're used to. But someone's gotta give Jarvis the tools to perform these tasks, and that's where we come in. In this talk, Kent will demonstrate an MCP server with an AI assistant to help us catch the vision of what this future could look like and our role in it. ------------------------------------ Session ID: 933688 Track: MCP Speaker: Zack Proser (Open source hacker. Dev Education at WorkOS) Format: Workshop Room: Salons 2-6: Workshops Time: 3 Jun 2025 03:30 PM Session Title: AI Pipelines and Agents in Pure TypeScript with Mastra.ai Description: This hands-on workshop introduces Mastra.ai, a TypeScript framework that streamlines the development of agentic AI systems compared to traditional approaches using LangChain and vector databases. Participants will learn to build structured AI workflows with composable tools and reliable control, enabling them to create internal AI assistants that can handle requests like data cleaning, email drafting, and document summarization with minimal code. The session covers Mastra installation, running a local MCP server, defining tools and agents in TypeScript, using the Mastra playground, and implementing practical examples such as RAG setups and tool-chaining agents—all designed to equip attendees with the skills to develop scalable AI-driven internal tools based on sound software engineering principles rather than just experimental prompts. ------------------------------------ Session ID: 933688 Track: MCP Speaker: Nick Nisi (Software developer and panelist on the JS Party podcast) Format: Workshop Room: Salons 2-6: Workshops Time: 3 Jun 2025 03:30 PM Session Title: AI Pipelines and Agents in Pure TypeScript with Mastra.ai Description: This hands-on workshop introduces Mastra.ai, a TypeScript framework that streamlines the development of agentic AI systems compared to traditional approaches using LangChain and vector databases. Participants will learn to build structured AI workflows with composable tools and reliable control, enabling them to create internal AI assistants that can handle requests like data cleaning, email drafting, and document summarization with minimal code. The session covers Mastra installation, running a local MCP server, defining tools and agents in TypeScript, using the Mastra playground, and implementing practical examples such as RAG setups and tool-chaining agents—all designed to equip attendees with the skills to develop scalable AI-driven internal tools based on sound software engineering principles rather than just experimental prompts. ------------------------------------ Session ID: 933656 Track: MCP Speaker: Nick Nisi (Software developer and panelist on the JS Party podcast) Room: Willow: Expo Sessions Time: 5 Jun 2025 12:45 PM Session Title: Agents, Access, and the Future of Machine Identity Description: AI agents are calling APIs, submitting forms, and sending emails—but how do you control what they’re allowed to do? As agents act on behalf of users or organizations, traditional patterns like OAuth, session tokens, and role-based access often fall short. In this talk, we’ll explore how machine identity is evolving to meet this new landscape. You’ll learn: - How to think about authentication for agents (not just humans) - What it means to authorize an action when the actor is an LLM or headless service - Real-world strategies from WorkOS and Cloudflare for assigning, managing, and revoking agent identity and access By the end, you’ll walk away with practical tools and mental models to build agent-powered systems that are secure, auditable, and scalable. ------------------------------------ Session ID: 926313 Track: MCP Speaker: Jan Curn (CEO) Format: Talk Room: Yerba Buena Ballroom Salons 7-8: MCP Time: 4 Jun 2025 02:40 PM Session Title: The rise of the agentic economy on the shoulders of MCP Description: Thanks to MCP and all the MCP server directories, agents can now autonomously discover new tools and other agents. This lays down the foundation for the future agentic economy, where businesses will sell to autonomous agents (B2A) and eventually agents will sell to other agents (A2A). But one key part is still missing: agents do not have a standard way to subscribe to external services and pay for them. In this talk, we’ll show how to give agents full autonomy to discover and pay for new external MCP-enabled services, even if those services don’t support it, using a little-known MCP server nesting capability. We’ll also cover how to monetize AI agents and the B2A/A2A business models. ------------------------------------ Session ID: 915013 Track: MCP Speaker: Benjamin Eckel (Co-Founder of Dylibso) Format: Talk Room: Yerba Buena Ballroom Salons 7-8: MCP Time: 4 Jun 2025 02:20 PM Session Title: Observable tools - the state of MCP observability Description: AI Engineers deserve observable tools! MCP getting adoption means that less and less of your agents code is running under your control, and this has DX and observability challenges, let's fix that! Join Alex Volkov from Weights & Biases and Steve Manual from mcp.run on this recap of the current state of MCP observability, including the observable.tools initiative, a recap of where the field stands and what to look forward to + a practical example of MCP tool usage evaluation framework from mcp.run! ------------------------------------ Session ID: 915013 Track: MCP Speaker: Steve Manuel (CEO) Format: Talk Room: Yerba Buena Ballroom Salons 7-8: MCP Time: 4 Jun 2025 02:20 PM Session Title: Observable tools - the state of MCP observability Description: AI Engineers deserve observable tools! MCP getting adoption means that less and less of your agents code is running under your control, and this has DX and observability challenges, let's fix that! Join Alex Volkov from Weights & Biases and Steve Manual from mcp.run on this recap of the current state of MCP observability, including the observable.tools initiative, a recap of where the field stands and what to look forward to + a practical example of MCP tool usage evaluation framework from mcp.run! ------------------------------------ Session ID: 915013 Track: MCP Speaker: Alex Volkov (AI Evangelist) Format: Talk Room: Yerba Buena Ballroom Salons 7-8: MCP Time: 4 Jun 2025 02:20 PM Session Title: Observable tools - the state of MCP observability Description: AI Engineers deserve observable tools! MCP getting adoption means that less and less of your agents code is running under your control, and this has DX and observability challenges, let's fix that! Join Alex Volkov from Weights & Biases and Steve Manual from mcp.run on this recap of the current state of MCP observability, including the observable.tools initiative, a recap of where the field stands and what to look forward to + a practical example of MCP tool usage evaluation framework from mcp.run! ------------------------------------ Session ID: 947995 Track: MCP Speaker: David Cramer (Founder, Sentry) Format: Talk Room: Yerba Buena Ballroom Salons 7-8: MCP Time: 4 Jun 2025 12:15 PM Session Title: MCP isn’t good, yet Description: You’ve heard a lot about MCP, probably been given an AI mandate or two, and are trying to figure out what’s real and what’s make believe. This session will give practical advice for how you should be thinking about MCP, the implementation pit falls, and where the speaker thinks things are going. ------------------------------------ Session ID: 914489 Track: MCP Speaker: Harald Kirschner (VS Code Team Member) Format: Talk Room: Yerba Buena Ballroom Salons 7-8: MCP Time: 4 Jun 2025 11:55 AM Session Title: Full Spectrum MCP: Uncovering Hidden Servers and Clients Capabilities Description: The true power of Model Context Protocol emerges when clients and servers collaborate across the full spectrum of the specification. This talk presents practical examples of how VS Code's comprehensive implementation of MCP transforms the capabilities of AI assistants, making them more contextual, efficient, and user-friendly. We'll showcase advanced features like dynamic tool discovery and workspace-aware roots, demonstrating how they create experiences impossible with standard tools integrations while confronting the reality gap between MCP's theoretical potential and practical implementation challenges. ====================================================================== --- Track: REASONING+RL (TBA) --- ====================================================================== Session ID: 947233 Track: Reasoning+RL Speaker: Jack Rae (Principal Research Scientist) Format: Keynote Session Title: (tbc) Gemini Thinking and the Future of Reasoning Description: Jack Rae, Principal Research Scientist at Google DeepMind, will keynote on Gemini Thinking and the future of reasoning in AI: advances, challenges, and what’s next for reasoning capabilities in next-gen foundation models. ------------------------------------ Session ID: 916189 Track: Reasoning+RL Speaker: Jesse Han (Founder) Format: Keynote Room: Keynote/General Session (Yerba Buena 7&8) Time: 5 Jun 2025 10:10 AM Session Title: The infrastructure for the singularity Description: We're at an inflection point where AI agents are transitioning from experimental tools to practical coworkers. This new world will demand new infrastructure for RL training, test-time scaling, and deployment. This is why Morph Labs developed Infinibranch last year, and we are excited to finally unveil what's next. ------------------------------------ Session ID: 929509 Track: Reasoning+RL Speaker: Daniel Han (CEO) Format: Workshop Room: Foothill C: Workshops Time: 3 Jun 2025 09:00 AM Session Title: Advanced: Reinforcement Learning, Kernels, Reasoning, Quantization & Agents Description: Why is Reinforcement Learning (RL) suddenly everywhere, and is it truly effective? Have LLMs hit a plateau in terms of intelligence and capabilities, or is RL the breakthrough they need? In this workshop, we'll dive into the fundamentals of RL, what makes a good reward function, and how RL can help create agents. We'll also talk about kernels, are they still worth your time and what you should focus on. And finally, we’ll explore how LLMs like DeepSeek-R1 can be quantized down to 1.58-bits and still perform well, along with techniques to maintain accuracy. ------------------------------------ Session ID: 935461 Track: Reasoning+RL Speaker: Logan Kilpatrick (Product, Google Deepmind) Format: Keynote Room: Keynote/General Session (Yerba Buena 7&8) Time: 5 Jun 2025 09:05 AM Session Title: A year of Gemini progress + what comes next Description: Over the last year, Google and Gemini models have shown rapid progress across all dimensions (model, product, etc). Let's highlight all the work that has happened, how we got the worlds best models, and where we are going next (across both the model landscape and out AI products). ------------------------------------ Session ID: 939640 Track: Reasoning+RL Speaker: Christian Szegedy (Former co-founder of xAI, discoverer of adversarial examples) Format: Talk Room: Yerba Buena Ballroom 2-6: Reasoning + RL Time: 5 Jun 2025 02:40 PM Session Title: Towards Verified Superintelligence Description: I describe a new paradigm towards open-endedly self-improving intelligence by scaling verification to remove the human data and supervision bottleneck. The objective is to achieve trustless alignment of superintelligence. ------------------------------------ Session ID: 914856 Track: Reasoning+RL Speaker: Will Brown (Research Engineering Lead) Format: Talk Room: Yerba Buena Ballroom 2-6: Reasoning + RL Time: 5 Jun 2025 11:15 AM Session Title: Training Agentic Reasoners Description: This talk will be a technical deep dive into RL for agentic reasoning via multi-turn tool calling, similar to OpenAI's o3 and Deep Research. In particular, we'll cover: - When, why, and how - GRPO vs PPO vs etc - Designing environments and rewards - Survey of recent research highlights - Results on example tasks - Overview of open-source ecosystem (libraries, compute requirements, tradeoffs, etc.) ------------------------------------ Session ID: 916074 Track: Reasoning+RL Speaker: Ryan Marten (Founding Engineer at Bespoke Labs) Format: Talk Room: Yerba Buena Ballroom 2-6: Reasoning + RL Time: 5 Jun 2025 12:15 PM Session Title: OpenThoughts: Data Recipes for Reasoning Models Description: Peel back the curtain on state of the art model post-training through the story of OpenThinker, a SOTA small reasoning model (outperforming DeepSeek distill), built in the open. Learn about the dataset recipe used to build the strongest reasoning models which you can apply to your own domain-specific specialized reasoning models. Hear about the strategies that scale (and that don't) based on our rigorous experimentation on the journey from thousands of data points (Bespoke-Stratos) to millions of data (OpenThinker3). Build upon our open source engineering solutions for large-scale synthetic data generation, training on multiple supercomputing clusters, and building out fast reliable evaluations. ------------------------------------ Session ID: 914786 Track: Reasoning+RL Speaker: Greg Kamradt (President) Format: Talk Room: Yerba Buena Ballroom 2-6: Reasoning + RL Time: 5 Jun 2025 11:35 AM Session Title: Measuring AGI: Interactive Reasoning Benchmarks Description: ARC Prize Foundation is building the North Star for AGI—rigorous, open benchmarks that track reasoning progress in modern AI. We'll show why static AGI evaluations are useful, but fall short when comparing models to human intelligence. Sneak peak preview of ARC-AGI-3: a dynamic, game-like benchmark launching Q1 '26. ------------------------------------ Session ID: 939091 Track: Reasoning+RL Speaker: Junyang Lin (Alibaba Qwen) Format: Online Talk Session Title: Qwen: Towards a Generalist Model / Agent Description: Since Alibaba launched the Qwen series of large models in 2023, the Qwen series of large language models and multimodal large models have been continuously updated and improved. This presentation will introduce the latest developments in the Qwen series of models, including the large language model Qwen3, vision-language large model Qwen2.5-VL, omni model Qwen2.5-Omni, etc. Additionally, this presentation will also cover the future development directions of the Qwen series. ------------------------------------ Session ID: 930540 Track: Reasoning+RL Speaker: Ilan Bigio (Developer Experience) Format: Workshop Room: Golden Gate Ballroom B: Workshops Time: 3 Jun 2025 01:00 PM Session Title: Model-Maxxing: RFT, DPO, SFT (Fine-tuning with OpenAI) Description: Covering all forms of fine-tuning and prompt engineering, like SFT, DPO, RFT, prompt engineering / optimization, and agent scaffolding. ------------------------------------ Session ID: 926721 Track: Reasoning+RL Speaker: Nathan Lambert (Research Lead) Format: Talk Room: Yerba Buena Ballroom 2-6: Reasoning + RL Time: 5 Jun 2025 02:20 PM Session Title: A taxonomy for next-generation reasoning models Description: Current AI models are extremely skilled, which was seen as the step change in evaluation scores across the industry in the first half of 2025, but often fail when presented with even medium time-horizon tasks. This talk presents a taxonomy of 4 traits of reasoning models -- skills, calibration, strategy, and abstraction -- that will be crucial to creating the next generation of AI applications. With this, we focus on the latter two, strategy and abstraction, and discuss how these traits will enable long-horizon and reliable agents. The talk concludes with a scenario where these agentic behaviors are the foundation for RL continuing to scale in the coming years and post-training techniques reaching compute parity with pretraining methors sooner than later. ------------------------------------ Session ID: 914533 Track: Reasoning+RL Speaker: Kyle Corbitt (CEO ) Format: Talk Room: Yerba Buena Ballroom 2-6: Reasoning + RL Time: 5 Jun 2025 02:00 PM Session Title: How to Train Your Agent: Building Reliable Agents with RL Description: Have you ever launched an awesome agentic demo, only to realize no amount of prompting will make it reliable enough to deploy in production? Agent reliability is a famously difficult problem to solve! In this talk we’ll learn how to use GRPO to help your agent learn from its successes and failures and improve over time. We’ve seen dramatic results with this technique, such as an email assistant agent that whose success rate jumped from 74% to 94% after replacing o4-mini with an open source model optimized using GRPO. We’ll share case studies as well as practical lessons learned around the types of problems this works well for and the unexpected pitfalls to avoid. ====================================================================== --- Track: RECSYS (TBA) --- ====================================================================== Session ID: 906567 Track: RecSys Speaker: Devansh Tandon (Principal Product Manager ) Format: Talk Room: Golden Gate Ballroom A: LLM RecSys Time: 4 Jun 2025 02:40 PM Session Title: Teaching Gemini to Speak YouTube: Adapting LLMs for Video Recommendations to 2B+ DAU Description: YouTube recommendations drive the majority of video watch time for billions of daily users. Traditionally powered by large embedding models (LEMs), we're undertaking a fundamental shift: rebuilding our recommendation stack using foundation models like Gemini. This talk dives into our engineering journey adapting general-purpose LLMs (Gemini) for the highly specialized, dynamic, and massive-scale task of YouTube recommendations. We'll discuss: - SemanticID: creating a "language" for YouTube videos, from our paper last year – Better Generalization with Semantic IDs: A Case Study in Ranking for Recommendations - Adapting Gemini checkpoints to understand SemanticID - Generative Video Retrieval with prompts There’s a lot of attention on the LLM-led transformation of Search (with AI Overviews, Perplexity, ChatGPT-Search etc). However, across large consumer apps, it’s the recommendation systems & feeds that drive most consumer engagement, not just search. This talk is about the LLM-led transformation of recommendations & feeds – building a recommendation engine on top of Gemini. ------------------------------------ Session ID: 936205 Track: RecSys Speaker: Hamed Firooz (Principal Scientist -- LinkedIn Core AI) Format: Talk Room: Golden Gate Ballroom A: LLM RecSys Time: 4 Jun 2025 11:55 AM Session Title: 360Brew LLM-based Foundation Model for Personalized Ranking and Recommendation Description: We will give a talk about our journey of building a foundation model for solving ranking and recommendation tasks across LinkedIn platform ------------------------------------ Session ID: 929231 Track: RecSys Speaker: Vinesh Gudla (Staff Machine Learning Engineer, Instacart) Format: Talk Room: Golden Gate Ballroom A: LLM RecSys Time: 4 Jun 2025 02:20 PM Session Title: How Instacart transformed its search and discovery using an LLM-driven approach Description: - Learn how Instacart uses cutting-edge LLMs to redefine search and product discovery. - Explore innovative solutions overcoming traditional search engine limitations for grocery shopping. - Discover how LLMs enhance user intent understanding and generate engaging content. - See practical applications of LLM technology to improve search relevance and user experience. ------------------------------------ Session ID: 936205 Track: RecSys Speaker: Maziar Sanjabi (LinkedIn AI, Principal Scientist) Format: Talk Room: Golden Gate Ballroom A: LLM RecSys Time: 4 Jun 2025 11:55 AM Session Title: 360Brew LLM-based Foundation Model for Personalized Ranking and Recommendation Description: We will give a talk about our journey of building a foundation model for solving ranking and recommendation tasks across LinkedIn platform ------------------------------------ Session ID: 932498 Track: RecSys Speaker: Mukuntha Narayanan (Machine Learning Engineer) Format: Talk Room: Golden Gate Ballroom A: LLM RecSys Time: 4 Jun 2025 11:35 AM Session Title: What We Learned from Using LLMs in Pinterest Search Description: Pinterest Search integrates Large Language Models (LLMs) to enhance relevance scoring by combining search queries with rich multimodal content, including visual captions, link-based text, and user curation signals. A semi-supervised learning framework enables scaling to large and multilingual datasets, going beyond English and limited human labels. These LLM-driven models are distilled into efficient architectures for real-time serving, with experimental validation and large-scale deployment demonstrating substantial improvements in search relevance for Pinterest users worldwide. ------------------------------------ Session ID: 932583 Track: RecSys Speaker: Yesu Feng (Netflix, Staff Research Scientist) Format: Talk Room: Golden Gate Ballroom A: LLM RecSys Time: 4 Jun 2025 02:00 PM Session Title: One model to rule recommendations: Netflix's Big Bet Description: Discuss the foundation model strategy for personalization at Netflix based on this post https://netflixtechblog.com/foundation-model-for-personalized-recommendation-1a0bd8e02d39 and recent developments. ------------------------------------ Session ID: 932498 Track: RecSys Speaker: Han Wang (Machine Learning Engineer) Format: Talk Room: Golden Gate Ballroom A: LLM RecSys Time: 4 Jun 2025 11:35 AM Session Title: What We Learned from Using LLMs in Pinterest Search Description: Pinterest Search integrates Large Language Models (LLMs) to enhance relevance scoring by combining search queries with rich multimodal content, including visual captions, link-based text, and user curation signals. A semi-supervised learning framework enables scaling to large and multilingual datasets, going beyond English and limited human labels. These LLM-driven models are distilled into efficient architectures for real-time serving, with experimental validation and large-scale deployment demonstrating substantial improvements in search relevance for Pinterest users worldwide. ------------------------------------ Session ID: 929231 Track: RecSys Speaker: Tejaswi Tenneti (Director of Machine Learning) Format: Talk Room: Golden Gate Ballroom A: LLM RecSys Time: 4 Jun 2025 02:20 PM Session Title: How Instacart transformed its search and discovery using an LLM-driven approach Description: - Learn how Instacart uses cutting-edge LLMs to redefine search and product discovery. - Explore innovative solutions overcoming traditional search engine limitations for grocery shopping. - Discover how LLMs enhance user intent understanding and generate engaging content. - See practical applications of LLM technology to improve search relevance and user experience. ------------------------------------ Session ID: 929337 Track: RecSys Speaker: Eugene Yan (Principal Applied Scientist) Format: Talk Room: Golden Gate Ballroom A: LLM RecSys Time: 4 Jun 2025 11:15 AM Session Title: Recsys Keynote: Improving Recommendation Systems & Search in the Age of LLMs Description: Recommendation systems and search have long adopted advances in language modeling, from early adoption of Word2vec for embedding-based retrieval to the transformative impact of GRUs, Transformers, and BERT on predicting user interactions. Now, the rise of large language models (LLMs) is inspiring innovations in model architecture, scalable system designs, and richer customer experiences. In this keynote, we'll dive into cutting-edge industry applications of LLMs in recommendation and search systems, exploring real-world implementations and measurable outcomes. Join us for an look at current trends and an exciting vision of how LLM-driven techniques will shape the future of content discovery and intelligent search. ====================================================================== --- Track: RETRIEVAL+SEARCH (TBA) --- ====================================================================== Session ID: 913839 Track: Retrieval+Search Speaker: Julia Neagu (CEO) Format: Talk Room: Golden Gate Ballroom A: Retrieval + Search Time: 5 Jun 2025 11:55 AM Session Title: Evaluating AI Search: A Practical Framework for Augmented AI Systems Description: AI search is becoming the front door to information, whether through Retrieval-Augmented Generation (RAG), Search-Augmented Generation (SAG), or custom agents that synthesize answers on top of indexed content. As users rely more heavily on these systems, evaluating their quality becomes mission-critical. But traditional metrics like precision and recall don’t capture the full picture. In this talk, we introduce a practical evaluation framework for AI-powered search, across three dimensions: - Are the retrieved sources relevant to the query? - And is the final answer complete? - Are the sources faithfully used in the generated answer? We’ll share lessons from working with search companies and present early findings from a new benchmark evaluating popular augmented AI systems across these dimensions. Rather than ranking winners and losers, we explore where different systems excel or break down, and how these tradeoffs inform product decisions. This talk is for AI engineers and product teams who want to build trusted, high-quality AI search experiences, and need a way to measure if it’s actually working. ------------------------------------ Session ID: 903966 Track: Retrieval+Search Speaker: Chang She (CEO) Format: Talk Room: Golden Gate Ballroom A: Retrieval + Search Time: 5 Jun 2025 11:35 AM Session Title: Scaling Enterprise-Grade RAG Systems: Lessons from the Legal Frontier Description: In domains like law, compliance, and tax, building enterprise-grade RAG means very large scale, spikey workloads, a focus on accuracy, and non-negotiable privacy. In this talk, we'll share war stories and battle scars of how Harvey has built the world's most advanced AI agents for the legal profession on top of a highly optimized retrieval architecture. We'll cover how to get better retrieval via both sparse and dense retrieval methods, why domain-specific reranking is essential, and how to handle ambiguity in real-world queries. We'll also touch on how LanceDB's search engine enables this architecture by delivering low-latency, high-throughput retrieval across millions of documents of varying sizes without compromising privacy. This solid foundation enables Harvey to build a product that brings highly accurate answers to hundreds of law firms and professional services firms across 45 countries. ------------------------------------ Session ID: 913839 Track: Retrieval+Search Speaker: Deanna Emery (Founding AI Researcher) Format: Talk Room: Golden Gate Ballroom A: Retrieval + Search Time: 5 Jun 2025 11:55 AM Session Title: Evaluating AI Search: A Practical Framework for Augmented AI Systems Description: AI search is becoming the front door to information, whether through Retrieval-Augmented Generation (RAG), Search-Augmented Generation (SAG), or custom agents that synthesize answers on top of indexed content. As users rely more heavily on these systems, evaluating their quality becomes mission-critical. But traditional metrics like precision and recall don’t capture the full picture. In this talk, we introduce a practical evaluation framework for AI-powered search, across three dimensions: - Are the retrieved sources relevant to the query? - And is the final answer complete? - Are the sources faithfully used in the generated answer? We’ll share lessons from working with search companies and present early findings from a new benchmark evaluating popular augmented AI systems across these dimensions. Rather than ranking winners and losers, we explore where different systems excel or break down, and how these tradeoffs inform product decisions. This talk is for AI engineers and product teams who want to build trusted, high-quality AI search experiences, and need a way to measure if it’s actually working. ------------------------------------ Session ID: 933671 Track: Retrieval+Search Speaker: Zach Blumenfeld (AI/ML Product Specialist) Room: Nobhill A&B: Expo Sessions Time: 5 Jun 2025 12:45 PM Session Title: Agentic GraphRAG: Simplifying Retrieval Across Structured & Unstructured Data Description: Agentic workflows often become complex, brittle, and hard to maintain when they need to retrieve and reason across both structured data (typically requiring precise query execution) and unstructured data (commonly handled via vector search in RAG). In this talk, we’ll explore how mapping key information into a knowledge graph can simplify these workflows and improve retrieval quality. You’ll learn core concepts behind GraphRAG, how to integrate it into agent tools, and get access to end-to-end code examples so you can start building right away. ------------------------------------ Session ID: 915338 Track: Retrieval+Search Speaker: Chau Tran (AI engineer at Glean) Format: Talk Room: Foothill C: Agent Reliability Time: 4 Jun 2025 02:00 PM Session Title: How to build Enterprise-aware agents Description: While LLMs demonstrated impressive reasoning capabilities, their out-of-the-box reasoning is akin to hiring a brilliant but brand-new employee who doesn’t have the enterprise context of “how things are done at this company”. In this talk, I'll introduce “Workflow Search” as a paradigm to build enterprise-aware agents that can balance predictability on common tasks, and flexibility on unforeseen tasks. ------------------------------------ Session ID: 907695 Track: Retrieval+Search Speaker: David Karam (CEO) Format: Talk Room: Golden Gate Ballroom A: Retrieval + Search Time: 5 Jun 2025 02:40 PM Session Title: Layering every technique in RAG, one query at a time Description: Start with the simplest Search - in-memory embeddings with relevance ranking. End with the most complex planet-scale Search - 70+ corpus mix of token, embeddings, and knowledge graphs, all jointly retrieved, custom ranked, joint re-ranked, and then LLM-processed, at 160,000 queries per second in under 200msec. This talk will be a fun “one query at a time” survey of all techniques in RAG in incremental complexity, showing the limits of each technique and what the next layered one opens up in terms of capabilities to handle ever-more complex queries in RAG. You’ll learn why queries like [falafel] are notoriously hard to Search over, why chunking your documents can be disastrous, how you can sometimes can get away with a simple bm25, and how some Search problems are so hard to solve that you’re better off punting the problem to the LLM or the UX. Brought to you by the team that worked on 50+ Search products, in the context of Google.com and custom Enterprise Search. ------------------------------------ Session ID: 903966 Track: Retrieval+Search Speaker: Calvin Qi (Tech Lead Manager) Format: Talk Room: Golden Gate Ballroom A: Retrieval + Search Time: 5 Jun 2025 11:35 AM Session Title: Scaling Enterprise-Grade RAG Systems: Lessons from the Legal Frontier Description: In domains like law, compliance, and tax, building enterprise-grade RAG means very large scale, spikey workloads, a focus on accuracy, and non-negotiable privacy. In this talk, we'll share war stories and battle scars of how Harvey has built the world's most advanced AI agents for the legal profession on top of a highly optimized retrieval architecture. We'll cover how to get better retrieval via both sparse and dense retrieval methods, why domain-specific reranking is essential, and how to handle ambiguity in real-world queries. We'll also touch on how LanceDB's search engine enables this architecture by delivering low-latency, high-throughput retrieval across millions of documents of varying sizes without compromising privacy. This solid foundation enables Harvey to build a product that brings highly accurate answers to hundreds of law firms and professional services firms across 45 countries. ------------------------------------ Session ID: 914024 Track: Retrieval+Search Speaker: Will Bryk (CEO & Co-founder, building perfect search at Exa) Format: Talk Room: Golden Gate Ballroom A: Retrieval + Search Time: 5 Jun 2025 02:20 PM Session Title: Building a Smarter AI Agent with Neural RAG Description: RAG quality for AI agents is critical, and traditional keyword-based search engines consistently underperform in agentic or multi-step tasks, where semantic grounding and contextual nuance matter most. In this talk, Will Bryk, CEO of Exa will live code two AI agent applications–one using traditional keyword search RAG and one using neural network RAG via vector search. He’ll then evaluate both applications based on task performance, relevance, and latency. With a live demo (no theory or pre-baked applications), the audience will get a firsthand look at the practical differences between keyword and semantic systems in production, and learn embedding strategies, indexing trade-offs, hybrid retrieval techniques, prompt tuning, and more. ------------------------------------ Session ID: 933678 Track: Retrieval+Search Speaker: Tengyu Ma (Chief AI Scientist, MongoDB) Format: Talk Room: Golden Gate Ballroom A: Retrieval + Search Time: 5 Jun 2025 12:15 PM Session Title: RAG in 2025: State of the Art and the Road Forward Description: The talk will have three parts 1.Roadmap debate: RAG vs. finetuning vs. long-context 2.RAG today: benefits, challenges, and current solutions 3.RAG tomorrow: AI models do more work ------------------------------------ Session ID: 933721 Track: Retrieval+Search Speaker: Suman Debnath (Principal Developer Advocate, AI/ML, AWS) Format: Workshop Room: Foothill G1&2: Workshops Time: 3 Jun 2025 03:30 PM Session Title: VoiceVision RAG - Integrating Visual Document Intelligence with Voice Response Description: In this workshop we will explore the integration of Colpali, a cutting-edge Vision based Retrieval Model, with voice synthesis for next-generation RAG systems. We'll demonstrate how Colpali's ability to generate multi-vector embeddings directly from document images bypasses traditional OCR and complex preprocessing, while adding voice output creates a more intuitive and accessible user experience. Attendees will see how this combination handles documents with mixed textual and visual information, leading to more efficient and accurate information retrieval with natural voice responses. ------------------------------------ Session ID: 933603 Track: Retrieval+Search Speaker: Suman Debnath (Principal Developer Advocate, AI/ML, AWS) Room: Nobhill A&B: Expo Sessions Time: 5 Jun 2025 10:45 AM Session Title: Introducing Strands Agents, an Open Source AI Agents SDK Description: Building AI agents used to require complex orchestration, extensive scaffolding, and months of tuning. With Strands Agents, an open source SDK from AWS. You can now build, test, and deploy intelligent agents in just a few lines of code. This session introduces the model-driven approach behind Strands, where a model, a prompt, and a set of tools are all you need to create powerful, production-ready agents. Learn how Strands leverages modern foundation models to handle reasoning, tool use, and reflection, reducing development time from months to days. ------------------------------------ Session ID: 916214 Track: Retrieval+Search Speaker: Philipp Krenn (Code and conference monkey) Format: Workshop Room: Foothill G1&2: Workshops Time: 3 Jun 2025 01:00 PM Session Title: Information Retrieval from the Ground Up Description: Vector search is only a feature. Search engines and information retrieval have retaken their position as the foundation of RAG. This workshop takes you through decades of research, what has been working for a long time, and how it got better with Machine Learning. ------------------------------------ Session ID: 915338 Track: Retrieval+Search Speaker: Chau Tran (Technical Lead Delete) Format: Talk Room: Foothill C: Agent Reliability Time: 4 Jun 2025 02:00 PM Session Title: How to build Enterprise-aware agents Description: While LLMs demonstrated impressive reasoning capabilities, their out-of-the-box reasoning is akin to hiring a brilliant but brand-new employee who doesn’t have the enterprise context of “how things are done at this company”. In this talk, I'll introduce “Workflow Search” as a paradigm to build enterprise-aware agents that can balance predictability on common tasks, and flexibility on unforeseen tasks. ------------------------------------ Session ID: 936933 Track: Retrieval+Search Speaker: Nina Lopatina (Lead Developer Advocate) Format: Workshop Room: Golden Gate Ballroom B: Workshops Time: 3 Jun 2025 09:00 AM Session Title: Forget RAG Pipelines—Build Production-Ready AI Agents in 15 Minutes Description: Want to take advantage of your data, but don't want to reinvent RAG infrastructure? Join our workshop and see how you can deploy Agentic RAG in minutes using Contextual AI's managed RAG solution. We'll explore how Contextual handles intelligent parsing and chunking of your data, retrieves information with state of the art accuracy, and generates responses with a multi layered set of guardrails against hallucinations. Together, we'll build an end-to-end Agentic RAG pipeline and demonstrate its integration with Claude Desktop via MCP, so you can see how this could plug into your existing ecosystem. By the end of this session, you'll have a functioning Agentic RAG prototype that you can easily customize and deploy to production for your specific use cases, even with complex, unstructured documents. ------------------------------------ Session ID: 933702 Track: Retrieval+Search Speaker: Mikiko Bazeley (Staff Developer Advocate, MongoDB) Room: Salons 9-15: Expo Hall Time: 5 Jun 2025 01:15 PM Session Title: Smarter Together: Designing Multi-Agent Systems with Shared, Evolving Memory Description: In today’s most advanced AI systems, intelligence is no longer confined to a single model or agent—it emerges from coordination. But coordination requires memory: short-term, long-term, and shared. In this talk, we’ll break down how agent systems can store, retrieve, and evolve shared memory to become smarter over time. You'll learn what it takes to architect these continuously learning systems, how to track and improve memory quality, and why robust, flexible infrastructure is the foundation of it all. Stick around to see how this works in practice—live. ------------------------------------ Session ID: 933646 Track: Retrieval+Search Speaker: Jesús Barrasa (AI Field CTO) Room: Juniper: Expo Sessions Time: 5 Jun 2025 10:45 AM Session Title: Why Your Agent’s Brain Needs a Playbook: Practical Wins from Using Ontologies Description: You're trying to guide how your agents think and act. Code-orchestrated workflows are too rigid, but LLMs charting their own course feel too chaotic. When you need a middle ground, it’s time to reach for the secret weapon: ontologies. These graph-shaped fragments of actionable knowledge can fill in critical gaps. In this talk, we’ll explore together how ontologies bring structure, semantics, and sanity to GenAI-powered applications. You’ll learn when they’re useful, how to apply them, and what kinds of problems they help solve. Through practical examples, we’ll show how ontologies (1) guide knowledge graph construction, (2) add a semantic layer for more efficient and accurate retrieval (GraphRAG), and (3) encode domain logic you don’t want to leave up to the LLM. ------------------------------------ Session ID: 933689 Track: Retrieval+Search Speaker: Frank Liu (Staff Product Manager, MongoDB) Room: Juniper: Expo Sessions Time: 5 Jun 2025 11:00 AM Session Title: The State of AI-Powered Search and Retrieval Description: In this talk, we examine the state-of-the-art in AI-powered search and retrieval. We detail techniques for enhancing performance beyond base embedding models, including hybrid search, reranking strategies, query decomposition and document enrichment, the use of domain-specific and fine-tuned embeddings, custom data processing pipelines (ETL), and contextualized chunking methods. ------------------------------------ Session ID: 933605 Track: Retrieval+Search Speaker: Mani Khanuja (Principal ML Services SA) Room: Juniper: Expo Sessions Time: 4 Jun 2025 03:30 PM Session Title: Data is Your Differentiator: Building Secure and Tailored AI Systems Description: As organizations seek to harness their proprietary data while maintaining security and compliance, Amazon Bedrock provides a comprehensive framework for building tailored AI applications. Using Amazon Bedrock Knowledge Bases and Amazon Bedrock Data Automation, organizations can create AI solutions that truly understand their unique business context, terminology, and requirements. Combined with Amazon Bedrock Guardrails, these capabilities enhance the accuracy and relevance of AI-generated responses, while ensuring that sensitive information remains protected within the organization's control - enabling businesses to build secure and compliant enterprise-grade generative AI solutions that accelerate time to value. ------------------------------------ Session ID: 933692 Track: Retrieval+Search Speaker: Richmond Alake (Staff Developer Advocate, AI/ML at MongoDB) Room: Willow: Expo Sessions Time: 4 Jun 2025 10:40 AM Session Title: Architecting Agent Memory: Principles, Patterns, and Best Practices Description: In the rapidly evolving landscape of agentic systems, memory management has emerged as a key pillar for building intelligent, context-aware AI Agents. Inspired by the complexity of human memory systems—such as episodic, working, semantic, and procedural memory—this talk unpacks how AI agents can achieve believability, reliability, and capability by retaining and reasoning over past experiences. We’ll begin by establishing a conceptual framework based on real-world implementations from memory management libraries and system architectures: Memory Components representing various structured memory types (e.g., conversation, workflow, episodic, persona) Memory Modes reflecting operational strategies for short-term, long-term, and dynamic memory handling Next, the talk transitions to practical implementation patterns critical for effective memory lifecycle management: Maintaining rich conversation history and contextual awareness Persistence strategies leveraging vector databases and hybrid search Memory augmentation using embeddings, relevance scoring, and semantic retrieval Production-ready practices for scaling memory in multi-agent ecosystems We’ll also examine advanced memory strategies within agentic systems: Memory cascading and selective deletion Integration of tool use and persona memory Optimizing performance around memory retrieval and LLM context window limits Whether you're developing autonomous agents, chatbots, or complex workflow orchestration systems, this talk offers knowledge and tactical insights for building AI that can remember, adapt, and improve over time. This session is ideal for: AI engineers and agent framework developers Architects designing Agentic RAG or multi-agent systems Practitioners building contextual, personalized AI experiences By the end of the session, you’ll understand how to leverage memory as a strategic asset in agentic design—and walk away ready to build agents that not only act and reason but also remember. ------------------------------------ Session ID: 914371 Track: Retrieval+Search Speaker: Henry Weller (Senior Product Manager, Vector Search @ MongoDB) Format: Talk Room: Salons 9-15: Expo Hall Time: 5 Jun 2025 03:00 PM Session Title: Building Vector Search Experiences with MongoDB: Access patterns, data models, and scaling considera Description: This talk will explore typical and forward-looking use cases for Atlas Vector Search, as well as how different types of data models and query patterns can be implemented and effectively scaled to meet the needs of those use cases. There will be a focus on the "Iron Triangle of Search" balancing accuracy, speed, and cost and talking about practical considerations that emerge within those use cases. This will be a technical talk focused on the "how" of Atlas Vector Search and considerations when building information retrieval systems given by a technical PM, not a sales pitch explaining how basic vector retrieval "solves" hallucinations. ------------------------------------ Session ID: 925259 Track: Retrieval+Search Speaker: Jerry Liu (CEO) Format: Talk Room: Golden Gate Ballroom A: Retrieval + Search Time: 5 Jun 2025 11:15 AM Session Title: Building AI Agents that actually automate Knowledge Work Description: Agents are all the rage in 2025, and every single b2b SaaS startup/incumbent promises AI agents that can "automate work" in some way. But how do you actually build this? The answer is two fold: 1. really really good tools 2. carefully tailored agent reasoning over these tools that range from assistant-to-automation based UXs. The main goal of this talk is to a practical overview of agent architectures that can automate real-world work, with a focus on document-centric tasks. Learn the core building blocks of best-in-class "tools" around processing, manipulating, and indexing/retrieving PDFs to Excel spreadsheets. Also learn the range of agent architectures suited for different tasks, from chat assistant-based UXs with high human-in-the-loop, to automation UXs that rely on encoding a business process into an end-to-end task solver. These architectures have to be generalizable but also highly accurate as agents get increasingly better at reasoning and code-writing. ------------------------------------ Session ID: 916215 Track: Retrieval+Search Speaker: Sherwood Callaway (Tech Lead, Alice ) Format: Talk Room: Golden Gate Ballroom A: Retrieval + Search Time: 5 Jun 2025 02:00 PM Session Title: Building Alice’s Brain: How We Built an AI Sales Rep that Learns Like a Human Description: AI agents are becoming essential tools for teams of all sizes and industries - but training them to become experts in your product, business, and customerbase remains a challenge. What if onboarding a digital worker was as simple as uploading your pitch deck? At 11x, we built Alice, an AI SDR that writes outbound emails with the nuance and context of a top-performing human sales rep - because she learns like one too! In this talk, we'll share how we built a knowledge base that allows 11x customers to "train" Alice on their internal materials: PDFs, websites, call recordings, and more. We'll talk through the ingestion pipeline in detail, discuss storage/retrieval technologies and their tradeoffs, and explain how Alice uses the knowledge base to drive high-performance email outreach at scale. ------------------------------------ Session ID: 916215 Track: Retrieval+Search Speaker: Satwik Singh (Member of Technical Staff ) Format: Talk Room: Golden Gate Ballroom A: Retrieval + Search Time: 5 Jun 2025 02:00 PM Session Title: Building Alice’s Brain: How We Built an AI Sales Rep that Learns Like a Human Description: AI agents are becoming essential tools for teams of all sizes and industries - but training them to become experts in your product, business, and customerbase remains a challenge. What if onboarding a digital worker was as simple as uploading your pitch deck? At 11x, we built Alice, an AI SDR that writes outbound emails with the nuance and context of a top-performing human sales rep - because she learns like one too! In this talk, we'll share how we built a knowledge base that allows 11x customers to "train" Alice on their internal materials: PDFs, websites, call recordings, and more. We'll talk through the ingestion pipeline in detail, discuss storage/retrieval technologies and their tradeoffs, and explain how Alice uses the knowledge base to drive high-performance email outreach at scale. ------------------------------------ Session ID: 916143 Track: Retrieval+Search Speaker: Mark Bain (Founder & Research Scientist) Format: Workshop Room: Golden Gate Ballroom B: GraphRAG Time: 4 Jun 2025 01:00 PM Session Title: Make Your AI Agents Remember What They Do! Description: Are you ready to give your AI agents a memory upgrade? Join us for a fast-paced workshop exploring how memory can transform your agents. What You'll Do: Learn Leading Memory Solutions: Gain practical experience with open-source tools like Neo4j, Cognee, Graphiti, and Mem0. Explore Memory Types: Understand the theory behind long-term, short-term, episodic, semantic, and other memory types. Discover Memory Benefits: Learn how memory improves recall, contextual awareness, and reasoning in autonomous agents. Compare Implementations: Get a snapshot of how different solutions implement memory—what’s built-in, flexible, and experimental. We'll also demonstrate GraphRAG memory solutions and a GraphRAG chat implemented with Google ADK. Whether you’re working on AI copilots, agentic workflows, or research prototypes, this workshop will help you embed real memory into your AI stack. ------------------------------------ Session ID: 913839 Track: Retrieval+Search Speaker: Maitar Asher (Head of Engineering) Format: Talk Room: Golden Gate Ballroom A: Retrieval + Search Time: 5 Jun 2025 11:55 AM Session Title: Evaluating AI Search: A Practical Framework for Augmented AI Systems Description: AI search is becoming the front door to information, whether through Retrieval-Augmented Generation (RAG), Search-Augmented Generation (SAG), or custom agents that synthesize answers on top of indexed content. As users rely more heavily on these systems, evaluating their quality becomes mission-critical. But traditional metrics like precision and recall don’t capture the full picture. In this talk, we introduce a practical evaluation framework for AI-powered search, across three dimensions: - Are the retrieved sources relevant to the query? - And is the final answer complete? - Are the sources faithfully used in the generated answer? We’ll share lessons from working with search companies and present early findings from a new benchmark evaluating popular augmented AI systems across these dimensions. Rather than ranking winners and losers, we explore where different systems excel or break down, and how these tradeoffs inform product decisions. This talk is for AI engineers and product teams who want to build trusted, high-quality AI search experiences, and need a way to measure if it’s actually working. ------------------------------------ Session ID: 936933 Track: Retrieval+Search Speaker: Rajiv Shah (Chief Evangelist) Format: Workshop Room: Golden Gate Ballroom B: Workshops Time: 3 Jun 2025 09:00 AM Session Title: Forget RAG Pipelines—Build Production-Ready AI Agents in 15 Minutes Description: Want to take advantage of your data, but don't want to reinvent RAG infrastructure? Join our workshop and see how you can deploy Agentic RAG in minutes using Contextual AI's managed RAG solution. We'll explore how Contextual handles intelligent parsing and chunking of your data, retrieves information with state of the art accuracy, and generates responses with a multi layered set of guardrails against hallucinations. Together, we'll build an end-to-end Agentic RAG pipeline and demonstrate its integration with Claude Desktop via MCP, so you can see how this could plug into your existing ecosystem. By the end of this session, you'll have a functioning Agentic RAG prototype that you can easily customize and deploy to production for your specific use cases, even with complex, unstructured documents. ------------------------------------ Session ID: 905421 Track: Retrieval+Search Speaker: Ofer Mendelevitch (Vectara - the trusted GenAI product platform) Format: Online Talk Session Title: open-rag-eval: RAG Evaluation without "golden" answers. Description: Open-RAG-Eval is an open-source framework that revolutionizes RAG evaluation by harnessing the power of LLM judges for scalable, automated evaluation without the need for golden answers or golden chunks. Building on pioneering research from the University of Waterloo, this framework integrates innovative tools like UMBRELA for reference-free relevance scoring and AutoNuggetizer for automated fact-checking. Designed with a flexible connectors architecture, it seamlessly plugs into any RAG pipeline while delivering fast, transparent, and interpretable metrics on retrieval, generation, and hallucination in RAG. ------------------------------------ Session ID: 933706 Track: Retrieval+Search Speaker: Thibaut Gourdel (Senior Technical Product Marketing Manager, MongoDB) Room: Salons 9-15: Expo Hall Time: 4 Jun 2025 11:00 AM Session Title: GraphRAG: Integrating LLMs with Knowledge Graphs Description: While traditional RAG is effective, it can struggle with complex relationships and reasoning across large knowledge bases. GraphRAG, an advanced variant, addresses these challenges by leveraging knowledge graphs to enable deeper understanding and improved response accuracy. Learn how LLMs extract key entities and relationships from your data to construct a graph structure, and how the system uses graph traversal to find related entities and enrich prompts. Stay for a live demo showcasing these concepts in action. ====================================================================== --- Track: SWE AGENTS (TBA) --- ====================================================================== Session ID: 929855 Track: SWE Agents Speaker: Scott Wu (CEO, Cognition AI) Format: Talk Room: Yerba Buena Ballroom 7&8: SWE Agents Time: 5 Jun 2025 11:15 AM Session Title: Devin 2.0 and the Future of SWE Description: A talk on the future of software engineering with Scott Wu of Cognition AI, the makers of Devin. ------------------------------------ Session ID: 939640 Track: SWE Agents Speaker: Christian Szegedy (Former co-founder of xAI, discoverer of adversarial examples) Format: Talk Room: Yerba Buena Ballroom 2-6: Reasoning + RL Time: 5 Jun 2025 02:40 PM Session Title: Towards Verified Superintelligence Description: I describe a new paradigm towards open-endedly self-improving intelligence by scaling verification to remove the human data and supervision bottleneck. The objective is to achieve trustless alignment of superintelligence. ------------------------------------ Session ID: 939942 Track: SWE Agents Speaker: Boris Cherny (Member of Technical Staff, created Claude Code) Format: Talk Room: Yerba Buena Ballroom 7&8: SWE Agents Time: 5 Jun 2025 02:00 PM Session Title: Claude Code & the evolution of agentic coding Description: A ten thousand foot view of the coding space, the UX of coding, and the Claude Code team's approach ------------------------------------ Session ID: 914012 Track: SWE Agents Speaker: Rustin Banks (Product Manager, AI Coding ) Format: Talk Room: Yerba Buena Ballroom 7&8: SWE Agents Time: 5 Jun 2025 11:35 AM Session Title: Your Coding Agent Just Got Cloned And Your Brain Isn't Ready Description: Will the future engineer code alongside a single coding agent, or will they spend their day orchestrating many agents? Traditional development rewards synchronous focus. This session dives into the significant mindshift required to move from sequential coding to orchestrating parallel agents. We are the builders of "Jules", Google's massively parallel asynchronous coding agent (to be opened up in May). We'll share real-world insights from building Jules and explore how to rewire your brain for this powerful new "post-IDE" development paradigm. ------------------------------------ Session ID: 915745 Track: SWE Agents Speaker: Aakanksha Chowdhery (Research Leader, Reflection AI) Format: Talk Room: Yerba Buena Ballroom 2-6: Reasoning + RL Time: 5 Jun 2025 11:55 AM Session Title: Post-Training Open Models with RL for Autonomous Coding Description: The models and techniques to build fully autonomous coding agents - not just coding copilots - are already here. In this talk, former Google DeepMind staff research scientist, now CEO of Reflection Misha Laskin will present new research on post-training open weight LLMs for autonomous SWE tasks. He’ll focus on how scaling LLMs with Reinforcement Learning improves the autonomous coding capabilities of LLMs, and provide insight on the technical challenges required to train such systems at scale. ------------------------------------ Session ID: 933462 Track: SWE Agents Speaker: Kenneth DuMez (DevRel Lead, Graphite) Format: Workshop Room: Willow: Expo Sessions Time: 5 Jun 2025 10:45 AM Session Title: The fastest software dev workflow in the world: AI meets stacked diffs Description: Learn the secrets behind the workflows that engineers at the fastest moving companies in the world are using to build software for billions of users worldwide. This workshop will cover a comprehensive overview of how to leverage generative AI to write code, how to stack and submit these pull requests, and finally how to use AI to review them. ------------------------------------ Session ID: 933685 Track: SWE Agents Speaker: Tejashwa Tiwari (Automation Engineer @ Windsurf) Room: Salons 9-15: Expo Hall Time: 4 Jun 2025 01:45 PM Session Title: Windsurf & Wonders Description: Come learn about why Windsurf is the premiere choice for engineers and enterprises alike in applications of AI for development. ------------------------------------ Session ID: 933560 Track: SWE Agents Speaker: Matt Ball (Empowering Developers with AI ) Format: Workshop Room: Foothill C: Workshops Time: 3 Jun 2025 03:30 PM Session Title: Navigating deep context in legacy code with Augment Agent Description: Attendees will learn to use an AI coding agent as a fast and intuitive part of navigating and working with complex, production-grade legacy code bases. We will drop directly into the code–written in assembly–that landed the1969 Apollo 11 astronauts on the moon and, through a series of challenges, locate parts of the code tied to key functionality. Using the agent to convert a key guidance computer algorithm into a more modern programming language, attendees will then compete to see whose code has what it takes to land on the moon. ------------------------------------ Session ID: 916115 Track: SWE Agents Speaker: Itamar Friedman (CEO) Format: Talk Room: Foothill C: Agent Reliability Time: 4 Jun 2025 02:40 PM Session Title: Vibe Coding, with Confidence Description: Everyone wants to do Vibe Code, even large Enterprises. But how can we ensure that the generated code is well-grounded with the dev team's code and software development standards? In this talk, Itamar will present how to use various tools and agents, including MCP and A2A, to achieve precisely that. ------------------------------------ Session ID: 915961 Track: SWE Agents Speaker: Jeremy Adams (Head of Ecosystem) Format: Workshop Room: Nobhill A&B: Workshops Time: 3 Jun 2025 10:40 AM Session Title: Ship Agents that Ship: A Hands-On Workshop for SWE Agent Builders Description: Coding agents are transforming how software gets built, tested, and deployed, but engineering teams face a critical challenge: how to embrace this automation wave without sacrificing trust, control, or reliability. In this 80 minute workshop, you’ll go beyond toy demos and build production-minded AI agents using Dagger, the programmable delivery engine designed for real CI/CD and AI-native workflows. Whether you're debugging failures, triaging pull requests, generating tests, or shipping features, you'll learn how to orchestrate autonomous agents that live in and around your codebase: from your laptop to your CI platform. We’ll guide you through: Building real-world agents with Dagger and popular LLMs (GPT, Claude, etc.) Programming agent environments using real languages (Go, Python, TypeScript) Executing agent workflows locally and in GitHub Actions, so you can bring them to production Using a composable runtime that ensures isolation, determinism, traceability, and repeatability Designing agents that automate and enhance debugging, test generation, code review, bug fixing, and feature implementation By the end of the workshop, you’ll walk away ready to build your own army of autonomous agents, working collaboratively across your codebase, locally and in CI, accelerating development without ceding control. Let’s build agents that don’t just talk, they ship! ------------------------------------ Session ID: 914017 Track: SWE Agents Speaker: Josh Albrecht (CTO and Co-founder) Format: Talk Room: Yerba Buena Ballroom 7&8: SWE Agents Time: 5 Jun 2025 02:40 PM Session Title: Beyond the Prototype: Using AI to Write High-Quality Code Description: In this case study-based keynote, Josh Albrecht, CTO of Imbue, examines the critical engineering challenges in building AI coding systems that create more than just prototypes. Drawing from Imbue's research developing Sculptor, an experimental coding agent environment, Josh shares key insights into the fundamental technical obstacles encountered when evolving AI-assisted coding from toy applications to more robust software systems. The session will explore approaches to core challenges like safely executing code, managing context across large codebases, automating test generation, and creating systems that can identify potential pitfalls in AI-generated code. Attendees will gain practical insights into the technical underpinnings of next-generation coding agents that aim to handle complex software engineering challenges architecting larger systems, increasing meaningful test coverage and designing systems that are easy to debug—moving us closer to AI systems that can help create maintainable software. ------------------------------------ Session ID: 915312 Track: SWE Agents Speaker: Raaz Dwivedi (Chief Scientist) Format: Talk Room: Foothill C: Agent Reliability Time: 4 Jun 2025 12:15 PM Session Title: Production software keeps breaking and it will only get worse. Here’s how Traversal is fixing it. Description: Software is eating the world. AI is eating software. AI-powered SWE means a whole lot more software is going to be written that powers mission critical systems in the coming years, with hardly any of it written by humans. Hence, when these software systems inevitably break, it’s going to be next to impossible to troubleshoot them. Towards addressing this issue, we’ll do a product launch of Traversal’s AI, a significant step towards self-healing software systems. We will showcase how it is already used to autonomously troubleshoot production incidents in some of the most complex enterprise environments. ------------------------------------ Session ID: 933632 Track: SWE Agents Speaker: Numair Baseer (Deployed Engineer, Windsurf) Format: Workshop Room: Golden Gate Ballroom A: Workshops Time: 3 Jun 2025 03:30 PM Session Title: Agentic Coding with Windsurf Description: Agentic coding marks a new era in software development, where AI agents take on autonomous roles in coding tasks. The Windsurf IDE embodies this shift by integrating intelligent agents like Cascade, which maintain full codebase context to perform multi-file edits, run terminal commands, and suggest changes through tools like Supercomplete and Flows. In this session, we will explore features that allow developers to guide strategy while the AI handles execution, enhancing productivity and enabling more creative, high-level work. ------------------------------------ Session ID: 915387 Track: SWE Agents Speaker: Robert Brennan (CEO) Format: Talk Room: Yerba Buena Ballroom 7&8: SWE Agents Time: 5 Jun 2025 02:20 PM Session Title: Software Development Agents: What Works and What Doesn't Description: The adoption of AI into software development has been bumpy. While autocomplete tools like Copilot have gone mainstream, autonomous agents like Devin and OpenHands have generated both enthusiasm and skepticism. Some engineers claim they generate a 10x productivity boost; others that they just create noise and tech debt. The difference between the enthusiasts and the skeptics is that the enthusiasts have reasonable expectations for what these agents can do, and have both practical and intuitive knowledge for how to use them effectively. In this session, we'll talk about what tasks are appropriate for today's software agents, what tasks they might start to succeed at in 2025, and what tasks are best left to humans no matter how good they get. Session Outline: Learn how to use software development agents like OpenHands (fka OpenDevin) effectively, without creating noise and tech debt. ------------------------------------ Session ID: 936298 Track: SWE Agents Speaker: Matt Ball (Empowering Developers with AI ) Format: Talk Room: Nobhill A&B: Expo Sessions Time: 5 Jun 2025 03:15 PM Session Title: To the moon! Navigating deep context in legacy code with Augment Agent Description: Shortened presentation-only version of our Apollo 11 workshop ------------------------------------ Session ID: 933633 Track: SWE Agents Speaker: Sam Fertig (Deployed Engineer, Windsurf) Format: Talk Room: Nobhill A&B: Expo Sessions Time: 5 Jun 2025 03:30 PM Session Title: The Eyes Are The (Context) Window to The Soul: How Windsurf Gets to Know You Description: Sometimes it seems like Windsurf knows you a little too well. It's one thing to generate generic code, but to predict your next intent? From matching existing code patterns and styles to tracking how local changes affect the larger codebase, this talk digs into the technical challenges of context awareness and why simply indexing code falls short. Relive our journey tackling the core issue in the AI IDE space : balancing retrieval quality with latency constraints and scaling effectively as codebases grow. For those curious about the infrastructure behind context-aware AI, this talk offers insights into our approach of turning massive codebases into collections of useful context. ------------------------------------ Session ID: 915312 Track: SWE Agents Speaker: Anish Agarwal (CEO and Co-founder) Format: Talk Room: Foothill C: Agent Reliability Time: 4 Jun 2025 12:15 PM Session Title: Production software keeps breaking and it will only get worse. Here’s how Traversal is fixing it. Description: Software is eating the world. AI is eating software. AI-powered SWE means a whole lot more software is going to be written that powers mission critical systems in the coming years, with hardly any of it written by humans. Hence, when these software systems inevitably break, it’s going to be next to impossible to troubleshoot them. Towards addressing this issue, we’ll do a product launch of Traversal’s AI, a significant step towards self-healing software systems. We will showcase how it is already used to autonomously troubleshoot production incidents in some of the most complex enterprise environments. ------------------------------------ Session ID: 915312 Track: SWE Agents Speaker: Raj Agrawal (CTO, Cofounder) Format: Talk Room: Foothill C: Agent Reliability Time: 4 Jun 2025 12:15 PM Session Title: Production software keeps breaking and it will only get worse. Here’s how Traversal is fixing it. Description: Software is eating the world. AI is eating software. AI-powered SWE means a whole lot more software is going to be written that powers mission critical systems in the coming years, with hardly any of it written by humans. Hence, when these software systems inevitably break, it’s going to be next to impossible to troubleshoot them. Towards addressing this issue, we’ll do a product launch of Traversal’s AI, a significant step towards self-healing software systems. We will showcase how it is already used to autonomously troubleshoot production incidents in some of the most complex enterprise environments. ------------------------------------ Session ID: 915312 Track: SWE Agents Speaker: Matthew Schoenbauer (Founding Engineer) Format: Talk Room: Foothill C: Agent Reliability Time: 4 Jun 2025 12:15 PM Session Title: Production software keeps breaking and it will only get worse. Here’s how Traversal is fixing it. Description: Software is eating the world. AI is eating software. AI-powered SWE means a whole lot more software is going to be written that powers mission critical systems in the coming years, with hardly any of it written by humans. Hence, when these software systems inevitably break, it’s going to be next to impossible to troubleshoot them. Towards addressing this issue, we’ll do a product launch of Traversal’s AI, a significant step towards self-healing software systems. We will showcase how it is already used to autonomously troubleshoot production incidents in some of the most complex enterprise environments. ------------------------------------ Session ID: 933545 Track: SWE Agents Speaker: Eric Hou (Member of Technical Staff, Augment Code) Format: Talk Room: SOMA: AI Architects Time: 4 Jun 2025 12:15 PM Session Title: Mentoring the Machine Description: You’d never let a swarm of fresh interns ship to prod on day one—same deal with AI agents. Mentoring the Machine dives into how acting like a tech lead (not just a user) turns those bots into real leverage. In this talk, Eric will deliver practical advice for working with AI agents in the SDLC. He'll also preview how effective use of AI agents changes the calculus of software engineering at both a micro and macro level. ------------------------------------ Session ID: 936299 Track: SWE Agents Speaker: Chris Kelly (All Things Developer @ Augment Code) Format: Talk Room: Juniper: Expo Sessions Time: 4 Jun 2025 10:55 AM Session Title: Vibes won't cut it Description: What's the role of vibe coding in a production-grade applications? Join Augment Code's Chris Kelly as he talks about the role of context in software engineering, not code. ------------------------------------ Session ID: 933560 Track: SWE Agents Speaker: Forrest Brazeal (Speaker and Workshop Organizer) Format: Workshop Room: Foothill C: Workshops Time: 3 Jun 2025 03:30 PM Session Title: Navigating deep context in legacy code with Augment Agent Description: Attendees will learn to use an AI coding agent as a fast and intuitive part of navigating and working with complex, production-grade legacy code bases. We will drop directly into the code–written in assembly–that landed the1969 Apollo 11 astronauts on the moon and, through a series of challenges, locate parts of the code tied to key functionality. Using the agent to convert a key guidance computer algorithm into a more modern programming language, attendees will then compete to see whose code has what it takes to land on the moon. ------------------------------------ Session ID: 936298 Track: SWE Agents Speaker: Forrest Brazeal (Speaker and Workshop Organizer) Format: Talk Room: Nobhill A&B: Expo Sessions Time: 5 Jun 2025 03:15 PM Session Title: To the moon! Navigating deep context in legacy code with Augment Agent Description: Shortened presentation-only version of our Apollo 11 workshop ------------------------------------ Session ID: 933629 Track: SWE Agents Speaker: Sam Alba (Co-Founder of Dagger) Room: Willow: Expo Sessions Time: 4 Jun 2025 12:45 PM Session Title: How to trust an agent with software delivery Description: AI-powered agents promise faster, easier software delivery, but their unpredictable behavior often makes engineers hesitant to fully trust them with critical workflows. Sam Alba, Co-founder of Dagger (and previously co-creator of Docker), explains how teams can reliably integrate agents into their delivery pipelines by shifting how they structure and manage automation. He'll share four practical strategies learned from real-world experience: 1. Treat agents as workflow participants, not isolated tools. Stop using agents as disconnected scripts or IDE plugins. Treating them as first-class parts of your delivery process simplifies your architecture, reduces hidden complexity, and makes agent outcomes more predictable. 2. Use many small agents instead of one big one. Just as software evolved from monoliths to microservices, software delivery benefits from smaller, specialized agents with clearly defined responsibilities. Smaller agents are easier to understand, maintain, and integrate. 3. Define clear environments—the real lever for reliability. Instead of chasing perfect prompts or models, focus on clearly defining the tools, resources, and permissions around your agents. Precisely controlling their environments makes agents behave consistently and reliably. 4. Design workflows for easy debugging and observability. Agents will sometimes fail unexpectedly. Sam will share simple, effective ways to build clear tracing and observability into your workflows from the start, making debugging quicker and less frustrating. You'll leave with practical, immediately usable techniques that give you the confidence to trust AI agents in your software delivery pipelines. ------------------------------------ Session ID: 933684 Track: SWE Agents Speaker: Eashan Sinha (Deployed Engineer, Windsurf) Format: Talk Room: Nobhill A&B: Expo Sessions Time: 4 Jun 2025 10:55 AM Session Title: Mastering Engineering Flow with Windsurf Description: As experienced engineers, especially senior and staff engineers, our focus shifts towards complex problem-solving, architectural decisions, and mentoring. While AI tools promise productivity gains, Windsurf offers more than just code completion and chat assistance – it's an agentic IDE built to enhance engineering flow. This talk explores how experienced engineers can leverage Windsurf's deep contextual awareness, structured guidance, and automated workflows to tackle sophisticated and complex tasks. We'll demonstrate practical strategies for accelerating feature development, automating code maintenance and reviews, and ultimately freeing up cognitive load to focus on high-impact engineering challenges. Learn how to move beyond basic AI assistance and truly partner with Windsurf to excel in your role. ------------------------------------ Session ID: 944044 Track: SWE Agents Speaker: Jonathon Belotti (Member of Technical Staff @ Modal) Format: Talk Room: Salons 9-15: Expo Hall Time: 4 Jun 2025 01:15 PM Session Title: Run 1000 branches of code with sandbox snapshotting Description: A peek under the hood of how we built container checkpoints and restores to enable massively parallel agentic workflows. No one wants to wait on infrastructure. In this short talk we’ll go through a demo and system design of container checkpoint/restore, which supports both burst autoscaling and agent branching for Modal's serverless Functions and Sandboxes. Can you save a live container to a file? Can you save a live GPU? Come by the Modal booth to find out! ------------------------------------ Session ID: 904751 Track: SWE Agents Speaker: Eno Reyes (CTO) Format: Talk Room: Yerba Buena Ballroom 7&8: SWE Agents Time: 5 Jun 2025 03:00 PM Session Title: Ship Production Software in Minutes, Not Months Description: Planning, coding, testing, monitoring—the endless cycle that spans 10+ tools that fragment our focus and slows delivery to a crawl. Vibe coding doesn't work when you've got 10TB of code. If you just sighed, you're one of many professional software engineers trapped in the traditional software development lifecycle (SDLC) that was designed before AI could parallelize your entire workflow. But what if you could orchestrate multiple AI agents on tasks beyond just generating code, while you focus on the creative decisions that matter? In this talk, I'll demonstrate how real enterprise organizations are changing their entire SDLC—going from understanding, planning, coding, and testing all the way to incident response—using AI agents. You'll witness the next evolution of software engineering—where AI doesn't just generate code, but orchestrates the entire development lifecycle. ------------------------------------ Session ID: 933685 Track: SWE Agents Speaker: Tejashwa Tiwari (Analytics and Automation Engineer) Room: Salons 9-15: Expo Hall Time: 4 Jun 2025 01:45 PM Session Title: Windsurf & Wonders Description: Come learn about why Windsurf is the premiere choice for engineers and enterprises alike in applications of AI for development. ------------------------------------ Session ID: 915961 Track: SWE Agents Speaker: Kyle Penfound (Solutions Engineer at Dagger) Format: Workshop Room: Nobhill A&B: Workshops Time: 3 Jun 2025 10:40 AM Session Title: Ship Agents that Ship: A Hands-On Workshop for SWE Agent Builders Description: Coding agents are transforming how software gets built, tested, and deployed, but engineering teams face a critical challenge: how to embrace this automation wave without sacrificing trust, control, or reliability. In this 80 minute workshop, you’ll go beyond toy demos and build production-minded AI agents using Dagger, the programmable delivery engine designed for real CI/CD and AI-native workflows. Whether you're debugging failures, triaging pull requests, generating tests, or shipping features, you'll learn how to orchestrate autonomous agents that live in and around your codebase: from your laptop to your CI platform. We’ll guide you through: Building real-world agents with Dagger and popular LLMs (GPT, Claude, etc.) Programming agent environments using real languages (Go, Python, TypeScript) Executing agent workflows locally and in GitHub Actions, so you can bring them to production Using a composable runtime that ensures isolation, determinism, traceability, and repeatability Designing agents that automate and enhance debugging, test generation, code review, bug fixing, and feature implementation By the end of the workshop, you’ll walk away ready to build your own army of autonomous agents, working collaboratively across your codebase, locally and in CI, accelerating development without ceding control. Let’s build agents that don’t just talk, they ship! ====================================================================== --- Track: SECURITY (June 5) --- ====================================================================== Session ID: 915873 Track: Security Speaker: Rene Brandel (CEO) Format: Talk Room: Foothill C: Security Time: 5 Jun 2025 02:40 PM Session Title: How we hacked YC Spring 2025 batch’s AI agents Description: We hacked 7 of the16 publicly-accessible YC X25 AI agents. This allowed us to leak user data, execute code remotely, and take over databases. All within 30 minutes each. In this session, we'll walk through the common mistakes these companies made and how you can mitigate these security concerns before your agents put your business at risk. ------------------------------------ Session ID: 915751 Track: Security Speaker: Jared Hanson (Co-Founder) Format: Talk Room: Foothill C: Security Time: 5 Jun 2025 02:20 PM Session Title: How to Secure Agents using OAuth Description: We all know sharing passwords is bad (unless you want free TV), so why are we sharing API keys with AI? We shouldn't, and that’s why we need to talk about OAuth. In this talk, we will give a brief intro to OAuth. Then we will talk about the state of authorization in MCP. We will show how an MCP client uses OAuth to authenticate a user and securely access private resources and tools hosted by an MCP server. Then we’ll look at ways autonomous agents can use OAuth on their own behalf, talking to other agents and MCP servers directly. We’ll learn how to use OAuth to build agents that humans and machines can trust. ------------------------------------ Session ID: 933686 Track: Security Speaker: Michael Grinich (Founder & CEO, WorkOS ) Format: Talk Room: SOMA: AI Architects Time: 5 Jun 2025 02:00 PM Session Title: CIAM for AI: Who Are Your Agents and What Can They Do? Description: AI agents are changing the way modern SaaS products operate. Whether automating workflows, integrating with APIs, or acting on behalf of users, AI-driven assistants and autonomous systems are becoming core product features. But securing these agents presents a fundamental challenge: How do you authenticate AI agents? How do you control what they can access? How do you ensure they act within the right permissions? This talk will explore these concepts and more while highlighting current research and best practices. ------------------------------------ Session ID: 938753 Track: Security Speaker: Fouad Matin (Member of Technical Staff, OpenAI) Format: Keynote Room: Foothill C: Security Time: 5 Jun 2025 11:15 AM Session Title: Safety and security for code-executing agents Description: Code is the lingua franca for both software engineers and highly capable AI models. As we give agents the ability to build, test, and run code that they generate, the command line becomes their canvas—and their attack surface. This keynote explores what it takes to bring code-executing agents from research to real-world deployment while maintaining control and security. We’ll cover how terminals offer AI an ideal interface, why they’re deceptively risky, and what it means to embed security, guardrails, and trust at every layer. It’s not just about what agents can do—it’s about what they should do, and how we make sure they do it safely. ------------------------------------ Session ID: 933612 Track: Security Speaker: Antje Barth (Principal Developer Advocate) Format: Keynote Room: Keynote/General Session (Yerba Buena 7&8) Time: 4 Jun 2025 04:05 PM Session Title: Building Agents at Cloud-Scale Description: Let's explore practical strategies for building and scaling agents in production. Discover how to move from local MCP implementations to cloud-scale architectures and how engineering teams leverage these patterns to develop sophisticated agent systems. Expect a mix of demos, use case discussions, and a glimpse into the future of agentic services! ------------------------------------ Session ID: 909905 Track: Security Speaker: Jonathan Mortensen (CEO) Format: Talk Room: Foothill C: Security Time: 5 Jun 2025 11:35 AM Session Title: The Unofficial Guide to Apple’s Private Cloud Compute Description: In October 2024, Apple released a new private AI technology onto millions of devices called “Private Cloud Compute”. It brings the same level of privacy and security a local device offers but on an “untrusted" remote server. This talk discusses how Private Cloud Compute represents a paradigm shift in confidential computing and explores the core advancements that made it possible to become mainstream. We’ll explore its novel architecture that allows developers to run sensitive, multi-tenant workloads with cryptographically-provably privacy guarantees at scale and at reasonable cost. Attendees will leave with an understanding of how to leverage this technology for data and AI applications where privacy and security is paramount. ------------------------------------ Session ID: 936795 Track: Security Speaker: Bobby Tiernay (Auth0, Principal Architect) Format: Talk Room: Foothill C: Security Time: 5 Jun 2025 12:15 PM Session Title: Securing Agents with Open Standards Description: Shipping AI agents that are safe for production means solving some tough identity and authorization challenges that are not always obvious at the prototype stage. In practice, this comes down to a handful of deeply technical questions: - How do you make sure agents are only acting for the right user? - How do you prevent over-broad API access or data leaks? - How do you handle user approvals when there is no UI, or you need a human in the loop? - And how do you avoid the usual pain points like manual credential sharing, stale keys, or unpredictable scopes without writing a lot of brittle, custom code? This talk digs into the real technical trade-offs behind building secure, user-aware AI agents. We will go beyond what to do and explain why, sharing the architectural decisions, open standards, and hard lessons learned from integrating OAuth, OIDC, RAR, and async authorization into agent-driven workflows. You will see a hands-on demo using an open-source Node.js agent and open protocols, with a focus on practical integration and no magic. The session will show how these solutions have shaped our approach to identity in GenAI and where we see the field heading next. If you are an engineer building AI apps that need real guardrails, not just a happy-path demo, we hope to leave you with some practical patterns, design rationale, and a clear view of the trade-offs for making your own agents production ready. ------------------------------------ Session ID: 933569 Track: Security Speaker: Lou Bichard (Product Manager, Gitpod) Room: Juniper: Expo Sessions Time: 5 Jun 2025 12:45 PM Session Title: Building CISO-approved agent fleet architecture Description: Security is the biggest blocker for agent orchestration adoption in regulated industries for SWE agents. Gitpod's agent orchestration went from an originally self-hosted kubernetes architecture to the current 'bring your own cloud' model that enables deployment our SWE agent orchestration platform in secure environments. The architecture allows customers to securely connect their foundational models and agent memory solutions and comes with features like auto-suspend and resume for agent fleets. In this talk we deep dive into the architecture to share our years of learnings in how to secure agent workloads at scale in secure and regulated environments. ------------------------------------ Session ID: 936795 Track: Security Speaker: Kam Sween (Auth0, Staff Engineer, AIFS) Format: Talk Room: Foothill C: Security Time: 5 Jun 2025 12:15 PM Session Title: Securing Agents with Open Standards Description: Shipping AI agents that are safe for production means solving some tough identity and authorization challenges that are not always obvious at the prototype stage. In practice, this comes down to a handful of deeply technical questions: - How do you make sure agents are only acting for the right user? - How do you prevent over-broad API access or data leaks? - How do you handle user approvals when there is no UI, or you need a human in the loop? - And how do you avoid the usual pain points like manual credential sharing, stale keys, or unpredictable scopes without writing a lot of brittle, custom code? This talk digs into the real technical trade-offs behind building secure, user-aware AI agents. We will go beyond what to do and explain why, sharing the architectural decisions, open standards, and hard lessons learned from integrating OAuth, OIDC, RAR, and async authorization into agent-driven workflows. You will see a hands-on demo using an open-source Node.js agent and open protocols, with a focus on practical integration and no magic. The session will show how these solutions have shaped our approach to identity in GenAI and where we see the field heading next. If you are an engineer building AI apps that need real guardrails, not just a happy-path demo, we hope to leave you with some practical patterns, design rationale, and a clear view of the trade-offs for making your own agents production ready. ------------------------------------ Session ID: 933610 Track: Security Speaker: Mike Chambers (AI/ML Specialist DA AWS) Format: Talk Room: Golden Gate Ballroom C: AI in the Fortune 500 Time: 5 Jun 2025 12:15 PM Session Title: Ship it! Building Production-Ready Agents Description: Explore the practical challenges and solutions for deploying AI agents in real-world production environments. Through detailed technical analysis and practical examples, we'll examine strategies for building and orchestrating agent systems at scale. We'll cover critical infrastructure decisions, scalability frameworks, and best practices for creating robust, production-ready agent architectures. ------------------------------------ Session ID: 913351 Track: Security Speaker: David Mytton (Founder) Format: Talk Room: Foothill C: Security Time: 5 Jun 2025 02:00 PM Session Title: How to defend your sites from AI bots Description: Constantly seeing CAPTCHAs? It used to be easy to detect the humans from the droids, but what else can we do when synthetic clients make up nearly half of all web requests. Rotating IPs, spoofed browsers, and agents acting on behalf of real users - are we doomed to forever be solving puzzles? In this talk, we’ll explore user agents, HTTP fingerprints, and IP reputation signals that make humans and agents stand out from scrapers, build a realistic threat model, and dig into the behaviors that reveal the LLM-mimicry. Leave with AX- and UX-safe code, benchmarks, and tools to help you take back control. ------------------------------------ Session ID: 915978 Track: Security Speaker: Leonard Tang (Founder & CEO) Format: Talk Room: Foothill C: Security Time: 5 Jun 2025 11:55 AM Session Title: Fuzzing in the GenAI Era Description: "Evaluation" is one of those concepts that every AI practitioner vaguely knows is important, but few practitioners truly understand. Is "eval" the dataset for measuring the quality of your AI system? Is "eval" the measure, the metric of quality? Is "eval" the process of human annotation and scoring? Or is "eval" a third-party dataset run once to benchmark a model? To mitigate this cacophony, this talk will provide an opinionated and principled perspective for what we actually mean when we say “evaluation”, beyond the traditional for-loop-over-a-static dataset. In particular, this perspective draws heavy inspiration from *fuzzing*, i.e. bombarding AI with simulated, unexpected user inputs to uncover corner cases at scale. This factors into sub-problems regarding: - Quality Metric. What is the actual criteria we, as humans, are using to determine if an AI system is producing good or bad responses? How do we elicit these criteria before the human SME can articulate them? How do we, as efficiently as possible, operationalize this criteria with an automated *Judge*? - Stimuli Generation. Given a metric, how do we know, with confidence, that an AI system is performing well with respect to the metric? What data is representative and sufficient for discovering all potential bugs of an AI system? And how do we generate this complex, diverse, faithful data at scale? We will discuss in detail the philosophy, technology, and case studies behind both problems of Quality Metric and Stimuli Generation, and how they interact in concert. ------------------------------------ Session ID: 933656 Track: Security Speaker: Nick Nisi (Software developer and panelist on the JS Party podcast) Room: Willow: Expo Sessions Time: 5 Jun 2025 12:45 PM Session Title: Agents, Access, and the Future of Machine Identity Description: AI agents are calling APIs, submitting forms, and sending emails—but how do you control what they’re allowed to do? As agents act on behalf of users or organizations, traditional patterns like OAuth, session tokens, and role-based access often fall short. In this talk, we’ll explore how machine identity is evolving to meet this new landscape. You’ll learn: - How to think about authentication for agents (not just humans) - What it means to authorize an action when the actor is an LLM or headless service - Real-world strategies from WorkOS and Cloudflare for assigning, managing, and revoking agent identity and access By the end, you’ll walk away with practical tools and mental models to build agent-powered systems that are secure, auditable, and scalable. ------------------------------------ Session ID: 947798 Track: Security Speaker: Sander Schulhoff (CEO) Format: Workshop Room: SOMA: Workshops Time: 3 Jun 2025 03:30 PM Session Title: Prompt Engineering & AI Red Teaming Description: Learn from the creator of Learn Prompting, the internet's 1st Prompt Engineering guide (released 2 months before ChatGPT), and HackAPrompt, the World's 1st AI Red Teaming competition. My talk will cover topics ranging from the history of prompt engineering to the most advanced research-backed prompt engineering techniques. I will also discuss the origins of prompt injection and AI red teaming, as well as the current state of industry and the need for agentic red teaming. Finally, we will have an interactive competition where you will be able to hone your prompt hacking skills and win prizes from swyx! https://www.hackaprompt.com ------------------------------------ Session ID: 902517 Track: Security Speaker: Allie Howe (vCISO | Founder of Growth Cyber) Format: Online Talk Session Title: How to Build Trustworthy AI Description: Trust is a multifaceted outcome that results when product and engineering teams work together to build AI that is aligned, explainable, and secure. Learn strategies for how to build trustworthy AI and why trust is paramount for AI systems. Trustworthy AI = AI Security + AI Safety Learn about the differences between AI Security and AI Safety and how the three focus areas of MLSecOps + AI Red Teaming + AI Runtime Security can help you achieve both and ultimately build Trustworthy AI. Trustworthy AI Issues in the news: https://x.com/syddiitwt/status/1923427722241487297 https://fingfx.thomsonreuters.com/gfx/legaldocs/egvblxokkvq/Walters%20v%20OpenAI%20-%20order.pdf?ref=claritasgrc.ai MLSecOps Resources Modelscan https://github.com/protectai/modelscan Community: mlsecops.com AI Red Teaming Resources: https://azure.github.io/PyRIT/ https://ashy-coast-00aeb501e.6.azurestaticapps.net/MS_AIRT_Lessons_eBook.pdf AI Runtime Security Resources: https://www.pillar.security/solutions#ai-detection https://noma.security/ Showcasing Trustworthy AI to Customers/Prospects https://www.vanta.com/collection/trust/what-is-a-trust-center ====================================================================== --- Track: TINY TEAMS (June 4) --- ====================================================================== Session ID: 933675 Track: Tiny Teams Speaker: Kevin Hou (Head of Product, Windsurf) Format: Keynote Room: Keynote/General Session (Yerba Buena 7&8) Time: 4 Jun 2025 04:25 PM Session Title: Windsurf everywhere, doing everything, all at once Description: abstract tbd ------------------------------------ Session ID: 911846 Track: Tiny Teams Speaker: Hassan El Mghari (DevRel lead ) Format: Talk Room: Yerba Buena Ballroom Salons 2-6: Tiny Teams Time: 4 Jun 2025 11:55 AM Session Title: Using OSS models to build AI apps with millions of users Description: In this talk, Hassan will go over how he builds open source AI apps that get millions of users like roomGPT.io (2.9 million users), restorePhotos.io (1.1 million users), Blinkshot.io (1 million visitors), and LlamaCoder.io (1.4 million visitors). He'll go over his journey in AI, demo some of the apps that he's built, and dig into his tech stack and code to explain how he builds these apps from scratch. He’ll also go over how to market them and go over his top tips and tricks for building great full-stack AI applications quickly and efficiently. This talk will start from first principles and give you a glimpse into Hassan’s workflow of idea -> working app -> many users. Attendees should come out of this session equipped with the resources to build impressive AI applications and understand some of the behind the scenes of how they’re built and marketed. This will hopefully serve as an educational and inspirational talk that encourages builders to go build cool things. ------------------------------------ Session ID: 914550 Track: Tiny Teams Speaker: Sid Bendre (Co-Founder) Format: Talk Room: Yerba Buena Ballroom Salons 2-6: Tiny Teams Time: 4 Jun 2025 11:35 AM Session Title: The New Lean Startup Description: In this session, I will be presenting a case study of Oleve's journey, revealing how we've scaled a profitable multi-product portfolio with a tiny team. I'll walk you through the emergence of "tiny teams," our two-track engineering methodology that has become our blueprint, as well as an inside look at our technical alpha – specifically how we've engineered deterministic AI agents to deliver magical and reliable consumer experiences to millions. You'll learn how we've built internal tools to grow leanly and created operating playbooks to scale operations without traditional headcount requirements. I'll also share our approach to scrappy infrastructure innovation and how our investment in internal tooling has served as a critical force multiplier. Finally, I'll give an overview of parts of the profitable portfolio playbook that keeps us lean, adaptable, and profitable across multiple product lines. Structure of talk: - the tiny teams revolution - the two-track engineering approach - technical alpha: deterministic ai agents at scale - scrappy infrastructure innovation - internal tooling as a multiplier - the profitable portfolio playbook ------------------------------------ Session ID: 941906 Track: Tiny Teams Speaker: Alex Atallah (CEO OpenRouter, co-founder of OpenSea) Format: Keynote Room: Keynote/General Session (Yerba Buena 7&8) Time: 5 Jun 2025 04:35 PM Session Title: fun stories from building OpenRouter and where all this is going Description: How the first LLM aggregator got started, some of the weird moments in its early growth, architecture challenges, and where we'll be taking it down the road ------------------------------------ Session ID: 939097 Track: Tiny Teams Speaker: Vikas Paruchuri (CEO) Format: Talk Room: Yerba Buena Ballroom Salons 2-6: Tiny Teams Time: 4 Jun 2025 02:20 PM Session Title: Datalab: 40k stars, 7-figure ARR, SoTA models, team of 3 Description: We scaled Datalab 5x this year - to 7-figure ARR, with customers that include tier 1 AI labs. We train custom models for document intelligence (OCR, layout), with popular repos surya and marker. I'll talk about a new approach to building AI teams, including lessons I learned from Jeremy Howard, and how we manage building popular repos, scaling revenue, and training models with a tiny team. ------------------------------------ Session ID: 948608 Track: Tiny Teams Speaker: Eric Simons (CEO of Bolt.new) Format: Keynote Room: Yerba Buena Ballroom Salons 2-6: Tiny Teams Time: 4 Jun 2025 11:15 AM Session Title: Bolt.new: How we scaled $0-20m ARR in 60 days, with 15 people Description: Tiny Teams are the future of how startups are built, and it all comes down to team culture, decision making, tooling choices, and endless grit. In this talk, Eric will share the high octane insights & learnings of how the 2nd fastest growing product in history _made it_ with a team of less than 15 people. ------------------------------------ Session ID: 923914 Track: Tiny Teams Speaker: Grant Lee (CEO) Format: Talk Room: Yerba Buena Ballroom Salons 2-6: Tiny Teams Time: 4 Jun 2025 02:00 PM Session Title: Tiny Teams Description: Sean reached out on X, happy to do a talk on how to build a tiny team ------------------------------------ Session ID: 940118 Track: Tiny Teams Speaker: Max Brodeur-Urbas (Founder and CEO of Gumloop) Format: Talk Room: Yerba Buena Ballroom Salons 2-6: Tiny Teams Time: 4 Jun 2025 12:15 PM Session Title: Gumloop's Path to be a 10 person unicorn Description: An overview of how Gumloop is scaling automation across companies like Instacart, Webflow and Shopify with less than 10 people. ====================================================================== --- Track: UNCATEGORIZED (TBA) --- ====================================================================== Session ID: 936006 Track: Uncategorized Speaker: Greg Brockman (Cofounder, OpenAI) Room: Keynote/General Session (Yerba Buena 7&8) Time: 4 Jun 2025 04:45 PM Session Title: #define AI Engineer Description: Greg Brockman's career and advice for AI Engineers ------------------------------------ Session ID: 936046 Track: Uncategorized Speaker: Shawn Wang (N/A) Format: Keynote Room: Keynote/General Session (Yerba Buena 7&8) Time: 5 Jun 2025 05:15 PM Session Title: AI Engineer World's Fair Hackathon - Grand Prize Description: Not Available ------------------------------------ Session ID: 944039 Track: Uncategorized Speaker: Keiji Kanazawa (Product @ Azure AI Foundry) Room: Nobhill C&D: Microsoft Time: 4 Jun 2025 12:45 PM Session Title: AI Red Teaming Agent: Accelerate your AI safety and security journey with Azure AI Foundry Description: In the age of autonomous AI agents, ensuring their safety and reliability is paramount. But how can we proactively uncover vulnerabilities before they impact real-world scenarios? Enter Azure AI Evaluation SDK’s Red Teaming Agent—a cutting-edge tool designed to rigorously challenge your AI agents, exposing hidden risks and unexpected behaviors. This session will guide you through the powerful capabilities of Azure’s Red Teaming Agent, demonstrating how it simulates adversarial scenarios, stress-tests agentic decision-making, and ensures your applications remain robust, ethical, and safe. You’ll learn practical techniques for systematically identifying weaknesses, interpreting evaluation results, and integrating safety checks into your development lifecycle. Join us to explore how embracing adversarial testing not only mitigates risks but strengthens trust in your AI solutions—keeping you ahead in the rapidly evolving landscape of responsible AI. ------------------------------------ Session ID: 939093 Track: Uncategorized Speaker: Anush Dsouza (Senior Product Manager for Heroku AI) Format: Workshop Room: Nobhill A&B: Expo Sessions Time: 4 Jun 2025 01:00 PM Session Title: Building Agentic Applications with Heroku Managed Inference and Agents Description: In this workshop, you’ll learn how to use Heroku Managed Inference and Agents to build agentic applications. We’ll cover how to provision and deploy LLM models to your app, run untrusted code securely in Python, Node.js, Go, and Ruby using built-in tools, and use the Model Context Protocol (MCP) to connect tools and actions that extend your agents' capabilities. ------------------------------------ Session ID: 936046 Track: Uncategorized Speaker: Benjamin Dunphy (N/A) Format: Keynote Room: Keynote/General Session (Yerba Buena 7&8) Time: 5 Jun 2025 05:15 PM Session Title: AI Engineer World's Fair Hackathon - Grand Prize Description: Not Available ------------------------------------ Session ID: 936907 Track: Uncategorized Speaker: Emma Ning (Principal PM in Microsoft) Format: Talk Room: Juniper: Expo Sessions Time: 5 Jun 2025 03:30 PM Session Title: Empowering Developers to build Cutting-Edge AI experiences on device Description: Not Available ------------------------------------ Session ID: 940839 Track: Uncategorized Speaker: Christopher Harrison (Senior Enterprise Advocate, GitHub) Format: Workshop Room: Nobhill C&D: Microsoft Time: 3 Jun 2025 03:30 PM Session Title: Collaborating with Agents in your Software Development Workflow Description: GitHub Copilot's agentic capabilities enhance its ability to act as a peer programmer. From the IDE to the repository, Copilot can generate code, run tests, and perform tasks like creating pull requests using Model Context Protocol (MCP). This instructor-led lab will guide you through using agent capabilities on both the client and the server: Key takeaways include: Understanding how to bring agents into your software development workflow Identifying scenarios where agents can be most impactful, as well as tips and tricks to provide the right context to lead to success Discovering how Model Context Protocol provides access to an additional set of external tools and capabilities that the agent can use Recommended practices to accelerate your development while maintaining code quality. ------------------------------------ Session ID: 936814 Track: Uncategorized Speaker: Christopher Harrison (Senior Enterprise Advocate, GitHub) Format: Talk Room: Yerba Buena Ballroom 7&8: SWE Agents Time: 5 Jun 2025 11:55 AM Session Title: The Agent Awakens: Collaborative Development with GitHub Copilot Description: Not Available ------------------------------------ Session ID: 939088 Track: Uncategorized Speaker: Daria Soboleva (Head Research Scientist, Cerebras) Format: Workshop Room: Nobhill A&B: Workshops Time: 3 Jun 2025 01:00 PM Session Title: From Mixture of Experts to Mixture of Agents … with Super Fast Inference Description: Our hands-on workshop will walk you through how to build your own Mixture of Agents (MoA) system using the fastest, and most capable open models available: Qwen3-32B and Llama 3.3-70B. MoA is an emerging architecture that combines the strengths of multiple large language models in a layered, agent-based design. This approach delivers superior performance by enabling specialized agents to collaborate across layers—outperforming today’s frontier models in both accuracy and efficiency. To ground this new paradigm in its roots, we’ll also explore how Mixture of Experts (MoE) architectures continue to push the boundaries of scale and specialization. Learn how Cerebras trains state-of-the-art MoEs from Daria Soboleva, Head Research Scientist. ------------------------------------ Session ID: 935459 Track: Uncategorized Speaker: Christopher Harrison (Senior Developer Advocate) Format: Workshop Room: Nobhill C&D: Microsoft Time: 3 Jun 2025 10:40 AM Session Title: Piloting agents in GitHub Copilot Description: The agent capabilities added to GitHub Copilot have enhanced its ability to act as a peer programmer. Copilot can now discover and generate code based on existing standards, run tests, recover from errors, and call tools using Model Context Protocol (MCP). This workshop will guide you through piloting Copilot's agent capabilities and how to best integrate with the most widely adopted AI coding assistant in the world. Key takeaways include: - Understanding how and when to bring agents into your software development workflow - Providing context through the use of custom instructions and prompt files to ensure consistency across your team - Discovering how MCP provides access to an additional set of external tools and capabilities that the agent can use - Configuring Copilot's agentic capabilities to take advantage of your custom MCP server - Recommended best practices to help your responsibly accelerate your development while maintaining code quality and governance ------------------------------------ Session ID: 944039 Track: Uncategorized Speaker: Nagkumar Arkalgud (Senior Software Engineer) Room: Nobhill C&D: Microsoft Time: 4 Jun 2025 12:45 PM Session Title: AI Red Teaming Agent: Accelerate your AI safety and security journey with Azure AI Foundry Description: In the age of autonomous AI agents, ensuring their safety and reliability is paramount. But how can we proactively uncover vulnerabilities before they impact real-world scenarios? Enter Azure AI Evaluation SDK’s Red Teaming Agent—a cutting-edge tool designed to rigorously challenge your AI agents, exposing hidden risks and unexpected behaviors. This session will guide you through the powerful capabilities of Azure’s Red Teaming Agent, demonstrating how it simulates adversarial scenarios, stress-tests agentic decision-making, and ensures your applications remain robust, ethical, and safe. You’ll learn practical techniques for systematically identifying weaknesses, interpreting evaluation results, and integrating safety checks into your development lifecycle. Join us to explore how embracing adversarial testing not only mitigates risks but strengthens trust in your AI solutions—keeping you ahead in the rapidly evolving landscape of responsible AI. ------------------------------------ Session ID: 936251 Track: Uncategorized Speaker: Windsurf Speaker (N/A) Session Title: Windsurf Booth Session #2 Description: Not Available ------------------------------------ Session ID: 939088 Track: Uncategorized Speaker: Daniel Kim (Head of Growth, Cerebras) Format: Workshop Room: Nobhill A&B: Workshops Time: 3 Jun 2025 01:00 PM Session Title: From Mixture of Experts to Mixture of Agents … with Super Fast Inference Description: Our hands-on workshop will walk you through how to build your own Mixture of Agents (MoA) system using the fastest, and most capable open models available: Qwen3-32B and Llama 3.3-70B. MoA is an emerging architecture that combines the strengths of multiple large language models in a layered, agent-based design. This approach delivers superior performance by enabling specialized agents to collaborate across layers—outperforming today’s frontier models in both accuracy and efficiency. To ground this new paradigm in its roots, we’ll also explore how Mixture of Experts (MoE) architectures continue to push the boundaries of scale and specialization. Learn how Cerebras trains state-of-the-art MoEs from Daria Soboleva, Head Research Scientist. ------------------------------------ Session ID: 936905 Track: Uncategorized Speaker: Cedric Vidal (Principal AI Advocate) Format: Workshop Room: Nobhill C&D: Microsoft Time: 3 Jun 2025 01:00 PM Session Title: Building Code-First AI Agents with Azure AI Agent Service: A Practical introduction Description: This workshop offers a hands-on introduction to developing Large Language Model (LLM)-powered AI agents using Microsoft’s Azure AI Agent Service. Participants will build a conversational agent capable of analyzing sales data, generating visualizations, and delivering actionable insights. The session takes a code-first approach using the Azure AI Foundry SDK for Python, and demonstrates how to integrate core Azure services including Azure OpenAI, Azure AI Search, and Azure Storage. Attendees will explore key concepts such as function calling, document grounding, and leveraging code interpreters to generate diagrams. The workshop also covers how to connect agents to external data sources like SQL databases (e.g., SQLite), enabling access to legacy relational systems. By the end of the session, participants will have a solid foundation for building and deploying intelligent, code-first AI agents with Azure AI Agent Service—ready to power real-world applications. ------------------------------------ Session ID: 936906 Track: Uncategorized Speaker: Jonathan Larson (Senior Principal Data Architect) Format: Talk Room: Juniper: Expo Sessions Time: 4 Jun 2025 03:15 PM Session Title: GraphRAG methods to create optimized LLM context windows for retrieval Description: Not Available ------------------------------------ Session ID: 936908 Track: Uncategorized Speaker: Marco Casalaina (VP of Products, Azure AI ) Format: Talk Room: Nobhill C&D: Microsoft Time: 4 Jun 2025 01:35 PM Session Title: Agentic RAG: build a reasoning retrieval engine with Azure AI Search Description: Not Available ------------------------------------ Session ID: 936006 Track: Uncategorized Speaker: swyx . (Curator, smol.ai) Room: Keynote/General Session (Yerba Buena 7&8) Time: 4 Jun 2025 04:45 PM Session Title: #define AI Engineer Description: Greg Brockman's career and advice for AI Engineers ------------------------------------ Session ID: 939093 Track: Uncategorized Speaker: Julián Duque (Principal Developer Advocate at Heroku) Format: Workshop Room: Nobhill A&B: Expo Sessions Time: 4 Jun 2025 01:00 PM Session Title: Building Agentic Applications with Heroku Managed Inference and Agents Description: In this workshop, you’ll learn how to use Heroku Managed Inference and Agents to build agentic applications. We’ll cover how to provision and deploy LLM models to your app, run untrusted code securely in Python, Node.js, Go, and Ruby using built-in tools, and use the Model Context Protocol (MCP) to connect tools and actions that extend your agents' capabilities. ------------------------------------ Session ID: 936818 Track: Uncategorized Speaker: Cedric Vidal (Principal AI Advocate) Format: Talk Room: Nobhill C&D: Microsoft Time: 4 Jun 2025 01:10 PM Session Title: Agentic Excellence: Mastering Evaluation of AI Agents with Azure AI Evaluation SDK Description: As AI agents transition from experimental assistants to critical components of enterprise workflows, reliably evaluating their performance becomes essential. But how do you systematically measure an AI agent’s capabilities, contextual understanding, and accuracy across diverse scenarios? In this talk, we'll dive deep into the Azure AI Evaluation SDK, an innovative tool designed to rigorously assess agentic applications. Learn how to create powerful evaluations using structured test plans, scenarios, and advanced analytics that pinpoint strengths and expose hidden weaknesses. Through practical examples and real-world case studies, you'll discover how companies are already leveraging this SDK to enhance agent trustworthiness, reliability, and performance. Whether you're developing conversational agents, data-driven decision-makers, or autonomous workflow orchestrators, this session equips you with the techniques and insights needed to ensure your AI solutions deliver exceptional value and exceed user expectations."" ------------------------------------ Session ID: 936935 Track: Uncategorized Speaker: Lunch Learn (N/A) Session Title: Braintrust Lunch & Learn Description: Not Available ------------------------------------ Session ID: 936902 Track: Uncategorized Speaker: Harald Kirschner (VS Code Team Member) Format: Talk Room: Juniper: Expo Sessions Time: 5 Jun 2025 01:00 PM Session Title: Vibe Coding at Scale: Customizing AI Assistants for Enterprise Environments Description: "Vibe coding" often falters in complex enterprise environments. Drawing from real implementations, this talk demonstrates systematic approaches to customizing AI assistants for challenging codebases. We'll explore specialized techniques for navigating complex architectures, evidence-based strategies for undocumented legacy systems, methodologies for maintaining context across polyglot environments, and frameworks for standardizing AI usage while preserving developer autonomy. Through case studies from finance and healthcare, we'll present a comprehensive evaluation framework that bridges the gap between AI's theoretical capabilities and practical enterprise implementation, enabling true flow-state collaboration even within the most complex development ecosystems. ------------------------------------ Session ID: 940839 Track: Uncategorized Speaker: Jon Peck (Technical Advocate & Software Developer) Format: Workshop Room: Nobhill C&D: Microsoft Time: 3 Jun 2025 03:30 PM Session Title: Collaborating with Agents in your Software Development Workflow Description: GitHub Copilot's agentic capabilities enhance its ability to act as a peer programmer. From the IDE to the repository, Copilot can generate code, run tests, and perform tasks like creating pull requests using Model Context Protocol (MCP). This instructor-led lab will guide you through using agent capabilities on both the client and the server: Key takeaways include: Understanding how to bring agents into your software development workflow Identifying scenarios where agents can be most impactful, as well as tips and tricks to provide the right context to lead to success Discovering how Model Context Protocol provides access to an additional set of external tools and capabilities that the agent can use Recommended practices to accelerate your development while maintaining code quality. ------------------------------------ Session ID: 936903 Track: Uncategorized Speaker: Jon Peck (Technical Advocate & Software Developer) Format: Talk Room: Willow: Expo Sessions Time: 4 Jun 2025 01:15 PM Session Title: Real-world MCPs in GitHub Copilot Agent Mode Description: As developers, we don't spend most of our time vibe-coding prototypes. More often, we're adding features, squashing bugs, and building tests for existing apps across a wide variety of services and technologies. Come learn how MCPs help GitHub Copilot to untangle real engineering problems. By allowing agent mode to securely work with data sources, testing tools, infrastructure providers, and even core DevOps tooling -- we can go beyond the hype, and solve the actual engineering problems we face every day. ------------------------------------ Session ID: 936901 Track: Uncategorized Speaker: Jon Peck (Technical Advocate & Software Developer) Format: Talk Room: Nobhill C&D: Microsoft Time: 5 Jun 2025 01:10 PM Session Title: Unlocking AI-Powered DevOps Within Your Organization Description: "Software development is a team sport, with many different roles, where eveyone can win. But success isn't guaranteed; it depends on specific practices, policies, and tools which enable minimally-siloed, AI-accelerated collaboration across all parts of the DevOps process, from PM to development to CI/CD and security. Discover the patterns and tools which lead to success, methods for changing the status quo, and perhaps a few horror stories. We'll touch on innersourcing, cloud development, AI, automation, governance, security, scaling and more -- with actionable learnings for everyone from small maintainer communities to F500 Enterprises." ====================================================================== --- Track: VOICE (June 4) --- ====================================================================== Session ID: 916131 Track: Voice Speaker: Shrestha Basu Mallick (Product lead, Gemini Developer API) Format: Workshop Room: Salons 2-6: Workshops Time: 3 Jun 2025 10:40 AM Session Title: Building Voice Agents with Gemini and Pipecat Description: Voice AI Agents are being deployed today in a wide range of business contexts. For example: - handling an increasing variety of call center tasks, - collecting patient data prior to healthcare appointments, - following up on inbound sales leads, - coordinating scheduling and logistics between companies, and - answering the phone for nearly every kind of small business. On the consumer side, conversational voice (and video) AI is also starting to make its way into social applications and games. And developers are sharing innovative personal voice AI projects and experiments every day on GitHub and social media. Building production-ready voice agents is complicated. Many elements are non-trivial to implement from scratch. This workshop will start with an overview of the voice AI landscape today. - The models, APIs, and infrastructure are used for Voice AI applications that are operating at production scale. - How to write voice agent code that achieves ultra low latency conversation and enterprise-quality reliability. - What new models and tools are coming in the second half of 2025. Then we will shift to a hands-on format: build and deploy a voice agent. Engineers from Google and Daily will help you get set up with a starter kit repo for your intended use case, then help you extend that code to create your own, customized Voice AI application. ------------------------------------ Session ID: 933596 Track: Voice Speaker: Shrestha Basu Mallick (Product lead, Gemini Developer API) Format: Talk Room: Foothill E: Voice Time: 4 Jun 2025 02:40 PM Session Title: Milliseconds to Magic: Real‑Time Workflows using the Gemini Live API and Pipecat Description: The Gemini Live API GA is now powered by Google's best cost-effective thinking model Gemini 2.5 Flash. We will do a deep dive on the capabilities that the Gemini Live API combined with Pipecat unlock for devs with special focus on session management, turn detection, tool use (including async function calls), proactivity, multilinguality and integration with telephony and other infra. We will demo some of the more innovative capabilities. We will also talk through some customer use cases - especially how customers can use Pipecat to extend these realtime multimodal capabilities to client side applications such as customer support agents, gaming agents, tutoring agents etc. In addition, we also have an experimental version of the Live API powered by with Google's native audio offering that can be tried in an experimental capacity . This experimental model can communicate with seamless, emotive, steerable, multilingual dialogue and enhances use cases where more natural voices can be a big differentiator. ------------------------------------ Session ID: 915031 Track: Voice Speaker: Brooke Hopkins (Founder ) Format: Talk Room: Foothill E: Voice Time: 4 Jun 2025 11:55 AM Session Title: What we can learn from self driving in autonomous voice agents Description: The reliability challenges facing voice & chat AI deployment today mirror those that the autonomous vehicle industry confronted years ago. This talk explores how evaluation methodologies developed for self-driving cars can be transferred to create autonomous, self-improving evaluation systems for conversational AI. Drawing from my experience building evaluation infrastructure at Waymo and now developing Coval, an enterprise-grade reliability platform for conversational agents, I'll demonstrate how systematic testing infrastructure is not just a technical requirement but a competitive advantage in the rapidly evolving AI landscape. ------------------------------------ Session ID: 914537 Track: Voice Speaker: Dominik Kundel (Developer Experience ) Format: Workshop Room: Golden Gate Ballroom B: Workshops Time: 3 Jun 2025 03:30 PM Session Title: Building voice agents with OpenAI Description: We'll walk through the differences between chained and speech-to-speech powered voice agents, how to approach them, best practices and transform a text-based agent into our first voice-enabled agent ------------------------------------ Session ID: 933599 Track: Voice Speaker: Rohit Talluri (WW Generative AI Specialist) Room: Nobhill A&B: Expo Sessions Time: 5 Jun 2025 01:15 PM Session Title: Serving Voice AI at Scale Description: Real-Time Voice AI applications demand the lowest possible latencies to enhance user experiences with more advanced reasoning and agentic capabilities. AWS is hosting Arjun Desai, co-founder of Cartesia, in a fireside chat for a technical deep dive into learnings and best practices for building a state-of-the-art inference stack that serves global enterprise customers. ------------------------------------ Session ID: 933491 Track: Voice Speaker: Chad Bailey (Senior Voice Bots Engineer, Daily) Room: Salons 9-15: Expo Hall Time: 5 Jun 2025 01:30 PM Session Title: Pipecat Cloud: Open Source Enterprise Voice AI Description: Learn about building voice agents with for customer support, call center workflows, market research, and many other use cases. Pipecat is the open source, vendor neutral realtime agent framework used by teams at NVIDIA, OpenAI, Google DeepMind, AWS, and hundreds of startups and scale-ups. ------------------------------------ Session ID: 915028 Track: Voice Speaker: Sean DuBois (WebRTC and Realtime API) Format: Talk Room: Foothill E: Voice Time: 4 Jun 2025 11:15 AM Session Title: [Voice Keynote] Your realtime AI is ngmi Description: Sean DuBois of OpenAI and Pion, and Kwindla Hultman Kramer of Daily and Pipecat, will talk about why you have to design realtime AI systems from the network layer up. Most people who build realtime AI apps and frameworks get it wrong. They build from either the model out or the app layer down. But unless you start with the network layer and build up, you'll never be able to deliver realtime audio and video streams reliably. And perhaps even worse, you'll get core primitives wrong: interruption handling, conversation state management, asynchronous function calling. Sean and Kwin agree on most things: old-school realtime systems people against the rest of the world. But they disagree on some important things, too, and will argue about those things live on stage. Do you need to give developers "thick" client-side realtime SDKs? Can you build truly great vendor neutral APIs? (You'll be surprised which of them argues which side, on that topic.) ------------------------------------ Session ID: 932493 Track: Voice Speaker: Tom Shapland, PhD (Product Manager) Format: Talk Room: Foothill E: Voice Time: 4 Jun 2025 02:20 PM Session Title: Why ChatGPT Keeps Interrupting You Description: ChatGPT Advanced Voice Mode isn’t interrupting just you. Interruptions, and turn-taking in general, are unsolved problems for all Voice AI agents. Nobody likes being cut short – and people have much less patience for machines than they do for other humans. Turn-taking failures take many forms (e.g., the agent interrupts the user, the agent mistakes a cough for an interruption), and all of them lead to users immediately hanging up the phone. In this talk, we use human conversation as a framework for understanding both today’s approaches to turn detection and where the field is headed. You’ll learn about how linguists think about turn detection in human dialogue, what’s working (and what’s broken) in current methods, and how we might build Voice AIs that interrupt you less than your human brother. ------------------------------------ Session ID: 933580 Track: Voice Speaker: Philip Kiely (Head of Developer Relations) Room: Willow: Expo Sessions Time: 5 Jun 2025 01:15 PM Session Title: Optimizing inference for voice models in production Description: How do you get time to first byte (TTFB) below 150 milliseconds for voice models -- and scale it in production? As it turns out, open-source TTS models like Orpheus have an LLM backbone that lets us use familiar tools and optimizations like TensorRT-LLM and FP8 quantization to serve the models with low latency. But client code, network infrastructure, and other outside-the-GPU factors can introduce latency in the production stack. In this talk, we'll cover the basic mechanics of TTS inference, common pitfalls to avoid in integrating them into production systems, and how to extend this high-performance system to serve customized models with voice cloning and fine-tuning. ------------------------------------ Session ID: 933721 Track: Voice Speaker: Suman Debnath (Principal Developer Advocate, AI/ML, AWS) Format: Workshop Room: Foothill G1&2: Workshops Time: 3 Jun 2025 03:30 PM Session Title: VoiceVision RAG - Integrating Visual Document Intelligence with Voice Response Description: In this workshop we will explore the integration of Colpali, a cutting-edge Vision based Retrieval Model, with voice synthesis for next-generation RAG systems. We'll demonstrate how Colpali's ability to generate multi-vector embeddings directly from document images bypasses traditional OCR and complex preprocessing, while adding voice output creates a more intuitive and accessible user experience. Attendees will see how this combination handles documents with mixed textual and visual information, leading to more efficient and accurate information retrieval with natural voice responses. ------------------------------------ Session ID: 937506 Track: Voice Speaker: Neil Dwyer (CTO) Room: Foothill E: Voice Time: 4 Jun 2025 12:15 PM Session Title: Serving Voice AI at $1/hr: Open-source, LoRAs, Latency, Load Balancing Description: This is a talk that goes over our experience deploying Orpheus (Emotive, Realtime TTS) to production. It will cover topics: - Latency and optimizations - High fidelity voice clones w/ examples - Load balancing w/ multiple GPUs and multiple LoRas ------------------------------------ Session ID: 915431 Track: Voice Speaker: Thor 雷神 Schaeff (DX at ElevenLabs) Format: Workshop Room: Golden Gate Ballroom A: Workshops Time: 3 Jun 2025 11:00 AM Session Title: Build multilingual Conversational AI Agents Description: In this workshop you will learn how to build multilingual Conversational AI agents that can automatically detect your user's spoken language and can seamlessly switch to their preferred language. ------------------------------------ Session ID: 931123 Track: Voice Speaker: Peter Bar (Product Lead, Voice AI) Format: Talk Room: Foothill E: Voice Time: 4 Jun 2025 11:35 AM Session Title: Shipping an Enterprise Voice AI Agent in 100 Days Description: What does it take to go from blank page to live enterprise voice agent in 100 days? That’s the challenge we took on with Fin Voice at Intercom. Enterprise customer service demands high-quality, reliable voice interactions - but delivering that fast means wrestling with tough problems like latency, hallucinations, voice quality, and answer accuracy. We rapidly evaluated and integrated a full voice stack - including transcription, language model, text-to-speech, retrieval-augmented generation, and telephony - while designing tools that fit seamlessly into existing human support workflows. In this session, I’ll share key lessons from our accelerated development of Fin Voice. We'll explore the technical and operational hurdles we faced, the trade-offs we made, and how we built deployment and handover tools that work for customer service teams. You'll leave with insights into building AI-driven voice products that are both powerful and practical. ------------------------------------ Session ID: 916079 Track: Voice Speaker: Jordan Dearsley (CEO) Format: Talk Room: Foothill E: Voice Time: 4 Jun 2025 02:00 PM Session Title: Building the Voice-First Future: Omnipresent Agents that Listen, Talk and Act Description: We’re entering a world where talking to machines feels as natural as talking to people. Voice is about to become the dominant interface for technology - ambient, always-on, and human by default. To get there, we need infrastructure that can orchestrate voice, tools, memory, real-time reasoning and telephony. This talk explores the vision for voice and how we're making it work at scale. ------------------------------------ Session ID: 933589 Track: Voice Speaker: Mark Backman (Head of Product, Daily) Room: Nobhill A&B: Expo Sessions Time: 5 Jun 2025 11:00 AM Session Title: Pipecat Cloud: Enterprise Voice Agents Built On Open Source Description: Voice AI agents today can conduct natural, human-like conversations and perform a wide variety of tasks: customer support, lead qualification, healthcare patient intake, market research, and more. Today's best voice agents combine: realtime responsiveness, open-ended conversational intelligence, reliable instruction following, and flexible integration with existing back-end systems. Learn how to build state of the art voice agents using Pipecat's open source, vendor neutral tooling. You can deploy Pipecat agents to your own infrastructure or to Pipecat Cloud. Pipecat is used and supported by teams at NVIDIA, AWS, Google DeepMind, OpenAI, and hundreds of other companies. ------------------------------------ Session ID: 937506 Track: Voice Speaker: Jack Dwyer (CEO @ Gabber - End to End Backend For Realtime AI) Room: Foothill E: Voice Time: 4 Jun 2025 12:15 PM Session Title: Serving Voice AI at $1/hr: Open-source, LoRAs, Latency, Load Balancing Description: This is a talk that goes over our experience deploying Orpheus (Emotive, Realtime TTS) to production. It will cover topics: - Latency and optimizations - High fidelity voice clones w/ examples - Load balancing w/ multiple GPUs and multiple LoRas ------------------------------------ Session ID: 916131 Track: Voice Speaker: Kwindla Kramer (CEO ) Format: Workshop Room: Salons 2-6: Workshops Time: 3 Jun 2025 10:40 AM Session Title: Building Voice Agents with Gemini and Pipecat Description: Voice AI Agents are being deployed today in a wide range of business contexts. For example: - handling an increasing variety of call center tasks, - collecting patient data prior to healthcare appointments, - following up on inbound sales leads, - coordinating scheduling and logistics between companies, and - answering the phone for nearly every kind of small business. On the consumer side, conversational voice (and video) AI is also starting to make its way into social applications and games. And developers are sharing innovative personal voice AI projects and experiments every day on GitHub and social media. Building production-ready voice agents is complicated. Many elements are non-trivial to implement from scratch. This workshop will start with an overview of the voice AI landscape today. - The models, APIs, and infrastructure are used for Voice AI applications that are operating at production scale. - How to write voice agent code that achieves ultra low latency conversation and enterprise-quality reliability. - What new models and tools are coming in the second half of 2025. Then we will shift to a hands-on format: build and deploy a voice agent. Engineers from Google and Daily will help you get set up with a starter kit repo for your intended use case, then help you extend that code to create your own, customized Voice AI application. ------------------------------------ Session ID: 915028 Track: Voice Speaker: Kwindla Kramer (CEO ) Format: Talk Room: Foothill E: Voice Time: 4 Jun 2025 11:15 AM Session Title: [Voice Keynote] Your realtime AI is ngmi Description: Sean DuBois of OpenAI and Pion, and Kwindla Hultman Kramer of Daily and Pipecat, will talk about why you have to design realtime AI systems from the network layer up. Most people who build realtime AI apps and frameworks get it wrong. They build from either the model out or the app layer down. But unless you start with the network layer and build up, you'll never be able to deliver realtime audio and video streams reliably. And perhaps even worse, you'll get core primitives wrong: interruption handling, conversation state management, asynchronous function calling. Sean and Kwin agree on most things: old-school realtime systems people against the rest of the world. But they disagree on some important things, too, and will argue about those things live on stage. Do you need to give developers "thick" client-side realtime SDKs? Can you build truly great vendor neutral APIs? (You'll be surprised which of them argues which side, on that topic.) ------------------------------------ Session ID: 933596 Track: Voice Speaker: Kwindla Kramer (CEO ) Format: Talk Room: Foothill E: Voice Time: 4 Jun 2025 02:40 PM Session Title: Milliseconds to Magic: Real‑Time Workflows using the Gemini Live API and Pipecat Description: The Gemini Live API GA is now powered by Google's best cost-effective thinking model Gemini 2.5 Flash. We will do a deep dive on the capabilities that the Gemini Live API combined with Pipecat unlock for devs with special focus on session management, turn detection, tool use (including async function calls), proactivity, multilinguality and integration with telephony and other infra. We will demo some of the more innovative capabilities. We will also talk through some customer use cases - especially how customers can use Pipecat to extend these realtime multimodal capabilities to client side applications such as customer support agents, gaming agents, tutoring agents etc. In addition, we also have an experimental version of the Live API powered by with Google's native audio offering that can be tried in an experimental capacity . This experimental model can communicate with seamless, emotive, steerable, multilingual dialogue and enhances use cases where more natural voices can be a big differentiator. ------------------------------------ Session ID: 933599 Track: Voice Speaker: Arjun Desai (Co-Founder, Cartesia) Room: Nobhill A&B: Expo Sessions Time: 5 Jun 2025 01:15 PM Session Title: Serving Voice AI at Scale Description: Real-Time Voice AI applications demand the lowest possible latencies to enhance user experiences with more advanced reasoning and agentic capabilities. AWS is hosting Arjun Desai, co-founder of Cartesia, in a fireside chat for a technical deep dive into learnings and best practices for building a state-of-the-art inference stack that serves global enterprise customers. ## Expo Open across all 3 days. Featuring 30+ booths and demo areas showcasing the most relevant and forward-thinking AI infrastructure and developer tools. Meet the engineers and founders behind: Microsoft, AWS, MongoDB, Neo4j, Hasura, Galileo, Sourcegraph, LlamaIndex, Temporal, Baseten, Elastic, Orb, Gitpod, Freeplay, Dagger, Traceloop, Pydantic, Arize, Arcjet, Zed, Modal, Agentuity, Weights & Biases, Fly.io, Sierra, Vellum, GenSx, Redis, Langbase, Twilio, Descope, SuperAnnotate, Unstructured, Baz, VESSL AI, Riza, Tambo, Sentry, Xpander, Thomson Reuters, ElevenLabs, Pomerium, Daytona, Polar Signals, Vercel, Ampersand, Together AI, Distributional, and many more. **[Buy Tickets](https://ti.to/software-3/ai-engineer-worlds-fair-2025?source={{UTM_SOURCE}}) | [Watch 2023/2024/2025 Talks](https://youtube.com/@aidotengineer)** ## AI Architects Track Invite-only track for AI executives (VPs, CTOs, Heads of AI at enterprises with >1000 employees). - Closed-door briefings and roundtables - Topics include technical org design, model building, FMOps, evals, inference optimization, build/buy decisions - Exclusive access to premium lunches and networking in the View Lounge ## Side Events (2024 Examples) A full week of satellite events hosted by our partners: - Hackathons (e.g., AI21, GenLab x AIEWF) - Deep Tech Week launch parties - RAG++ pre-party, AI DevTools nights - Rooftop happy hours and after-parties - Demo Days, quality conferences, founders dinners If you are organizing an event around June 1–8, email **sponsorships@ai.engineer** to be added to the official calendar. ## Venue & Hotel **Marriott Marquis, San Francisco** 780 Mission St, San Francisco, CA 94103 - Yerba Buena Ballroom: Keynotes, expo, and large sessions - Golden Gate Ballroom: Dedicated space for workshops - View Lounge: Reserved for AI Architects and Leadership Track networking ## Sponsors ### Community Partners - Data Council, Hall C, Ai LA, Open Web Foundation, SF Python, SF Node, SF Java, Weaviate, Prompt Engineering, R meetup, Hypergrowth, Vector DAO, GenAI Collective, Cambrian ML, Ai Tinkerers, CodingNomads, Ai Product Builders, Ai Salon, Ai Makers SF, OpenSourceGrill, Ai Breakfast Club, Ai Stack, Nexus Events, Seattle Ai Society, RVC, Ai Happy Hour, SF Ai, FourthBrain, Ai Comic Books, Ai Engineer Foundation ### Presenting Sponsor Microsoft ### Innovation Partner AWS ### Track Sponsors Neo4j, Braintrust, Hasura ### Platinum Sponsors Graphite, Daily, Windsurf, MongoDB, AugmentCode, WorkOS ### Gold Sponsors Neo4j, Hasura, Galileo, Sourcegraph, LlamaIndex, Temporal, Baseten, Elastic, Orb, Gitpod, Freeplay, Dagger, Traceloop, Pydantic, Arize, Arcjet, Zed, Modal, Agentuity ### Silver Sponsors Weights & Biases, Fly.io, Sierra, Vellum, GenSx, Redis, Langbase, Twilio, Descope, SuperAnnotate, Unstructured, Baz, VESSL AI, Riza, Tambo, Sentry, Xpander, Thomson Reuters, ElevenLabs, Pomerium, Daytona, Polar Signals, Vercel, Ampersand, Together AI, Distributional ### Supporters Circle ## Testimonials > "The most insightful and exciting conference I ever attended. High signal, deeply technical, and community-focused." > — Yanick J. S. > "By far the best AI conference I've ever attended." > — Dedy Kredo > "Reminded me of the early Twitter dev scene—a spark for a decade of innovation." > — Eric Ryan > "Months of effort distilled into powerful 20-minute talks." > — Yubrew > "You could feel the buzz and optimism everywhere." > — Eric Ness ## Stay Updated - **[Buy Tickets](https://ti.to/software-3/ai-engineer-worlds-fair-2025?source={{UTM_SOURCE}})** Early bird discounts available until sell-out. - **[Watch Talks](https://youtube.com/@aidotengineer)** Browse sessions from 2023, 2024, and upcoming 2025. - **[Subscribe to Newsletter](https://ai.engineer/newsletter)** Get notified about speakers, tickets, livestreams, and community events. - **[Follow on X](https://twitter.com/aiDotEngineer)** Live updates, real-time speaker quotes, and behind-the-scenes moments. - **[Subscribe on YouTube](https://www.youtube.com/@aiengineer)** Access full talk recordings and curated playlists from every year. ## Contact & Connect - [Sponsor Inquiry](mailto:sponsorships@ai.engineer) - [Volunteer](https://ai.engineer/volunteer) - [Jobs](https://ai.engineer/jobs) - [Scholarships](https://ai.engineer/scholarships) - [Code of Conduct](https://ai.engineer/code-of-conduct) - [About](https://ai.engineer/about) - [What is an AI Engineer?](https://ai.engineer/what-is-an-ai-engineer) **Copyright 2025 Software 3.0 LLC** **Note:** The 2025 tracks are subject to change. Check the website for the latest updates. **[Apply to Speak](https://sessionize.com/ai-engineer-worlds-fair-2025)**