Varun Badrinath Krishna & Petro Junior Milan - Build enterprise generative AI apps using Llama-3 at 1,000 tokens/s on the SambaNova AI platform


In this workshop, you will learn how to build LLM-based apps, such as a question-answering system with RAG, in LangChain using Llama-3 at 1,000 tokens per second on the SambaNova AI Platform.

Level: Intermediate

Abstract: SambaNova delivers generative AI capabilities to the enterprise. In this workshop, you will learn:

● About SambaNova’s full-stack generative AI platform, powered by the SN40L AI chip and delivering unparalleled performance for training and inference

● Samba-1, a trillion parameter composition of experts (CoE) model, and how it can be used for enterprise settings

● How to build and deploy a question-answering app end-to-end with retrieval augmented generation (RAG) for enterprise search using the following suite: LangChain as framework, Unstructured for pre-processing text documents, E5-large-v2 embedding, ChromaDB vector store, and Llama-3-8B-Instruct running at speed record of 1,000 tokens per second via SambaNova.

This workshop is designed for tech professionals, engineers, and anyone interested in enterprise generative AI applications.

Prerequisites: Experience programming, ideally in Python, a Github account, and laptop

Assets: We will provide a link to the Github repo with step-by-step instructions on how to install the required libraries and how to run the Jupyter notebooks and Streamlit apps. We will also provide SambaNova API keys for the CoE and Llama-3 endpoints.

GitHub Repo: Dev Setup for Exercise 1: Dev Setup for Exercise 2:

Varun Badrinath Krishna

Varun is a Sr Principal AI Solutions Engineer at SambaNova Systems. He is currently investigating the benefits of fine-tuning embedding & decoder LLMs in retrieval augmented generation (RAG). Previously, he led the deployment of AI/ML applications across CRM, e-commerce, healthcare, finance, energy, manufacturing, fraud detection, and cyber security at Fortune 500 enterprises. He has worked at, Cisco, IBM Research, ABB, and research institutes in Singapore. He holds a Ph.D. in Computer Engineering from the University of Illinois at Urbana-Champaign.

Petro Junior Milan

Petro is a Principal Engineer in the AI Solutions team at SambaNova Systems, where he is currently working on developing applications powered by large language models. His expertise spans AI for Science and Generative AI. He obtained his PhD and MS from the Georgia Institute of Technology and previously interned at Argonne National Laboratory. He has given several tutorials at conferences, e.g., Supercomputing, and in 2023, he was selected to participate in the prestigious US Frontiers of Engineering organized by the National Academy of Engineering.

