Architecting GenAI Virtual Assistants

Main Speaker

Learning Tracks

Course ID

52074

Date

17-07-2025

Time

Daily seminar
9:00-16:30

Location

Daniel Hotel, 60 Ramat Yam st. Herzliya

Overview

Join Eyal Rubin, CEO of 1 2 3 Completed, for a hands-on look at how to architect backend systems for GenAI-powered virtual assistants. Based on real-world enterprise projects, this session covers how to build flexible, scalable platforms using modular architecture, prompt/code separation, and multi-LLM support. We’ll explore practical approaches to RAG pipelines with Pinecone, orchestration using LangChain, and techniques for tracking and optimizing prompt performance with tools like Phoenix. The session also covers integrating Text-to-Speech (TTS) and Speech-to-Text (STT) capabilities and building systems that are not only powerful but also flexible and maintainable. Whether you’re launching your first assistant or scaling an existing platform, this session offers field-tested strategies and actionable insights—with space for shared learning and peer exchange.

Who Should Attend

Developers, Architects, Engineering Managers, and Technical Leaders interested in understanding how to build scalable and flexible backend architectures for AI-powered applications.

Prerequisites

Course Contents

 
  1. Virtual Assistant Backend Architecture: The Essentials
    • High-level flow of a GenAI-powered virtual assistant
    • Core components: Frontend (chat/web/voice), Backend API, LLM orchestrator, RAG server, STT/TTS layer, and analytics
    • Example Phoenix server role in managing the state and flow
 
  1. Separation of Code and Prompts
    • Benefits of decoupling logic from prompt design
    • Using templating and versioning for prompts
    • Integration strategies for storing prompts in external services (e.g., DBs or repos)
 
  1. Supporting Multiple LLMs: Abstraction and Flexibility
    • How to switch between models (e.g., OpenAI, Anthropic, Mistral, custom models)
    • Building modular backends to support pluggable LLMs
 
  1. LLM Communication: Langchain & Alternatives
    • Introduction to Langchain and how it simplifies agent-based architectures
    • Examples of chaining steps, memory, and tools
    • Comparison with other orchestration layers and custom implementations
 
  1. Retrieval-Augmented Generation (RAG): Core Patterns
    • What RAG is and why it matters
    • Using vector stores like Pinecone to enhance responses
    • Workflow: Ingest – Embed – Store – Retrieve – Inject
 
  1. Prompt Engineering Analytics: Phoenix + Beyond
    • Importance of understanding prompt performance over time
    • Using Phoenix for experiment tracking, prompt scoring, and observability
    • Visualization of token usage, LLM latency, output drift
🧭 What Attendees Can Expect:
  • A suggested proven architecture for building a robust Virtual Assistant platform
  • Guidance on designing a flexible system that supports multiple LLMs and can adapt to LLM replacements
  • Practical insights into monitoring and tuning both LLMs and prompt strategies
  • Real-world lessons and best practices from experienced experts actively developing and deploying GenAI systems
 

The conference starts in

Days
Hours
Minutes
Seconds