Agentic Workflows

Build intelligent agents that route between models, implement RAG, and coordinate multiple AI systems

Table of contents

  1. Model Routing
  2. RAG with PostgreSQL
    1. Setup
    2. Document Model with Embeddings
    3. RAG Tool
  3. Multi-Agent Systems
    1. Researcher and Writer Team
    2. Parallel Agent Execution with Async
    3. Supervisor Pattern
  4. Error Handling
  5. Next Steps

After reading this guide, you will know:

  • How to build a model router that selects the best AI for each task
  • How to implement RAG with PostgreSQL and pgvector
  • How to run multiple agents in parallel with async
  • How to create multi-agent systems with specialized roles

Model Routing

Different models excel at different tasks. A router can analyze requests and delegate to the most appropriate model.

class ModelRouter < RubyLLM::Tool
  description "Routes requests to the optimal model"
  param :query, desc: "The user's request"

  def execute(query:)
    task_type = classify_task(query)

    case task_type
    when :code
      RubyLLM.chat(model: 'claude-3-5-sonnet').ask(query).content
    when :creative
      RubyLLM.chat(model: 'gpt-4o').ask(query).content
    when :factual
      RubyLLM.chat(model: 'gemini-1.5-pro').ask(query).content
    else
      RubyLLM.chat.ask(query).content
    end
  end

  private

  def classify_task(query)
    classifier = RubyLLM.chat(model: 'gpt-4o-mini')
                     .with_instructions("Classify: code, creative, or factual. One word only.")
    classifier.ask(query).content.downcase.to_sym
  end
end

# Usage
chat = RubyLLM.chat.with_tool(ModelRouter)
response = chat.ask "Write a Ruby function to parse JSON"

RAG with PostgreSQL

Use pgvector and the neighbor gem for production-ready RAG implementations.

Setup

# Gemfile
gem 'neighbor'
gem 'ruby_llm'

# Generate migration for pgvector
rails generate neighbor:vector
rails db:migrate

# Create documents table
class CreateDocuments < ActiveRecord::Migration[7.1]
  def change
    create_table :documents do |t|
      t.text :content
      t.string :title
      t.vector :embedding, limit: 1536 # OpenAI embedding size
      t.timestamps
    end

    add_index :documents, :embedding, using: :hnsw, opclass: :vector_l2_ops
  end
end

Document Model with Embeddings

class Document < ApplicationRecord
  has_neighbors :embedding

  before_save :generate_embedding, if: :content_changed?

  private

  def generate_embedding
    response = RubyLLM.embed(content)
    self.embedding = response.vectors
  end
end

RAG Tool

class DocumentSearch < RubyLLM::Tool
  description "Searches knowledge base for relevant information"
  param :query, desc: "Search query"

  def execute(query:)
    # Generate embedding for query
    embedding = RubyLLM.embed(query).vectors

    # Find similar documents using neighbor
    documents = Document.nearest_neighbors(
      :embedding,
      embedding,
      distance: "euclidean"
    ).limit(3)

    # Return formatted context
    documents.map { |doc|
      "#{doc.title}: #{doc.content.truncate(500)}"
    }.join("\n\n---\n\n")
  end
end

# Usage
chat = RubyLLM.chat
      .with_tool(DocumentSearch)
      .with_instructions("Search for context before answering. Cite sources.")

response = chat.ask "What is our refund policy?"

Multi-Agent Systems

Researcher and Writer Team

class ResearchAgent < RubyLLM::Tool
  description "Researches topics"
  param :topic, desc: "Topic to research"

  def execute(topic:)
    RubyLLM.chat(model: 'gemini-1.5-pro')
           .ask("Research #{topic}. List key facts.")
           .content
  end
end

class WriterAgent < RubyLLM::Tool
  description "Writes content based on research"
  param :research, desc: "Research findings"

  def execute(research:)
    RubyLLM.chat(model: 'claude-3-5-sonnet')
           .ask("Write an article:\n#{research}")
           .content
  end
end

# Coordinator uses both tools
coordinator = RubyLLM.chat.with_tools(ResearchAgent, WriterAgent)
article = coordinator.ask("Create an article about Ruby 3.3 features")

Parallel Agent Execution with Async

require 'async'

class ParallelAnalyzer
  def analyze(text)
    results = {}

    Async do |task|
      task.async do
        results[:sentiment] = RubyLLM.chat
          .ask("Sentiment of: #{text}. One word: positive/negative/neutral")
          .content
      end

      task.async do
        results[:summary] = RubyLLM.chat
          .ask("Summarize in one sentence: #{text}")
          .content
      end

      task.async do
        results[:keywords] = RubyLLM.chat
          .ask("Extract 5 keywords: #{text}")
          .content
      end
    end

    results
  end
end

# Usage
analyzer = ParallelAnalyzer.new
insights = analyzer.analyze("Your text here...")
# All three analyses run concurrently

Supervisor Pattern

class CodeReviewSystem
  def review_code(code)
    Async do |task|
      # Run reviews in parallel
      reviews = {}

      task.async do
        reviews[:security] = RubyLLM.chat(model: 'claude-3-5-sonnet')
          .ask("Security review:\n#{code}")
          .content
      end

      task.async do
        reviews[:performance] = RubyLLM.chat(model: 'gpt-4o')
          .ask("Performance review:\n#{code}")
          .content
      end

      task.async do
        reviews[:style] = RubyLLM.chat(model: 'gpt-4o-mini')
          .ask("Style review (Ruby conventions):\n#{code}")
          .content
      end

      task.wait # Wait for all reviews

      # Synthesize findings
      RubyLLM.chat.ask(
        "Summarize these code reviews:\n" +
        reviews.map { |type, review| "#{type}: #{review}" }.join("\n\n")
      ).content
    end
  end
end

Error Handling

For robust error handling in agent workflows, leverage the patterns from the Tools guide:

  • Return { error: "description" } for recoverable errors the LLM might fix
  • Raise exceptions for unrecoverable errors (missing config, service down)
  • Use the retry middleware for transient failures

See the Error Handling section in Tools for detailed patterns.

Next Steps