Agentic Workflows
Build intelligent agents that route between models, implement RAG, and coordinate multiple AI systems
Table of contents
After reading this guide, you will know:
- How to build a model router that selects the best AI for each task
- How to implement RAG with PostgreSQL and pgvector
- How to run multiple agents in parallel with async
- How to create multi-agent systems with specialized roles
Model Routing
Different models excel at different tasks. A router can analyze requests and delegate to the most appropriate model.
class ModelRouter < RubyLLM::Tool
description "Routes requests to the optimal model"
param :query, desc: "The user's request"
def execute(query:)
task_type = classify_task(query)
case task_type
when :code
RubyLLM.chat(model: 'claude-3-5-sonnet').ask(query).content
when :creative
RubyLLM.chat(model: 'gpt-4o').ask(query).content
when :factual
RubyLLM.chat(model: 'gemini-1.5-pro').ask(query).content
else
RubyLLM.chat.ask(query).content
end
end
private
def classify_task(query)
classifier = RubyLLM.chat(model: 'gpt-4o-mini')
.with_instructions("Classify: code, creative, or factual. One word only.")
classifier.ask(query).content.downcase.to_sym
end
end
# Usage
chat = RubyLLM.chat.with_tool(ModelRouter)
response = chat.ask "Write a Ruby function to parse JSON"
RAG with PostgreSQL
Use pgvector and the neighbor gem for production-ready RAG implementations.
Setup
# Gemfile
gem 'neighbor'
gem 'ruby_llm'
# Generate migration for pgvector
rails generate neighbor:vector
rails db:migrate
# Create documents table
class CreateDocuments < ActiveRecord::Migration[7.1]
def change
create_table :documents do |t|
t.text :content
t.string :title
t.vector :embedding, limit: 1536 # OpenAI embedding size
t.timestamps
end
add_index :documents, :embedding, using: :hnsw, opclass: :vector_l2_ops
end
end
Document Model with Embeddings
class Document < ApplicationRecord
has_neighbors :embedding
before_save :generate_embedding, if: :content_changed?
private
def generate_embedding
response = RubyLLM.embed(content)
self.embedding = response.vectors
end
end
RAG Tool
class DocumentSearch < RubyLLM::Tool
description "Searches knowledge base for relevant information"
param :query, desc: "Search query"
def execute(query:)
# Generate embedding for query
embedding = RubyLLM.embed(query).vectors
# Find similar documents using neighbor
documents = Document.nearest_neighbors(
:embedding,
embedding,
distance: "euclidean"
).limit(3)
# Return formatted context
documents.map { |doc|
"#{doc.title}: #{doc.content.truncate(500)}"
}.join("\n\n---\n\n")
end
end
# Usage
chat = RubyLLM.chat
.with_tool(DocumentSearch)
.with_instructions("Search for context before answering. Cite sources.")
response = chat.ask "What is our refund policy?"
Multi-Agent Systems
Researcher and Writer Team
class ResearchAgent < RubyLLM::Tool
description "Researches topics"
param :topic, desc: "Topic to research"
def execute(topic:)
RubyLLM.chat(model: 'gemini-1.5-pro')
.ask("Research #{topic}. List key facts.")
.content
end
end
class WriterAgent < RubyLLM::Tool
description "Writes content based on research"
param :research, desc: "Research findings"
def execute(research:)
RubyLLM.chat(model: 'claude-3-5-sonnet')
.ask("Write an article:\n#{research}")
.content
end
end
# Coordinator uses both tools
coordinator = RubyLLM.chat.with_tools(ResearchAgent, WriterAgent)
article = coordinator.ask("Create an article about Ruby 3.3 features")
Parallel Agent Execution with Async
require 'async'
class ParallelAnalyzer
def analyze(text)
results = {}
Async do |task|
task.async do
results[:sentiment] = RubyLLM.chat
.ask("Sentiment of: #{text}. One word: positive/negative/neutral")
.content
end
task.async do
results[:summary] = RubyLLM.chat
.ask("Summarize in one sentence: #{text}")
.content
end
task.async do
results[:keywords] = RubyLLM.chat
.ask("Extract 5 keywords: #{text}")
.content
end
end
results
end
end
# Usage
analyzer = ParallelAnalyzer.new
insights = analyzer.analyze("Your text here...")
# All three analyses run concurrently
Supervisor Pattern
class CodeReviewSystem
def review_code(code)
Async do |task|
# Run reviews in parallel
reviews = {}
task.async do
reviews[:security] = RubyLLM.chat(model: 'claude-3-5-sonnet')
.ask("Security review:\n#{code}")
.content
end
task.async do
reviews[:performance] = RubyLLM.chat(model: 'gpt-4o')
.ask("Performance review:\n#{code}")
.content
end
task.async do
reviews[:style] = RubyLLM.chat(model: 'gpt-4o-mini')
.ask("Style review (Ruby conventions):\n#{code}")
.content
end
task.wait # Wait for all reviews
# Synthesize findings
RubyLLM.chat.ask(
"Summarize these code reviews:\n" +
reviews.map { |type, review| "#{type}: #{review}" }.join("\n\n")
).content
end
end
end
Error Handling
For robust error handling in agent workflows, leverage the patterns from the Tools guide:
- Return
{ error: "description" }
for recoverable errors the LLM might fix - Raise exceptions for unrecoverable errors (missing config, service down)
- Use the retry middleware for transient failures
See the Error Handling section in Tools for detailed patterns.
Next Steps
- Using Tools - Learn the fundamentals of tool usage
- Rails Integration - Build agent workflows in Rails
- Scale with Async - Deep dive into async patterns
- Error Handling - Build resilient systems