Skip to content

AI Agent

The AI agent orchestrates the RAG pipeline, retrieving relevant context and generating responses.

Overview

The agent uses the @convex-dev/agent framework to:

  1. Receive user messages
  2. Search for relevant document chunks
  3. Build a context-aware prompt
  4. Generate and stream responses
  5. Include source citations

Agent Configuration

typescript
import { Agent } from '@convex-dev/agent'

const agent = new Agent({
  model: 'openai/gpt-4o', // via OpenRouter
  instructions: `You are a helpful assistant that answers questions 
    based on the provided documents. Always cite your sources.`,
})

RAG Pipeline

1. Query Processing

When a user sends a message, the agent:

  • Extracts the core question
  • Generates an embedding for similarity search

2. Context Retrieval

Vector search finds relevant chunks:

typescript
const relevantChunks = await vectorSearch(query, {
  limit: 5,
  threshold: 0.7,
})

3. Prompt Construction

The retrieved chunks are formatted into the prompt:

Based on the following documents:

[Document 1: filename.pdf]
...chunk content...

[Document 2: notes.md]
...chunk content...

Answer the user's question: {question}

4. Response Generation

The AI generates a response that:

  • Answers based on the retrieved context
  • Cites specific documents
  • Acknowledges when information isn't available

Streaming

Responses stream in real-time using Convex subscriptions, providing immediate feedback as the AI generates text.

Model Selection

Users can choose from various models via OpenRouter:

  • GPT-4o (default)
  • Claude 3.5 Sonnet
  • Gemini Pro
  • And many more

The model can be changed per-conversation in the UI.

Released under the MIT License.