The AI system uses LangChain for multi-provider support with a modular agent architecture.

Multi-LLM Support

The AI system supports multiple LLM providers through a unified interface built on LangChain. You can use models from OpenAI, Anthropic, and Google interchangeably across all agents.

File Structure

config/
└── ai.ts                 # Models, providers, pricing configuration

lib/ai/
├── index.ts              # Main exports
├── providers/
│   └── index.ts          # LangChain model factory (OpenAI, Anthropic, Google)
├── agents/
│   ├── index.ts          # Agent registry
│   ├── types.ts          # Agent interfaces
│   ├── base-agent.ts     # Base class for all agents
│   ├── chat/index.ts     # Chat agent
│   ├── code-assistant/   # Code assistant agent
│   ├── translator/       # Translator agent
│   └── writer/           # Writer agent
├── cache/
│   └── index.ts          # Prompt caching utilities — applied to the Anthropic system prompt in base-agent.ts (default `5m`, optional `1h` TTL via `AnthropicCacheTTL`)
└── langgraph.ts          # Message building utilities

app/api/ai/
└── stream/route.ts       # SSE streaming endpoint

Supported Providers

Provider Models Environment Variable
OpenAI GPT-4.1, GPT-4.1 Mini, GPT-4.1 Nano, GPT-4o, GPT-4o Mini, O3, O3-mini, O4-mini OPENAI_API_KEY
Anthropic Claude Opus 4.5, Sonnet 4.5, Haiku 4.5 + legacy models (Claude 4.1/4/3.7/3.5) ANTHROPIC_API_KEY
Google Gemini 3 Pro, 3 Flash, 2.5 Pro, 2.5 Flash, 2.5 Flash Lite, 2.0 Flash + image variants GOOGLE_AI_API_KEY

Model Configuration

Models are defined in config/ai.ts with pricing and capabilities. Each model entry specifies its provider, model ID, display name, input/output token costs (per million tokens), maximum context length, and any feature flags (like vision support or reasoning capability). To add a new model, add an entry to the models array in the config file and ensure the corresponding provider API key is set in your environment.

Provider Factory

The provider factory (in lib/ai/providers/index.ts) creates LangChain-compatible model instances dynamically based on the configuration. It reads the model definition, determines the provider, and instantiates the correct LangChain class (ChatOpenAI, ChatAnthropic, or ChatGoogleGenerativeAI). This means you can switch models or add new providers without changing your agent code.

Built-in Agents

Five agents ship out of the box — Chat (default), Code Assistant, Translator, Writer, and Knowledge Base (RAG) — each defined in lib/ai/agents/ and registered in lib/ai/agents/index.ts. All extend BaseAgent, declare an allowedModels list, and surface in the agent selector dropdown. Credits are charged uniformly across agents — 1 credit per LLM token (input + output), deducted post-stream against provider-reported usage.

For the full architecture, registry pattern, custom-agent recipe, and advanced override hooks (like the RAG agent's prepareRAGContext() pre-fetch), see Built-in Agents.

Chat Interface

A full-featured AI conversation page lives at /private-dashboard/chat — Server Component shell with a Client Component (components/private/chat-interface.tsx) for streaming, agent switching, and session history. Data reads come from core/chat/queries.ts; writes go through core/chat/mutations.ts. Sessions are persisted in chat_sessions, messages in chat_messages, both RLS-scoped by membership.

For the file structure, hook API (useChat), session management rules, SSE handling, and pagination details, see Chat Interface.

RAG Document Chat

The Knowledge Base agent lets users upload TXT/MD/PDF files at /private-dashboard/documents, chunks and embeds them via OpenAI, and answers questions by retrieving relevant chunks from a pgvector HNSW index. Similarity search uses a membership-enforced match_document_chunks_text SECURITY DEFINER RPC; the match cutoff is configurable via embeddingConfig.ragMatchThreshold (default 0.1 in config/ai.ts). Credits are deducted 1:1 with the embedding tokens consumed.

For the embedding-model table, credit math, RPC details, API routes, and the Knowledge Base page walkthrough, see RAG & Documents.

Config KeyDefaultDescription
embeddingConfig.defaultModeltext-embedding-3-smallDefault embedding model for new documents
embeddingConfig.ragMatchThreshold0.1Cosine-distance cutoff for RAG chunk retrieval (pgvector <=>). Single source of truth — both core/documents/ reads and the rag agent must read this instead of inlining a literal.
aiConfig.minCreditsRequired100Minimum balance for the pre-flight check before any LLM call (chat or RAG)