Skip to content

Providers

OpenKoi is model-agnostic. It supports multiple LLM providers out of the box and can assign different models to different roles within the iteration engine. Providers are auto-discovered from environment variables and local services -- no manual configuration required for the common case.

Provider Trait

Every provider implements the same Rust trait, making them interchangeable:

rust
#[async_trait]
pub trait ModelProvider: Send + Sync {
    fn id(&self) -> &str;
    fn name(&self) -> &str;
    fn models(&self) -> &[ModelInfo];

    async fn chat(
        &self,
        request: ChatRequest,
    ) -> Result<ChatResponse, ProviderError>;

    async fn chat_stream(
        &self,
        request: ChatRequest,
    ) -> Result<Pin<Box<dyn Stream<Item = Result<ChatChunk>>>>, ProviderError>;

    async fn embed(
        &self,
        texts: &[&str],
    ) -> Result<Vec<Vec<f32>>, ProviderError>;
}

The chat method is used for non-streaming calls (evaluation, planning). The chat_stream method is used for execution output that streams to the terminal. The embed method generates vector embeddings for semantic memory search.


Built-in Providers

Subscription-Based (OAuth)

These providers use device-code OAuth — you authenticate through your browser using your existing subscription. No API key needed.

GitHub Copilot

PropertyValue
Provider IDgithub-copilot
AuthDevice-code OAuth (GitHub login)
Commandopenkoi connect copilot
Default modelCopilot Chat model
APIGitHub Copilot Chat API
StreamingYes (SSE)
EmbeddingsNo

Works with any GitHub Copilot subscription (Individual, Business, or Enterprise). When you run openkoi connect copilot, OpenKoi displays a device code and opens your browser. Sign in to GitHub, enter the code, and you're connected.

$ openkoi connect copilot

  Visit: https://github.com/login/device
  Enter code: ABCD-1234

  Waiting for authorization...
  Connected to GitHub Copilot.

Tokens are stored in ~/.openkoi/auth.json and refreshed automatically before expiry.

ChatGPT (OpenAI)

PropertyValue
Provider IDopenai-oauth
AuthDevice-code OAuth (OpenAI login)
Commandopenkoi connect chatgpt
Default modelChatGPT model
APIOpenAI ChatGPT API
StreamingYes (SSE)
EmbeddingsNo

Works with ChatGPT Plus or Pro subscriptions. The flow is identical to GitHub Copilot — device code, browser login, done.

$ openkoi connect chatgpt

  Visit: https://login.chatgpt.com/device
  Enter code: WXYZ-5678

  Waiting for authorization...
  Connected to ChatGPT.

Tokens are stored in ~/.openkoi/auth.json alongside any other OAuth tokens.

API Key Providers

Anthropic

PropertyValue
Provider IDanthropic
API key env varANTHROPIC_API_KEY
Default modelclaude-sonnet-4-5
APIAnthropic Messages API
StreamingYes (SSE)
EmbeddingsNo (use OpenAI embedder)

Anthropic is the highest-priority API key provider when auto-detecting. OpenKoi uses Anthropic's prompt caching feature to reduce costs on repeated system prompts across iterations within a session.

Prompt Caching

The system prompt (soul + task context + skill descriptions) is marked with cache_control: Ephemeral. This tells Anthropic to cache the system prompt across consecutive API calls, saving approximately 90% of input tokens on the system prompt portion.

rust
// Simplified from src/provider/anthropic.rs
SystemBlock {
    text: system_prompt,
    cache_control: Some(CacheControl::Ephemeral),
}

Prompt caching is automatic and requires no configuration. It only applies to the Anthropic provider.

Credential Sources

Anthropic keys can come from multiple sources (checked in order):

  1. ANTHROPIC_API_KEY environment variable
  2. Claude CLI credentials at ~/.claude/.credentials.json
  3. macOS Keychain entry Claude Code-credentials
  4. Saved credentials at ~/.openkoi/credentials/anthropic.key

OpenAI

PropertyValue
Provider IDopenai
API key env varOPENAI_API_KEY
Default modelgpt-5.2
APIOpenAI Chat Completions API
StreamingYes (SSE)
EmbeddingsYes (text-embedding-3-small)

OpenAI is the second-priority API key provider. It also serves as the default embedder -- openai/text-embedding-3-small is used for vector embeddings unless overridden.

Credential Sources
  1. OPENAI_API_KEY environment variable
  2. OpenAI Codex CLI credentials
  3. Saved credentials at ~/.openkoi/credentials/openai.key

Google

PropertyValue
Provider IDgoogle
API key env varGOOGLE_API_KEY
Default modelgemini-2.5-pro
APIGoogle Generative AI API
StreamingYes
EmbeddingsYes

Google is the third-priority provider in auto-detection.

Local Providers

Ollama

PropertyValue
Provider IDollama
ConnectionLocal probe at localhost:11434
Default modelBest available (see priority below)
APIOllama REST API (OpenAI-compatible)
StreamingYes
EmbeddingsYes (model-dependent)

Ollama is the zero-cost, zero-key path. OpenKoi probes localhost:11434 on startup. If Ollama is running, it queries the available models and selects the best one.

Model Priority

When multiple Ollama models are available, OpenKoi picks the best one by quality:

PriorityModelNotes
1qwen2.5-coderBest coding model available locally
2codestralStrong coding model from Mistral
3deepseek-coder-v2Good code + general capability
4llama3.3Strong general-purpose model
5llama3.1Older but capable
6mistralLightweight general-purpose
7gemma2Google's open model
fallbackFirst available modelIf none of the above are found

If no models are installed, OpenKoi suggests running ollama pull llama3.3.

AWS Bedrock

PropertyValue
Provider IDbedrock
AuthAWS credentials (IAM / SigV4)
APIAWS Bedrock Runtime API
StreamingYes
EmbeddingsYes (model-dependent)
Required Environment Variables
VariableRequiredDescription
AWS_ACCESS_KEY_IDYesAWS IAM access key
AWS_SECRET_ACCESS_KEYYesAWS IAM secret key
AWS_SESSION_TOKENNoTemporary session token (for assumed roles)
AWS_REGIONNoAWS region (defaults to us-east-1)

Bedrock uses SigV4 signing for all API requests. OpenKoi handles the signing internally using the standard AWS credential chain.

Available Models
ModelBedrock Model ID
Claude Sonnet 4anthropic.claude-sonnet-4-20250514
Claude 3.5 Haikuanthropic.claude-3-5-haiku-20241022
Amazon Nova Proamazon.nova-pro-v1:0
Llama 3.3 70Bmeta.llama3-3-70b-instruct-v1:0

OpenAI-Compatible Providers

Any provider that implements the OpenAI Chat Completions API can be used. These are configured via environment variables or the config file.

ProviderAPI Key Env VarDefault ModelBase URL
GroqGROQ_API_KEYllama-3.3-70b-versatilehttps://api.groq.com/openai/v1
OpenRouterOPENROUTER_API_KEYautohttps://openrouter.ai/api/v1
TogetherTOGETHER_API_KEYmeta-llama/Llama-3.3-70B-Instruct-Turbohttps://api.together.xyz/v1
DeepSeekDEEPSEEK_API_KEYdeepseek-chathttps://api.deepseek.com/v1
MiniMaxMINIMAX_API_KEYMiniMax-M2.5https://api.minimax.io/v1
Moonshot / KimiMOONSHOT_API_KEYkimi-k2.5https://api.moonshot.cn/v1
xAIXAI_API_KEYgrok-4-0709https://api.x.ai/v1
CustomUser-definedUser-definedUser-defined
Custom Endpoint Configuration

For a self-hosted or unlisted OpenAI-compatible endpoint, use the provider picker during openkoi init and select "Other (OpenAI-compatible URL)", or set it up directly in the config file.


Credential Discovery

On startup, OpenKoi scans for credentials in the following order. The first match wins.

PrioritySourceExample
1Environment variablesANTHROPIC_API_KEY, OPENAI_API_KEY, etc.
2OAuth storeGitHub Copilot, ChatGPT tokens from openkoi connect (~/.openkoi/auth.json)
3Claude CLI credentials~/.claude/.credentials.json
4Claude CLI Keychain (macOS)macOS Keychain entry Claude Code-credentials
5OpenAI Codex CLICodex CLI auth credentials
6Qwen CLI~/.qwen/oauth_creds.json
7Saved OpenKoi credentials~/.openkoi/credentials/<provider>.key
8Ollama probeTCP connection to localhost:11434

This means if you already have Claude Code or Codex CLI installed and authenticated, OpenKoi will automatically use those credentials with zero setup. And if you have a GitHub Copilot or ChatGPT subscription, openkoi connect is all you need.


Default Model Priority

When no model is specified (no --model flag, no OPENKOI_MODEL, no config file), OpenKoi picks the best available model:

PriorityProviderModelWhy
1anthropicclaude-sonnet-4-5Best overall quality for coding and reasoning
2openaigpt-5.2Strong general-purpose model
3googlegemini-2.5-proCompetitive with large context window
4ollamaBest local modelFree, no API key needed

If no provider is found at all, OpenKoi launches an interactive provider picker that helps you set up Ollama (free) or paste an API key.


Role-Based Model Assignment

OpenKoi assigns models to four distinct roles in the iteration engine:

RoleWhat it doesRecommended characteristics
ExecutorPerforms the task (writes code, generates text, analyzes data)Fast, high-quality generation
EvaluatorJudges the executor's output against rubricsPrecise, critical reasoning
PlannerCreates the initial plan and refines it between iterationsGood at decomposition and strategy
EmbedderGenerates vector embeddings for semantic memory searchFast, inexpensive

Default Behavior

If you specify a single model (via --model or OPENKOI_MODEL), it is used for executor, evaluator, and planner. The embedder always defaults to openai/text-embedding-3-small unless explicitly overridden.

bash
# All three roles use claude-sonnet-4-5
openkoi --model anthropic/claude-sonnet-4-5 "Fix the bug"

Per-Role Assignment

For more control, assign different models to different roles:

bash
# CLI flags
openkoi --executor anthropic/claude-sonnet-4-5 --evaluator anthropic/claude-opus-4-6 "Fix the bug"
toml
# config.toml
[models]
executor  = "anthropic/claude-sonnet-4-5"
evaluator = "anthropic/claude-opus-4-6"
planner   = "anthropic/claude-sonnet-4-5"
embedder  = "openai/text-embedding-3-small"

A common pattern is to use a fast, cheaper model for execution and a more capable model for evaluation:

toml
[models]
executor  = "openai/gpt-5.2"
evaluator = "anthropic/claude-opus-4-6"

Fallback Chain

When a provider returns a transient error (rate limit, server error, timeout), OpenKoi automatically falls back to the next model in the chain.

How It Works

  1. The primary model is tried first.
  2. On transient failure, the model enters a cooldown period and is temporarily skipped.
  3. The next model in the fallback chain is tried.
  4. If all models in the chain fail, the task returns an AllCandidatesExhausted error.

Configuration

toml
[models.fallback]
executor = [
  "anthropic/claude-sonnet-4-5",
  "openai/gpt-5.2",
  "ollama/llama3.3",
]

What Triggers a Fallback

Error TypeFallback?Notes
Rate limit (429)YesModel enters cooldown
Server error (5xx)YesModel enters cooldown
TimeoutYesModel enters cooldown
Authentication error (401/403)NoPermanent error, not retriable
Invalid request (400)NoPermanent error, not retriable

Model Reference Format

Throughout OpenKoi -- CLI flags, config files, REPL commands -- models are referenced using the provider/model-name format:

anthropic/claude-sonnet-4-5
openai/gpt-5.2
google/gemini-2.5-pro
ollama/llama3.3
ollama/codestral
bedrock/anthropic.claude-sonnet-4-20250514
groq/llama-3.3-70b-versatile
openrouter/auto
together/meta-llama/Llama-3.3-70B-Instruct-Turbo
deepseek/deepseek-chat
minimax/MiniMax-M2.5
xai/grok-4-0709

The provider prefix is required to disambiguate models that may share names across providers.


Connect and Disconnect

Use openkoi connect to authenticate with subscription-based providers, and openkoi disconnect to remove stored credentials. Both commands support interactive selection when the provider argument is omitted.

Connecting

bash
# Interactive picker (shows all providers and integrations with descriptions)
openkoi connect

# Or specify directly
openkoi connect copilot     # GitHub Copilot
openkoi connect chatgpt     # ChatGPT Plus/Pro

# Check connection status for all providers
openkoi connect status

When run without an argument, connect shows an interactive picker listing all 12 supported targets with descriptive hints:

$ openkoi connect

? Select a provider or integration to connect:
> GitHub Copilot — Device-code OAuth (use your existing subscription)
  ChatGPT Plus/Pro — Device-code OAuth (use your existing subscription)
  Anthropic — API key
  OpenAI — API key
  ...

Each connect command initiates a device-code flow: OpenKoi displays a URL and a short code, opens your browser, and waits for you to authorize. Once complete, tokens are saved to ~/.openkoi/auth.json and refreshed automatically.

Disconnecting

bash
# Interactive picker (shows only currently connected providers)
openkoi disconnect

# Or specify directly
openkoi disconnect copilot      # Remove GitHub Copilot OAuth token
openkoi disconnect chatgpt      # Remove ChatGPT OAuth token
openkoi disconnect anthropic    # Remove saved API key

# Remove all OAuth tokens
openkoi disconnect all

When run without an argument, disconnect dynamically scans your currently connected providers and integrations and shows only those in the picker, plus an "All" option:

$ openkoi disconnect

? Select a provider to disconnect:
> GitHub Copilot (connected)
  Anthropic (API key from env)
  Slack (connected)
  All — remove all stored credentials

Disconnecting deletes the stored token or key. You can reconnect at any time by running openkoi connect again.

Connection Status

bash
$ openkoi connect status

  GitHub Copilot:  connected (token valid, expires in 47m)
  ChatGPT:         connected (token valid, expires in 2h 13m)
  Anthropic:       connected (API key from env)
  Ollama:          connected (localhost:11434, 3 models)

Adding a New Provider

OpenKoi's provider layer is designed for extensibility. There are three paths:

1. OpenAI-Compatible (Easiest)

If the provider implements the OpenAI Chat Completions API, no code changes are needed. Set the API key and base URL:

bash
export MY_PROVIDER_API_KEY=sk-...

Then reference it via config or CLI with an OpenAI-compatible prefix.

2. WASM Plugin

For providers with non-standard APIs, implement the provider interface as a WASM plugin:

toml
[plugins]
wasm = ["~/.openkoi/plugins/wasm/my-provider.wasm"]

WASM plugins run sandboxed and must declare network capabilities in their manifest.

3. Native (Rust)

For first-class support, implement the ModelProvider trait in Rust and submit a contribution. See src/provider/ for existing implementations.


Token Usage Tracking

Every API call returns a TokenUsage struct that feeds into cost tracking:

rust
pub struct TokenUsage {
    pub input_tokens: u32,
    pub output_tokens: u32,
    pub cache_read_tokens: u32,   // Anthropic prompt caching
    pub cache_write_tokens: u32,  // Anthropic prompt caching
}

The cache_read_tokens and cache_write_tokens fields are specific to Anthropic's prompt caching. For other providers, these are always zero. Cost is calculated per-model using the provider's published pricing.

View your token usage and costs with:

bash
openkoi status --costs

Released under the MIT License.