Providers - Goose

Providers are the bridge between Goose and AI models. They abstract different LLM APIs behind a common interface, enabling Goose to work with 25+ AI services from Anthropic, OpenAI, local models, and more.

What is a Provider?

A provider in Goose is a component that:

Connects to an AI model service (cloud or local)
Translates Goose’s conversation format to the model’s API format
Handles authentication and API keys
Streams responses back to the agent
Manages tool calling protocols
Tracks token usage and costs

// Core provider trait (simplified)
#[async_trait]
pub trait Provider: Send + Sync {
    /// Send a completion request to the model
    async fn complete(
        &self,
        system: String,
        messages: Vec<Message>,
        tools: Vec<Tool>,
    ) -> Result<BoxStream<ProviderMessage>>;
    
    /// List available models
    fn list_models(&self) -> Vec<ModelInfo>;
    
    /// Generate a session name from conversation
    async fn generate_session_name(
        &self,
        messages: &[Message],
    ) -> Result<String>;
}

Built-in Providers

Goose includes native support for many popular AI providers:

Cloud Providers

Anthropic

Claude models (Sonnet, Opus, Haiku)

Native tool calling
Prompt caching
Extended context windows

OpenAI

GPT models (GPT-4o, GPT-4, GPT-3.5)

Function calling
Vision support
Structured outputs

Google

Gemini models via Vertex AI

Multi-modal support
Large context windows
OAuth authentication

AWS Bedrock

Multiple model families

Anthropic Claude
Meta Llama
AWS credentials

Local & Open Source

Ollama

Local model execution

Privacy-first
No API keys required
Qwen, Llama, Mistral, etc.

LiteLLM

Unified gateway to 100+ models

Consistent API across providers
Load balancing
Fallback handling

OpenAI Compatible

Custom OpenAI-compatible servers

vLLM, LocalAI, etc.
Self-hosted models
Custom endpoints

Local Inference

Direct local model execution

llama.cpp integration
GGUF model support
CPU/GPU acceleration

Enterprise Providers

Azure OpenAI: Enterprise OpenAI deployment
Databricks: Databricks model serving
Snowflake Cortex: Snowflake’s AI models
GitHub Copilot: GitHub’s code models
OpenRouter: Multi-provider routing
Venice.ai: Privacy-focused inference

Provider Architecture

Configuration

Via Environment Variables

The simplest way to configure a provider:

# Anthropic
export GOOSE_PROVIDER=anthropic
export GOOSE_MODEL=claude-sonnet-4-20250514
export ANTHROPIC_API_KEY=sk-ant-...

# OpenAI
export GOOSE_PROVIDER=openai
export GOOSE_MODEL=gpt-4o
export OPENAI_API_KEY=sk-...

# Ollama (local)
export GOOSE_PROVIDER=ollama
export GOOSE_MODEL=qwen3-coder:latest
export OLLAMA_HOST=http://localhost:11434

Via Configuration File

# ~/.config/goose/config.yaml
GOOSE_PROVIDER: anthropic
GOOSE_MODEL: claude-sonnet-4-20250514

API keys stored separately in keyring or secrets file:

# ~/.config/goose/secrets.yaml (if GOOSE_DISABLE_KEYRING=1)
ANTHROPIC_API_KEY: sk-ant-...

Via Recipe

# recipe.yaml
title: GPT-4o Code Review
settings:
  goose_provider: openai
  goose_model: gpt-4o
  temperature: 0.2

Provider Implementation

Example: Anthropic Provider

// From crates/goose/src/providers/anthropic.rs
pub struct AnthropicProvider {
    api_key: String,
    base_url: String,
    http_client: reqwest::Client,
}

#[async_trait]
impl Provider for AnthropicProvider {
    async fn complete(
        &self,
        system: String,
        messages: Vec<Message>,
        tools: Vec<Tool>,
    ) -> Result<BoxStream<ProviderMessage>> {
        // Convert to Anthropic API format
        let request = self.build_request(
            system,
            messages,
            tools,
        )?;
        
        // Make streaming API call
        let response = self.http_client
            .post(&format!("{}/v1/messages", self.base_url))
            .header("x-api-key", &self.api_key)
            .header("anthropic-version", "2023-06-01")
            .json(&request)
            .send()
            .await?;
        
        // Stream and parse chunks
        let stream = response
            .bytes_stream()
            .map(|chunk| self.parse_chunk(chunk))
            .boxed();
        
        Ok(stream)
    }
    
    fn list_models(&self) -> Vec<ModelInfo> {
        vec![
            ModelInfo::with_cost(
                "claude-sonnet-4-20250514",
                200_000,  // context window
                0.000003, // input cost per token
                0.000015, // output cost per token
            ),
            // ... other models
        ]
    }
}

Tool Calling Translation

Different providers have different tool calling formats. Goose translates between them:

// Anthropic format
{
  "tools": [{
    "name": "read_file",
    "description": "Read a file",
    "input_schema": {
      "type": "object",
      "properties": {
        "path": {"type": "string"}
      }
    }
  }]
}

// OpenAI format
{
  "tools": [{
    "type": "function",
    "function": {
      "name": "read_file",
      "description": "Read a file",
      "parameters": {
        "type": "object",
        "properties": {
          "path": {"type": "string"}
        }
      }
    }
  }]
}

Goose providers handle these translations automatically.

Model Capabilities

Providers expose model capabilities through ModelInfo:

pub struct ModelInfo {
    pub name: String,
    pub context_limit: usize,              // Max tokens
    pub input_token_cost: Option<f64>,     // Cost per input token
    pub output_token_cost: Option<f64>,    // Cost per output token
    pub supports_cache_control: Option<bool>,  // Prompt caching
}

Goose uses this information to:

Manage context windows
Estimate costs
Enable/disable features (like prompt caching)
Choose appropriate models for subagents

Custom Providers

You can add custom providers without modifying Goose’s code:

1. Declarative Provider (JSON)

For OpenAI-compatible APIs:

// ~/.config/goose/custom_providers/my-provider.json
{
  "name": "my_provider",
  "engine": "openai",
  "display_name": "My Custom LLM",
  "description": "Internal LLM endpoint",
  "api_key_env": "MY_PROVIDER_API_KEY",
  "base_url": "https://llm.company.internal/v1",
  "models": [
    {
      "name": "company-llm-v1",
      "context_limit": 32768,
      "input_token_cost": 0.000001,
      "output_token_cost": 0.000002
    }
  ],
  "supports_streaming": true,
  "requires_auth": true
}

Supported engines:

openai: OpenAI-compatible API
anthropic: Anthropic-compatible API
ollama: Ollama-compatible API

2. Code-Based Provider (Rust)

For custom protocols:

// 1. Create a new file: crates/goose/src/providers/my_provider.rs

use super::base::{Provider, ModelInfo, ProviderMessage};
use async_trait::async_trait;
use anyhow::Result;

pub struct MyProvider {
    api_key: String,
    endpoint: String,
}

impl MyProvider {
    pub fn new(api_key: String, endpoint: String) -> Self {
        Self { api_key, endpoint }
    }
}

#[async_trait]
impl Provider for MyProvider {
    async fn complete(
        &self,
        system: String,
        messages: Vec<Message>,
        tools: Vec<Tool>,
    ) -> Result<BoxStream<ProviderMessage>> {
        // Your implementation here
        todo!()
    }
    
    fn list_models(&self) -> Vec<ModelInfo> {
        vec![ModelInfo::new("my-model-v1", 8192)]
    }
}

// 2. Register in crates/goose/src/providers/init.rs
pub fn create_provider(config: &Config) -> Result<Box<dyn Provider>> {
    match config.provider.as_str() {
        "anthropic" => Ok(Box::new(AnthropicProvider::new(config)?)),
        "openai" => Ok(Box::new(OpenAIProvider::new(config)?)),
        "my_provider" => Ok(Box::new(MyProvider::new(
            config.get_secret("MY_PROVIDER_API_KEY")?,
            config.get("MY_PROVIDER_ENDPOINT")?,
        )?)),
        // ...
    }
}

Provider Selection

Goose determines which provider to use via configuration precedence:

Subagent settings (highest priority)

subagent(settings: {provider: "openai", model: "gpt-4o-mini"})

Recipe settings

settings:
  goose_provider: anthropic
  goose_model: claude-sonnet-4-20250514

Environment variables

GOOSE_PROVIDER=ollama
GOOSE_MODEL=qwen3-coder:latest

Config file
```
GOOSE_PROVIDER: anthropic
```
Default (Anthropic Claude)

Streaming

All providers support streaming responses:

pub enum ProviderMessage {
    Text(String),              // Text chunk
    ToolUse(ToolRequest),      // Tool call request
    Thinking(String),          // Model reasoning (if supported)
    Usage(TokenUsage),         // Token counts
    Done,                      // Stream complete
}

// Agent consumes stream:
let mut stream = provider.complete(system, messages, tools).await?;
while let Some(msg) = stream.next().await {
    match msg {
        ProviderMessage::Text(text) => {
            // Stream to user immediately
            send_to_user(text).await?;
        }
        ProviderMessage::ToolUse(tool) => {
            // Execute tool
            execute_tool(tool).await?;
        }
        ProviderMessage::Done => break,
    }
}

Token Usage Tracking

Providers report token usage for cost estimation:

pub struct TokenUsage {
    pub input_tokens: usize,
    pub output_tokens: usize,
    pub cache_read_tokens: Option<usize>,   // For prompt caching
    pub cache_write_tokens: Option<usize>,
}

// Stored in session
session.accumulated_input_tokens += usage.input_tokens;
session.accumulated_output_tokens += usage.output_tokens;

// Calculate cost
let cost = (usage.input_tokens as f64 * model_info.input_token_cost)
         + (usage.output_tokens as f64 * model_info.output_token_cost);

Error Handling

Providers return standardized errors:

pub enum ProviderError {
    AuthenticationError(String),   // Invalid API key
    RateLimitError(String),        // Rate limit hit
    InvalidRequestError(String),   // Bad request
    ModelNotFoundError(String),    // Unknown model
    NetworkError(String),          // Connection issues
    StreamError(String),           // Streaming failure
    // ...
}

The agent’s retry manager handles transient errors automatically:

loop {
    match provider.complete(...).await {
        Ok(stream) => return Ok(stream),
        Err(ProviderError::RateLimitError(_)) => {
            // Wait and retry
            sleep(backoff.next_delay()).await;
        }
        Err(ProviderError::NetworkError(_)) => {
            // Retry with backoff
            sleep(backoff.next_delay()).await;
        }
        Err(e) => return Err(e), // Don't retry auth errors, etc.
    }
}

Multi-Provider Workflows

You can use different providers for different tasks:

title: Multi-Provider Analysis
instructions: |
  Use different models for different tasks:
  - GPT-4o-mini for simple file operations
  - Claude Sonnet for complex analysis
  - Local Ollama for privacy-sensitive data

settings:
  goose_provider: anthropic
  goose_model: claude-sonnet-4-20250514

prompt: |
  # Use cheap model for file listing
  subagent(
    instructions: "List all Python files",
    settings: {provider: "openai", model: "gpt-4o-mini"}
  )
  
  # Use powerful model for analysis
  subagent(
    instructions: "Analyze the architecture for security issues",
    settings: {provider: "anthropic", model: "claude-sonnet-4-20250514"}
  )
  
  # Use local model for sensitive data
  subagent(
    instructions: "Process customer data locally",
    settings: {provider: "ollama", model: "qwen3-coder:latest"}
  )

Provider Comparison

Provider	Tool Calling	Streaming	Vision	Local	Cost
Anthropic	Native	Yes	Yes (Claude 3.5+)	No	$$$
OpenAI	Function calling	Yes	Yes (GPT-4V)	No	$$$
Ollama	Via toolshim	Yes	Some models	Yes	Free
Google Gemini	Native	Yes	Yes	No	$$
AWS Bedrock	Model-dependent	Yes	Model-dependent	No	$$
LiteLLM	Pass-through	Yes	Model-dependent	No	Varies
Local Inference	Via toolshim	Yes	No	Yes	Free

Tool Calling Approaches

Native: Provider API has built-in tool calling support

Anthropic, OpenAI, Google Gemini
Best accuracy and performance

Toolshim: Goose adds tool calling via system prompts

Ollama, local models
Works but less reliable
Good for experimentation

Best Practices

Choose models appropriate to the task

Simple tasks: Use cheaper/faster models (GPT-4o-mini, Claude Haiku)
Complex reasoning: Use powerful models (Claude Sonnet, GPT-4o)
Code generation: Use code-specialized models (Qwen Coder, Claude)
Privacy-sensitive: Use local models (Ollama)

Set appropriate context limits

// Check model context before sending
let token_count = count_tokens(&messages);
let model_info = provider.list_models()
    .find(|m| m.name == model_name)?;

if token_count > model_info.context_limit * 0.75 {
    // Compact messages
    messages = compact_messages(messages);
}

Handle rate limits gracefully

# Configure retry behavior
retry:
  max_retries: 3
  initial_delay_ms: 1000
  max_delay_ms: 30000
  backoff_multiplier: 2.0

Use prompt caching when available

Anthropic and some other providers support caching system prompts:

// Automatically enabled for supported models
// Significantly reduces cost for repeated requests
if model_info.supports_cache_control {
    // System prompt is cached
    // Only pay for new user messages
}

Troubleshooting

Common Issues

“Authentication failed”

# Check API key is set
echo $ANTHROPIC_API_KEY

# Verify in config
goose configure get ANTHROPIC_API_KEY

“Model not found”

# List available models
goose configure list-models

# Check provider documentation for exact model names

“Rate limit exceeded”

Wait and retry (automatic)
Upgrade API tier
Use multiple API keys with load balancing (via LiteLLM)

“Context length exceeded”

# Reduce max_turns or enable aggressive compaction
settings:
  max_turns: 20  # Limit conversation length

Next Steps

Extensions

Learn about the tools providers can use

Recipes

Configure providers in recipes

Configuration

Advanced provider configuration

Custom Distributions

Bundle custom providers

Getting Started

Core Concepts

User Guides

Advanced

Troubleshooting

​What is a Provider?

​Built-in Providers

​Cloud Providers

Anthropic

OpenAI

Google

AWS Bedrock

​Local & Open Source

Ollama

LiteLLM

OpenAI Compatible

Local Inference

​Enterprise Providers

​Provider Architecture

​Configuration

​Via Environment Variables

​Via Configuration File

​Via Recipe

​Provider Implementation

​Example: Anthropic Provider

​Tool Calling Translation

​Model Capabilities

​Custom Providers

​1. Declarative Provider (JSON)

​2. Code-Based Provider (Rust)

​Provider Selection

​Streaming

​Token Usage Tracking

​Error Handling

​Multi-Provider Workflows

​Provider Comparison

​Tool Calling Approaches

​Best Practices

​Troubleshooting

​Common Issues

​Next Steps

Extensions

Recipes

Configuration

Custom Distributions

Build docs developers (and LLMs) love

What is a Provider?

Built-in Providers

Cloud Providers

Local & Open Source

Enterprise Providers

Provider Architecture

Configuration

Via Environment Variables

Via Configuration File

Via Recipe

Provider Implementation

Example: Anthropic Provider

Tool Calling Translation

Model Capabilities

Custom Providers

1. Declarative Provider (JSON)

2. Code-Based Provider (Rust)

Provider Selection

Streaming

Token Usage Tracking

Error Handling

Multi-Provider Workflows

Provider Comparison

Tool Calling Approaches

Best Practices

Troubleshooting

Common Issues

Next Steps