REST API Overview

The ArchiveBox REST API is currently in ALPHA status and may change in future releases.

Introduction

ArchiveBox provides a REST API for programmatic access to your web archiving server. The API is built with django-ninja and follows RESTful conventions.

API Base URL

The API is available at:

http://your-archivebox-server/api/v1/

For example, if running locally:

Local development: http://127.0.0.1:8000/api/v1/
Docker setup: http://web.archivebox.localhost:8000/api/v1/

Interactive API Documentation

ArchiveBox provides an interactive Swagger UI for exploring and testing the API:

http://your-archivebox-server/api/v1/docs

The Swagger UI allows you to:

Browse all available endpoints
View request/response schemas
Test API calls directly from your browser
Generate code examples

API Status

ALPHA Status: This API is still in early development and may change. Key considerations:

API endpoints may be added, modified, or removed
Request/response schemas may change
Breaking changes may occur between versions
Not all features are exposed via the API yet

Available Endpoints

The API is organized into several categories:

Core Models (`/api/v1/core/`)

Snapshots - Manage archived URLs and their metadata
ArchiveResults - Access individual archiving outputs (PDF, screenshot, etc.)
Tags - Organize snapshots with tags

Crawls (`/api/v1/crawls/`)

Crawls - Manage crawl sessions (groups of snapshots from a single import)

Authentication (`/api/v1/auth/`)

API Tokens - Generate and validate API tokens

CLI (`/api/v1/cli/`)

Command Execution - Run ArchiveBox CLI commands via API

Workers (`/api/v1/workers/`)

Background Workers - Monitor and manage background processing

Machine (`/api/v1/machine/`)

System Info - View server and system information

Response Format

All API responses return JSON with consistent structure:

{
  "TYPE": "core.models.Snapshot",
  "id": "01234567-89ab-cdef-0123-456789abcdef",
  "url": "https://example.com",
  "timestamp": "2024-01-15T10:30:00Z",
  ...
}

Error Responses

Errors return JSON with error details:

{
  "succeeded": false,
  "message": "ObjectDoesNotExist: Snapshot matching query does not exist.",
  "errors": [
    "Traceback details..."
  ]
}

Common HTTP status codes:

200 - Success
400 - Bad request (invalid parameters)
403 - Forbidden (authentication failed or insufficient permissions)
404 - Resource not found
503 - Service error

Authentication

API access requires authentication. See the Authentication page for details on:

API token generation
Authentication methods
Permission requirements

Pagination

List endpoints support pagination with the following query parameters:

limit - Number of items per page (default: 200, max: 500)
offset - Number of items to skip
page - Page number (alternative to offset)

Paginated responses include:

{
  "total_items": 1000,
  "total_pages": 5,
  "page": 0,
  "limit": 200,
  "offset": 0,
  "num_items": 200,
  "items": [...]
}

Filtering

Many list endpoints support filtering via query parameters. See individual endpoint documentation for available filters. Example:

GET /api/v1/core/snapshots?search=example.com&status=succeeded

CORS and Security

The API uses token-based authentication and does not set session cookies by default. This allows cross-origin API access when properly configured.

Security features:

All endpoints require superuser authentication
API responses include Cache-Control: no-store header
Debug headers include execution details (stdout/stderr)

Next Steps

Authentication

Set up API tokens and authentication

Snapshots API

Manage archived snapshots

Crawls API

Work with crawl sessions

Tags API

Organize with tags

REST API

Python API

Introduction

API Base URL

Interactive API Documentation

API Status

Available Endpoints

Core Models (`/api/v1/core/`)

Crawls (`/api/v1/crawls/`)

Authentication (`/api/v1/auth/`)

CLI (`/api/v1/cli/`)

Workers (`/api/v1/workers/`)

Machine (`/api/v1/machine/`)

Response Format

Error Responses

Authentication

Filtering

CORS and Security

Next Steps

Authentication

Snapshots API

Crawls API

Tags API

Build docs developers (and LLMs) love

REST API

Python API

​Introduction

​API Base URL

​Interactive API Documentation

​API Status

​Available Endpoints

​Core Models (/api/v1/core/)

​Crawls (/api/v1/crawls/)

​Authentication (/api/v1/auth/)

​CLI (/api/v1/cli/)

​Workers (/api/v1/workers/)

​Machine (/api/v1/machine/)

​Response Format

​Error Responses

​Authentication

​Pagination

​Filtering

​CORS and Security

​Next Steps

Authentication

Snapshots API

Crawls API

Tags API

Build docs developers (and LLMs) love

Introduction

API Base URL

Interactive API Documentation

API Status

Available Endpoints

Core Models (`/api/v1/core/`)

Crawls (`/api/v1/crawls/`)

Authentication (`/api/v1/auth/`)

CLI (`/api/v1/cli/`)

Workers (`/api/v1/workers/`)

Machine (`/api/v1/machine/`)

Response Format

Error Responses

Authentication

Pagination

Filtering

CORS and Security

Next Steps