Skip to main content

Overview

The Snapshots API allows you to manage archived URLs (snapshots) programmatically. Each snapshot represents a single archived URL with its associated metadata and archiving results. Base URL: /api/v1/core/snapshots

Snapshot Schema

A snapshot object contains:
{
  "TYPE": "core.models.Snapshot",
  "id": "01234567-89ab-cdef-0123-456789abcdef",
  "created_by_id": "1",
  "created_by_username": "admin",
  "created_at": "2024-01-15T10:30:00Z",
  "modified_at": "2024-01-15T10:35:00Z",
  "status": "succeeded",
  "retry_at": null,
  "bookmarked_at": "2024-01-15T10:30:00Z",
  "downloaded_at": "2024-01-15T10:31:00Z",
  "url": "https://example.com",
  "tags": ["important", "tutorial"],
  "title": "Example Domain",
  "timestamp": "2024-01-15T10:30:00",
  "archive_path": "archive/2024-01-15T10:30:00",
  "num_archiveresults": 12,
  "archiveresults": []
}

Field Descriptions

FieldTypeDescription
idUUIDUnique identifier for the snapshot
created_by_idstringUser ID who created the snapshot
created_by_usernamestringUsername who created the snapshot
created_atdatetimeWhen the snapshot was created
modified_atdatetimeWhen the snapshot was last modified
statusstringCurrent status (see Status Values)
retry_atdatetime?When to retry archiving (null if not scheduled)
bookmarked_atdatetimeWhen the URL was bookmarked
downloaded_atdatetime?When archiving completed
urlstringThe archived URL
tagsstring[]List of tag names
titlestring?Page title
timestampstringSnapshot timestamp identifier
archive_pathstringFilesystem path to archived content
num_archiveresultsintNumber of archiving results
archiveresultsarrayList of archive results (if with_archiveresults=true)

Status Values

  • queued - Waiting to be archived
  • started - Currently being archived
  • succeeded - Successfully archived
  • failed - Archiving failed
  • sealed - Cancelled/frozen (no further archiving)

List Snapshots

curl http://127.0.0.1:8000/api/v1/core/snapshots \
  -H "X-ArchiveBox-API-Key: your-token-here"

Query Parameters

ParameterTypeDescription
limitintItems per page (default: 200, max: 500)
offsetintNumber of items to skip
pageintPage number (alternative to offset)
with_archiveresultsboolInclude archiveresults array (default: false)

Filter Parameters

ParameterDescription
searchSearch URL, title, tags, ID, or timestamp
idFilter by ID or timestamp (prefix match)
urlExact URL match
tagFilter by tag name
titleFilter by title (case-insensitive)
timestampFilter by timestamp (prefix match)
created_by_idFilter by creator user ID
created_by_usernameFilter by creator username
created_atExact creation date
created_at__gteCreated after date
created_at__ltCreated before date
modified_atLast modified date
modified_at__gteModified after date
modified_at__ltModified before date
bookmarked_at__gteBookmarked after date
bookmarked_at__ltBookmarked before date

Example: Filter by Tag

curl "http://127.0.0.1:8000/api/v1/core/snapshots?tag=important" \
  -H "X-ArchiveBox-API-Key: your-token-here"

Example: Search with Pagination

curl "http://127.0.0.1:8000/api/v1/core/snapshots?search=example.com&limit=10&page=0" \
  -H "X-ArchiveBox-API-Key: your-token-here"

Response

{
  "total_items": 150,
  "total_pages": 15,
  "page": 0,
  "limit": 10,
  "offset": 0,
  "num_items": 10,
  "items": [
    { /* snapshot object */ },
    { /* snapshot object */ }
  ]
}

Get Single Snapshot

Retrieve a specific snapshot by ID or timestamp.
curl http://127.0.0.1:8000/api/v1/core/snapshot/01234567 \
  -H "X-ArchiveBox-API-Key: your-token-here"

Path Parameters

ParameterDescription
snapshot_idSnapshot UUID (full or prefix) or timestamp

Query Parameters

ParameterTypeDefaultDescription
with_archiveresultsbooltrueInclude archiveresults array

Response

Returns a single snapshot object (see Snapshot Schema above).

Update Snapshot

Update snapshot status or retry time.
curl -X PATCH http://127.0.0.1:8000/api/v1/core/snapshot/01234567 \
  -H "X-ArchiveBox-API-Key: your-token-here" \
  -H "Content-Type: application/json" \
  -d '{
    "status": "sealed"
  }'

Request Body

{
  "status": "sealed",          // Optional: new status value
  "retry_at": "2024-01-20T10:00:00Z"  // Optional: schedule retry
}

Use Cases

Cancel queued archiving:
{"status": "sealed"}
Setting status to sealed automatically sets retry_at to null. Schedule a retry:
{"retry_at": "2024-01-20T10:00:00Z"}

Valid Status Transitions

You can update status to any of these values:
  • queued
  • started
  • succeeded
  • failed
  • sealed

Response

Returns the updated snapshot object.

Archive Results

Each snapshot can have multiple archive results (PDF, screenshot, DOM, etc.). Include them with:
curl "http://127.0.0.1:8000/api/v1/core/snapshot/01234567?with_archiveresults=true" \
  -H "X-ArchiveBox-API-Key: your-token-here"
See ArchiveResults for details on the archiveresults schema.

Common Workflows

Find Recently Failed Snapshots

curl "http://127.0.0.1:8000/api/v1/core/snapshots?status=failed&created_at__gte=2024-01-01" \
  -H "X-ArchiveBox-API-Key: your-token-here"

Get All Snapshots for a Tag

curl "http://127.0.0.1:8000/api/v1/core/snapshots?tag=important" \
  -H "X-ArchiveBox-API-Key: your-token-here"

Cancel All Queued Work

import requests

api_key = "your-token-here"
base_url = "http://127.0.0.1:8000/api/v1"
headers = {"X-ArchiveBox-API-Key": api_key}

# Get all queued snapshots
response = requests.get(
    f"{base_url}/core/snapshots",
    headers=headers,
    params={"status": "queued", "limit": 500}
)

# Seal each one
for snapshot in response.json()["items"]:
    requests.patch(
        f"{base_url}/core/snapshot/{snapshot['id']}",
        headers={**headers, "Content-Type": "application/json"},
        json={"status": "sealed"}
    )

Search by URL Pattern

curl "http://127.0.0.1:8000/api/v1/core/snapshots?search=github.com" \
  -H "X-ArchiveBox-API-Key: your-token-here"

Error Responses

404 Not Found

{
  "succeeded": false,
  "message": "ObjectDoesNotExist: Snapshot matching query does not exist."
}

400 Bad Request

{
  "succeeded": false,
  "message": "Invalid status: invalid-status"
}

ArchiveResults API

Access individual archiving outputs

Tags API

Manage snapshot tags

Crawls API

View snapshot’s parent crawl

Build docs developers (and LLMs) love