Contributing to ArchiveBox
All contributions to ArchiveBox are welcomed! We appreciate your help in making this project better.Getting Started
Confirm Your Idea Fits
- Check our Roadmap to confirm your desired features fit into our bigger project goals
- Open an issue with your planned implementation to discuss
- Check in before starting development to make sure your work won’t conflict with or duplicate existing work
Find Something to Work On
For low hanging fruit and easy first tickets, see:Development Setup
Prerequisites
- Python 3.11+ (3.13 recommended)
- uv package manager
- A non-root user for running tests (e.g.,
testuser)
Clone and Setup
Run Development Server
Docker Development
Code Style Guidelines
Naming Conventions
Use consistent naming for grep-ability and logical grouping:- Group related functions with common prefixes:
fs_migrate(),fs_migration_needed() - Use
_prefix for private helpers:_log_error(),_fs_next_version() - All logging methods must start with
log_or_log
Minimize Unique Names
Reuse existing field names and data structures to keep the codebase predictable:Testing
Running Tests
Test Writing Standards
- NO MOCKS: Tests must exercise real code paths with real databases
- NO SKIPS: Never use
@skiporpytest.mark.skip- fix the test or code instead - Strict Assertions: Use exact counts (
==) not loose bounds (>=)
Linting
Making Changes
Database Migrations
Adding a New Extractor
Extractors are external binaries or scripts that archive content. See examples:- Issue #399 + PR #403 - Adding SingleFile extractor
- Check
archivebox/extractors/for existing extractors
Submitting Changes
- Make your changes and test thoroughly
- Commit with clear messages describing the why, not just the what
- Push to your fork and submit a Pull Request
- Wait for review feedback and be patient - we all have day jobs!
- Don’t abandon your PR - ping @theSquashSH if you need faster response
Pull Request Guidelines
- Open an issue first to discuss your proposed changes
- Keep PRs focused on a single feature or fix
- Include tests for new functionality
- Update documentation as needed
- Follow existing code style and conventions
- Make sure all tests pass before submitting
Code Coverage
We track code coverage to find dead code and improve test quality:Developer Resources
- Python API Documentation: https://docs.archivebox.io/en/dev/archivebox.html
- Architecture Diagrams: https://github.com/ArchiveBox/ArchiveBox/wiki/ArchiveBox-Architecture-Diagrams
- Issue Tracker: https://github.com/ArchiveBox/ArchiveBox/issues
- Roadmap: https://github.com/ArchiveBox/ArchiveBox/wiki/Roadmap
- Development Guide:
CLAUDE.mdin the repository
Getting Help
- Open issues on GitHub
- Join discussions on GitHub Discussions
- Contact via Twitter: @ArchiveBoxApp or @theSquashSH
- Visit https://sweeting.me/#contact
Common Development Tasks
See the./bin/ folder for bash scripts covering common tasks. Examples: