Skip to main content

Thoth IP Reputation Service

Thoth is the reputation database for Anubis. Thoth feeds information to Anubis so that it can make better decisions about which traffic is innocuous and which traffic is suspicious.
Thoth is hosted by Techaro and is a paid service. Thoth is opt-in and requires manual intervention (including payment) to use. The code that powers Thoth is currently closed source.To get access to Thoth, please subscribe on GitHub Sponsors and email Xe. This will be self-service soon.

What is Thoth?

Anubis instances are normally isolated. Each Anubis instance has its own configuration and exists in roughly its own world without any long term memory between requests. As threats, workarounds, and AI scraper toolchains evolve, administrators need a way to get more up-to-date information faster than Anubis’ release cycle. Thoth solves this problem by providing:
  • Real-time threat intelligence: Get up-to-date information about malicious actors faster than Anubis’ release cycle
  • Shared reputation data: Benefit from collective threat intelligence across Anubis deployments
  • ASN and GeoIP filtering: Make decisions based on autonomous system numbers and geographic locations
  • Informative, not authoritative: Thoth influences request weight but doesn’t arbitrarily block traffic

Implementation

Thoth is a web service that listens over gRPC. Thoth’s API is documented in protocol buffer definitions in the GitHub repo TecharoHQ/thoth-proto. Thoth is designed to be informative, not authoritative. Thoth cannot and will not arbitrarily block requests, origins, or other traffic. Thoth is there to inform Anubis and influence the weight of requests so that upstream resources can be protected. Additionally, Anubis aggressively caches data from Thoth such that over time Anubis will not need to request data very often. This makes the fast path for repeat visitors even faster and reduces the amount of data that Thoth is exposed to.

Configuration

To enable Thoth integration, configure the following environment variables:

THOTH_URL

The URL for your Thoth instance.
THOTH_URL=thoth.example.com:443

THOTH_TOKEN

Your API token for authenticating with Thoth.
THOTH_TOKEN=your-api-token-here
Both THOTH_URL and THOTH_TOKEN must be set together. If you set one without the other, Anubis will log a warning and Thoth integration will not be enabled.

Example Configuration

# Enable Thoth integration
THOTH_URL=thoth.techaro.lol:443
THOTH_TOKEN=your-secret-token
The client automatically:
  • Uses TLS for secure communication (unless THOTH_INSECURE is set)
  • Sets a 500ms timeout for Thoth requests
  • Includes the Anubis version in the User-Agent header
  • Provides Prometheus metrics for monitoring gRPC performance

Features

ASN-based Filtering

When companies link their backbone infrastructure to the Internet, they do so via a BGP Autonomous System, denoted by a number (the Autonomous System Number or ASN). Every IP address on the Internet is owned by an ASN with a 1:1 lookup that does not change very frequently. Anubis uses Thoth to match IP addresses to BGP Autonomous Systems so that you can either issue arbitrary challenges to individual internet service providers (such as Cloudflare or Huawei Cloud) or, at the administrator’s explicit instruction, block them altogether. Example: Add 10 weight points to requests from Cloudflare, Huawei Cloud, and Alibaba Cloud:
- name: aggressive-asns-without-functional-abuse-contact
  action: WEIGH
  asns:
    match:
      - 13335   # Cloudflare
      - 136907  # Huawei Cloud
      - 45102   # Alibaba Cloud
  weight:
    adjust: 10
You can look up details for AS13335 or any of these other top offenders on bgp.tools.

GeoIP-based Filtering

In extreme cases, an administrator may have to take action against an entire country. This is not an ideal circumstance, but sometimes reality forces their hands and the administrators just want to sleep at night. Anubis uses Thoth to look up the geographic location registered to an IP address. This lookup is not the best and will get better with time, but you ship what you can so you can make it better for next time. Example: Add 10 weight points to requests from Brazil and China:
- name: countries-with-aggressive-scrapers
  action: WEIGH
  geoip:
    countries:
      - BR
      - CN
  weight:
    adjust: 10
Use geographic filtering with care. Blocking entire countries can impact legitimate users.

Work-in-Progress Features

The following features are planned for future releases:
  • Private rulesets: Advanced patterns, current known exploits, and other recognition tactics that need to be kept confidential for operational security reasons
  • Private challenge implementations: Advanced browser detection logic via WebAssembly
  • Reputation querying: Arbitrarily influence request weight based on net aggregate pass rate so common browsers can get through with no challenge
  • Abuse reporting APIs: Allow trusted administrators to report abusive request fingerprints for faster threat response
  • Pass rate reporting: Periodic reporting of pass rates per ASN and other fingerprints for methodology improvement

Benefits

  • Faster threat response: React to new threats faster than Anubis’ release cycle
  • Reduced false positives: Better intelligence leads to more accurate traffic classification
  • Performance optimization: Aggressive caching means minimal latency impact
  • Collective intelligence: Benefit from threat data across multiple Anubis deployments
  • Fine-grained control: Use ASN and GeoIP data to create sophisticated filtering rules

Build docs developers (and LLMs) love