Policies in Anubis define how to identify and handle different types of traffic. They consist of bot rules (pattern-based detection) and thresholds (weight-based triggers).
Policy Structure
Anubis policies are configured in YAML and loaded at startup:
bots :
- name : verified-googlebot
remote_addresses :
- "66.249.64.0/19"
action : ALLOW
- name : suspicious-user-agents
user_agent_regex : "(curl|wget|scrapy)"
action : CHALLENGE
challenge :
algorithm : fast
difficulty : 3
thresholds :
- name : high-suspicion
expression : "weight >= 10"
action : CHALLENGE
challenge :
algorithm : fast
difficulty : 4
Bot Rules
Bot rules are evaluated sequentially. The first matching rule with a terminal action (ALLOW, DENY, CHALLENGE, BENCHMARK) determines the request’s fate.
Rule Definition
// From lib/config/config.go:58
type BotConfig struct {
UserAgentRegex * string `json:"user_agent_regex,omitempty"`
PathRegex * string `json:"path_regex,omitempty"`
HeadersRegex map [ string ] string `json:"headers_regex,omitempty"`
Expression * ExpressionOrList `json:"expression,omitempty"`
Challenge * ChallengeRules `json:"challenge,omitempty"`
Weight * Weight `json:"weight,omitempty"`
GeoIP * GeoIP `json:"geoip,omitempty"`
ASNs * ASNs `json:"asns,omitempty"`
Name string `json:"name"`
Action Rule `json:"action"`
RemoteAddr [] string `json:"remote_addresses,omitempty"`
}
Matching Conditions
Rules can match requests using multiple conditions (AND logic):
Match against the User-Agent header: - name : block-python-scrapers
user_agent_regex : "python-requests|httpx|aiohttp"
action : DENY
Implementation : lib/policy/checker.go:NewUserAgentChecker()
Match against the request path: - name : protect-admin
path_regex : "^/admin/.*"
action : CHALLENGE
challenge :
difficulty : 5
Implementation : lib/policy/checker.go:NewPathChecker()
Match against client IP addresses: - name : internal-network
remote_addresses :
- "10.0.0.0/8"
- "172.16.0.0/12"
- "192.168.0.0/16"
action : ALLOW
Implementation : lib/policy/checker.go:NewRemoteAddrChecker() using gaissmai/bart prefix table
Advanced matching with Common Expression Language: - name : rate-limit-trigger
expression :
- "req.headers['x-forwarded-for'].size() > 0"
- "req.path.startsWith('/api/')"
- "req.method in ['POST', 'PUT', 'DELETE']"
action : WEIGH
weight :
adjust : 5
Available variables:
req.method (string)
req.path (string)
req.headers (map)
req.query (map)
DNS lookups (via expressions)
Implementation : lib/policy/celchecker.go:NewCELChecker()
GeoIP (Thoth Integration)
Match by country code (requires Thoth): - name : block-regions
geoip :
countries :
- CN
- RU
action : DENY
Requires : Thoth service configured via ANUBIS_THOTH_URLImplementation : lib/thoth/geoipchecker.go
Match by Autonomous System Number: - name : cloud-providers
asns :
match :
- 16509 # Amazon AWS
- 15169 # Google Cloud
- 8075 # Microsoft Azure
action : CHALLENGE
challenge :
difficulty : 2
Requires : Thoth serviceImplementation : lib/thoth/asnchecker.go
All conditions within a single bot rule are AND-ed together. If you specify both user_agent_regex and path_regex, the request must match both.
Rule Validation
Rules are validated on startup:
// From lib/config/config.go:95
func ( b * BotConfig ) Valid () error {
var errs [] error
if b . Name == "" {
errs = append ( errs , ErrBotMustHaveName )
}
// Must have at least one matching condition
allFieldsEmpty := b . UserAgentRegex == nil &&
b . PathRegex == nil &&
len ( b . RemoteAddr ) == 0 &&
len ( b . HeadersRegex ) == 0 &&
b . ASNs == nil &&
b . GeoIP == nil
if allFieldsEmpty && b . Expression == nil {
errs = append ( errs , ErrBotMustHaveUserAgentOrPath )
}
// Validate regexes compile
if b . UserAgentRegex != nil {
if _ , err := regexp . Compile ( * b . UserAgentRegex ); err != nil {
errs = append ( errs , ErrInvalidUserAgentRegex , err )
}
}
return errors . Join ( errs ... )
}
Actions
Bot rules can specify five different actions:
Immediately proxy the request to upstream without challenge. - name : verified-bot
remote_addresses :
- "66.249.64.0/19"
action : ALLOW
Block the request with a 403 Forbidden response. - name : blacklisted-ips
remote_addresses :
- "203.0.113.0/24"
action : DENY
Issue a proof-of-work challenge. Requires challenge configuration. - name : suspicious-bot
user_agent_regex : "bot|crawler|spider"
action : CHALLENGE
challenge :
algorithm : fast
difficulty : 3
Adjust the request’s suspicion weight and continue evaluation. - name : missing-common-headers
expression :
- "!has(req.headers['accept-language'])"
action : WEIGH
weight :
adjust : 5
Default weight adjustment is 5 if not specified.
Render a benchmark page for testing challenge performance. - name : benchmark-endpoint
path_regex : "^/__benchmark$"
action : DEBUG_BENCHMARK
Action Flow
// From lib/anubis.go:609
for _ , b := range s . policy . Bots {
match , err := b . Rules . Check ( r )
if match {
switch b . Action {
case config . RuleDeny , config . RuleAllow ,
config . RuleBenchmark , config . RuleChallenge :
// Terminal action - return immediately
return cr ( "bot/" + b . Name , b . Action , weight ), & b , nil
case config . RuleWeigh :
// Non-terminal - accumulate weight and continue
weight += b . Weight . Adjust
}
}
}
Order matters! Place ALLOW rules for verified bots first, then WEIGH rules to accumulate suspicion, and finally DENY/CHALLENGE rules.
Thresholds
Thresholds evaluate accumulated weight from WEIGH actions using CEL expressions:
thresholds :
- name : low-suspicion
expression : "weight >= 5 && weight < 10"
action : CHALLENGE
challenge :
algorithm : fast
difficulty : 2
- name : high-suspicion
expression : "weight >= 10"
action : CHALLENGE
challenge :
algorithm : fast
difficulty : 5
- name : extreme-suspicion
expression : "weight >= 20"
action : DENY
Threshold Evaluation
// From lib/anubis.go:627
for _ , t := range s . policy . Thresholds {
result , _ , err := t . Program . ContextEval (
r . Context (),
& policy . ThresholdRequest { Weight : weight }
)
if matches {
return cr ( "threshold/" + t . Name , t . Action , weight ), & policy . Bot {
Challenge : t . Challenge ,
Rules : & checker . List {},
}, nil
}
}
Thresholds are evaluated in order. The first matching threshold determines the action.
Threshold Definition
// From lib/config/threshold.go:32
type Threshold struct {
Expression * ExpressionOrList `json:"expression"`
Challenge * ChallengeRules `json:"challenge"`
Name string `json:"name"`
Action Rule `json:"action"`
}
Thresholds cannot use the WEIGH action - this validation error occurs at config load time: if t . Action == RuleWeigh {
errs = append ( errs , ErrThresholdCannotHaveWeighAction )
}
Rule Hashing
Each bot rule is hashed to detect policy changes:
// From lib/policy/bot.go:19
func ( b Bot ) Hash () string {
return internal . FastHash ( fmt . Sprintf ( " %s :: %s " , b . Name , b . Rules . Hash ()))
}
This hash is embedded in JWTs. When you update your policy:
Rule hash changes
Existing JWTs with old hash fail validation
Clients must re-solve challenges
This prevents bypassing updated security rules with old tokens.
Check Result
Policy evaluation returns a CheckResult:
// From lib/policy/checkresult.go:9
type CheckResult struct {
Name string // e.g., "bot/suspicious-crawler" or "threshold/high-suspicion"
Rule config . Rule // ALLOW, DENY, CHALLENGE, etc.
Weight int // Accumulated weight
}
This is logged and exposed in Prometheus metrics:
anubis_policy_results{rule="bot/verified-googlebot",action="ALLOW"} 1234
anubis_policy_results{rule="threshold/high-suspicion",action="CHALLENGE"} 567
Import Statements
Reuse bot rules across multiple configs:
# main-policy.yaml
bots :
- import : "(data)/verified-bots.yaml" # Built-in
- import : "/etc/anubis/custom-rules.yaml" # External
- name : site-specific-rule
path_regex : "^/protected/"
action : CHALLENGE
# verified-bots.yaml
- name : googlebot
remote_addresses :
- "66.249.64.0/19"
action : ALLOW
- name : bingbot
remote_addresses :
- "40.77.167.0/24"
action : ALLOW
Use the (data)/ prefix to import built-in bot policies shipped with Anubis. These are embedded at compile time.
CEL Expressions
Anubis supports Common Expression Language for advanced matching:
Available Functions
String Operations
Collection Operations
Logical Operations
req.path.startsWith('/api/')
req.headers['user-agent'].contains('Mobile')
req.method.matches('^(POST|PUT|DELETE)$')
Environment Variables
Expressions have access to:
// From lib/policy/expressions/
- req . method ( string )
- req . path ( string )
- req . headers ( map < string , string > )
- req . query ( map < string , string > )
- loadavg () ( float , Linux only )
- dns . forward ( ip ) ([] string )
- dns . reverse ( hostname ) ([] string )
Example Policies
Progressive Challenge
API Protection
Geographic Restrictions
# Escalate difficulty based on behavior
bots :
# Known good bots
- name : verified-crawlers
import : "(data)/verified-bots.yaml"
# Add suspicion for missing headers
- name : missing-language
expression :
- "!has(req.headers['accept-language'])"
action : WEIGH
weight :
adjust : 3
- name : missing-encoding
expression :
- "!has(req.headers['accept-encoding'])"
action : WEIGH
weight :
adjust : 3
# Add suspicion for scraper user agents
- name : scraper-ua
user_agent_regex : "(curl|wget|python|scrapy)"
action : WEIGH
weight :
adjust : 10
thresholds :
# Light challenge for moderate suspicion
- name : moderate
expression : "weight >= 5 && weight < 10"
action : CHALLENGE
challenge :
algorithm : fast
difficulty : 2
# Heavy challenge for high suspicion
- name : high
expression : "weight >= 10"
action : CHALLENGE
challenge :
algorithm : fast
difficulty : 4
Default Behavior
If no bot rules or thresholds match, Anubis allows the request:
// From lib/anubis.go:648
return cr ( "default/allow" , config . RuleAllow , weight ), & policy . Bot {
Challenge : & config . ChallengeRules {
Difficulty : s . policy . DefaultDifficulty ,
Algorithm : config . DefaultAlgorithm ,
},
Rules : & checker . List {},
}, nil
This “default allow” behavior means Anubis is not a firewall by itself. It only challenges/blocks traffic that matches your rules. Combine it with proper network security.
Metrics and Monitoring
Policy decisions are tracked:
# Rule application counts
anubis_policy_results{rule="bot/verified-googlebot",action="ALLOW"}
anubis_policy_results{rule="bot/suspicious-crawler",action="CHALLENGE"}
anubis_policy_results{rule="threshold/high-suspicion",action="CHALLENGE"}
anubis_policy_results{rule="bot/blocklist",action="DENY"}
Request headers include policy metadata:
X-Anubis-Rule : bot/suspicious-crawler
X-Anubis-Action : CHALLENGE
X-Anubis-Status : PASS
Best Practices
Order Rules Carefully Place ALLOW rules first, then WEIGH, then terminal actions.
Use Imports Reuse verified bot lists with import: "(data)/verified-bots.yaml".
Start Conservative Begin with WEIGH actions and observe metrics before adding DENY rules.
Test Expressions Use DEBUG_BENCHMARK action on test endpoints to verify CEL expressions.
Next Steps
Challenges Configure challenge difficulty and algorithms
Architecture Understand how policies integrate with the proxy