Build and Runtime
Node.js Version
Node.js Version
Requirement: Pin Node to a supported LTS line (Why: Runner requires Node.js 18+ for
>=18)fetch, AbortController, and modern async features.Check:Build Validation
Build Validation
Requirement: Build in CI with Why: Catches type errors, linting issues, and test failures before deployment.What
npm run qanpm run qa does:- TypeScript type checking
- ESLint/Prettier validation
- Full test suite with 100% coverage enforcement
Compiled Output
Compiled Output
Requirement: Run from compiled output (no ts-node in production)Why: ts-node adds significant startup time and memory overhead.Never do this in production:
Security
Tunnel Authentication
Tunnel Authentication
Requirement: Configure exposure auth for tunnels and avoid anonymous exposureNever expose without auth:
Task/Event Allow-lists
Task/Event Allow-lists
Requirement: Use allow-lists for remotely callable task/event idsWhy: Prevents accidental exposure of internal tasks/events.
Payload Limits
Payload Limits
Requirement: Set payload limits for JSON/multipart trafficWhy: Protects against denial-of-service attacks via large payloads.
Log Sanitization
Log Sanitization
Requirement: Review logs for sensitive data before enabling external sinksCommon sensitive fields to avoid logging:
- Passwords, tokens, API keys
- Credit card numbers, SSNs
- Email addresses (in some jurisdictions)
Reliability
Timeout/Retry Defaults
Timeout/Retry Defaults
Requirement: Define timeout/retry/circuit-breaker defaults for external I/O tasksWhy: External services fail. Proper error handling prevents cascading failures.
Graceful Shutdown
Graceful Shutdown
Requirement: Verify graceful shutdown path with Test in staging:
SIGTERM in stagingResource Disposal Order
Resource Disposal Order
Requirement: Ensure resource disposal order is validated in integration testsWhy: Incorrect disposal order can cause connection leaks or errors.
Observability
Without observability, you’re flying blind in production. These are the baseline requirements.
Structured Logging
Structured Logging
Requirement: Emit structured logs with stable Log format:
source ids- timestamp: ISO 8601
- level: debug/info/warn/error
- source: task/resource ID
- data: structured payload
- error: stack trace and details
Metrics Collection
Metrics Collection
Requirement: Track latency and error-rate metrics per critical task pathKey metrics to track:
- Request rate (requests/second)
- Error rate (errors/second)
- Latency (p50, p95, p99)
- Task execution time
Distributed Tracing
Distributed Tracing
Requirement: Export traces for cross-service flows
Baseline Alerts
Baseline Alerts
Requirement: Configure baseline alerts for error-rate spikes and sustained p95 latencyExample alert rules:
Common alerting platforms:
| Metric | Threshold | Duration | Action |
|---|---|---|---|
| Error rate | > 5% | 5 minutes | Page on-call |
| p95 latency | > 1000ms | 10 minutes | Notify team |
| Availability | < 99.9% | 5 minutes | Page on-call |
| Memory usage | > 85% | 5 minutes | Auto-scale or alert |
- Datadog
- Prometheus + Alertmanager
- New Relic
- Sentry
Operations
Health Checks
Health Checks
Requirement: Expose
/health (or equivalent) and wire container/platform checksRunbooks
Runbooks
Requirement: Maintain runbooks for incident triage and rollbackExample runbook structure:
Escalation
- On-call: #oncall-team
- Engineering lead: @lead
- Incident commander: @ic
- Production deployment:
- Deploy during low-traffic window
- Use canary or blue-green deployment
- Monitor metrics closely
- Have rollback plan ready
Deployment Checklist
Use this final checklist before promoting to production:| Category | Check | Status |
|---|---|---|
| Build | Node.js >= 18 | [ ] |
| Build | CI runs npm run qa | [ ] |
| Build | Using compiled output (not ts-node) | [ ] |
| Security | Tunnel auth configured | [ ] |
| Security | Task/event allow-lists defined | [ ] |
| Security | Payload limits set | [ ] |
| Security | Logs sanitized | [ ] |
| Reliability | Timeout/retry configured | [ ] |
| Reliability | Graceful shutdown tested | [ ] |
| Reliability | Resource disposal order validated | [ ] |
| Observability | Structured logging enabled | [ ] |
| Observability | Metrics collection configured | [ ] |
| Observability | Distributed tracing enabled | [ ] |
| Observability | Alerts configured | [ ] |
| Operations | Health check endpoint exposed | [ ] |
| Operations | Runbooks documented | [ ] |
| Operations | Release notes reviewed | [ ] |
Support and SLAs
For enterprise deployments with SLA requirements, see the Enterprise Support guide. Current support channels:- Stable:
5.x(current feature line) - Maintenance/LTS:
4.x(critical fixes only)
Additional Resources
Observability Strategy
Detailed guide on logs, metrics, and traces
Enterprise Support
Professional and enterprise support plans
Migration Guide
Upgrading between Runner versions
Troubleshooting
Common issues and solutions