Incident Handling
This guide outlines the complete incident response lifecycle for the Enterprise SOC, from initial detection through post-incident review. It covers the use of TheHive for case management, escalation procedures, and best practices for effective incident response.Incident Response Workflow
Detection and Identification
Identify potential security incidents through:
- Automated alerts from Wazuh, Snort, Suricata
- Threat hunting activities
- User reports
- Third-party notifications
- Acknowledge the alert in monitoring system
- Perform initial triage to confirm it’s a genuine incident
- Gather preliminary information (affected systems, timeline, indicators)
Incident Classification
Classify the incident by:
- Type: Malware, unauthorized access, data breach, DDoS, insider threat, etc.
- Severity: Critical, High, Medium, Low
- Scope: Single endpoint, multiple systems, network-wide
- Impact: Confidentiality, integrity, availability
- Open new case with descriptive title
- Set appropriate severity and TLP
- Assign to appropriate analyst
- Add initial observables (IPs, domains, hashes)
Containment
Implement containment measures to prevent spread:Short-term Containment:
- Isolate affected systems from network
- Block malicious IPs/domains at firewall
- Disable compromised user accounts
- Implement emergency firewall rules
- Apply temporary patches or workarounds
- Segment network to limit lateral movement
- Enhanced monitoring of related systems
- Prepare for eradication phase
Investigation and Analysis
Conduct thorough investigation:
- Collect forensic evidence from affected systems
- Analyze logs in Elasticsearch for full attack timeline
- Review Wazuh EDR data for endpoint artifacts
- Examine IDS/IPS logs for network indicators
- Identify root cause and attack vector
- Determine full scope of compromise
- Document all findings in TheHive
- Analyze file hashes with VirusTotal
- Check IP reputation with AbuseIPDB
- Investigate domains with passive DNS
- Extract IOCs automatically
Eradication
Remove threat from environment:
- Remove malware from infected systems
- Delete unauthorized accounts or access
- Close exploited vulnerabilities
- Update security controls (IDS rules, firewall policies)
- Verify complete removal of threat
- Scan systems with updated antivirus
- Search logs for indicators of persistence
- Monitor for reinfection attempts
Recovery
Restore normal operations:
- Restore systems from clean backups if necessary
- Rebuild compromised systems from known-good images
- Reset credentials for affected accounts
- Gradually restore services with enhanced monitoring
- Verify business operations are normal
- Continue monitoring for 72+ hours post-recovery
Post-Incident Review
Conduct lessons learned session:
- Document complete incident timeline
- Analyze response effectiveness
- Identify improvements for detection and response
- Update playbooks and procedures
- Share findings with stakeholders
- Implement preventive measures
- Close TheHive case with final report
Using TheHive for Case Management
Creating a New Case
Create New Case
Click “New Case” and fill in required fields:
- Title: Descriptive incident name (e.g., “Malware Detection on HR-WKS-042”)
- Severity: Critical/High/Medium/Low
- TLP: Traffic Light Protocol classification (Red/Amber/Green/White)
- PAP: Permissible Actions Protocol
- Description: Detailed incident summary
- Tags: Categorization tags (malware, phishing, data-breach, etc.)
Add Observables
Add indicators of compromise:
- IP addresses (source and destination)
- Domain names
- File hashes (MD5, SHA1, SHA256)
- Email addresses
- URLs
- Filenames
- Registry keys
- User accounts
TheHive integrates automatically with Wazuh for high-severity alerts. Configure alert thresholds to auto-create cases for critical events.
Using Cortex for Analysis
Cortex provides automated analysis and response capabilities:Threat Intelligence
Query multiple threat intel sources (VirusTotal, AbuseIPDB, OTX) for observable reputation
File Analysis
Analyze suspicious files with sandboxes and static analysis tools
Domain Investigation
Perform WHOIS, passive DNS, and reputation checks on domains
Response Actions
Execute automated containment through responders (block IP, quarantine endpoint)
- Select an observable in TheHive case
- Click “Run Analyzers”
- Choose relevant analyzers (e.g., VirusTotal for hash, MaxMind for IP)
- Review analyzer reports when complete
- Use results to inform investigation decisions
- VirusTotal_GetReport: Check file/URL/domain/IP reputation
- AbuseIPDB: IP address abuse history
- Shodan: Internet exposure analysis
- MaxMind: IP geolocation
- MISP: Query MISP threat intelligence platform
Case Documentation Best Practices
What to Document:- Timeline: Precise timestamps for all events and actions
- Evidence: Screenshots, log excerpts, forensic artifacts
- Analysis: Your thought process and investigative steps
- Actions Taken: Every containment, eradication, and recovery action
- Communications: Stakeholder notifications and approvals
- Outcomes: Resolution status and lessons learned
Incident Severity Classification
Critical Severity
Critical Severity
Criteria:
- Active data exfiltration in progress
- Ransomware encryption of critical systems
- Complete compromise of domain controller or core infrastructure
- Confirmed APT or nation-state activity
- Public-facing breach with customer data exposure
High Severity
High Severity
Criteria:
- Malware detected on multiple systems
- Successful exploitation of critical vulnerability
- Unauthorized access to sensitive data
- Privilege escalation to administrative level
- Confirmed command and control communication
Medium Severity
Medium Severity
Criteria:
- Malware detected on isolated endpoint
- Suspicious activity requiring investigation
- Policy violations with security implications
- Failed exploitation attempts
- Anomalous network traffic
Low Severity
Low Severity
Criteria:
- Minor policy violations
- Informational security events
- False positive confirmation
- Routine security operations
Escalation Procedures
When to Escalate
Tier 1 to Tier 2 Escalation
Escalate when:
- Incident complexity exceeds Tier 1 capabilities
- Deep forensic analysis required
- Advanced malware analysis needed
- Incident duration exceeds 4 hours without resolution
- Multiple systems affected
- Document all findings in TheHive
- Assign case to Tier 2 queue
- Provide verbal briefing to Tier 2 analyst
- Remain available for questions
Tier 2 to Tier 3 / Management Escalation
Escalate when:
- Critical infrastructure compromised
- Data breach confirmed or suspected
- Legal or regulatory implications
- External support needed (vendors, law enforcement)
- Incident declared as “major incident”
- Update TheHive case with severity increase
- Notify SOC Manager via phone/SMS
- Prepare executive summary
- Join incident bridge call if activated
External Escalation
Escalate externally when:
- Law enforcement involvement required
- Regulatory reporting obligations (GDPR, HIPAA, etc.)
- Customer notification needed
- Vendor/partner notification required
- Cyber insurance claim activation
- Obtain management approval
- Follow legal/compliance notification procedures
- Coordinate with public relations if media involved
- Document all external communications
Escalation Contact Matrix
| Severity | Initial Contact | Timeframe | Additional Notifications |
|---|---|---|---|
| Critical | SOC Manager + CISO | Immediate | Executive team, Legal, PR |
| High | SOC Manager | Within 30 min | System owners, IT Management |
| Medium | SOC Team Lead | Within 2 hours | Affected department heads |
| Low | Document in system | Next business day | None required |
Incident Response Playbooks
Malware Incident
Immediate Actions
- Isolate infected system from network (disconnect network cable or disable in firewall)
- Create TheHive case with malware sample hash
- Run Cortex analyzers on file hash (VirusTotal, reverse.it)
- Identify other systems with same IOCs using Wazuh/Elasticsearch
Containment
- Block C2 domains/IPs at firewall and DNS
- Update IDS/IPS signatures for detection
- Push Wazuh rule to detect malware across all endpoints
- Search for related IOCs (file paths, registry keys, network connections)
Eradication
- Run full antivirus scan with updated signatures
- Remove malware files and persistence mechanisms
- Reset credentials for accounts used on infected system
- Apply missing patches that may have been exploited
Phishing Incident
Initial Response
- Obtain copy of phishing email (forward as attachment to preserve headers)
- Create TheHive case with email observables (sender, URLs, attachments)
- Check if users clicked links or opened attachments
- Search email logs for other recipients
Containment
- Block sender address and domain at email gateway
- Delete phishing emails from all mailboxes
- Block malicious URLs in web proxy/firewall
- Reset credentials for users who entered passwords
Investigation
- Analyze email headers for origin
- Analyze attachments in sandbox (use Cortex analyzers)
- Check compromised URLs with URLscan.io
- Review logs for credential use post-phishing
- Check for data exfiltration or account compromise
Data Breach Incident
Immediate Containment
- Identify and stop ongoing exfiltration
- Isolate compromised systems
- Preserve evidence (memory dumps, disk images, logs)
- Create TheHive case with “data-breach” tag
- Notify management immediately
Scope Assessment
- Identify what data was accessed/exfiltrated
- Determine number of records affected
- Classify data sensitivity (PII, PHI, financial, trade secrets)
- Establish timeline of unauthorized access
- Identify affected individuals/customers
Legal and Regulatory
- Engage legal counsel
- Assess regulatory notification requirements
- Prepare notification templates (customers, regulators, media)
- Activate cyber insurance if applicable
- Consider law enforcement notification
Post-Incident Review Process
Post-incident reviews are learning opportunities, not blame sessions. Focus on process improvement, not individual mistakes.
Review Meeting Agenda
Incident Overview (10 minutes)
- Present incident timeline
- Describe attack vector and techniques
- Summarize impact and scope
- Review response timeline
What Went Well (15 minutes)
- Effective detection mechanisms
- Successful containment actions
- Good communication and collaboration
- Useful tools and processes
What Went Wrong (20 minutes)
- Delayed detection or response
- Missing visibility or monitoring
- Tool or process failures
- Communication breakdowns
- Documentation gaps
Review Documentation
Create a formal post-incident report including:- Executive Summary: High-level overview for management
- Detailed Timeline: Complete event sequence
- Technical Analysis: Attack methods, tools, and IOCs
- Impact Assessment: Business impact, costs, data affected
- Response Evaluation: What worked and what didn’t
- Recommendations: Specific improvements with priorities
- Action Plan: Assigned tasks with deadlines
Communication During Incidents
Internal Communications
SOC Team:- Use dedicated Slack/Teams channel for real-time coordination
- Update TheHive case frequently with progress
- Hold standup calls every 2-4 hours for major incidents
- Provide initial notification within 30 minutes of critical incident
- Send status updates every 2 hours or when major developments occur
- Keep updates concise: status, impact, next steps, ETA
- Notify of service disruptions promptly
- Provide clear guidance on required actions
- Set expectations for resolution timeframe
External Communications
Guidelines:- Coordinate with PR/communications team
- Stick to approved messaging
- Never speculate or provide unconfirmed information
- Refer media inquiries to designated spokesperson
- Document all external communications
Metrics and Reporting
Track key incident response metrics:- Mean Time to Detect (MTTD): Time from incident occurrence to detection
- Mean Time to Respond (MTTR): Time from detection to containment
- Mean Time to Resolve (MTTR): Time from detection to full resolution
- Incident Count by Type: Trends in incident categories
- Incident Count by Severity: Distribution of severity levels
- False Positive Rate: Alerts that were not incidents
- Escalation Rate: Percentage requiring escalation
Regular metric review helps identify areas for improvement in detection and response capabilities.
Related Resources
- Monitoring Guide - Daily monitoring operations and alert management
- Threat Hunting - Proactive threat detection techniques
- Maintenance - System maintenance and tuning procedures
