Types of Red Team Engagements
Introduction
Not every organization needs the same style of red team engagement, and selecting the wrong model wastes budget while leaving critical gaps unexplored. A startup with a flat network and ten employees has fundamentally different testing needs than a multinational bank with segmented OT environments and a 24/7 SOC. This page breaks down the major engagement types, explains when each is appropriate, and provides practical guidance for scoping and execution.
Before diving in, make sure you are comfortable with the core concepts covered in Red Team Fundamentals. The terminology, rules of engagement, and planning frameworks introduced there apply to every engagement type discussed below.
Engagement Type Selection Decision Tree
Use this decision tree as a starting point when deciding which engagement model fits your organization’s maturity, goals, and constraints.
flowchart TD
A[What is the primary goal?] --> B{Test full attack lifecycle?}
A --> C{Validate detection & response?}
A --> D{Protect specific critical assets?}
A --> E{Train leadership & IR teams?}
A --> F{Understand APT exposure?}
A --> G{Ongoing security validation?}
A --> H{Test wireless attack surface?}
A --> I{Assess third-party risk?}
B -->|Yes| B1[Full-Scope Red Team]
C -->|Yes| C1{Mature SOC in place?}
C1 -->|Yes| C2[Assumed Breach]
C1 -->|No| C3[Start with Purple Team]
D -->|Yes| D1[Objective-Based / Crown Jewel]
E -->|Yes| E1[Tabletop Exercise]
F -->|Yes| F1[Adversary Emulation]
G -->|Yes| G1[Continuous Red Teaming]
H -->|Yes| H1[Wireless Red Teaming]
I -->|Yes| I1[Supply Chain Red Teaming]
B1 --> Z[Define Rules of Engagement]
C2 --> Z
D1 --> Z
E1 --> Z
F1 --> Z
G1 --> Z
H1 --> Z
I1 --> Z
1. Full-Scope Red Team
Description
A full-scope red team engagement is the most comprehensive and realistic form of adversarial testing. The red team operates as a genuine external (or insider) threat actor, beginning with open-source intelligence (OSINT) gathering and progressing through initial access, persistence, lateral movement, privilege escalation, and ultimately mission impact — such as exfiltrating sensitive data or disrupting critical systems.
The engagement exercises every layer of defense: perimeter controls, endpoint detection, network monitoring, identity and access management, physical security, and human awareness. The blue team (defenders) is typically unaware the engagement is taking place, which provides a true measure of detection and response capability.
When to Use
- The organization has a reasonably mature security program and wants a realistic end-to-end assessment.
- Leadership wants to understand the full blast radius of a motivated attacker.
- Previous penetration tests or vulnerability assessments have been completed, and the organization is ready for a higher-fidelity test.
- Regulatory or compliance frameworks require adversarial simulation (e.g., TIBER-EU, CBEST, iCAST).
Typical Scope
- Phases: OSINT, social engineering (phishing, vishing, pretexting), physical intrusion attempts, technical exploitation, post-exploitation, data exfiltration, reporting.
- Targets: All externally reachable assets, employees (social engineering), physical facilities, cloud environments, partner/vendor portals.
- Exclusions: Typically excludes destructive actions (ransomware deployment, data destruction) unless explicitly authorized. Safety-critical systems (ICS/SCADA, medical devices) are often out of scope or require additional safety controls.
Duration
4 to 8 weeks on average, though complex engagements for large enterprises can extend to 12 weeks. The breakdown often looks like:
| Phase | Duration |
|---|---|
| OSINT & Reconnaissance | 1–2 weeks |
| Initial Access Attempts | 1–2 weeks |
| Post-Exploitation & Lateral Movement | 1–2 weeks |
| Impact & Objective Completion | 0.5–1 week |
| Reporting & Debrief | 1 week |
Pros and Cons
| Pros | Cons |
|---|---|
| Most realistic assessment of organizational risk | Most expensive engagement type |
| Tests all defense layers end-to-end | Requires significant planning and coordination |
| Reveals gaps that siloed testing misses | Longer timeline before actionable results |
| Exercises blue team detection under real conditions | Risk of operational disruption if controls fail |
| Provides compelling narrative for executive reporting | Requires highly skilled operators |
Example Scenario
A financial services firm engages a red team for an 8-week full-scope assessment. The team begins with OSINT, discovering employee names and roles on LinkedIn, leaked credentials on paste sites, and details about internal technology from job postings. They craft a targeted phishing campaign impersonating the firm’s HR benefits portal and gain initial access when an employee enters credentials on a cloned page. From there, the team pivots through the internal network, exploits a misconfigured service account with excessive Active Directory privileges, obtains domain admin, and exfiltrates a sample of customer PII to a controlled external server — all without triggering a single alert. The final report maps every finding to MITRE ATT&CK techniques (see MITRE ATT&CK Mapping) and includes a step-by-step attack narrative.
2. Assumed Breach
Description
An assumed breach engagement starts from a position where the attacker already has a foothold inside the environment. Instead of spending weeks on OSINT, social engineering, and initial access, the red team is given a starting point — such as a workstation with standard user credentials, a VPN connection, or a compromised web application session. This allows the engagement to focus entirely on post-exploitation: privilege escalation, lateral movement, persistence, and objective completion.
The model acknowledges a fundamental truth: initial access is almost always achievable given enough time and resources. Assumed breach tests what happens after the inevitable occurs.
When to Use
- The organization has a mature security posture and wants to stress-test internal detection and response capabilities.
- Previous full-scope engagements have demonstrated that initial access is reliably achievable.
- Budget or timeline constraints prevent a full-scope engagement, but the organization still wants meaningful adversarial testing.
- The SOC or DFIR team wants to validate their ability to detect lateral movement, credential abuse, and data staging.
- Post-merger integration: testing whether a newly acquired company’s network is adequately segmented.
Typical Scope
- Starting Position: Standard domain user workstation, VPN credentials, compromised web application, cloud tenant user, or insider access.
- Objectives: Escalate privileges, move laterally to sensitive segments, access crown jewel assets, establish persistence, exfiltrate data.
- Out of Scope: External reconnaissance, social engineering, physical access testing (these are bypassed by the starting assumptions).
Duration
2 to 4 weeks, significantly shorter than full-scope since the initial access phase is skipped.
| Phase | Duration |
|---|---|
| Environment Familiarization | 2–3 days |
| Privilege Escalation & Lateral Movement | 1–2 weeks |
| Objective Completion & Impact | 0.5–1 week |
| Reporting & Debrief | 0.5–1 week |
Pros and Cons
| Pros | Cons |
|---|---|
| Faster time to actionable results | Does not test perimeter or initial access controls |
| Lower cost than full-scope | Misses social engineering and physical security gaps |
| Focuses on highest-impact post-exploitation gaps | Starting assumptions may not reflect realistic scenarios |
| Excellent for validating SOC detection capabilities | Less compelling executive narrative (no “how they got in” story) |
| Repeatable — easy to run quarterly | May underestimate the difficulty of actual initial access |
Example Scenario
A healthcare organization provides the red team with a standard employee workstation connected to the corporate network and Active Directory credentials for a non-privileged user in the nursing department. Within four days, the team discovers a misconfigured Group Policy Preferences file containing a local admin password, uses it to move laterally to a file server in the research department, and finds unencrypted patient records. They stage the data and exfiltrate it over DNS tunneling. The SOC detects the DNS anomaly on day six but fails to contain it before the exfiltration completes. The engagement directly drives improvements to the SOC’s DNS monitoring playbooks and AD hygiene practices.
3. Objective-Based / Crown Jewel
Description
An objective-based engagement (sometimes called “crown jewel” testing) defines specific, high-value targets that the red team must reach. Rather than broadly exploring the environment, the team focuses its effort on achieving predefined objectives that represent the organization’s worst-case scenarios. Success criteria are established before the engagement begins, making results clear-cut and directly tied to business risk.
Common objectives include: obtaining domain admin privileges, accessing the CEO’s email, exfiltrating a production database, initiating a wire transfer in a test environment, accessing source code repositories, or compromising a CI/CD pipeline.
When to Use
- The organization has identified specific critical assets or processes it needs to protect above all else.
- Board or executive leadership wants a clear answer to “Can an attacker do X?”
- Budget is limited and needs to be focused on the highest-risk areas.
- Compliance or regulatory requirements mandate testing of specific controls (e.g., PCI DSS cardholder data environment isolation).
- Previous assessments have been too broad, and findings were not actionable.
Typical Scope
- Objectives: 3 to 5 specific, measurable goals agreed upon during scoping.
- Attack Path: The red team determines the attack path — any method is fair game (OSINT, social engineering, technical exploitation) unless explicitly excluded.
- Boundaries: The team may have broad scope for methods but narrow scope for targets.
Duration
2 to 6 weeks, depending on the number of objectives and whether initial access is in scope.
| Phase | Duration |
|---|---|
| Objective Definition & Scoping | 0.5–1 week |
| Reconnaissance & Planning | 0.5–1 week |
| Execution | 1–3 weeks |
| Reporting & Debrief | 0.5–1 week |
Pros and Cons
| Pros | Cons |
|---|---|
| Results directly tied to business risk | Narrow focus may miss vulnerabilities outside the attack path |
| Clear success/failure criteria | Risk of tunnel vision — team ignores interesting side findings |
| Easy for non-technical stakeholders to understand | May not exercise blue team broadly |
| Efficient use of budget | Defining the right objectives requires organizational maturity |
| Drives targeted remediation | Success depends heavily on objective selection |
Example Scenario
A manufacturing company defines three objectives: (1) access the ERP system’s financial module with approval authority, (2) exfiltrate product design files from the engineering network, and (3) compromise the building access control system. The red team achieves objectives one and two within three weeks — the ERP system is reached through a compromised vendor VPN account, and engineering files are accessed after exploiting a Jenkins server with hardcoded credentials in the CI/CD pipeline. Objective three fails: the physical access control system is on an air-gapped network with no viable bridge. The result gives the organization a clear picture: financial and intellectual property controls need immediate attention, while physical access controls are adequately isolated.
4. Tabletop Exercises
Description
A tabletop exercise (TTX) is a discussion-based simulation where participants walk through a hypothetical attack scenario without any live testing. A facilitator presents an evolving scenario in stages (called “injects”), and participants — typically from IT, security, legal, communications, and executive leadership — discuss how they would detect, respond to, and recover from the situation.
Tabletop exercises are not technically “red teaming” in the traditional sense, but they are a critical component of the adversarial testing spectrum. They test people and processes rather than technology, and they are often the most effective way to engage executive leadership in security discussions.
When to Use
- The organization is not yet mature enough for live red team testing but wants to exercise incident response capabilities.
- Executive leadership, legal, or communications teams need to practice their roles during a security incident.
- New incident response plans or playbooks need to be validated before a real incident tests them.
- Regulatory requirements mandate incident response exercises (e.g., NIST CSF, ISO 27001, HIPAA).
- As a precursor to a live red team engagement, to ensure the organization is ready and to refine scope.
Typical Scope
- Participants: 8 to 20 people from security, IT operations, legal, communications, HR, and executive leadership.
- Scenario: A realistic, multi-phase attack narrative tailored to the organization’s industry and threat landscape.
- Injects: 5 to 10 escalating scenario developments that force participants to make decisions under uncertainty.
- Deliverables: Observations report documenting gaps in processes, communication, and decision-making.
Duration
2 to 4 hours for the exercise itself, plus 1 to 2 weeks for preparation and reporting.
| Phase | Duration |
|---|---|
| Scenario Development | 1–2 weeks |
| Exercise Facilitation | 2–4 hours |
| After-Action Report | 1 week |
Pros and Cons
| Pros | Cons |
|---|---|
| No risk of operational disruption | Does not test technical controls |
| Engages non-technical stakeholders | Participants may give “textbook” answers |
| Low cost relative to live engagements | Results depend heavily on facilitator skill |
| Reveals process and communication gaps | No proof of exploitability |
| Builds organizational muscle memory | Can feel abstract without live demonstration |
Example Scenario
A retail company conducts a tabletop exercise simulating a ransomware attack during the holiday shopping season. The scenario begins with the SOC receiving alerts about unusual encryption activity on point-of-sale systems. Injects progressively reveal that the attackers have exfiltrated customer payment card data, that the backup system was also compromised, and that a ransom note has been posted to the company’s public website. Participants must decide: Do they pay the ransom? When do they notify payment card brands? How do they communicate with customers? The exercise reveals that the legal team was unaware of the 72-hour GDPR notification requirement and that the communications team had no pre-drafted holding statement for a data breach. These gaps are addressed immediately.
5. Adversary Emulation
Description
Adversary emulation is a threat-intelligence-driven engagement where the red team replicates the specific tactics, techniques, and procedures (TTPs) of a known threat actor. Rather than using whatever works, the team constrains itself to the tools, methods, and behaviors documented for a particular adversary group — such as APT29 (Cozy Bear), APT28 (Fancy Bear), FIN7, Lazarus Group, or a sector-specific threat.
The engagement is mapped directly to the MITRE ATT&CK framework, with each step corresponding to specific technique IDs. This creates a structured test that answers a precise question: “If [specific threat actor] targeted us, would our defenses detect and stop them?”
MITRE’s own Center for Threat-Informed Defense publishes adversary emulation plans that can serve as blueprints. Frameworks like SCYTHE, Atomic Red Team, and Caldera provide automation support for emulation exercises.
When to Use
- The organization has identified specific threat actors relevant to its industry or geopolitical exposure.
- Threat intelligence indicates active campaigns targeting the organization’s sector.
- The SOC wants to validate detection coverage against specific TTP sets.
- Regulatory frameworks require threat-intelligence-led testing (TIBER-EU, CBEST, AASE).
- The organization wants structured, repeatable tests that track detection improvement over time.
Typical Scope
- Threat Actor Selection: Based on industry, geography, and current threat landscape. Common choices include APT29 for government/defense, FIN7 for retail/hospitality, APT41 for technology/healthcare.
- TTP Mapping: Each phase of the engagement maps to MITRE ATT&CK techniques documented for the selected actor.
- Tooling: The red team uses tools and techniques consistent with the emulated actor (e.g., Cobalt Strike for APT29, Carbanak for FIN7) or open-source equivalents.
- Detection Validation: Each executed technique is logged and checked against the blue team’s detection capability.
Duration
3 to 6 weeks, depending on the complexity of the emulated actor’s playbook.
| Phase | Duration |
|---|---|
| Threat Intelligence Research & TTP Mapping | 1–2 weeks |
| Emulation Plan Development | 0.5–1 week |
| Execution | 1–2 weeks |
| Detection Gap Analysis & Reporting | 1 week |
Pros and Cons
| Pros | Cons |
|---|---|
| Directly tests against realistic, documented threats | Constrained to known TTPs — may miss novel attack paths |
| Produces measurable detection coverage metrics | Requires strong threat intelligence capability or partnership |
| Repeatable and trackable over time | Emulated actor’s TTPs may be outdated |
| Maps cleanly to MITRE ATT&CK for reporting | More rigid than full-scope — less room for operator creativity |
| Aligns security investment with actual threat landscape | May not test physical or social engineering vectors |
Example Scenario
A government agency’s threat intelligence team identifies APT29 as the most relevant threat based on recent campaigns targeting similar agencies. The red team builds an emulation plan based on MITRE’s published APT29 evaluation, covering techniques like spearphishing with a malicious link (T1566.002), PowerShell execution (T1059.001), credential dumping via LSASS (T1003.001), and data staging for exfiltration (T1074). Over two weeks, the team executes each technique in sequence. Results show that the agency’s EDR detects PowerShell-based execution reliably but completely misses the credential dumping technique because the tool used (a custom memory-only implementation) does not match known signatures. The agency uses this finding to deploy a kernel-level credential guard and adds a YARA rule for memory-resident credential access patterns.
6. Continuous Red Teaming
Description
Continuous red teaming replaces the traditional point-in-time engagement with an ongoing adversarial program. An internal or embedded external red team operates persistently, testing defenses on a rolling basis. This model treats red teaming as an operational function rather than a periodic project.
Continuous programs often integrate closely with purple team activities, where red and blue teams collaborate to iteratively test, detect, and improve. The red team maintains persistent access in test environments, continuously probes for new attack paths as the environment changes, and provides real-time feedback to defenders.
When to Use
- Large organizations with dedicated security operations centers and mature security programs.
- Environments that change rapidly (frequent deployments, cloud-native architectures, M&A activity) where point-in-time assessments quickly become stale.
- Organizations subject to sustained targeting by advanced adversaries (defense, finance, critical infrastructure).
- When the goal is continuous improvement rather than periodic compliance.
- Organizations building or scaling an internal red team capability.
Typical Scope
- Operational Cadence: Weekly or monthly micro-engagements, quarterly deep-dive exercises, annual full-scope assessments.
- Coverage: Rotates through different business units, attack surfaces, and threat scenarios.
- Integration: Close coordination with SOC, DFIR, vulnerability management, and engineering teams.
- Metrics: Tracks mean-time-to-detect (MTTD), mean-time-to-respond (MTTR), detection coverage percentage, and vulnerability recurrence rates.
Duration
Ongoing — typically structured as 12-month contracts with external providers or a permanent internal team.
| Activity | Cadence |
|---|---|
| Micro-engagements (targeted tests) | Weekly to bi-weekly |
| Attack simulations | Monthly |
| Comprehensive assessments | Quarterly |
| Full-scope red team exercise | Annually |
| Purple team workshops | Monthly |
Pros and Cons
| Pros | Cons |
|---|---|
| Keeps pace with environment changes | Highest ongoing cost |
| Builds institutional adversarial knowledge | Requires dedicated staff or long-term contracts |
| Enables iterative improvement with measurable metrics | Risk of familiarity bias (same team, same environment) |
| Provides near-real-time feedback to defenders | Requires strong program governance |
| Reduces risk of stale findings | Operational overhead of managing ongoing engagements |
Example Scenario
A large cloud services provider maintains a six-person internal red team. Each week, the team selects a different service area — container orchestration, identity platform, customer-facing APIs — and conducts focused testing. Monthly, they execute more complex multi-step attack chains that cross service boundaries. Every test is logged in a shared platform that the SOC monitors; if the SOC detects the activity, both teams document the detection and the red team adjusts their approach. Quarterly, the red team runs a no-notice exercise where the SOC is not informed. Over 12 months, the program drives MTTD from 96 hours to 14 hours and detection coverage from 42% to 78% of MITRE ATT&CK techniques relevant to their threat model.
7. Wireless Red Teaming
Description
Wireless red teaming focuses on attacking the organization’s wireless communications infrastructure: WiFi networks, Bluetooth devices, RFID/NFC access systems, and other radio-frequency technologies. These attacks often require physical proximity to the target and specialized hardware, making them a distinct discipline within the broader red team toolkit.
Wireless attacks are frequently underestimated because organizations focus their security investments on network perimeter and endpoint controls while leaving wireless infrastructure with default or weak configurations. A compromised wireless network can provide an attacker with internal network access that bypasses all perimeter defenses.
When to Use
- The organization relies heavily on wireless connectivity (retail, healthcare, hospitality, warehousing).
- Physical security testing is in scope and wireless is a potential entry vector.
- BYOD policies create a large and diverse wireless device population.
- IoT devices communicate over Bluetooth, Zigbee, or other wireless protocols.
- Previous assessments identified wireless controls as a gap area.
- Badge cloning or physical access bypass is a concern.
Typical Scope
- WiFi Attacks: Evil Twin access points, KARMA/MANA attacks, WPA2 handshake capture and cracking, WPA3 downgrade attacks (Dragonblood), rogue AP detection bypass, captive portal exploitation, EAP relay attacks against WPA-Enterprise.
- Bluetooth Attacks: BlueSmack (DoS), BlueBorne exploitation, KNOB (Key Negotiation of Bluetooth) attacks, BLE (Bluetooth Low Energy) sniffing and replay.
- RFID/NFC: Badge cloning (HID ProxCard, MIFARE), relay attacks, skimming, NFC payment terminal manipulation.
- Hardware: Pineapple WiFi, Alfa adapters, Proxmark3, Flipper Zero, HackRF, Ubertooth One.
Duration
1 to 3 weeks, often conducted as part of a larger engagement.
| Phase | Duration |
|---|---|
| Wireless Reconnaissance | 2–3 days |
| Attack Execution | 1–2 weeks |
| Reporting | 2–3 days |
Pros and Cons
| Pros | Cons |
|---|---|
| Tests a frequently overlooked attack surface | Requires physical proximity to target |
| Can bypass perimeter defenses entirely | Specialized hardware and expertise required |
| Validates wireless security policies | Limited scope — wireless only |
| Tests IoT and BYOD exposure | Signal propagation is unpredictable |
| Directly supports physical penetration testing | Legal considerations for radio frequency transmission |
Example Scenario
A hospital engages a red team to test its wireless security posture. The team sets up in the visitor parking lot with a directional antenna and a WiFi Pineapple. They deploy an Evil Twin mimicking the hospital’s guest WiFi network (which shares a flat network with some medical devices due to a segmentation failure). A nurse’s workstation automatically connects to the rogue AP, and the team captures credentials as the device attempts to reach internal resources. Using these credentials, they access the clinical network and discover unpatched DICOM imaging servers. Separately, using a Proxmark3, they clone an RFID badge from a facilities worker during a coffee shop visit and use it to access a restricted server room. The findings drive immediate network segmentation work and a migration to certificate-based WiFi authentication.
8. Supply Chain Red Teaming
Description
Supply chain red teaming targets the organization’s software and hardware supply chain — the third-party code, services, and components that are implicitly trusted. This engagement type has surged in relevance following high-profile attacks like SolarWinds (2020), Codecov (2021), and the 3CX compromise (2023).
The red team evaluates whether an attacker could compromise the organization by targeting its dependencies, build pipelines, package registries, or vendor relationships rather than attacking the organization directly. This requires a different mindset: the target is not a network or an application but a trust relationship.
When to Use
- The organization develops and deploys software (internal or customer-facing).
- Heavy reliance on open-source dependencies or third-party libraries.
- CI/CD pipelines are complex and involve multiple external services.
- Vendor and partner access to internal systems is extensive.
- The organization has been affected by or is concerned about supply chain compromises.
- Software Bill of Materials (SBOM) requirements are being adopted or mandated.
Typical Scope
- Dependency Confusion: Registering internal package names on public registries to test whether build systems pull from public sources before private ones.
- Typosquatting: Creating packages with names similar to popular dependencies to test developer awareness and tooling safeguards.
- CI/CD Pipeline Attacks: Testing for secrets in build logs, insecure pipeline configurations, unauthorized artifact injection, pipeline poisoning (compromising build scripts or Dockerfiles).
- Third-Party Access Review: Evaluating vendor VPN access, API keys, OAuth applications, and partner portal security.
- Artifact Integrity: Testing whether code signing, SLSA provenance, and artifact verification controls are enforced or bypassable.
- Hardware Supply Chain: Where applicable, testing for counterfeit components, firmware tampering, or unauthorized hardware modifications.
Duration
3 to 6 weeks, with significant upfront research.
| Phase | Duration |
|---|---|
| Supply Chain Mapping & Analysis | 1–2 weeks |
| Attack Development & Testing | 1–2 weeks |
| Execution (controlled) | 0.5–1 week |
| Reporting & Remediation Guidance | 1 week |
Pros and Cons
| Pros | Cons |
|---|---|
| Tests a critical and growing attack vector | Highly specialized skill set required |
| Reveals trust relationships that are rarely examined | Risk of impacting production if controls fail |
| Directly relevant to modern software development | Scope can be difficult to define |
| Drives adoption of SBOM, SLSA, and signing practices | May require coordination with third-party vendors |
| Uniquely high impact findings | Relatively new discipline — fewer established methodologies |
Example Scenario
A SaaS company engages a red team to assess its software supply chain. The team begins by mapping all dependencies in the company’s Node.js and Python applications using SBOM analysis. They discover that the company uses an internal Python package called acme-auth-utils hosted on a private PyPI server. The team registers acme-auth-utils on the public PyPI registry with a version number higher than the internal one. During the next CI/CD build, the build system pulls the public (attacker-controlled) package instead of the internal one, executing a benign payload that phones home to the red team’s infrastructure. The team also identifies that three GitHub Actions workflows use pull_request_target with unsafe checkout patterns, allowing a forked PR to execute arbitrary code with write permissions to the main repository. Both findings lead to immediate remediation: pip is configured to use --index-url exclusively pointing to the private registry, and all GitHub Actions workflows are audited and hardened.
Engagement Type Comparison
The following table provides a side-by-side comparison of all engagement types to help with selection.
| Engagement Type | Scope | Duration | Stealth Required | Relative Cost | Best For |
|---|---|---|---|---|---|
| Full-Scope Red Team | Entire organization | 4–8 weeks | High | $$$$$ | Comprehensive risk assessment |
| Assumed Breach | Internal network | 2–4 weeks | Medium-High | $$$ | Post-exploitation detection validation |
| Objective-Based | Specific critical assets | 2–6 weeks | Medium-High | $$$$ | Business-risk-focused testing |
| Tabletop Exercise | People and processes | 2–4 hours (+ prep) | None | $ | IR readiness, executive engagement |
| Adversary Emulation | TTP-specific coverage | 3–6 weeks | High | $$$$ | Threat-specific detection testing |
| Continuous Red Teaming | Rotates across org | Ongoing (12+ months) | Varies | $$$$$ (annual) | Continuous improvement programs |
| Wireless Red Teaming | Wireless infrastructure | 1–3 weeks | Medium | $$ | WiFi, Bluetooth, RFID exposure |
| Supply Chain | Dependencies & pipelines | 3–6 weeks | Low-Medium | $$$ | Software development organizations |
Combining Engagement Types
In practice, organizations rarely rely on a single engagement type. A mature adversarial testing program layers multiple approaches throughout the year.
Recommended Annual Program (Mature Organization)
| Quarter | Activity | Type |
|---|---|---|
| Q1 | Full-scope red team engagement | Full-Scope |
| Q1 | Executive tabletop exercise | Tabletop |
| Q2 | Assumed breach focused on cloud environment | Assumed Breach |
| Q2 | Supply chain assessment of CI/CD pipeline | Supply Chain |
| Q3 | Adversary emulation of top threat actor | Adversary Emulation |
| Q3 | Wireless assessment of new office locations | Wireless |
| Q4 | Objective-based test of crown jewel assets | Objective-Based |
| Ongoing | Continuous micro-engagements by internal team | Continuous |
Building Maturity Over Time
Organizations that are new to adversarial testing should not start with a full-scope red team engagement. A practical maturity progression looks like this:
- Year 1: Tabletop exercises and vulnerability assessments. Build baseline awareness and IR processes.
- Year 2: Assumed breach engagements and objective-based testing. Validate internal detection and response.
- Year 3: Full-scope red team and adversary emulation. Test the full attack lifecycle against specific threats.
- Year 4+: Continuous red teaming with purple team integration. Operationalize adversarial testing.
Scoping Best Practices
Regardless of the engagement type, effective scoping is critical to a successful outcome. The following principles apply universally.
Define Success Criteria
Every engagement should have clear, measurable success criteria agreed upon before testing begins. Vague objectives like “test our security” produce vague results. Good criteria are specific:
- “Obtain domain admin credentials without triggering a SOC alert within 10 business days.”
- “Access the production database containing customer PII from a standard user workstation.”
- “The incident response team detects and contains a simulated ransomware deployment within 4 hours.”
Establish Communication Protocols
Define how the red team communicates with the trusted agent (the internal point of contact who knows about the engagement) throughout the operation:
- Check-in cadence: Daily status updates? Weekly? Only on milestones?
- Emergency procedures: What happens if the red team discovers evidence of a real compromise? What if testing causes an unintended outage?
- Deconfliction: How does the trusted agent distinguish red team activity from real attacks in SOC alerts?
Set Realistic Constraints
Constraints should reflect the organization’s risk tolerance:
- Hours of operation: Can the red team operate 24/7 or only during business hours?
- Geographic boundaries: Physical testing in all offices or only headquarters?
- Technique restrictions: Are kernel exploits, denial-of-service, or production data access permitted?
- Notification thresholds: At what point must the red team notify the trusted agent (e.g., discovery of critical zero-day, evidence of real compromise)?
Budget Allocation
A common mistake is allocating the entire budget to execution and leaving nothing for remediation validation. A reasonable split:
- 70% — Engagement execution and reporting
- 15% — Remediation support and consultation
- 15% — Retest and validation of fixes
Selecting the Right Engagement Type
The decision should be driven by three factors:
1. Organizational Maturity
Organizations with limited security programs benefit most from tabletop exercises and assumed breach engagements. Full-scope engagements against immature organizations produce overwhelming finding lists that cannot be acted upon. Match the engagement to the organization’s ability to consume and act on findings.
2. Threat Landscape
What are the most likely and most impactful threats? A software company’s primary risk might be supply chain compromise, while a hospital’s might be ransomware delivered through phishing. Use threat intelligence to select engagement types that test against the most relevant attack scenarios. Adversary emulation is particularly effective when specific threat actors can be identified.
3. Specific Questions
The best engagements start with a question:
- “Can an attacker reach our SWIFT environment from a compromised email account?” → Objective-Based
- “Would our SOC detect APT29-style lateral movement?” → Adversary Emulation
- “How would our executives handle a public ransomware incident?” → Tabletop Exercise
- “What does our full exposure look like to a motivated external attacker?” → Full-Scope Red Team
- “Are our internal segmentation controls effective?” → Assumed Breach
- “Could someone clone our badges and walk into the data center?” → Wireless Red Teaming (RFID component)
- “Is our CI/CD pipeline vulnerable to dependency attacks?” → Supply Chain Red Teaming
Common Mistakes
Skipping the tabletop. Organizations jump to live testing before their incident response processes are defined. When the red team compromises the environment and nobody knows who to call or what to do, the exercise provides limited value beyond proving that attackers can get in — which was already known.
One and done. A single annual engagement creates a false sense of security. The environment changes daily; a test from six months ago may no longer reflect current risk. Continuous or at least quarterly testing is necessary for meaningful assurance.
Testing without a hypothesis. Engagements that lack specific objectives produce generic findings. Start with a hypothesis about what you expect the red team to find, and design the engagement to test it.
Ignoring remediation. The engagement report is not the finish line. Without dedicated time and budget for remediation and retesting, findings persist indefinitely. Track red team findings in the same system as vulnerability management and hold teams accountable for remediation timelines.
Choosing based on cost alone. The cheapest engagement is rarely the right one. A tabletop exercise is inexpensive but will not reveal technical vulnerabilities. A full-scope engagement is expensive but may be the only way to understand true organizational risk. Match the investment to the question being asked.
Key Takeaways
- There is no single “best” engagement type — the right choice depends on organizational maturity, threat landscape, and the specific question being asked.
- Full-scope engagements are the gold standard for realism but require significant investment and organizational readiness.
- Assumed breach engagements provide the best return on investment for organizations focused on improving detection and response.
- Adversary emulation produces the most structured, measurable, and repeatable results, especially when mapped to MITRE ATT&CK.
- Supply chain and wireless engagements test attack surfaces that traditional network-focused testing misses entirely.
- A mature program combines multiple engagement types throughout the year, supported by robust infrastructure and clear rules of engagement.
- Always define success criteria, communication protocols, and constraints before the first keystroke.
Further Reading
- MITRE Center for Threat-Informed Defense — Adversary Emulation Library
- CREST — A Guide to the CBEST Threat Intelligence-Led Penetration Testing Standard
- European Central Bank — TIBER-EU Framework
- NIST SP 800-53 Rev. 5 — CA-8: Penetration Testing
- Red Team Journal — Red Teaming Laws and Ethics
- Atomic Red Team — Technique Library