← Back to Red Teaming

Purple Teaming & Detection Engineering

20 min read

What Is Purple Teaming?

Purple teaming is a collaborative security methodology that integrates red team (offensive) and blue team (defensive) operations into a unified, iterative feedback loop. Unlike traditional red team engagements — where the offensive team operates in isolation and delivers a report weeks later — purple teaming places attackers and defenders side by side, sharing techniques, telemetry, and insights in real time.

The name comes from mixing red and blue, but the concept goes far beyond color theory. Purple teaming fundamentally changes the relationship between offense and defense from adversarial to cooperative.


The Problem Purple Teaming Solves

Traditional red team engagements suffer from a well-documented failure mode: findings go unactioned. A 2024 SANS survey found that over 40% of red team recommendations are never fully implemented, and the average time from finding to remediation exceeds 90 days. The reasons are structural:

  • Communication gap — Red team reports describe attack paths in offensive terminology. Blue team analysts think in terms of log sources, detection rules, and alert pipelines. Translation between these worldviews is lossy.
  • Delayed feedback — By the time a red team report lands, the blue team’s environment has changed. New tools have been deployed, configurations have shifted, and the specific conditions the red team exploited may no longer exist — or may have gotten worse.
  • Missing context — A red team report says “we used DCSync to extract credentials.” The blue team needs to know: What specific telemetry should we have seen? What logs were generated? What detection logic would catch this? The report rarely answers these questions at the depth required.
  • No validation loop — Even when the blue team builds a detection, there is no structured process to verify it actually works against the technique that was demonstrated.

Purple teaming addresses all four problems by collapsing the gap between execution and detection into a single, iterative session.


Red Team vs. Blue Team vs. Purple Team

AspectRed TeamBlue TeamPurple Team
ObjectiveFind weaknesses, simulate adversariesDetect, respond, and recoverImprove detection and response together
Operating ModelIsolated, covertReactive, continuousCollaborative, iterative
CommunicationReport-based (post-engagement)Alert-based (real-time)Continuous dialogue during exercises
Success MetricObjectives achieved, paths documentedIncidents detected, MTTD/MTTRDetection coverage improved, gaps closed
Failure ModeFindings ignored, no detection improvementBlind spots persist, unknown unknownsRequires commitment from both sides
Typical Duration2–6 weeksOngoingStructured sessions (days) or continuous
OutputAttack narrative, recommendationsAlerts, incident reportsNew/improved detections, validated playbooks

Purple Team Roles and Responsibilities

A well-structured purple team exercise involves distinct roles:

Red Team Operator — Executes techniques from the agreed-upon scope. Provides real-time narration of what they are doing, what artifacts they expect to generate, and what telemetry the blue team should observe. Adjusts techniques based on feedback.

Blue Team Analyst — Monitors SIEM, EDR, and other detection platforms during technique execution. Reports what was detected, what was missed, and what telemetry is available. Develops or tunes detection rules in response to gaps.

Purple Team Lead (Facilitator) — Coordinates the exercise. Maintains the test plan, tracks progress through the ATT&CK matrix, documents results, and ensures both sides are communicating effectively. Often the most experienced person in the room.

Detection Engineer — Writes and deploys new detection logic during the exercise. Tests rules against replayed or re-executed techniques. Ensures detections are production-ready before the session ends.

Stakeholder / Management Observer — Observes the exercise to understand defensive capabilities and gaps. Provides resourcing decisions for remediation priorities.


The Test-Analyze-Refine (TAR) Cycle

The TAR cycle is the engine of purple teaming. Every technique tested follows this iterative loop until detection coverage reaches the desired threshold.

graph LR
    A["1. Plan & Scope<br/>Select technique"] --> B["2. Execute<br/>Run atomic test"]
    B --> C["3. Analyze<br/>Review telemetry"]
    C --> D{"Detection<br/>exists?"}
    D -->|"Yes — effective"| E["5. Document<br/>Record coverage"]
    D -->|"No or partial"| F["4. Refine<br/>Build/tune detection"]
    F --> B
    E --> G["6. Next Technique"]
    G --> A

Phase 1: Plan and Scope

Before executing anything, the team selects a specific ATT&CK technique (or sub-technique) and defines the test parameters:

  • Technique ID and name — e.g., T1003.001 (OS Credential Dumping: LSASS Memory)
  • Tool or method — e.g., Mimikatz sekurlsa::logonpasswords, ProcDump LSASS dump, direct memory read via NtReadVirtualMemory
  • Target system — Specific host, OS version, endpoint protection stack
  • Expected telemetry — Process creation events (Sysmon Event ID 1), process access events (Sysmon Event ID 10 with GrantedAccess 0x1010 or 0x1410), Windows Security Event 4656
  • Existing detections — Any current SIEM rules, EDR alerts, or custom analytics that should trigger

Phase 2: Execute

The red team operator runs the technique while narrating their actions. This is not a stealth exercise — the purpose is detection validation, not evasion testing. The operator announces:

  • The exact command or tool being run
  • The timestamp of execution (for log correlation)
  • The expected artifacts (files dropped, registry keys modified, network connections made)
  • Any cleanup actions taken

Phase 3: Analyze

The blue team immediately examines available telemetry:

  • Did any existing detection fire? Check SIEM alerts, EDR detections, and custom analytics.
  • What raw telemetry is available? Review log sources for relevant events, even if no alert triggered.
  • What telemetry is missing? Identify gaps in logging coverage — perhaps Sysmon is not configured to capture the required event type, or the relevant Windows audit policy is not enabled.
  • What is the signal-to-noise ratio? If telemetry exists, how distinguishable is the malicious activity from normal baseline behavior?

Phase 4: Refine

If detection is absent or insufficient, the detection engineer builds or tunes a rule:

  • Write a SIGMA rule or native SIEM query targeting the observed telemetry
  • Deploy the rule to the detection pipeline
  • Configure appropriate severity, tagging, and response actions
  • Document the rule’s logic, expected false positive rate, and tuning guidance

Phase 5: Re-Test

The red team re-executes the technique to validate the new detection fires correctly. This may iterate multiple times as the detection is tuned to minimize false positives while maintaining true positive coverage.

Phase 6: Document

Record the final state: technique tested, detection status, rule ID, telemetry requirements, any remaining gaps, and follow-up actions needed.


TAR Cycle: Real-World Workflow Example

Scenario: Testing detection of Kerberoasting (T1558.003)

Iteration 1:

  1. Red team runs Rubeus.exe kerberoast /outfile:hashes.txt on a domain-joined workstation
  2. Blue team checks SIEM — no alert fires
  3. Analysis reveals Windows Security Event 4769 is being collected, but no detection rule targets the anomalous encryption type (0x17 = RC4) that Kerberoasting produces
  4. Detection engineer writes a SIGMA rule targeting Event ID 4769 where TicketEncryptionType = 0x17 and ServiceName does not end in $
  5. Rule deployed to SIEM

Iteration 2:

  1. Red team re-runs Rubeus.exe kerberoast
  2. Alert fires — but also generates 12 false positives from a legacy application using RC4
  3. Detection engineer adds an exclusion for the known service account
  4. Re-test confirms: true positive fires, false positives eliminated

Iteration 3:

  1. Red team runs Kerberoasting with AES encryption downgrade (/enctype:aes) to test evasion
  2. Original detection misses this variant
  3. Detection engineer adds a second rule correlating 4769 events with unusual volume of TGS requests from a single source within a short time window
  4. Both rules now provide layered detection coverage

Final documentation: Two SIGMA rules deployed, covering both RC4-based and volume-based Kerberoasting detection. Telemetry requirement: Windows Security Event 4769 with Advanced Audit Policy “Audit Kerberos Service Ticket Operations” enabled. Known limitation: AES-only Kerberoasting with low-volume, spread-out requests remains difficult to detect without behavioral baselines.


Atomic Red Team

Atomic Red Team is an open-source library maintained by Red Canary that provides a collection of small, focused, portable tests mapped to the MITRE ATT&CK framework. With over 1,070 atomic tests covering hundreds of techniques and sub-techniques, it is the most widely used testing library for purple team exercises and detection validation.

Why Atomic Tests Matter

The core philosophy is atomicity: each test exercises exactly one technique, produces known artifacts, and can be executed independently in under a minute. This makes them ideal for the TAR cycle — you can rapidly test, analyze, refine, and re-test without the overhead of running a full attack chain.

Atomic Test Structure

Every atomic test is defined in a YAML file with a standardized structure:

attack_technique: T1003.001
display_name: "OS Credential Dumping: LSASS Memory"
atomic_tests:
  - name: "Dump LSASS memory using ProcDump"
    auto_generated_guid: 0be2230c-9ab3-4ac2-8826-3199b9a0ebf8
    description: |
      Uses Sysinternals ProcDump to dump the LSASS process memory
      to disk for offline credential extraction.
    supported_platforms:
      - windows
    input_arguments:
      output_file:
        description: "Path to write the dump file"
        type: path
        default: "C:\\Windows\\Temp\\lsass_dump.dmp"
    dependency_executor_name: powershell
    dependencies:
      - description: "ProcDump must be available on disk"
        prereq_command: |
          if (Test-Path "C:\Tools\procdump.exe") { exit 0 } else { exit 1 }
        get_prereq_command: |
          Invoke-WebRequest -Uri "https://live.sysinternals.com/procdump.exe" `
            -OutFile "C:\Tools\procdump.exe"
    executor:
      command: |
        C:\Tools\procdump.exe -accepteula -ma lsass.exe #{output_file}
      cleanup_command: |
        Remove-Item -Path #{output_file} -Force -ErrorAction SilentlyContinue
      name: command_prompt
      elevation_required: true

Key elements of the YAML structure:

  • attack_technique — Maps directly to an ATT&CK technique or sub-technique ID
  • supported_platforms — Specifies where the test runs (windows, linux, macos)
  • input_arguments — Parameterized values that can be customized at runtime
  • dependencies — Prerequisites that must be satisfied before the test can execute, with automatic resolution commands
  • executor — The actual command to run, the cleanup command to reverse changes, and whether elevation is required
  • cleanup_command — Critical for safe testing; reverses changes made by the test

Invoke-AtomicRedTeam

The primary execution framework is Invoke-AtomicRedTeam, a PowerShell module that automates the download, prerequisite installation, execution, and cleanup of atomic tests.

# Install the module
Install-Module -Name invoke-atomicredteam -Scope CurrentUser

# Import and download the atomics library
Import-Module invoke-atomicredteam
IEX (IWR 'https://raw.githubusercontent.com/redcanaryco/invoke-atomicredteam/master/install-atomicredteam.ps1' -UseBasicParsing)
Install-AtomicRedTeam -getAtomics

# List all tests for a technique
Invoke-AtomicTest T1003.001 -ShowDetailsBrief

# Check and install prerequisites for a specific test
Invoke-AtomicTest T1003.001 -TestNumbers 1 -CheckPrereqs
Invoke-AtomicTest T1003.001 -TestNumbers 1 -GetPrereqs

# Execute a specific atomic test
Invoke-AtomicTest T1003.001 -TestNumbers 1

# Execute with custom input arguments
Invoke-AtomicTest T1003.001 -TestNumbers 1 -InputArgs @{
    "output_file" = "C:\Temp\custom_dump.dmp"
}

# Run cleanup after testing
Invoke-AtomicTest T1003.001 -TestNumbers 1 -Cleanup

# Execute all tests for a technique (useful for comprehensive coverage testing)
Invoke-AtomicTest T1003.001

# Run tests and log execution details for correlation
Invoke-AtomicTest T1003.001 -LoggingModule "Attire-ExecutionLogger"

Creating Custom Atomics

When a technique you need to test is not covered by existing atomics, or you need a test tailored to your specific environment, you can create custom atomic tests:

  1. Create a new YAML file following the standard schema under atomics/T<technique_id>/
  2. Define the test with clear descriptions, input arguments, and cleanup commands
  3. Test locally using Invoke-AtomicTest with the -PathToAtomicsFolder parameter
  4. Submit a pull request to the upstream repository if the test has broad applicability

CI/CD Integration for Detection Testing

One of the most powerful applications of Atomic Red Team is integrating it into CI/CD pipelines for automated detection validation:

# detection-validation-pipeline.ps1
# Run as part of CI/CD after SIEM rule changes are deployed

$techniques = @("T1003.001", "T1059.001", "T1053.005", "T1548.002")
$results = @()

foreach ($technique in $techniques) {
    # Execute atomic test
    $executionTime = Get-Date
    Invoke-AtomicTest $technique -TestNumbers 1

    # Wait for log ingestion pipeline
    Start-Sleep -Seconds 60

    # Query SIEM for corresponding alert
    $alert = Search-SIEMAlert -Technique $technique -After $executionTime

    $results += [PSCustomObject]@{
        Technique   = $technique
        Detected    = [bool]$alert
        AlertName   = $alert.Name
        TimeTaken   = ($alert.Timestamp - $executionTime).TotalSeconds
    }

    # Cleanup
    Invoke-AtomicTest $technique -TestNumbers 1 -Cleanup
}

# Report results
$results | Format-Table -AutoSize
$failedDetections = $results | Where-Object { -not $_.Detected }
if ($failedDetections) {
    Write-Error "Detection regression: $($failedDetections.Count) techniques not detected"
    exit 1
}

This approach treats detections as code — any change to detection logic is automatically validated against known attack techniques before deployment.


SIGMA Rules

SIGMA is a generic, open, and vendor-agnostic signature format for SIEM systems. Created by Florian Roth (Neo23x0) and Thomas Patzke, SIGMA does for SIEM detection what YARA does for file scanning and Snort does for network traffic: it provides a common language for describing detection logic that can be translated to any backend platform.

Why SIGMA Matters for Purple Teaming

Without SIGMA, a detection engineer who writes a rule for Splunk must rewrite it entirely for Microsoft Sentinel, and again for Elastic SIEM. SIGMA eliminates this duplication by defining detections in a platform-independent YAML format that compilers convert to native query languages.

For purple teams specifically, SIGMA provides:

  • Portable detections — Rules written during a purple team exercise work regardless of which SIEM the organization uses
  • Community sharing — The SigmaHQ repository contains thousands of community-contributed rules mapped to ATT&CK
  • Version control — SIGMA rules are text files that live in Git repositories, enabling detection-as-code workflows
  • Standardized testing — Automated validation pipelines can process SIGMA rules the same way regardless of target platform

SIGMA Rule Structure

A SIGMA rule consists of several key sections:

title: Kerberoasting - RC4 Ticket Encryption
id: a]bf6e87-4c76-43b3-b2a5-1d2c63b9e7a4
status: production
level: high
description: |
    Detects potential Kerberoasting activity by identifying TGS ticket
    requests using RC4 encryption (etype 0x17), which is anomalous in
    modern Active Directory environments using AES by default.
references:
    - https://attack.mitre.org/techniques/T1558/003/
    - https://www.ired.team/offensive-security-experiments/active-directory-kerberos-abuse/t1208-kerberoasting
author: Purple Team Exercise - Detection Sprint 2026-Q1
date: 2026/03/15
modified: 2026/03/18
tags:
    - attack.credential_access
    - attack.t1558.003
logsource:
    category: null
    product: windows
    service: security
detection:
    selection:
        EventID: 4769
        TicketEncryptionType: "0x17"
    filter_machine_accounts:
        ServiceName|endswith: "$"
    filter_known_services:
        ServiceName:
            - "krbtgt"
            - "kadmin/changepw"
    condition: selection and not filter_machine_accounts and not filter_known_services
falsepositives:
    - Legacy applications that explicitly request RC4 encryption
    - Service accounts configured before AES migration
    - Cross-forest trust authentication in some configurations

Key sections explained:

  • logsource — Defines the log category, product, and service. This is what makes SIGMA portable — backends map logsources to their specific index/table names.
  • detection — Contains named selection and filter blocks. Each block defines field-value pairs to match.
  • condition — A boolean expression combining selection and filter blocks. Supports and, or, not, 1 of, all of, and aggregation functions like count() and near.
  • level — Severity classification: informational, low, medium, high, critical.
  • tags — ATT&CK mappings using the attack. prefix convention.
  • falsepositives — Documented known false positive scenarios, essential for tuning guidance.

SIGMA Modifiers

SIGMA supports field value modifiers that provide pattern matching capabilities:

ModifierDescriptionExample
containsSubstring matchCommandLine|contains: "-nop"
startswithPrefix matchImage|startswith: "C:\\Windows\\Temp"
endswithSuffix matchTargetFilename|endswith: ".dmp"
reRegular expressionCommandLine|re: "mimikatz|sekurlsa"
base64Match base64-encoded valueCommandLine|base64: "IEX"
base64offsetMatch base64 at any offsetScriptBlockText|base64offset: "Invoke-Mimikatz"
allAll values must matchCommandLine|contains|all: ["-nop", "-w hidden"]
cidrCIDR network matchDestinationIp|cidr: "10.0.0.0/8"

SIGMA-to-SIEM Conversion

The pySigma framework (successor to the original sigmac tool) handles conversion from SIGMA rules to native SIEM query languages:

# Install pySigma and backend plugins
pip install pySigma
pip install pySigma-backend-splunk
pip install pySigma-backend-microsoft365defender
pip install pySigma-backend-elasticsearch

# Convert a single rule to Splunk SPL
sigma convert -t splunk -p sysmon rule.yml

# Convert to Microsoft Sentinel KQL
sigma convert -t microsoft365defender -p sysmon rule.yml

# Convert to Elasticsearch Lucene query
sigma convert -t elasticsearch -p ecs_windows rule.yml

# Batch convert all rules in a directory
sigma convert -t splunk -p sysmon rules/ --output splunk_rules/

SIGMA Community Rules

The SigmaHQ repository contains over 3,000 community-contributed rules organized by log source category. These rules represent the collective detection knowledge of the security community and serve as an excellent foundation for any detection engineering program. During purple team exercises, the SigmaHQ rules provide a baseline — if a technique is tested and no SigmaHQ rule exists for it, that represents both a gap in your program and a contribution opportunity for the community.


Detection Engineering Workflow

Detection engineering is the discipline of systematically designing, building, testing, deploying, and maintaining detection logic. Purple teaming provides the testing framework; detection engineering provides the methodology for building durable, high-quality detections.

Alert Development Lifecycle

graph TB
    A["1. Hypothesis<br/>What threat behavior<br/>do we need to detect?"] --> B["2. Data Source Mapping<br/>What telemetry is<br/>available or needed?"]
    B --> C["3. Rule Development<br/>Write detection logic<br/>(SIGMA/native query)"]
    C --> D["4. Validation<br/>Test against known-good<br/>and known-bad samples"]
    D --> E{"Meets quality<br/>thresholds?"}
    E -->|No| F["5a. Tune<br/>Reduce FPs,<br/>improve coverage"]
    F --> D
    E -->|Yes| G["5b. Deploy<br/>Push to production<br/>SIEM pipeline"]
    G --> H["6. Monitor<br/>Track performance<br/>metrics over time"]
    H --> I{"Performance<br/>degradation?"}
    I -->|Yes| F
    I -->|No| H

Hypothesis-Driven Detection

Rather than writing detections reactively (after an incident), mature detection engineering programs use hypothesis-driven development:

  1. Start with the threat — Select an ATT&CK technique relevant to your threat model. See the MITRE ATT&CK page for mapping threats to your environment.
  2. Formulate a hypothesis — “If an adversary performs T1053.005 (Scheduled Task), we should observe Windows Security Event 4698 (Scheduled Task Created) with specific characteristics distinguishing malicious from legitimate task creation.”
  3. Identify data sources — Determine which log sources provide the telemetry needed to test the hypothesis. Verify that these sources are being collected and ingested.
  4. Build the detection — Write the rule targeting the hypothesized indicators.
  5. Test the hypothesis — Execute the technique (via Atomic Red Team or manual reproduction) and verify the detection fires.
  6. Iterate — Refine based on results until the detection meets quality thresholds.

Data Source Mapping and Telemetry Requirements

Effective detection requires the right telemetry. A common failure mode in purple team exercises is discovering that the necessary log source is not being collected at all. Map your telemetry requirements systematically:

ATT&CK Data SourceWindows SourceCollection ToolKey Event IDs / Fields
Process CreationWindows Security Audit, SysmonSysmon, WEF, EDRSysmon 1, Security 4688
Process AccessSysmonSysmonSysmon 10 (GrantedAccess)
File CreationSysmon, NTFS auditingSysmon, EDRSysmon 11, Security 4663
Registry ModificationSysmon, Registry AuditingSysmon, EDRSysmon 12/13/14, Security 4657
Network ConnectionSysmon, Windows FirewallSysmon, EDR, NTASysmon 3, Firewall 5156
Command ExecutionPowerShell logging, SysmonGPO, SysmonPS 4104 (ScriptBlock), Sysmon 1
AuthenticationWindows Security AuditWEF, SIEM agentSecurity 4624/4625/4648
Kerberos ActivityWindows Security AuditWEF, SIEM agentSecurity 4768/4769/4771

Measuring Detection Quality

Every detection should be evaluated against quantitative quality metrics:

  • True Positive Rate (TPR) — Percentage of actual attack executions that trigger the detection. Target: >95% for critical techniques.
  • False Positive Rate (FPR) — Number of false alerts per day/week. Target: <5 per day for high-severity rules, <20 per day for medium.
  • Mean Time to Detect (MTTD) — Elapsed time from technique execution to alert generation. Includes log ingestion latency, rule evaluation frequency, and any batching delays.
  • Detection Specificity — How precisely the alert identifies the technique. A generic “suspicious PowerShell” alert has low specificity; an alert identifying “Invoke-Mimikatz via encoded PowerShell with LSASS access” has high specificity.
  • Evasion Resistance — How many known variants of the technique the detection covers. Tested by running multiple atomics for the same ATT&CK technique. Refer to Stealth & Evasion for understanding how adversaries avoid detection.

ATT&CK Navigator Heatmaps

The MITRE ATT&CK Navigator is a web-based tool for creating layered, color-coded visualizations of the ATT&CK matrix. For purple teams, it serves as the primary tool for tracking detection coverage and identifying gaps.

Creating Detection Coverage Maps

Maintain two parallel Navigator layers:

  1. Red Layer (Threat Coverage) — Techniques that adversaries relevant to your organization are known to use. Sourced from CTI reports, threat modeling, and industry-specific intelligence.
  2. Blue Layer (Detection Coverage) — Techniques for which you have validated, production-grade detections. Color-coded by detection maturity:
    • Dark green — Automated detection with validated playbook, tested in last 90 days
    • Light green — Detection exists, basic response procedure documented
    • Yellow — Partial detection (only some variants covered) or high false positive rate
    • Red — No detection, but technique is in threat model
    • Gray — Technique not relevant to environment (e.g., macOS techniques in a Windows-only environment)

Gap Analysis and Prioritization

Overlaying the red (threat) and blue (detection) layers reveals gaps where adversary techniques lack detection coverage. Prioritize detection investment based on:

  • Threat relevance — Techniques actively used by threat actors targeting your industry
  • Impact severity — Techniques that enable high-impact objectives (credential access, lateral movement, data exfiltration)
  • Detection feasibility — Availability of telemetry and distinguishability of malicious from benign behavior
  • Detection cost — Engineering effort and potential false positive burden

The Navigator supports JSON export, enabling version-controlled tracking of coverage changes over time. Integrate Navigator layer diffs into your purple team after-action reports to demonstrate measurable improvement.


Collaborative Exercise Formats

Purple team exercises can take several forms, each appropriate for different maturity levels and objectives.

Guided Purple Team Sessions

The most common format for organizations beginning their purple team journey. A facilitator controls the exercise pace and ensures productive collaboration.

Structure:

  1. Pre-exercise: Select 10-15 ATT&CK techniques based on threat intelligence priority
  2. For each technique: Red team explains and executes while blue team observes telemetry
  3. Real-time discussion of detection gaps and immediate rule development
  4. Re-test after detection is deployed
  5. Post-exercise: Document results, assign follow-up actions

Duration: 1-3 days Best for: Building initial detection coverage, training blue team on adversary tradecraft

Open Purple Team Exercises

More advanced format where the red team has broader freedom to chain techniques and the blue team hunts in near-real-time.

Structure:

  1. Red team executes a multi-stage attack scenario (e.g., initial access through lateral movement to objective)
  2. Blue team attempts to detect and track the intrusion as it progresses
  3. After each phase, both teams debrief: what was detected, what was missed, how the attack could have been stopped earlier
  4. Detection development happens between phases

Duration: 1-2 weeks Best for: Testing detection chains, validating incident response workflows, training SOC analysts

Detection Sprints

Borrowed from agile methodology, detection sprints focus on rapidly building detections for a specific threat or ATT&CK tactic.

Structure:

  1. Sprint planning: Select a tactic (e.g., Persistence) and identify all relevant techniques
  2. Daily standups: Red team executes 2-3 techniques per day, blue team builds detections
  3. Sprint review: Demonstrate all new detections, update ATT&CK Navigator coverage map
  4. Sprint retrospective: What worked, what to improve in the next sprint

Duration: 1-2 weeks Best for: Rapidly expanding coverage for a specific tactic, building team muscle memory

Tabletop-to-Technical Pipeline

Bridges the gap between executive tabletop exercises and technical validation.

Structure:

  1. Run a tabletop exercise with a specific threat scenario
  2. Extract the ATT&CK techniques implied by the scenario
  3. Execute those techniques in a guided purple team session
  4. Validate whether the organization’s actual detection and response capabilities match the assumptions made during the tabletop

Duration: Tabletop (half day) + Technical validation (1-2 days) Best for: Aligning executive understanding with operational reality


Continuous Purple Teaming

Mature organizations move beyond periodic purple team exercises toward continuous, automated detection validation. This shift treats detection coverage as a living system that requires ongoing testing — the same way software requires continuous integration and testing.

Breach and Attack Simulation (BAS) Platforms

BAS platforms automate the execution of attack techniques and the validation of detection coverage at scale:

PlatformKey CapabilitiesDeployment Model
AttackIQFull ATT&CK coverage, automated TAR cycle, integrates with major SIEMs and EDRsSaaS + on-prem agents
SafeBreachContinuous simulation, breach scenario library, risk scoringSaaS + on-prem agents
Picus SecurityDetection analytics, mitigation suggestions, threat librarySaaS
CymulateAttack surface management + BAS, phishing simulationSaaS + on-prem agents
Atomic Red Team + Custom AutomationOpen-source, fully customizable, CI/CD integrationSelf-hosted

BAS platforms run attack simulations on a schedule (daily, weekly) and report on which detections fired, which missed, and whether any previously working detections have regressed. This provides the continuous feedback loop that periodic exercises cannot.

Detection-as-Code

The detection-as-code paradigm applies software engineering practices to detection development:

  • Version control — All detection rules stored in Git repositories
  • Code review — New detections require peer review before deployment (pull request workflow)
  • Automated testing — CI pipeline validates rule syntax, runs tests against sample data, and checks for regressions
  • Deployment automation — Merged rules automatically deploy to SIEM via API
  • Rollback capability — If a rule causes alert storms, revert to the previous version instantly

This workflow integrates naturally with the purple team process: each exercise produces new SIGMA rules that go through the detection-as-code pipeline before reaching production.

Automated Testing Pipelines

A mature continuous purple teaming pipeline combines several components:

  1. Scheduled atomic test execution — BAS platform or cron-scheduled Invoke-AtomicRedTeam runs
  2. SIEM alert correlation — Automated check for corresponding alerts after each test execution
  3. Coverage dashboarding — Real-time visualization of detection coverage percentage across ATT&CK matrix
  4. Regression alerting — Automated notification when a previously detected technique is no longer being caught
  5. Reporting — Weekly/monthly detection coverage reports for security leadership, feeding into the broader Metrics & Reporting framework

Building Detection Playbooks

A detection without a response playbook is an alarm with no fire department. Purple team exercises should produce not only detections but also the playbooks that analysts follow when those detections fire.

Playbook Structure

A well-structured detection playbook contains:

1. Alert Context

  • Detection rule name and ID
  • ATT&CK technique mapping
  • Severity and priority classification
  • Historical true positive / false positive ratio

2. Triage Steps

  • Initial validation: Is this a true positive?
  • Data enrichment: What additional context to gather (user info, host info, recent activity)
  • Scope assessment: Is this an isolated event or part of a broader attack chain?

3. Investigation Queries

  • Pre-written SIEM queries for pivoting from the initial alert
  • EDR queries for gathering host-level context
  • Network queries for identifying related connections

4. Response Actions

  • Containment options (isolate host, disable account, block IP)
  • Evidence preservation steps
  • Escalation criteria (when to engage incident response, when to notify management)

5. Automation Opportunities (SOAR Integration)

  • Which triage steps can be automated via SOAR playbooks
  • Automatic enrichment lookups (threat intelligence, asset inventory, user directory)
  • Automated containment actions for high-confidence alerts
  • Notification routing based on severity and affected assets

Playbook Testing

Playbooks must be tested during purple team exercises, not just the detections they accompany. When a detection fires during a purple team session:

  1. Have a SOC analyst walk through the playbook in real time
  2. Identify steps that are unclear, incomplete, or impractical
  3. Measure the time from alert to completed triage
  4. Verify that investigation queries return useful results
  5. Test SOAR automation components end-to-end
  6. Update the playbook based on findings

Metrics and Maturity

Measuring purple team effectiveness requires both tactical metrics (per-exercise) and strategic metrics (program-level).

Tactical Metrics (Per Exercise)

  • Techniques tested — Number of ATT&CK techniques executed during the exercise
  • Initial detection rate — Percentage of techniques detected by existing rules before any tuning
  • Final detection rate — Percentage of techniques detected after rule development and tuning
  • Detections created — Number of new detection rules produced
  • Detections improved — Number of existing rules tuned for better accuracy
  • Mean time to detect (MTTD) — Average time from technique execution to alert generation
  • Mean time to develop detection — Average time to create a new detection rule during the exercise

Strategic Metrics (Program-Level)

  • ATT&CK coverage percentage — Percentage of threat-relevant techniques with validated detections (tracked over time)
  • Detection regression rate — Percentage of previously validated detections that stop working between exercises
  • MTTD trend — Improvement in mean detection time across quarterly exercises
  • Detection-to-response ratio — Percentage of detections that have associated response playbooks
  • Purple team ROI — Calculated as: (value of detections created + risk reduction from gaps closed) / (cost of exercise time + tooling)

Purple Team Maturity Model

LevelNameCharacteristicsActivitiesToolsOutcomes
1Ad HocNo formal purple team program. Red and blue teams operate independently. Occasional knowledge sharing.Informal post-engagement debriefs, ad hoc rule writing based on red team reportsStandalone red team tools, manual SIEM queriesSporadic detection improvements, no tracking
2DefinedFormal purple team exercises scheduled quarterly. Designated facilitator. ATT&CK used for scoping.Guided purple team sessions, basic TAR cycle, manual documentationAtomic Red Team, ATT&CK Navigator, shared spreadsheetsMeasurable detection improvement per exercise, basic coverage tracking
3ManagedRegular exercises with detection engineering workflow. SIGMA-based rule development. Metrics tracked.Detection sprints, hypothesis-driven development, playbook creation, metric dashboardsSIGMA + pySigma, detection-as-code pipeline, SOAR integration, coverage dashboardsConsistent coverage growth, MTTD improvement trend, playbook coverage >50%
4OptimizedContinuous automated testing supplements regular exercises. Detection regression monitoring.BAS platform integration, CI/CD detection pipelines, automated regression testing, advanced tabletop-to-technical exercisesBAS platform (AttackIQ/SafeBreach), automated testing pipeline, Navigator API integration>70% ATT&CK coverage for threat-relevant techniques, <24hr MTTD for critical techniques, <5% regression rate
5InnovatingPurple team is embedded in security culture. Custom threat emulation, original research, community contribution.Custom adversary simulation development, zero-day detection research, SIGMA rule community contributions, cross-org purple team exercisesCustom emulation frameworks, threat intelligence platform integration, automated threat-informed defense>85% threat-relevant coverage, near-real-time detection for priority techniques, measurable risk reduction, industry-leading detection capability

Bringing It All Together

Purple teaming is not a single event — it is a continuous improvement methodology that transforms how an organization detects and responds to threats. The most effective programs combine:

  1. Structured exercises using the TAR cycle to systematically improve detection coverage
  2. Standardized tooling — Atomic Red Team for reproducible technique execution, SIGMA for portable detections
  3. Engineering discipline — Detection-as-code pipelines with version control, testing, and automated deployment
  4. Measured outcomes — ATT&CK Navigator heatmaps showing coverage growth, MTTD trending downward, and detection regression rates staying low
  5. Continuous validation — BAS platforms or automated testing pipelines ensuring detections continue working between exercises

The organizations that mature their purple team programs from ad hoc exercises to continuous, measured operations gain a decisive advantage: they do not just know their weaknesses, they systematically eliminate them.

For measuring and communicating these improvements to leadership, see Metrics & Reporting.