
Most of a red team engagement is not the dramatic part. The phishing email and the final objective bookend a campaign whose middle, the internal reconnaissance and lateral movement, eats the majority of the calendar, because that is where an operator moves slowly enough to stay below your detection thresholds. Understanding the methodology means understanding that pacing, not just the steps.
Red team methodology is a five-stage attack lifecycle, reconnaissance, initial access, establishing a foothold, lateral movement and privilege escalation, and actions on the objective, that mirrors how a real adversary works toward a goal. Each stage maps cleanly to MITRE ATT&CK tactics, which is what lets a defender translate 'the red team got in' into 'here are the specific techniques we failed to detect at each step'. Below, each stage is broken down at the methodology level, never as an operational playbook, with the detection opportunities a blue team should be building at each one.
The five stages are reconnaissance, initial access, establishing a foothold (command and control), lateral movement and privilege escalation, and actions on the objective with reporting. Together they form an iterative attack lifecycle that takes the team from zero knowledge of the target to a specific goal, while staying below the detection threshold. The visual below lays out the full lifecycle at a glance.
The stages are not strictly linear. Once inside, operators return to internal reconnaissance constantly, discovering new hosts and credentials that open new paths. The methodology is a loop, not a straight line, and it shares its backbone with the standard penetration testing process, extended for stealth and goal-orientation. The distinction that matters throughout: a pentest wants to map the whole maze, a red team only needs the one corridor that reaches the objective.
Reconnaissance is where the red team builds an external picture of the target before touching anything that triggers an alert. It splits into passive and active. Passive recon gathers intelligence without contacting the target directly: OSINT on employees and tech stack via LinkedIn and job postings, subdomain and asset discovery, leaked-credential checks, and certificate-transparency logs. Active recon probes the attack surface directly, port and service scanning of external infrastructure, but carefully, because noisy scanning is itself a detectable event.
The output is a target map: which people are likely phishing targets, which services are exposed, and which technologies might offer a way in. In ATT&CK terms this is the Reconnaissance and Resource Development tactics. AI-assisted attack-surface discovery is increasingly part of this stage, continuously mapping exposed assets the way an agentic pentesting system would. For defenders, the hard lesson is that most of this is invisible, so reducing your external footprint matters far more than trying to detect the recon itself. A common client mistake shows up here early: treating the red team like a pentest and asking for a list of every exposed asset. That is a coverage question, and the team will ignore everything that does not lead toward the goal.
Initial access is the transition from outside to inside, and for most red teams it comes through people rather than raw exploits. The most common vector is spear-phishing (T1566 in MITRE ATT&CK, with T1566.001 for a malicious attachment and T1566.002 for a link), a crafted message that delivers a payload or harvests credentials, which is why social engineering is so central to red teaming. Other paths include exploiting an exposed vulnerable service on the perimeter, abusing valid credentials from a breach dump (T1078, Valid Accounts), or in some engagements physical entry to plant a device.
The defensive value is high here, and the detections that SHOULD fire are concrete: the email gateway flagging the lure, EDR catching a macro spawning a child process, or an alert on the first beacon leaving the network. A red team that cannot phish its way in tells you your awareness program is working; one that lands a foothold on the first attempt tells you exactly where to invest. The goal is a single reliable entry point, not breadth, so the team needs only one user to click. Email security, multi-factor authentication, and user-awareness training are the controls that decide whether this stage succeeds at all.
Once inside, the team establishes a foothold by setting up a command-and-control (C2) channel, then expands access by moving laterally and escalating privileges toward the objective. The C2 channel (T1071, Application Layer Protocol), run through frameworks like Cobalt Strike, Sliver, or Mythic, gives operators a persistent, controlled connection to the compromised host that is designed to blend into normal traffic. Persistence mechanisms ensure the foothold survives a reboot or logout. OPSEC governs every choice: operators throttle activity and prefer living-off-the-land techniques precisely because the blue team is watching for the noisy version.
From there the work becomes internal. The team runs BloodHound to map the domain (T1087 Account Discovery, T1069 Permission Groups Discovery), which renders the shortest path to high-value targets as a graph:
$ # BloodHound shortest-path result (abridged)
JDOE --MemberOf--> IT-SUPPORT
IT-SUPPORT --GenericAll--> SVC-BACKUP <- over-privileged service acct
SVC-BACKUP --AdminTo--> FIN-JUMP01 <- finance jump host
FIN-JUMP01 --HasSession--> treasury operator sessionThat single GenericAll edge collapses a multi-day pivot into one step. The team then harvests credentials from memory and the domain (Credential Access, T1003), abuses the service account to escalate, and pivots host to host using legitimate protocols (T1021, Remote Services) and reused valid accounts (T1078) to reach the segmented finance systems, territory covered in our Active Directory testing checklist. Each technique maps to a detection that SHOULD fire: a high-volume LDAP query from a workstation for the BloodHound sweep, an LSASS-access alert for credential dumping, an unusual-host logon for service-account reuse. Every one is a chance for the blue team to catch the operator mid-campaign rather than after the objective is reached.
Actions on objective is the final stage where the team demonstrates it can achieve the agreed goal, then proves and documents it without causing real harm. The objective is written as a verifiable flag before the engagement starts:
OBJECTIVE Reach the treasury share from a standard workstation.
FLAG Contents of \\fin-fs01\treasury\flag.txt
+ screenshot of the payment-initiation screen.
FORBIDDEN Real funds movement, data destruction (T1486),
exfiltration of real customer records (T1041).A real adversary at this point might detonate ransomware (T1486, Data Encrypted for Impact) or exfiltrate (T1041); the red team only proves it could, by moving a benign marker file rather than stealing real data. On a recent assessment of a regional bank, we reached this stage and captured the flag in under two days of active operation, yet the most valuable finding was that the only alert the entire campaign generated was the phishing email itself. Everything after initial access, the C2 beacon, the BloodHound storm, the credential dump, ran in silence.
Reporting is where that silence becomes value. A strong red team report is not a vulnerability list; it is an attack narrative with a timeline, every action mapped to MITRE ATT&CK, a clear account of what blue detected and missed, and prioritized recommendations. The debrief, ideally run as a purple team session, turns the operation into concrete detection improvements. See our guide on the types of penetration testing for how this fits a broader program.
You measure a red team by detection and response, stage by stage, not by whether the flag fell. Three numbers carry the verdict, each with a formula. Dwell time = (first detection) minus (initial access). Detection rate = (techniques that fired an alert) divided by (techniques executed). MTTR = (containment) minus (first detection). Reading detection rate per stage tells you exactly where you are blind: if it is high at initial access but collapses during lateral movement, your perimeter is louder than your interior, which is the most common pattern we see.
In our experience the most frequent mistake is treating the engagement like a pentest and scoring it by findings count. A red team has few findings by design; its value is the timeline and the conversion of missed techniques into durable detections. Track the share of gaps closed within 30 days of each engagement, and run threat-led tests under frameworks like TIBER-EU, CBEST, or DORA when a regulator requires the scenario to mirror real adversaries. That feedback loop, not the flag, is what makes the next attacker's job harder.