Strobesstrobes
Platform
Solutions
Resources
Customers
Company
Pricing
Book a Demo
Strobesstrobes

Strobes connects every exposure signal to autonomous action, so security teams fix what matters, prove what works, and stop chasing noise.

Book a DemoTalk to an expert
ISO 27001SOC 2CREST
  • Platform
  • Platform Overview
  • Agentic Exposure Management
  • AI Agents
  • Integrations
  • API & Developers
  • Workflows & Automation
  • Analytics & Reporting
  • Solutions
  • Exposure Assessment (EAP)
  • Attack Surface Management
  • Application Security Posture
  • Risk-Based Vulnerability Management
  • Adversarial Exposure Validation (AEV)
  • AI Pentesting
  • Pentesting as a Service
  • CTEM Framework
  • By Industry
  • Financial Institutions
  • Technology
  • Retail
  • Healthcare
  • Manufacturing
  • By Roles
  • CISOs
  • Security Directors
  • Cloud Security Leaders
  • App Sec Leaders
  • Resources
  • Blog
  • Customer Stories
  • eBooks
  • Datasheets
  • Videos & Demos
  • Exposure Management Academy
  • CTEM Maturity Assessment
  • Pentest Health Check
  • Security Tool ROI Calculator
  • Company
  • About Strobes
  • Meet the Team
  • Trust & Security
  • Contact Us
  • Careers
  • Become a Partner
  • Technology Partner
  • Partner Deal Registration
  • Press Release

Weekly insight for security leaders

CTEM research, agentic AI trends, and what's actually moving the needle.

© 2026 Strobes Security Inc. All rights reserved.

Privacy PolicyTerms of ServiceCookie PolicyAccessibilitySitemap
Back to Blog
Social Engineering Penetration Testing Guide
Offensive Security

Social Engineering Penetration Testing Guide

Likhil ChekuriDecember 21, 20258 min read

Table of Contents

  • What is social engineering penetration testing?
  • What are the main social engineering attack vectors?
  • How does social engineering map to MITRE ATT&CK?
  • What tools are used for social engineering tests?
  • Why do rules of engagement matter so much?
  • How do you measure social engineering results?
  • Frequently asked questions
  • Sources and references

Authors

L
Likhil Chekuri

Share

Table of Contents

  • What is social engineering penetration testing?
  • What are the main social engineering attack vectors?
  • How does social engineering map to MITRE ATT&CK?
  • What tools are used for social engineering tests?
  • Why do rules of engagement matter so much?
  • How do you measure social engineering results?
  • Frequently asked questions
  • Sources and references

Authors

L
Likhil Chekuri

Share

TL;DR
  • ✓Social engineering pentesting tests people and process, simulating phishing, vishing, smishing, and physical pretexting under written authorization.
  • ✓It maps to the MITRE ATT&CK initial-access tactic, the same foothold a real adversary uses before any technical exploitation.
  • ✓Tools include GoPhish and King Phisher for campaigns, Evilginx2 for MFA-bypassing reverse-proxy phishing, and SET for payloads.
  • ✓Tight rules of engagement (scope, exclusions, safe words, data handling) keep the test legal and ethical.
  • ✓Measure click rate, credential-submit rate, and crucially the report rate, since reporting speed reflects real resilience.

Social engineering penetration testing measures whether your people and processes hold up against the techniques attackers actually use to get in. Most breaches still begin with a human, a clicked link, a convincing phone call, or someone holding a door, not with a zero-day. So a test that ignores the human layer ignores the most common entry point.

This guide covers the four channels (phishing, vishing, smishing, and physical), the tools that run them, how to write rules of engagement that keep the test ethical and legal, and the metrics that tell you whether the organization is improving. It treats social engineering as one piece of a broader offensive program, closely tied to red team work.

What is social engineering penetration testing?

Social engineering penetration testing is an authorized assessment that uses deception to test whether employees and processes can be manipulated into granting access, revealing credentials, or performing harmful actions. It targets the human layer instead of (or before) the technical one, simulating the methods a real attacker uses to gain an initial foothold.

It is usually one component of a wider engagement. On its own it produces a phishing or pretexting assessment; combined with technical exploitation and post-compromise activity it becomes a full red team assessment. Either way it complements rather than replaces technical testing across the types of penetration testing an organization runs.

What are the main social engineering attack vectors?

There are four primary channels, and a good program tests more than one because attackers do. Phishing (email) is the most common and most scalable. Vishing (voice) uses a phone call to extract information or push an action, often impersonating IT or a vendor. Smishing (SMS) exploits the trust and urgency of text messages. Physical social engineering puts an assessor on-site to tailgate, drop devices, or talk past reception.

Each relies on the same psychological levers: authority, urgency, fear, and familiarity. The channel changes the delivery and the controls you are testing, but the pretext (the believable story behind the attack) is what makes any of them work.

  • Phishing: bulk or targeted email, including spear-phishing of specific roles.
  • Vishing: phone-based pretexting against helpdesks and individuals.
  • Smishing: SMS lures, often for credential capture or malicious links.
  • Physical: tailgating, badge cloning, USB drops, and pretext visits.
Social engineering channels at a glance
ChannelDeliveryPrimary control testedTypical tool
PhishingEmail lureFiltering, user reporting, MFAGoPhish, Evilginx2
VishingPhone callHelpdesk identity verificationPretext + caller ID spoof
SmishingSMS messageMobile awareness, link filteringCustom SMS gateway
PhysicalOn-site presenceAccess control, reception, tailgatingBadge cloner, pretext kit

How does social engineering map to MITRE ATT&CK?

Social engineering primarily maps to the MITRE ATT&CK initial-access tactic, which covers how adversaries get their first foothold. Phishing (T1566) and its sub-techniques (spearphishing attachment, link, and via service) are the headline entries, and they are among the most-used techniques in real intrusions.

Framing your test against ATT&CK does two things. It lets the blue team measure detection and response against the exact techniques you simulated, and it connects the human entry point to everything that follows, lateral movement, privilege escalation, and exfiltration. That continuity is why social engineering belongs in an adversary-emulation program rather than as a standalone stunt, and why teams increasingly run it continuously, as we describe in our guide to agentic pentesting.

What tools are used for social engineering tests?

The toolset depends on the channel. For email phishing campaigns, GoPhish and King Phisher handle template design, sending, landing pages, and per-target tracking of opens, clicks, and submissions. The Social-Engineer Toolkit (SET) generates payloads and cloned login pages for more hands-on attacks.

When the target has multi-factor authentication, Evilginx2 is the tool that matters: it acts as a reverse proxy between the victim and the real login page, capturing both credentials and the authenticated session cookie, which defeats most app-based and SMS MFA. That capability is exactly why phishing-resistant authentication (FIDO2/WebAuthn) is the recommended fix in nearly every report.

  • GoPhish / King Phisher: campaign management, tracking, and metrics.
  • Evilginx2: reverse-proxy phishing that captures MFA session tokens.
  • SET: payloads, cloned pages, and quick attack scaffolding.
  • Supporting recon: OSINT for targets, lookalike domains, and pretext detail.

Why do rules of engagement matter so much?

Rules of engagement matter because social engineering targets people, which creates legal, ethical, and HR risk that technical testing usually does not. Written authorization is non-negotiable, and the ROE must define exactly what is in scope, what is off-limits, and how the assessment protects the individuals involved.

At minimum, agree on: the in-scope channels and target groups, explicit exclusions (no threats, no real harm, no exploiting personal crises), the physical-test safe word and emergency contacts, how captured credentials and PII are stored and destroyed, and a clear get-out-of-jail letter for on-site assessors. The aim is to measure the organization, never to punish an individual, so findings should be reported in aggregate.

  • Signed authorization and a physical get-out-of-jail letter.
  • Defined scope, target groups, and hard exclusions.
  • Safe word and live emergency contacts for physical work.
  • Strict handling and destruction of captured credentials and PII.
Pre-engagement rules of engagement
Authorization
  • ✓Signed scope and consent from leadership
  • ✓Physical get-out-of-jail letter
  • ✓Named emergency contacts and safe word
Boundaries
  • ✓Defined target groups and exclusions
  • ✓No threats, coercion, or real harm
  • ✓Aggregate (not individual) reporting
Data handling
  • ✓Encrypted storage of captured credentials
  • ✓Defined retention and destruction
  • ✓PII minimization throughout

How do you measure social engineering results?

Measure the funnel, not just the failures. For phishing, the core metrics are delivery rate, open rate, click rate, and credential-submit rate, which tell you how far an attack progressed. But the most important number is the report rate: how many people flagged the message to security, and how fast.

A high click rate with a fast, high report rate is a healthier outcome than a low click rate and near-zero reporting, because real resilience is detection and response, not perfection. Track trends across repeated tests rather than obsessing over one campaign, and feed results into targeted training and into technical fixes like phishing-resistant MFA and better email filtering. Tie remediation back into the same queue you use for findings across your other network and infrastructure tests.

Frequently asked questions

What is social engineering penetration testing?
It is an authorized security assessment that uses deception to test whether employees and processes can be tricked into granting access, sharing credentials, or taking harmful actions. It targets the human layer through phishing, vishing, smishing, and physical pretexting, simulating how real attackers gain an initial foothold.
What is the difference between phishing, vishing, and smishing?
Phishing is delivered by email, vishing by phone call, and smishing by SMS text message. All three use the same psychological levers (authority, urgency, fear, familiarity), but each tests different controls: email filtering and reporting for phishing, helpdesk verification for vishing, and mobile awareness for smishing.
What tools are used for social engineering testing?
GoPhish and King Phisher run and track phishing campaigns, the Social-Engineer Toolkit (SET) generates payloads and cloned pages, and Evilginx2 performs reverse-proxy phishing that captures MFA session cookies. OSINT tools and lookalike domains support recon and pretext development before a campaign launches.
Can social engineering bypass multi-factor authentication?
Yes. Reverse-proxy phishing tools like Evilginx2 sit between the victim and the real login page and capture both the password and the authenticated session cookie, which defeats most app-based and SMS-based MFA. The durable fix is phishing-resistant authentication such as FIDO2 or WebAuthn hardware keys.
How is a social engineering test kept legal and ethical?
Through signed rules of engagement that define scope, target groups, and explicit exclusions, plus a get-out-of-jail letter for physical work and a safe word with emergency contacts. Captured credentials and PII are encrypted, minimized, and destroyed on a set schedule, and results are reported in aggregate so no individual is punished.
What metrics matter in a phishing assessment?
Delivery, open, click, and credential-submit rates show how far an attack progressed, but the report rate (how many people flagged the message, and how quickly) is the most telling. Strong detection and reporting beat a low click rate with no reporting, because resilience is about response, not perfection.
How does social engineering relate to a red team assessment?
Social engineering is often the initial-access phase of a broader red team engagement. On its own it produces a phishing or pretexting assessment; combined with technical exploitation, lateral movement, and exfiltration it becomes full adversary emulation. It maps to the MITRE ATT&CK initial-access tactic, primarily phishing (T1566).

Sources and references

  • MITRE ATT&CK: Phishing (T1566)
  • GoPhish Open-Source Phishing Framework
  • Evilginx2
  • CISA: Avoiding Social Engineering and Phishing Attacks
L
Likhil Chekuri
Application Security Engineer, Strobes
Likhil Chekuri is an AppSec engineer at Strobes who has run hundreds of web, mobile, and cloud penetration tests for regulated industries.
Tags
Social EngineeringRed TeamOffensive Security

Stop chasing vulnerabilities Start reducing exposure

See how Strobes AI agents validate and fix your most critical exposures automatically.

Book a Demo
Continue Reading

Related Posts

How to Catch Blind Bugs Scanners Miss
Penetration TestingOffensive Security

How to Catch the Blind Bugs Scanners Miss

Out-of-band validation detects blind SSRF, blind SQLi, and out-of-band XXE that return no in-band response. Learn how it works and why it matters.

May 29, 202613 min
5 Vulnerabilities in Every Vibe-Coded App
Application SecurityLLM Security

5 Vulnerabilities in Every Vibe-Coded App

The 5 security flaws AI coding assistants ship by default: missing authz, leaked secrets, weak JWTs, IDOR, eval RCE — with detection queries and fixes for each.

May 29, 202613 min
Black-Box Agentic Scanners Strengths and Their Ceiling
Penetration TestingOffensive Security

Black-Box Agentic Scanners: Strengths and Their Ceiling

Black box agentic pentesting finds real CVEs fast and proves them, but where does it hit a ceiling? An honest, category-level verdict.

May 29, 20268 min