
A founder once forwarded us their auditor's email two weeks before fieldwork closed: "Please provide the penetration test performed during the review period." They did not have one. They had assumed SOC 2 would tell them if they needed it, and SOC 2 never did. That is the trap. If you search the AICPA Trust Services Criteria for the words "penetration test," you will get zero hits, yet an absent or out-of-period pentest is one of the most common reasons a Type II report lands with a qualification or a late scramble.
This post is precise about what the criteria say versus what auditors do. You will see which criteria a pentest actually evidences, why Type II raises the stakes, how to scope a SaaS system correctly, what your auditor's document-request list looks like in practice, and how to map findings to criteria so the report reads as control evidence rather than a vulnerability dump.
No. SOC 2 is an attestation against the AICPA Trust Services Criteria (TSC), and none of those criteria name penetration testing as a control. What SOC 2 requires is that you define controls meeting the criteria you scoped in, then show they operate.
The distinction is load-bearing. You choose your controls and you choose how to satisfy each criterion; the auditor evaluates whether the evidence is sufficient. A pentest happens to be the most credible single artifact for several criteria, which is why its absence draws fire. Skip it and your auditor will ask how else you detect exploitable vulnerabilities and validate access controls, then judge whether that answer holds. "We run a weekly Nessus scan" rarely survives that conversation, because a scan and a test answer different questions. The honest framing: SOC 2 does not mandate a pentest, but it is very hard to evidence CC4.1 and the CC7 series without one.
This is the mental shift that fixes most SOC 2 testing mistakes. The auditor is not grading the pentester. They are checking whether the controls you claimed under CC4.1 and CC7.x produced and acted on evidence during the period. The pentest is that evidence.
The criteria the test actually maps to:
The strongest reports tag every finding to the criterion it touches so the auditor does not have to infer it. For the line between a real test and a scan, see penetration testing vs vulnerability scanning and automated vs manual penetration testing.
Type II grades operating effectiveness across time, not design at an instant. A Type I report attests that controls are suitably designed at a single date; a Type II attests they operated effectively over a window, usually three to twelve months. A pentest carries far more weight in a Type II, and its date has to fall inside that window.
For Type I, control design plus a recent pentest as supporting evidence is often enough. For Type II, the auditor samples effectiveness across the whole period, so a dated, in-period pentest report becomes a concrete artifact proving your detection control ran and produced action. The scheduling detail trips up more teams than the requirement. If your window is January through December and you test in November, you have almost no runway to remediate criticals and retest before the report closes, leaving visible open findings the auditor must note. Run it in the first third of the window so you can fix, retest, and show a clean loop. Most Type II shops also keep a lighter continuous cadence across the window so the annual pentest is not the only dated evidence in the file.
Scope follows your system description boundary, full stop. That is the same boundary the report attests to: the in-scope production application, its APIs, the supporting cloud infrastructure, and the authentication and authorization paths. Corporate IT outside the system description stays out.
The single most common readiness mistake is pointing the test at the whole company (marketing site, internal HR tools, employee laptops) instead of the in-scope environment. That wastes budget on findings the auditor does not care about while the systems that matter get a thinner test. Make the boundary explicit before kickoff. A scope statement that the auditor and tester both sign off on looks like this:
SOC 2 Type II Pentest - Scope Statement (Acme SaaS)
IN SCOPE
app.acme.io multi-tenant web app (prod)
api.acme.io REST + GraphQL, all authn paths
tenant isolation cross-tenant IDOR / data-segregation tests
IAM / RBAC layer privilege escalation, role boundaries
AWS account 4471xxxx prod VPC, S3, RDS reachable from app tier
OUT OF SCOPE
www.acme.io marketing WordPress (not in system desc.)
office Wi-Fi / laptops corporate IT, outside the boundary
WINDOW 2026-02-01 to 2026-02-14 (review period Jan-Dec 2026)That maps every in-scope item to something the system description actually covers. Multi-tenant isolation is the criterion-relevant part most teams underweight: a single cross-tenant read is a CC6.1 failure that reads far worse in a report than a missing security header. For setting cadence beyond the annual minimum and prepping the engagement, see how to prepare for a penetration test and our overview of penetration testing types and process.
An auditor asks for a dated report inside the review period, the scope it covered, the methodology behind it, severity-rated findings, and proof you remediated or formally accepted each risk. The request list during Type II fieldwork is predictable, so prepare it as a bundle:
SOC 2 evidence request - penetration testing
[1] Executive summary + full technical report (PDF)
[2] Scope document / SOW: in-scope vs out-of-scope assets
[3] Methodology reference (PTES / NIST 800-115 / OWASP WSTG)
[4] Findings register with CVSS severity, dated
[5] Remediation tickets (Jira/Linear): owner + fix date
[6] Retest letter confirming criticals/highs closed
[7] Tester independence attestationWe have watched auditors push back hardest on two things. First, a report dated outside the window, which fails the "operated effectively during the period" test for Type II. Second, a stack of high-severity findings with no evidence of action, because CC4.1 is about controls functioning and an unremediated critical is a control that did not function. The Jira tickets showing triage, owner, and fix date are often worth more to the auditor than the report itself. Item 5 closes the loop that item 1 opens.
Treat the report as control evidence and tag each finding to the criterion it proves or disproves. An authentication bypass is CC6.1 evidence; a silent SIEM during lateral movement is CC7.2 evidence; the act of running the test on schedule and tracking fixes is CC4.1 evidence. Done well, the mapping turns a vulnerability list into an attestation artifact.
Use a recognized methodology so the report is defensible: OWASP WSTG and ASVS for web apps, the OWASP API Security Top 10 for APIs, and PTES or NIST SP 800-115 for engagement structure. Score with CVSS, and add EPSS or CISA KEV context when you want to justify why a medium got fixed before a high. The mapping below is the kind of excerpt that reads as senior work to an auditor.
One honest caveat: not every finding maps cleanly, and forcing a tenuous criterion tag is worse than leaving it untagged. Map what genuinely evidences a control, and let the rest stand as plain security findings. Continuous approaches such as agentic pentesting help you keep producing dated, in-period evidence across the whole window instead of betting the report on one annual snapshot.