
The fastest way to expose a weak pentest vendor is one question: which standard do you follow, and how do you cover what it leaves out? A strong answer names two or three and explains how they stack. A vague answer ("we use our own methodology") tells you the test will be ad hoc, and coverage will depend on whichever bug the tester happened to find interesting that week. We have inherited reports from exactly that kind of vendor: ten pages of nmap output, no test IDs, no mapping, no way to tell what was actually checked.
Professional testing is anchored to published standards so engagements are repeatable, defensible, and comparable. The four you will meet are PTES, OSSTMM, NIST SP 800-115, and the OWASP guides, and they are not competitors. They are different lenses: engagement process, measurable rigor, regulatory baseline, and application coverage. This guide shows what each does, how they combine, and how to read a report that uses them.
Standards make testing repeatable and comparable. Without a methodology, two testers attacking the same target could deliver wildly different results and you would have no way to judge quality. A standard defines what gets tested, in what order, and to what depth, so coverage is not left to luck.
They also support compliance and protect you. Frameworks like SOC 2 and PCI DSS expect testing aligned to a recognized methodology, and a report that maps findings to a named standard is far easier to defend in an audit. When a finding cites a specific test ID, an engineer can look up exactly what was checked and how. That traceability is the difference between a report and a receipt.
There is a second, quieter benefit: standards tell you what was not tested. A report mapped to the WSTG can show which test IDs passed, which failed, and which were out of scope, so a quiet area reads as deliberately untested rather than silently skipped. Without a methodology you have no idea whether a clean section means secure or never looked at. That distinction is exactly what an auditor, and an honest engineering lead, needs to know.
PTES, the Penetration Testing Execution Standard, is the most complete end-to-end engagement framework. It defines seven sections: pre-engagement interactions, intelligence gathering, threat modeling, vulnerability analysis, exploitation, post-exploitation, and reporting. It is the standard most often mapped to the classic penetration testing phases. In practice you do not follow PTES line by line; you use its seven sections as a table of contents so nothing, especially scoping and reporting, gets skipped under time pressure.
PTES is strong because it covers the whole lifecycle, including the business-side parts other standards skip entirely. The pre-engagement section alone, scope, rules of engagement, authorization, is where most engagements actually succeed or fail, and it is the part ad hoc testers skip under time pressure. Its separate technical guidelines recommend concrete tools per phase. The honest caveat: PTES has not seen major formal updates in years, so testers treat it as a structural backbone and layer current technique from OWASP and MITRE ATT&CK on top. If you want one framework to structure a full engagement, PTES is usually it, and it maps directly to the kickoff work in how to prepare for a penetration test.
OSSTMM, the Open Source Security Testing Methodology Manual from ISECOM, is a rigorous, metrics-driven approach focused on measurable operational security. Instead of a vulnerability checklist, it defines how to test across channels (physical, wireless, telecom, data networks, human) and produces a quantified RAV (Risk Assessment Value) you can track over time. A simplified RAV summary in a report reads like this:
Channel: Data Networks
Visibility 12 Access 4 Trust 3 (porosity)
Controls present: 9 of 14 expected
RAV (actual security): 81.4 / 100
^ a number you can compare against last quarter, not a vibeIts strength is scientific rigor and repeatability: results are measured, not just described, which makes it popular where you need defensible, comparable metrics across years or business units. A board that wants to know did our security posture improve since last year gets a real answer from a RAV trend line, where a vulnerability count alone is meaningless (fewer findings might just mean a shallower test). The tradeoff is that it is dense and less prescriptive about modern web tooling, so testers pair it with OWASP for application depth. Few commercial app tests run pure OSSTMM, but its measurement discipline shows up in good reports everywhere, and the channels concept (testing physical, wireless, telecom, and human alongside data networks) is a useful reminder that your attack surface is wider than your web app.
NIST SP 800-115, the Technical Guide to Information Security Testing and Assessment, is the US government's reference for security testing. It defines four phases (planning, discovery, attack, reporting) and covers review techniques, target identification, and vulnerability validation at a high level. It is the go-to baseline for federal agencies, contractors, and regulated industries because it is authoritative and widely accepted by auditors.
It is deliberately less granular than PTES on technique, acting more as a policy-level standard you align to than a step-by-step playbook. Many engagements cite NIST for the compliance signature while using PTES or OWASP for execution detail. Put bluntly: if an auditor asks what standard, NIST is the safe answer; if a tester asks what do I actually run, it is not.
This is why teams get confused when they read a NIST-aligned report and find it thin on technique. NIST was never meant to be a technical checklist; it is the common vocabulary that lets a federal auditor, a contractor, and a vendor agree they ran a legitimate assessment. Expecting it to tell a tester which sqlmap flags to use is like expecting a building code to tell a carpenter how to swing a hammer. Pair it with OWASP for the swing.
OWASP publishes the application-layer standards testers map findings to. The Web Security Testing Guide (WSTG) is the definitive checklist for web apps; MASVS and the MSTG cover iOS and Android; and the API Security Top 10 covers REST and GraphQL, including broken object-level authorization (BOLA). These guides stay current in a way engagement frameworks do not. The API Top 10 added server-side request forgery as its own category in the 2023 revision, which tells you where attacker attention moved.
The payoff is traceability. Every WSTG test has an ID, and a strong report cites it next to each finding so the reader knows exactly what was checked:
WSTG-ATHZ-02 Testing for Privilege Escalation FAIL (IDOR, High)
WSTG-ATHZ-01 Testing Directory Traversal / File Incl PASS
WSTG-SESS-03 Testing for Session Fixation PASS
WSTG-INPV-05 Testing for SQL Injection FAIL (blind, Critical)A typical web test uses PTES for the overall process and the OWASP WSTG for the technical checklist, with each test ID traceable in the report. Mapping findings to OWASP categories also makes them easier for developers to act on, because they already think in those terms.
A subtle strength of the OWASP guides is that they evolve with the attack surface, where engagement frameworks do not. The API Top 10 promoting server-side request forgery to its own category in 2023, and the OWASP Top 10 adding broken access control as the number-one web risk, are not editorial whims; they track where real breaches moved. A tester who maps to the current edition is implicitly testing for what attackers are doing now, not what mattered five years ago. This is also why pairing a static engagement framework like PTES with a living coverage standard like the WSTG works so well: PTES gives you durable structure, OWASP gives you current technique.
Testers rarely use one standard in isolation; they stack them by what each does best:
PTES -> engagement shape (scope to report)
OWASP WSTG -> per-surface technical checklist (web/API/mobile)
MITRE ATT&CK -> map the attack chain to adversary techniques
NIST 800-115 -> the language auditors recognizeIn practice a strong web engagement reads as: PTES process, WSTG coverage, ATT&CK mapping for the attack chain, NIST cited for compliance. That stacking is why a structured penetration testing process beats an ad hoc one, and the table below summarizes which standard to reach for when. If you are keeping coverage current between annual tests, agentic pentesting applies the same standards-based methodology continuously rather than once a year.