OWASP WSTG: The Web Security Testing Guide Explained

Akhil ReniJanuary 10, 20257 min read

Authors

Akhil Reni

TL;DR

✓The OWASP Web Security Testing Guide (WSTG) is a free, community-driven methodology for web app testing, currently at version 4.2 with a 5.0 rewrite in progress on GitHub.
✓It organizes 100+ individual tests into 12 categories, each with a stable ID like WSTG-INPV-05 (SQL injection) or WSTG-ATHN-03 (weak lockout) that never changes between minor revisions.
✓WSTG is a methodology, not a tool. You execute it with Burp Suite, sqlmap, ffuf, and manual analysis, then map each finding to the OWASP Top 10 2021 and a CVSS score.
✓Unlike the OWASP Top 10 (a risk-awareness list), WSTG tells you exactly what to test and how, which is why it is the backbone of most professional web pentest reports.
✓The categories scanners are blind to (ATHZ, BUSL, IDNT) are where WSTG earns its keep, because every request is well-formed and only context makes it abuse.

Pick up almost any professional web application pentest report and you will find findings tagged WSTG-INPV-05 or WSTG-ATHZ-02. That is the OWASP Web Security Testing Guide doing its job: a shared vocabulary so a tester in Berlin and a reviewer in Bangalore mean the exact same check when they write down a test ID. WSTG 4.2 catalogs more than a hundred discrete tests across 12 categories, and each one carries an identifier that does not drift between revisions.

This guide is built around how you actually use the WSTG, not just what it lists. You will see how the categories and IDs work, how WSTG differs from the Top 10 and ASVS, what a single test looks like end to end with real request and response bytes, where scanners go blind, and how a mature finding pairs a WSTG ID with an OWASP Top 10 2021 mapping and a CVSS score.

Table of contents

What is the OWASP WSTG?
How do the 12 categories and test IDs work?
WSTG and the OWASP Top 10 do different jobs.
What does a single WSTG test look like in practice?
How do you run a full engagement with WSTG?
The categories scanners can't see are where WSTG pays off.
How do you turn raw findings into a report?

What is the OWASP WSTG?

The OWASP Web Security Testing Guide is a free, community-maintained framework that defines how to test a web application for security flaws end to end. It is published by the Open Worldwide Application Security Project, is currently at stable version 4.2 (released 2020), and has a 5.0 rewrite in active development on GitHub.

WSTG is descriptive and procedural, not a scanner you run. For each test it gives a summary of the issue, black-box and gray-box test steps, example payloads, and remediation notes. It assumes you are working through an application methodically rather than reading a tool's alert pane, which is exactly why it pairs with a hands-on web application pentesting checklist and a defined set of penetration testing steps and test cases.

How do the 12 categories and test IDs work?

WSTG groups its tests into 12 categories, each with a short code, and every individual test has a stable ID of the form WSTG-<CATEGORY>-<NUMBER>. The categories run roughly in the order you would test an app, from passive recon through to client-side issues, and the ID never changes between minor revisions, which is what makes coverage auditable.

For example, WSTG-INPV-01 is Testing for Reflected Cross Site Scripting, WSTG-INPV-05 is SQL Injection, WSTG-ATHN-03 is Weak Lockout Mechanism, and WSTG-SESS-02 is Cookie Attributes. The first eleven categories have existed since the 4.x line; API Testing (APIT) was formalized as the surface shifted toward services. If you test APIs heavily, treat APIT as a starting point and pair it with a dedicated API penetration testing methodology.

WSTG-INFO-*  Information Gathering
WSTG-CONF-*  Configuration & Deployment
WSTG-IDNT-*  Identity Management
WSTG-ATHN-*  Authentication
WSTG-ATHZ-*  Authorization
WSTG-SESS-*  Session Management
WSTG-INPV-*  Input Validation
WSTG-ERRH-*  Error Handling
WSTG-CRYP-*  Cryptography
WSTG-BUSL-*  Business Logic
WSTG-CLNT-*  Client-side
WSTG-APIT-*  API Testing

The 12 WSTG Categories at a Glance

Recon and Config

✓INFO: Information Gathering
✓CONF: Configuration & Deployment
✓IDNT: Identity Management
✓ERRH: Error Handling

Access Control

✓ATHN: Authentication
✓ATHZ: Authorization
✓SESS: Session Management
✓CRYP: Cryptography

App Surface

✓INPV: Input Validation
✓BUSL: Business Logic
✓CLNT: Client-side
✓APIT: API Testing

WSTG and the OWASP Top 10 do different jobs.

The OWASP Top 10 is an awareness document that ranks the ten most critical web risk categories; WSTG is the methodology you use to actually find instances of those risks. They are complementary, not competing. The Top 10 answers "what should I worry about," WSTG answers "how do I test for it," and ASVS answers "what requirement must this app meet."

In practice you test with WSTG and report against both. A SQL injection found via WSTG-INPV-05 maps to A03:2021 Injection in the OWASP Top 10, and you attach a CVSS score for severity. That triple, WSTG ID plus Top 10 category plus CVSS, is what separates a mature finding from a vague claim, and it is the structure clients and auditors expect to see.

What does a single WSTG test look like in practice?

Take WSTG-INPV-01, Testing for Reflected XSS. You find a parameter that echoes into the response, inject a context-breaking probe, and capture the request and response together. The proof is not that your string came back, it is that it came back unencoded into an executable context. Here is the parameter q reflecting straight into the HTML body:

GET /search?q=%22%3E%3Csvg%20onload%3Dalert(document.domain)%3E HTTP/1.1
Host: shop.target.tld

HTTP/1.1 200 OK
Content-Type: text/html; charset=utf-8
...
<input name="q" value=""><svg onload=alert(document.domain)>">   <-- payload broke out of the value="" attribute

That unencoded <svg onload> landing inside the markup, not as text, is the finding. Contrast it with a server that does its job: the same probe comes back inert, which is itself reportable assurance. Capturing both the request and the raw response, and pointing at the telltale line, is what makes a WSTG result reproducible rather than a screenshot of a popup.

GET /search?q=%22%3E%3Csvg%20onload%3Dalert(1)%3E HTTP/1.1
Host: shop.target.tld

HTTP/1.1 200 OK
Content-Type: text/html; charset=utf-8
...
<input name="q" value="&quot;&gt;&lt;svg onload=alert(1)&gt;">   <-- correctly entity-encoded, NOT a finding

How do you run a full engagement with WSTG?

Work the categories in order, map each test to a tool or manual technique, and log the WSTG ID against every result whether it passes, fails, or is not applicable. That last part matters: an auditable matrix of pass/fail/N-A per ID is what a client pays for, not a pile of scanner alerts. Start with passive Information Gathering using whatweb and Wappalyzer, fingerprint the stack under CONF, then move into the authenticated surface.

The bulk of high-severity findings cluster in two places. INPV is where injection and XSS live (Burp Suite Intruder, sqlmap, ffuf for content discovery). ATHZ and BUSL are where the money is: insecure direct object references, horizontal privilege escalation, and workflow abuse that no payload reveals. For access control, lean on Burp's Autorize extension to replay each request as a lower-privileged session and diff the responses. WSTG is the manual backbone, but keeping that coverage current between point-in-time engagements is the case for agentic pentesting that chains automated and reasoning-based checks continuously.

Running an Engagement WSTG-Style

Map IDs

List every WSTG ID in scope as a pass/fail/N-A matrix before you touch the app.

Recon

INFO + CONF: fingerprint stack with whatweb, Wappalyzer, nmap.

Access

ATHN/ATHZ/SESS: lockout, IDOR, Autorize replay, cookie attributes.

Inject

INPV: Burp Intruder, sqlmap, ffuf, XSS context probes.

Logic

BUSL/IDNT: the reasoning tests no scanner can do.

Report

Each finding = WSTG ID + Top 10 + CVSS + reproducible evidence.

The categories scanners can't see are where WSTG pays off.

WSTG forces you to test the categories scanners are structurally blind to: Authorization (ATHZ), Business Logic (BUSL), and Identity Management (IDNT), where every request is well-formed and only the context makes it abuse. A scanner can flag a missing HttpOnly flag under SESS, but it cannot reason that order #1043 belongs to another tenant, or that the password-reset flow lets you skip email verification.

On a recent assessment of a B2B logistics portal, the automated pass came back clean. Manually following WSTG-ATHZ-02, we incremented a numeric invoiceId in a JSON body and the API returned a different customer's billing record with a 200 and no error, a horizontal IDOR worth more than every INPV finding combined. Watch the false positives too: a parameter reflected into a sanitized template is not XSS, and a verbose stack trace under ERRH is only a finding if it discloses something exploitable. Map the negatives as carefully as the positives, because a passed WSTG-CRYP-01 with TLS 1.3 and HSTS is assurance the client can show an auditor.

How do you turn raw findings into a report?

A WSTG-anchored report ranks findings by CVSS, ties each to its test ID and Top 10 category, and gives evidence a reviewer can reproduce. The findings table below is the shape of what lands in the executive summary; the detail pages then carry the full request and response. For more on what reviewers expect to see, our guide on the key elements of a pentest report walks through the structure.

The trap is mistaking tool output for coverage. A clean Burp Scanner pass tells you the input-validation surface looks fine; it says nothing about WSTG-BUSL or WSTG-ATHZ, where the test is whether a valid request should have been allowed. Map every WSTG ID you intend to cover before you start, and you get a defensible matrix. For when to reach for each tool, see our breakdown of web application penetration testing tools.

Sample Findings Excerpt (WSTG-Anchored)

Finding	Severity (CVSS)	Evidence	Remediation
IDOR on /api/invoices (WSTG-ATHZ-02, A01)	8.6 High	Incremented invoiceId returned another tenant's record, 200 OK	Enforce object-level authz; scope queries to the session's tenant
Reflected XSS in search q (WSTG-INPV-01, A03)	6.1 Medium	<svg onload> broke out of value="" attribute, executed	Context-aware output encoding; add strict CSP
Weak lockout on /login (WSTG-ATHN-03, A07)	5.3 Medium	1000 attempts, no throttle or lockout observed	Rate-limit + account lockout + monitoring
Missing HSTS (WSTG-CRYP-01, A02)	3.7 Low	No Strict-Transport-Security header on HTTPS responses	Add HSTS with includeSubDomains and preload

Frequently asked questions

What is the latest version of the OWASP WSTG?

The current stable release is WSTG 4.2, published in 2020 and still the reference most testers cite. A 5.0 version is in active development on GitHub with restructured content and new tests, but it is not yet the stable baseline, so report against 4.2 IDs for now.

What is the difference between WSTG and the OWASP Top 10?

The OWASP Top 10 is a risk-awareness list of the ten most critical web risk categories. WSTG is the testing methodology that tells you how to find instances of those risks. You test with WSTG and report each finding against its Top 10 category plus a CVSS score.

How many tests are in the WSTG?

WSTG 4.2 contains more than 100 individual tests spread across its 12 categories. The exact count shifts as the guide evolves, but each test carries a stable WSTG ID so coverage can be tracked precisely across engagements and audits.

What is the difference between WSTG and ASVS?

WSTG is a testing methodology that tells you how to look for flaws; the OWASP Application Security Verification Standard (ASVS) is a list of security requirements an application should meet. Testers often use WSTG procedures to verify ASVS requirements, so the two are used together.

Does WSTG cover API and GraphQL testing?

Version 4.2 added an API Testing category (WSTG-APIT), but it is lighter than the web categories. Most teams pair it with the OWASP API Security Top 10 and a dedicated API methodology for full coverage of REST and GraphQL surfaces, since modern apps push most logic behind services.

Can you automate the WSTG?

Partially. Nuclei, Burp Suite, and sqlmap automate many input-validation and configuration tests, but Authorization, Business Logic, and Identity tests need manual reasoning because every request is valid. Agentic pentesting closes more of that gap by chaining automated and reasoning-based checks continuously.

How long does a WSTG-based web pentest take?

A focused application typically runs five to ten testing days, scaling with the number of roles, endpoints, and workflows. The access-control and business-logic categories consume most of that time because they cannot be parallelized to a tool the way an INPV fuzz can.

Sources and references

Akhil Reni

Co-founder and CTO, Strobes

Akhil Reni is co-founder and CTO of Strobes, building AI-driven penetration testing and exposure management for security teams.

Back to Blog

OWASP Application Security

OWASP WSTG: The Web Security Testing Guide Explained

Akhil ReniJanuary 10, 20257 min read

Authors

Akhil Reni

TL;DR

✓The OWASP Web Security Testing Guide (WSTG) is a free, community-driven methodology for web app testing, currently at version 4.2 with a 5.0 rewrite in progress on GitHub.
✓It organizes 100+ individual tests into 12 categories, each with a stable ID like WSTG-INPV-05 (SQL injection) or WSTG-ATHN-03 (weak lockout) that never changes between minor revisions.
✓WSTG is a methodology, not a tool. You execute it with Burp Suite, sqlmap, ffuf, and manual analysis, then map each finding to the OWASP Top 10 2021 and a CVSS score.
✓Unlike the OWASP Top 10 (a risk-awareness list), WSTG tells you exactly what to test and how, which is why it is the backbone of most professional web pentest reports.
✓The categories scanners are blind to (ATHZ, BUSL, IDNT) are where WSTG earns its keep, because every request is well-formed and only context makes it abuse.

Table of contents

What is the OWASP WSTG?
How do the 12 categories and test IDs work?
WSTG and the OWASP Top 10 do different jobs.
What does a single WSTG test look like in practice?
How do you run a full engagement with WSTG?
The categories scanners can't see are where WSTG pays off.
How do you turn raw findings into a report?

What is the OWASP WSTG?

How do the 12 categories and test IDs work?

WSTG-INFO-*  Information Gathering
WSTG-CONF-*  Configuration & Deployment
WSTG-IDNT-*  Identity Management
WSTG-ATHN-*  Authentication
WSTG-ATHZ-*  Authorization
WSTG-SESS-*  Session Management
WSTG-INPV-*  Input Validation
WSTG-ERRH-*  Error Handling
WSTG-CRYP-*  Cryptography
WSTG-BUSL-*  Business Logic
WSTG-CLNT-*  Client-side
WSTG-APIT-*  API Testing

The 12 WSTG Categories at a Glance

Recon and Config

✓INFO: Information Gathering
✓CONF: Configuration & Deployment
✓IDNT: Identity Management
✓ERRH: Error Handling

Access Control

✓ATHN: Authentication
✓ATHZ: Authorization
✓SESS: Session Management
✓CRYP: Cryptography

App Surface

✓INPV: Input Validation
✓BUSL: Business Logic
✓CLNT: Client-side
✓APIT: API Testing

WSTG and the OWASP Top 10 do different jobs.

What does a single WSTG test look like in practice?

GET /search?q=%22%3E%3Csvg%20onload%3Dalert(document.domain)%3E HTTP/1.1
Host: shop.target.tld

HTTP/1.1 200 OK
Content-Type: text/html; charset=utf-8
...
<input name="q" value=""><svg onload=alert(document.domain)>">   <-- payload broke out of the value="" attribute

GET /search?q=%22%3E%3Csvg%20onload%3Dalert(1)%3E HTTP/1.1
Host: shop.target.tld

HTTP/1.1 200 OK
Content-Type: text/html; charset=utf-8
...
<input name="q" value="&quot;&gt;&lt;svg onload=alert(1)&gt;">   <-- correctly entity-encoded, NOT a finding

How do you run a full engagement with WSTG?

Running an Engagement WSTG-Style

Map IDs

List every WSTG ID in scope as a pass/fail/N-A matrix before you touch the app.

Recon

INFO + CONF: fingerprint stack with whatweb, Wappalyzer, nmap.

Access

ATHN/ATHZ/SESS: lockout, IDOR, Autorize replay, cookie attributes.

Inject

INPV: Burp Intruder, sqlmap, ffuf, XSS context probes.

Logic

BUSL/IDNT: the reasoning tests no scanner can do.

Report

Each finding = WSTG ID + Top 10 + CVSS + reproducible evidence.

The categories scanners can't see are where WSTG pays off.

How do you turn raw findings into a report?

Sample Findings Excerpt (WSTG-Anchored)

Finding	Severity (CVSS)	Evidence	Remediation
IDOR on /api/invoices (WSTG-ATHZ-02, A01)	8.6 High	Incremented invoiceId returned another tenant's record, 200 OK	Enforce object-level authz; scope queries to the session's tenant
Reflected XSS in search q (WSTG-INPV-01, A03)	6.1 Medium	<svg onload> broke out of value="" attribute, executed	Context-aware output encoding; add strict CSP
Weak lockout on /login (WSTG-ATHN-03, A07)	5.3 Medium	1000 attempts, no throttle or lockout observed	Rate-limit + account lockout + monitoring
Missing HSTS (WSTG-CRYP-01, A02)	3.7 Low	No Strict-Transport-Security header on HTTPS responses	Add HSTS with includeSubDomains and preload

Frequently asked questions

What is the latest version of the OWASP WSTG?

What is the difference between WSTG and the OWASP Top 10?

How many tests are in the WSTG?

What is the difference between WSTG and ASVS?

Does WSTG cover API and GraphQL testing?

Can you automate the WSTG?

How long does a WSTG-based web pentest take?

Sources and references

Akhil Reni

Co-founder and CTO, Strobes

Akhil Reni is co-founder and CTO of Strobes, building AI-driven penetration testing and exposure management for security teams.

OWASP WSTG: The Web Security Testing Guide Explained

Table of Contents

Authors

Share

What is the OWASP WSTG?

How do the 12 categories and test IDs work?

WSTG and the OWASP Top 10 do different jobs.

What does a single WSTG test look like in practice?

How do you run a full engagement with WSTG?

The categories scanners can't see are where WSTG pays off.

How do you turn raw findings into a report?

Frequently asked questions

Sources and references

OWASP WSTG: The Web Security Testing Guide Explained

Table of Contents

Authors

Share

What is the OWASP WSTG?

How do the 12 categories and test IDs work?

WSTG and the OWASP Top 10 do different jobs.

What does a single WSTG test look like in practice?

How do you run a full engagement with WSTG?

The categories scanners can't see are where WSTG pays off.

How do you turn raw findings into a report?

Frequently asked questions

Sources and references