Strobesstrobes
Platform
Solutions
Resources
Customers
Company
Pricing
Book a Demo
Strobesstrobes

Strobes connects every exposure signal to autonomous action, so security teams fix what matters, prove what works, and stop chasing noise.

Book a DemoTalk to an expert
ISO 27001SOC 2CREST
  • Platform
  • Platform Overview
  • Agentic Exposure Management
  • AI Agents
  • Integrations
  • API & Developers
  • Workflows & Automation
  • Analytics & Reporting
  • Solutions
  • Exposure Assessment (EAP)
  • Attack Surface Management
  • Application Security Posture
  • Risk-Based Vulnerability Management
  • Adversarial Exposure Validation (AEV)
  • AI Pentesting
  • Pentesting as a Service
  • CTEM Framework
  • By Industry
  • Financial Institutions
  • Technology
  • Retail
  • Healthcare
  • Manufacturing
  • By Roles
  • CISOs
  • Security Directors
  • Cloud Security Leaders
  • App Sec Leaders
  • Resources
  • Quick Agentic Pentest
  • Blog
  • Customer Stories
  • eBooks
  • Datasheets
  • Videos & Demos
  • Exposure Management Academy
  • CTEM Maturity Assessment
  • Pentest Health Check
  • Security Tool ROI Calculator
  • Company
  • About Strobes
  • Meet the Team
  • Trust & Security
  • Contact Us
  • Careers
  • Become a Partner
  • Technology Partner
  • Partner Deal Registration
  • Press Release

Weekly insight for security leaders

CTEM research, agentic AI trends, and what's actually moving the needle.

© 2026 Strobes Security Inc. All rights reserved.

Privacy PolicyTerms of ServiceCookie PolicyAccessibilitySitemap
Back to Blog
OWASP WSTG: The Web Security Testing Guide Explained
OWASPApplication Security

OWASP WSTG: The Web Security Testing Guide Explained

Akhil ReniJanuary 10, 20257 min read

Table of Contents

  • What is the OWASP WSTG?
  • How do the 12 categories and test IDs work?
  • WSTG and the OWASP Top 10 do different jobs.
  • What does a single WSTG test look like in practice?
  • How do you run a full engagement with WSTG?
  • The categories scanners can't see are where WSTG pays off.
  • How do you turn raw findings into a report?
  • Frequently asked questions
  • Sources and references

Authors

A
Akhil Reni

Share

Table of Contents

  • What is the OWASP WSTG?
  • How do the 12 categories and test IDs work?
  • WSTG and the OWASP Top 10 do different jobs.
  • What does a single WSTG test look like in practice?
  • How do you run a full engagement with WSTG?
  • The categories scanners can't see are where WSTG pays off.
  • How do you turn raw findings into a report?
  • Frequently asked questions
  • Sources and references

Authors

A
Akhil Reni

Share

TL;DR
  • ✓The OWASP Web Security Testing Guide (WSTG) is a free, community-driven methodology for web app testing, currently at version 4.2 with a 5.0 rewrite in progress on GitHub.
  • ✓It organizes 100+ individual tests into 12 categories, each with a stable ID like WSTG-INPV-05 (SQL injection) or WSTG-ATHN-03 (weak lockout) that never changes between minor revisions.
  • ✓WSTG is a methodology, not a tool. You execute it with Burp Suite, sqlmap, ffuf, and manual analysis, then map each finding to the OWASP Top 10 2021 and a CVSS score.
  • ✓Unlike the OWASP Top 10 (a risk-awareness list), WSTG tells you exactly what to test and how, which is why it is the backbone of most professional web pentest reports.
  • ✓The categories scanners are blind to (ATHZ, BUSL, IDNT) are where WSTG earns its keep, because every request is well-formed and only context makes it abuse.

Pick up almost any professional web application pentest report and you will find findings tagged WSTG-INPV-05 or WSTG-ATHZ-02. That is the OWASP Web Security Testing Guide doing its job: a shared vocabulary so a tester in Berlin and a reviewer in Bangalore mean the exact same check when they write down a test ID. WSTG 4.2 catalogs more than a hundred discrete tests across 12 categories, and each one carries an identifier that does not drift between revisions.

This guide is built around how you actually use the WSTG, not just what it lists. You will see how the categories and IDs work, how WSTG differs from the Top 10 and ASVS, what a single test looks like end to end with real request and response bytes, where scanners go blind, and how a mature finding pairs a WSTG ID with an OWASP Top 10 2021 mapping and a CVSS score.

Table of contents
  1. What is the OWASP WSTG?
  2. How do the 12 categories and test IDs work?
  3. WSTG and the OWASP Top 10 do different jobs.
  4. What does a single WSTG test look like in practice?
  5. How do you run a full engagement with WSTG?
  6. The categories scanners can't see are where WSTG pays off.
  7. How do you turn raw findings into a report?

What is the OWASP WSTG?

The OWASP Web Security Testing Guide is a free, community-maintained framework that defines how to test a web application for security flaws end to end. It is published by the Open Worldwide Application Security Project, is currently at stable version 4.2 (released 2020), and has a 5.0 rewrite in active development on GitHub.

WSTG is descriptive and procedural, not a scanner you run. For each test it gives a summary of the issue, black-box and gray-box test steps, example payloads, and remediation notes. It assumes you are working through an application methodically rather than reading a tool's alert pane, which is exactly why it pairs with a hands-on web application pentesting checklist and a defined set of penetration testing steps and test cases.

How do the 12 categories and test IDs work?

WSTG groups its tests into 12 categories, each with a short code, and every individual test has a stable ID of the form WSTG-<CATEGORY>-<NUMBER>. The categories run roughly in the order you would test an app, from passive recon through to client-side issues, and the ID never changes between minor revisions, which is what makes coverage auditable.

For example, WSTG-INPV-01 is Testing for Reflected Cross Site Scripting, WSTG-INPV-05 is SQL Injection, WSTG-ATHN-03 is Weak Lockout Mechanism, and WSTG-SESS-02 is Cookie Attributes. The first eleven categories have existed since the 4.x line; API Testing (APIT) was formalized as the surface shifted toward services. If you test APIs heavily, treat APIT as a starting point and pair it with a dedicated API penetration testing methodology.

WSTG-INFO-*  Information Gathering
WSTG-CONF-*  Configuration & Deployment
WSTG-IDNT-*  Identity Management
WSTG-ATHN-*  Authentication
WSTG-ATHZ-*  Authorization
WSTG-SESS-*  Session Management
WSTG-INPV-*  Input Validation
WSTG-ERRH-*  Error Handling
WSTG-CRYP-*  Cryptography
WSTG-BUSL-*  Business Logic
WSTG-CLNT-*  Client-side
WSTG-APIT-*  API Testing
The 12 WSTG Categories at a Glance
Recon and Config
  • ✓INFO: Information Gathering
  • ✓CONF: Configuration & Deployment
  • ✓IDNT: Identity Management
  • ✓ERRH: Error Handling
Access Control
  • ✓ATHN: Authentication
  • ✓ATHZ: Authorization
  • ✓SESS: Session Management
  • ✓CRYP: Cryptography
App Surface
  • ✓INPV: Input Validation
  • ✓BUSL: Business Logic
  • ✓CLNT: Client-side
  • ✓APIT: API Testing

WSTG and the OWASP Top 10 do different jobs.

The OWASP Top 10 is an awareness document that ranks the ten most critical web risk categories; WSTG is the methodology you use to actually find instances of those risks. They are complementary, not competing. The Top 10 answers "what should I worry about," WSTG answers "how do I test for it," and ASVS answers "what requirement must this app meet."

In practice you test with WSTG and report against both. A SQL injection found via WSTG-INPV-05 maps to A03:2021 Injection in the OWASP Top 10, and you attach a CVSS score for severity. That triple, WSTG ID plus Top 10 category plus CVSS, is what separates a mature finding from a vague claim, and it is the structure clients and auditors expect to see.

What does a single WSTG test look like in practice?

Take WSTG-INPV-01, Testing for Reflected XSS. You find a parameter that echoes into the response, inject a context-breaking probe, and capture the request and response together. The proof is not that your string came back, it is that it came back unencoded into an executable context. Here is the parameter q reflecting straight into the HTML body:

GET /search?q=%22%3E%3Csvg%20onload%3Dalert(document.domain)%3E HTTP/1.1
Host: shop.target.tld

HTTP/1.1 200 OK
Content-Type: text/html; charset=utf-8
...
<input name="q" value=""><svg onload=alert(document.domain)>">   <-- payload broke out of the value="" attribute

That unencoded <svg onload> landing inside the markup, not as text, is the finding. Contrast it with a server that does its job: the same probe comes back inert, which is itself reportable assurance. Capturing both the request and the raw response, and pointing at the telltale line, is what makes a WSTG result reproducible rather than a screenshot of a popup.

GET /search?q=%22%3E%3Csvg%20onload%3Dalert(1)%3E HTTP/1.1
Host: shop.target.tld

HTTP/1.1 200 OK
Content-Type: text/html; charset=utf-8
...
<input name="q" value="&quot;&gt;&lt;svg onload=alert(1)&gt;">   <-- correctly entity-encoded, NOT a finding

How do you run a full engagement with WSTG?

Work the categories in order, map each test to a tool or manual technique, and log the WSTG ID against every result whether it passes, fails, or is not applicable. That last part matters: an auditable matrix of pass/fail/N-A per ID is what a client pays for, not a pile of scanner alerts. Start with passive Information Gathering using whatweb and Wappalyzer, fingerprint the stack under CONF, then move into the authenticated surface.

The bulk of high-severity findings cluster in two places. INPV is where injection and XSS live (Burp Suite Intruder, sqlmap, ffuf for content discovery). ATHZ and BUSL are where the money is: insecure direct object references, horizontal privilege escalation, and workflow abuse that no payload reveals. For access control, lean on Burp's Autorize extension to replay each request as a lower-privileged session and diff the responses. WSTG is the manual backbone, but keeping that coverage current between point-in-time engagements is the case for agentic pentesting that chains automated and reasoning-based checks continuously.

Running an Engagement WSTG-Style
1
Map IDs
List every WSTG ID in scope as a pass/fail/N-A matrix before you touch the app.
2
Recon
INFO + CONF: fingerprint stack with whatweb, Wappalyzer, nmap.
3
Access
ATHN/ATHZ/SESS: lockout, IDOR, Autorize replay, cookie attributes.
4
Inject
INPV: Burp Intruder, sqlmap, ffuf, XSS context probes.
5
Logic
BUSL/IDNT: the reasoning tests no scanner can do.
6
Report
Each finding = WSTG ID + Top 10 + CVSS + reproducible evidence.

The categories scanners can't see are where WSTG pays off.

WSTG forces you to test the categories scanners are structurally blind to: Authorization (ATHZ), Business Logic (BUSL), and Identity Management (IDNT), where every request is well-formed and only the context makes it abuse. A scanner can flag a missing HttpOnly flag under SESS, but it cannot reason that order #1043 belongs to another tenant, or that the password-reset flow lets you skip email verification.

On a recent assessment of a B2B logistics portal, the automated pass came back clean. Manually following WSTG-ATHZ-02, we incremented a numeric invoiceId in a JSON body and the API returned a different customer's billing record with a 200 and no error, a horizontal IDOR worth more than every INPV finding combined. Watch the false positives too: a parameter reflected into a sanitized template is not XSS, and a verbose stack trace under ERRH is only a finding if it discloses something exploitable. Map the negatives as carefully as the positives, because a passed WSTG-CRYP-01 with TLS 1.3 and HSTS is assurance the client can show an auditor.

How do you turn raw findings into a report?

A WSTG-anchored report ranks findings by CVSS, ties each to its test ID and Top 10 category, and gives evidence a reviewer can reproduce. The findings table below is the shape of what lands in the executive summary; the detail pages then carry the full request and response. For more on what reviewers expect to see, our guide on the key elements of a pentest report walks through the structure.

The trap is mistaking tool output for coverage. A clean Burp Scanner pass tells you the input-validation surface looks fine; it says nothing about WSTG-BUSL or WSTG-ATHZ, where the test is whether a valid request should have been allowed. Map every WSTG ID you intend to cover before you start, and you get a defensible matrix. For when to reach for each tool, see our breakdown of web application penetration testing tools.

Sample Findings Excerpt (WSTG-Anchored)
FindingSeverity (CVSS)EvidenceRemediation
IDOR on /api/invoices (WSTG-ATHZ-02, A01)8.6 HighIncremented invoiceId returned another tenant's record, 200 OKEnforce object-level authz; scope queries to the session's tenant
Reflected XSS in search q (WSTG-INPV-01, A03)6.1 Medium&lt;svg onload&gt; broke out of value="" attribute, executedContext-aware output encoding; add strict CSP
Weak lockout on /login (WSTG-ATHN-03, A07)5.3 Medium1000 attempts, no throttle or lockout observedRate-limit + account lockout + monitoring
Missing HSTS (WSTG-CRYP-01, A02)3.7 LowNo Strict-Transport-Security header on HTTPS responsesAdd HSTS with includeSubDomains and preload

Frequently asked questions

What is the latest version of the OWASP WSTG?
The current stable release is WSTG 4.2, published in 2020 and still the reference most testers cite. A 5.0 version is in active development on GitHub with restructured content and new tests, but it is not yet the stable baseline, so report against 4.2 IDs for now.
What is the difference between WSTG and the OWASP Top 10?
The OWASP Top 10 is a risk-awareness list of the ten most critical web risk categories. WSTG is the testing methodology that tells you how to find instances of those risks. You test with WSTG and report each finding against its Top 10 category plus a CVSS score.
How many tests are in the WSTG?
WSTG 4.2 contains more than 100 individual tests spread across its 12 categories. The exact count shifts as the guide evolves, but each test carries a stable WSTG ID so coverage can be tracked precisely across engagements and audits.
What is the difference between WSTG and ASVS?
WSTG is a testing methodology that tells you how to look for flaws; the OWASP Application Security Verification Standard (ASVS) is a list of security requirements an application should meet. Testers often use WSTG procedures to verify ASVS requirements, so the two are used together.
Does WSTG cover API and GraphQL testing?
Version 4.2 added an API Testing category (WSTG-APIT), but it is lighter than the web categories. Most teams pair it with the OWASP API Security Top 10 and a dedicated API methodology for full coverage of REST and GraphQL surfaces, since modern apps push most logic behind services.
Can you automate the WSTG?
Partially. Nuclei, Burp Suite, and sqlmap automate many input-validation and configuration tests, but Authorization, Business Logic, and Identity tests need manual reasoning because every request is valid. Agentic pentesting closes more of that gap by chaining automated and reasoning-based checks continuously.
How long does a WSTG-based web pentest take?
A focused application typically runs five to ten testing days, scaling with the number of roles, endpoints, and workflows. The access-control and business-logic categories consume most of that time because they cannot be parallelized to a tool the way an INPV fuzz can.

Sources and references

  • OWASP Web Security Testing Guide
  • OWASP WSTG v4.2 Test Index
  • OWASP Top 10:2021
A
Akhil Reni
Co-founder and CTO, Strobes
Akhil Reni is co-founder and CTO of Strobes, building AI-driven penetration testing and exposure management for security teams.
Tags
OWASPWeb SecurityPenetration Testing

Stop chasing vulnerabilities Start reducing exposure

See how Strobes AI agents validate and fix your most critical exposures automatically.

Book a Demo
Continue Reading

Related Posts

Vulnerability validation: why most of your scanner backlog is noise - Strobes
Exposure ValidationApplication Security

Vulnerability Validation: Why Most of Your Scanner Backlog Is Noise

Vulnerability validation proves which scanner findings are real, reachable, and exploitable. Why manual triage fails and how agentic validation scales.

Jun 9, 202619 min
How to pentest single-page applications - React, Angular and Vue SPA security testing guide
Penetration TestingApplication Security

How to Pentest Single-Page Applications (React, Angular, Vue)

Learn how to pentest React, Angular, and Vue SPAs. Covers DOM XSS, client-side routing bypass, JS bundle secrets, and why traditional DAST scanners fail.

Jun 4, 202623 min
Bug bounty vs pentesting vs AI pentesting comparison featured image
Penetration TestingApplication Security

Bug Bounty vs. Pentesting vs. AI Pentesting: Which Model Fits Your AppSec Program?

Bug bounty vs pentesting vs AI pentesting: compare costs, coverage, compliance, and when to use each model. Build a layered AppSec testing strategy.

Jun 4, 202621 min