Why OpenAPI Documentation Alone Doesn't Make Your API Testable
The detection gap that leaves most APIs exposed — and what actually closes it
Why OpenAPI Documentation Alone Doesn't Make Your API Testable
After running ApyGuard against dozens of open-source APIs over the past year, one pattern shows up again and again: most APIs have OpenAPI documentation, but very few have accurate OpenAPI documentation.
This isn't a small problem. Every modern API security scanner — including ApyGuard, Postman's tooling, ZAP's OpenAPI extension, StackHawk, and most commercial DAST products — uses OpenAPI specs to understand request shapes, generate test payloads, and traverse endpoint relationships. When the spec drifts from reality, scanners degrade in ways that are easy to miss and hard to catch in CI logs.
I want to walk through what we've actually seen in production scans, why it matters for authorization testing in particular, and the approach we ended up taking in ApyGuard to deal with it.
The Quiet Assumption Every API Scanner Makes
Most scanners — including the early version of ApyGuard — treat the OpenAPI specification as ground truth.
If the spec says a parameter is optional, the scanner sends requests without it. If a field is typed as string, the scanner generates arbitrary string values. If an endpoint accepts a payload, the scanner builds requests from the documented schema and starts attacking.
This works beautifully when the documentation actually matches backend validation logic. In real codebases, it rarely does.
Two Patterns That Break Scans Silently
Pattern 1: Required parameters marked as optional
This is the most common discrepancy we encounter. A schema looks like this:
{
"type": "object",
"properties": {
"email": {
"type": "string"
}
}
}
The spec implies email is optional. The backend disagrees:
{
"error": "email is required"
}
A scanner that trusts the spec sends payload after payload without email, gets a wall of 400 Bad Request responses, and moves on. The vulnerability scan technically "completed" — but every authorization test against that endpoint failed at the validation layer, never reaching the auth check it was designed to probe.
For BOLA, BFLA, and IDOR testing, this is a silent failure. The scan log shows requests sent, responses received, and zero findings. Looks clean. Means nothing.
Pattern 2: Enums documented as generic strings
The second pattern is undocumented accepted values. A field is typed as a plain string in the spec:
{
"status": {
"type": "string"
}
}
But the backend only accepts three values:
active | inactive | pending
When the scanner generates "status": "test123", the API returns a validation error and the request chain breaks. Dependent endpoints that needed a valid resource state become unreachable. Multi-step authorization tests — the ones that catch the actual business logic flaws — stop before they reach the interesting parts.
Why This Hits Authorization Testing Hardest
If you're scanning for SQL injection or XSS, broken request flows are annoying but recoverable — those vulnerabilities live at the surface layer.
Authorization vulnerabilities are different. BOLA, BFLA, mass assignment, and excessive data exposure all require the scanner to reach a valid authenticated state, perform an action as one user, then attempt the same action as another. Every broken validation in the chain kills the test.
In our internal benchmarks, scans against APIs with drifted OpenAPI specs found 40-60% fewer authorization issues than the same APIs scanned with corrected specs. The vulnerabilities were there. The scanner just never got past the validation wall.
This is the real cost of documentation drift: not noisy logs, but invisible false negatives in the exact vulnerability category that matters most for modern APIs.
What Most Scanners Do (And Why It Doesn't Work)
Traditional scanners follow a static interpretation model:
- Parse the OpenAPI spec
- Generate request templates from schemas
- Send the templates with attack payloads
- Report whatever comes back
When the spec is wrong, this pipeline degrades end-to-end. Invalid requests pile up. State progression breaks. Auth flows fail at step 2 of a 5-step chain. The scan completes — but the scanner is essentially running attacks against a closed door.
The scanners that handle this well need to behave less like spec-readers and more like adaptive testers. Which is closer to how a human pentester works.
How a Pentester Handles This (And What We Copied)
When I do manual API pentests, I don't trust the spec either. I send a request, watch what the API actually says, adjust, and try again. If the response says email is required, I add the email field and retry. If it says status must be one of: active, inactive, pending, I pick one and move on. The spec is a starting hypothesis, not a contract.
We built the same loop into ApyGuard.
When the scanner gets a validation error, it doesn't just log the failure and continue. It parses the error response, extracts the constraint, updates its internal model of the endpoint, and retries with a corrected payload. Over the course of a scan, ApyGuard's understanding of the API drifts toward the actual backend behavior — not what the spec claims.
In practice, the signals we extract from error responses include:
- Required fields the spec missed — pulled from messages like
"X is required"or"missing field: X" - Enum constraints — extracted from messages listing accepted values
- Format expectations — parsed from messages about date formats, UUID patterns, length limits
- Conditional requirements — fields that become required only when another field is present
Each correction lets the scan continue deeper instead of dying at the validation layer.
What Changes in Real Scans
The practical effect on a scan is significant. Endpoints that previously returned only 400s start returning 200s and 403s — and 403s are where the interesting authorization findings live. Multi-step attack chains that used to fail at step 2 now run to completion. Coverage on stateful APIs (the ones with workflows: create resource, transition state, read, modify) increases substantially.
The trade-off is scan time. Adaptive learning adds round-trips. We've found this is worth it in almost every case for AuthZ-heavy scans, but for surface-level vulnerability classes (basic injection, header issues) the spec-only mode is faster and good enough.
OpenAPI Is Still Worth Maintaining
I want to be clear about something: this isn't an argument against OpenAPI. Good OpenAPI documentation is one of the most valuable things an API team can produce. It accelerates onboarding, enables auto-generated SDKs, makes contract testing possible, and — when accurate — does make security scanning faster and more reliable.
The problem isn't OpenAPI. The problem is treating any single source of truth as infallible when you're building automated systems on top of it.
The same lesson applies elsewhere. SAST tools that trust dependency manifests miss vulnerabilities in vendored code. Cloud security tools that trust resource tags miss misconfigured assets that were never tagged. API scanners that trust the spec miss everything the spec got wrong.
What This Means If You're Choosing an API Scanner
A few practical questions to ask whatever scanner you're evaluating:
-
What does it do when an endpoint returns a 400 it didn't expect? If the answer is "logs it and moves on," your scan coverage is whatever the spec accuracy lets it be.
-
Does it learn validation rules from responses? Or does it only consume the static spec?
-
Can it run without an OpenAPI spec at all? Some APIs don't have one. Some have one that's 6 months stale. A scanner that can build its own model from traffic is a different category of tool.
-
How does it handle stateful resources? If the test for endpoint B depends on creating a resource at endpoint A, and the creation fails because of an undocumented required field, what happens?
These four questions separate spec-readers from behavior-adaptive scanners.
What We Learned Building This
The biggest realization for us building ApyGuard wasn't technical — it was framing. We started building a scanner that consumed OpenAPI. We ended up building a scanner that uses OpenAPI as one input among many, including live traffic, error response patterns, and observed authentication flows.
API security testing isn't a documentation problem. It's an observation problem. The closer your scanner gets to how the API actually behaves — not how it's documented to behave — the more useful its findings get.
That's the line we're trying to hold as we keep building.
For automated coverage across all your endpoints, ApyGuard runs behavioral authorization testing, BOLA, BFLA, and BOPLA detection, against your full API surface in minutes.
Start your free API security scan. No credit card required.
Related reading:
- API Penetration Testing: OWASP Top 10 Coverage
- Detecting Excessive Data Exposure with Privilege Diff
- API Security in CI/CD: How to Protect APIs Without Slowing Delivery
External resources:
Published by Anıl Yüksel, Founder & CEO, ApyGuard | April 2026