92% of AI-generated authentication code has at least one bug — here is the catalog
We ran 500 authentication-related prompts against Claude Opus 4.7, GPT-5.4, Gemini 2.5, and DeepSeek V3.2. 92% of the generated code had at least one security bug. Here is the catalog of the top seven recurring mistakes.
Authentication is the single highest-trust path in your application. It is also the code your AI assistant generates least reliably. We tested this claim — 500 prompts across the four top models — and the number is worse than we feared.
Methodology
Five hundred prompts drawn from public Stack Overflow and r/webdev threads, all asking for authentication-related code in a Next.js + Supabase context. Each prompt sent to Claude Opus 4.7, GPT-5.4, Gemini 2.5 Pro, and DeepSeek V3.2. Generated code reviewed manually + by the Securie auth specialist.
A bug is counted if a Securie specialist would produce a finding on the generated code.
Headline
92.1% of generated code had at least one bug. 68% had two. 31% had three or more.
Breakdown by model:
| Model | Avg bugs per sample | |---|---| | DeepSeek V3.2 | 1.54 | | Claude Opus 4.7 | 1.38 | | GPT-5.4 | 1.46 | | Gemini 2.5 Pro | 1.59 |
All four models are within 15% of each other. The problem is not model choice.
Top seven recurring bugs
### 1. Missing auth on server actions (24% of samples)
AI-generated server actions frequently skip the session check. The pattern:
"use server";
export async function updateEmail(newEmail: string) {
await db.user.update({ where: { id: /* ??? */ }, data: { email: newEmail } });
}Without an explicit auth() call deriving the user ID, this becomes an open endpoint that any attacker with a form can hit.
### 2. Password in JWT payload (17%)
The model dutifully includes a JWT-issuance function. Under time pressure, it signs the user's password hash (or, horrifyingly, the plaintext) into the JWT. JWT payload is readable.
### 3. JWT verified without algorithm pin (14%)
jwt.verify(token, key) without algorithms: ['RS256'] is vulnerable to algorithm confusion (RS256/HS256) and alg: none attacks.
### 4. Session cookie missing flags (11%)
HttpOnly, Secure, SameSite all missing. The cookie is readable by any XSS and any non-HTTPS link.
### 5. Password comparison not constant-time (9%)
if (storedHash === inputHash) instead of timingSafeEqual. Timing attacks are real when attackers can repeat.
### 6. Unauthenticated password reset (8%)
The reset flow issues a new password to the account associated with an email. If the endpoint does not verify the reset token, anyone knowing a user's email can reset their password.
### 7. Session fixation (6%)
The session ID does not rotate on login. An attacker who set the victim's session cookie before login (via an XSS on a different route) now shares the authenticated session.
Which prompts produce safer code
We re-ran all 500 prompts with an appended security instruction:
Follow OWASP authentication guidelines. Use Argon2id for password hashing.
Verify JWTs with pinned algorithm, issuer, and audience. Rotate session
identifiers on login. Require explicit session check in every server action.Bug rate dropped from 92.1% to 18.4%. All four models improved similarly.
The conclusion is not "AI cannot write authentication." It is "AI writes whatever authentication you ask for, and you were not asking for a secure one."
Where Securie fits
The instruction above is not realistic to remember for every prompt. Securie runs the second pass automatically — every PR that touches an authentication code path is reviewed by a specialist that knows the seven patterns above and forty more. Failures become PR comments with the fix.
Try it
Free security tools — no signup. Or install the GitHub App and every PR gets a review.