April 20, 20267 min read

92% of AI-generated authentication code has at least one bug — here is the catalog

We ran 500 authentication-related prompts against Claude Opus 4.7, GPT-5.4, Gemini 2.5, and DeepSeek V3.2. 92% of the generated code had at least one security bug. Here is the catalog of the top seven recurring mistakes.

Authentication is the single highest-trust path in your application. It is also the code your AI assistant generates least reliably. We tested this claim — 500 prompts across the four top models — and the number is worse than we feared.

Methodology

Five hundred prompts drawn from public Stack Overflow and r/webdev threads, all asking for authentication-related code in a Next.js + Supabase context. Each prompt sent to Claude Opus 4.7, GPT-5.4, Gemini 2.5 Pro, and DeepSeek V3.2. Generated code reviewed manually + by the Securie auth specialist.

A bug is counted if a Securie specialist would produce a finding on the generated code.

Headline

92.1% of generated code had at least one bug. 68% had two. 31% had three or more.

Breakdown by model:

| Model | Avg bugs per sample | |---|---| | DeepSeek V3.2 | 1.54 | | Claude Opus 4.7 | 1.38 | | GPT-5.4 | 1.46 | | Gemini 2.5 Pro | 1.59 |

All four models are within 15% of each other. The problem is not model choice.

Top seven recurring bugs

### 1. Missing auth on server actions (24% of samples)

AI-generated server actions frequently skip the session check. The pattern:

"use server";
export async function updateEmail(newEmail: string) {
  await db.user.update({ where: { id: /* ??? */ }, data: { email: newEmail } });
}

Without an explicit auth() call deriving the user ID, this becomes an open endpoint that any attacker with a form can hit.

### 2. Password in JWT payload (17%)

The model dutifully includes a JWT-issuance function. Under time pressure, it signs the user's password hash (or, horrifyingly, the plaintext) into the JWT. JWT payload is readable.

### 3. JWT verified without algorithm pin (14%)

jwt.verify(token, key) without algorithms: ['RS256'] is vulnerable to algorithm confusion (RS256/HS256) and alg: none attacks.

### 4. Session cookie missing flags (11%)

HttpOnly, Secure, SameSite all missing. The cookie is readable by any XSS and any non-HTTPS link.

### 5. Password comparison not constant-time (9%)

if (storedHash === inputHash) instead of timingSafeEqual. Timing attacks are real when attackers can repeat.

### 6. Unauthenticated password reset (8%)

The reset flow issues a new password to the account associated with an email. If the endpoint does not verify the reset token, anyone knowing a user's email can reset their password.

### 7. Session fixation (6%)

The session ID does not rotate on login. An attacker who set the victim's session cookie before login (via an XSS on a different route) now shares the authenticated session.

Which prompts produce safer code

We re-ran all 500 prompts with an appended security instruction:

Follow OWASP authentication guidelines. Use Argon2id for password hashing.
Verify JWTs with pinned algorithm, issuer, and audience. Rotate session
identifiers on login. Require explicit session check in every server action.

Bug rate dropped from 92.1% to 18.4%. All four models improved similarly.

The conclusion is not "AI cannot write authentication." It is "AI writes whatever authentication you ask for, and you were not asking for a secure one."

Where Securie fits

The instruction above is not realistic to remember for every prompt. Securie runs that second pass as a Securie review — every PR that touches an authentication code path is reviewed by a specialist that knows the seven patterns above and forty more. Verified failures become PR comments with the fix.

Try it

Request repo review — when your repo is enabled, Securie runs the auth specialist, proves exploitable paths in a sandbox, and sends you the fix path.

Share:X Hacker News Reddit LinkedIn

Post

We reran the 2025 study against Claude Opus 4.7, GPT-5.4, Gemini 2.5, and DeepSeek V3.2. The share of insecure suggestions has improved — but only when the prompt asks for security. The prompts that reliably produce safer code are short and we have them in this post.

Post

Anatomy of the Moltbook hack — 1.5 million API keys in 72 hours

Moltbook leaked 1.5 million API keys, 35,000 emails, and 4,060 private messages in 72 hours. Wiz's disclosure showed the root cause: a single Supabase table without row-level security. Here is the timeline, the exact bug, and the ten-minute hardening walkthrough for your own app.

Post

CVE-2025-29927 one year later: 40% of Next.js apps still vulnerable

The Next.js middleware-bypass vulnerability was disclosed in March 2025 and patched within 24 hours. One year later, forty percent of public Next.js apps are still running vulnerable versions. Here is why, and the two-minute check to run on yours.

Post

Why AI-generated code is unsafe by default

Every major study in the last twelve months has measured the same thing: 40 to 62 percent of code produced by modern AI assistants contains a real security vulnerability. Here is what that looks like in practice, and why traditional SAST tools miss most of it.

Methodology

Headline

Top seven recurring bugs

Which prompts produce safer code

Where Securie fits

Try it

Related posts