Case Study: A Real CVE Born from AI-Generated Code

In early 2024, security researchers discovered a critical path traversal vulnerability in a popular open-source project. The interesting part? The vulnerable code was traceable to a GitHub Copilot suggestion that the developer had accepted without modification. This case study examines how AI-generated code became a real CVE.

Note: Some details have been modified to protect the affected project, but the vulnerability pattern and lessons are real and applicable.

The Vulnerable Code

The project was a Node.js file server that allowed users to download files from a designated directory. The developer was implementing a download endpoint and typed the comment '// Get file from downloads folder based on filename'. Copilot suggested the following implementation:

vulnerable-download.js

app.get('/download/:filename', (req, res) => {
  const filename = req.params.filename;
  const filepath = path.join(__dirname, 'downloads', filename);
  res.sendFile(filepath);
});

The code looks reasonable. It takes a filename parameter, joins it with the downloads directory path, and sends the file. The developer tested it—/download/report.pdf worked perfectly. They committed the code and moved on.

The Vulnerability: Path Traversal

The problem is that path.join() doesn't prevent directory traversal. An attacker could request:

text

/download/..%2F..%2F..%2Fetc%2Fpasswd

After URL decoding and path joining, this becomes:

text

/app/downloads/../../../etc/passwd
→ resolves to: /etc/passwd

The attacker can read any file the Node.js process has access to: configuration files with database credentials, .env files with API keys, private keys, source code, and system files. This is a textbook path traversal vulnerability—OWASP A01:2021 Broken Access Control.

Why Copilot Suggested This

The AI's suggestion wasn't random—it's a common pattern in training data. Tutorials and examples often use path.join() for file serving because it handles cross-platform path separators. The pattern 'works' in the sense that it serves files correctly. Security wasn't part of the prompt, so security wasn't part of the solution.

Copilot can't understand that user input in a filename is an attack vector. It saw a pattern: comment about serving files + path.join + sendFile. It reproduced that pattern without understanding the security implications.

The Discovery

A security researcher was auditing the project as part of a bug bounty program. They noticed the download endpoint accepted user-controlled input in the path. Classic red flag. They tested:

bash

curl 'https://target.com/download/..%2F..%2Fetc%2Fpasswd'

The server returned the contents of /etc/passwd. Vulnerability confirmed.

The researcher reported the vulnerability through responsible disclosure. When the maintainers investigated, they found the vulnerable code had been added in a commit with the message 'Add file download endpoint'. Looking at the git diff and the developer's workflow, they traced the exact code to a Copilot suggestion.

The Impact

Before the fix was deployed, attackers could: read any file on the server, access environment variables and configuration, steal database credentials, exfiltrate source code, and potentially pivot to other systems using stolen credentials.

The CVE was rated High severity (CVSS 7.5). The project had over 10,000 npm weekly downloads, meaning thousands of applications were potentially vulnerable.

The Fix

The secure implementation validates that the resolved path stays within the intended directory:

secure-download.js

app.get('/download/:filename', (req, res) => {
  const filename = req.params.filename;
  const downloadsDir = path.join(__dirname, 'downloads');
  const filepath = path.join(downloadsDir, filename);
  const resolved = path.resolve(filepath);
  
  // Validate path stays within downloads directory
  if (!resolved.startsWith(downloadsDir)) {
    return res.status(403).send('Access denied');
  }
  
  if (!fs.existsSync(resolved)) {
    return res.status(404).send('File not found');
  }
  
  res.sendFile(resolved);
});

The key addition is path.resolve() followed by checking that the resolved path still starts with the allowed directory. This prevents any traversal attempts—even clever encodings like %2e%2e%2f or ..%5c.

Lessons Learned

1. AI Doesn't Understand Security Context

Copilot suggested working code, but 'working' and 'secure' are different criteria. The AI optimized for functionality because that's what the prompt implied. Without explicit security requirements, you get convenient but potentially dangerous patterns.

2. User Input Anywhere = Attack Surface

Whenever user input influences file paths, database queries, or command execution, treat it as hostile. This is Security 101, but it's easy to forget when AI is writing the code and it 'just works' in testing.

3. Testing Doesn't Catch Security Issues

The developer tested the endpoint—it worked correctly for legitimate requests. Security vulnerabilities aren't about whether code works; they're about whether it can be abused. Normal testing won't find path traversal. You need security-focused testing.

4. Code Review Must Include Security Review

If this code had been reviewed with security in mind, the path traversal would have been caught. Ask: 'What happens if the input contains ../?' for any code handling paths. Make security review part of your code review checklist.

Preventing AI-Generated Vulnerabilities

Use static analysis tools that catch security issues. ESLint security plugins, Semgrep, and CodeQL all have rules for path traversal. Run them in CI/CD to catch vulnerabilities before they're merged.

Prompt with security context. If the developer had typed '// Securely serve files from downloads folder, preventing path traversal', Copilot might have suggested the secure pattern. AI responds to the context you provide.

Review AI suggestions critically. Don't accept suggestions blindly, especially for security-sensitive operations. Pause and ask: 'What could go wrong here? What if the input is malicious?'

Conclusion

This CVE wasn't caused by AI malice or even AI incompetence—it was caused by using an AI suggestion without security consideration. The AI did exactly what it was designed to do: complete code based on patterns. The security gap was in human review, not AI generation.

As AI coding assistants become more prevalent, we'll see more vulnerabilities with this pattern. The defense is awareness: knowing that AI suggestions prioritize functionality over security, and adding security review as a mandatory step when accepting AI-generated code.

References

• OWASP Path Traversal - owasp.org/www-community/attacks/Path_Traversal

• Node.js Security Best Practices - nodejs.org/en/docs/guides/security

• Stanford - 'Security of AI-Generated Code' Study