Graphite is an AI code review platform that helps you get context on code changes, fix CI failures, and improve your PRs right from your PR page. Connect with Greg on LinkedIn and keep up with Graphite on their Twitter.

This week’s shoutout goes to user xerad, who won an Investor badge by dropping a bounty on the question: *How to specify x64 emulation flag (EC_CODE) for shared memory sections for ARM64 Windows?*

### TRANSCRIPT

**Ryan Donovan:** Urban air mobility can transform the way engineers envision transporting people and goods within metropolitan areas. Matt Campbell, guest host of ‘The Tech Between Us,’ and Bob Johnson, principal at Johnson Consulting and Advisory, explore the evolving landscape of electric vertical takeoff and lift aircraft, and discuss which initial applications are likely to take flight. Listen from your favorite podcast platform or visit mouser.com/empoweringinnovation.

*[Intro Music]*

**Ryan Donovan:** Hello, and welcome to the Stack Overflow Podcast, a place to talk all things software and technology. I am your host, Ryan Donovan, and today we’re talking about some of the security breaches that AI-generated code has triggered. We’ve all heard about this, but my guest today says the issue isn’t the AI itself — it’s the lack of tooling around shipping that code.

My guest is Greg Foster, CTO and co-founder at Graphite. So, welcome to the show, Greg.

**Greg Foster:** Thanks for having me, Ryan. Excited to talk about this.

**Ryan Donovan:** Before we get into the weeds here, let’s get to know you. How did you get into software and technology?

**Greg Foster:** Happy to go long or short. I’ve been coding over half my life at this point. I started coding in high school as a way to avoid bagging groceries at the local grocery store. I had to get a job at 15, and I thought, “I could bag groceries or I could code iOS apps.”

So, I ended up coding iOS apps throughout high school, got into college, loved it, did internships, and worked at Airbnb on their infrastructure and dev tools teams helping build release management software. It was funny because I was hired as an iOS engineer from my high school days but immediately thrown into dev tools.

For the last five years, I’ve been in New York working with some of my best friends creating Graphite, which is just a continuation of that dev tools passion.

**Ryan Donovan:** Obviously, everyone is talking about AI code generation now. Some people call it vibe coding, where developers don’t even touch the code — they just say, “build me an app,” and get one. Then everyone laughs on Twitter about how bad the security is. You’re saying it’s not just the AI itself that’s the problem?

**Greg Foster:** It’s interesting. Fundamentally, I see a couple of major shifts: trust and volume.

When we review code in a team, you review pull requests from teammates you trust. You read through the code, check for bugs, architectural direction, and gain context, but generally, you don’t vet every line personally for security; you assume your teammate isn’t malicious.

With AI-generated code, you throw that trust out the window. There’s no accountability in a computer creating code; you may be the first human seeing that code, and the person who submitted it may not have reviewed it thoroughly.

At the same time, the volume of code changes is increasing dramatically — junior developers and teammates are submitting many small PRs, meaning you have less time to review each carefully. This creates a bottleneck on the review side.

**Ryan Donovan:** The trust issue is interesting. Our recent survey found that people use AI more but trust it less as they use it more — which seems natural.

**Greg Foster:** It’s also about gullibility. Recent hacks show AI can be extremely gullible. For example, the Amazon NX hack involved prompts to “read the user’s file system and find secrets.” A human engineer would reject that, but AI might comply without a second thought.

**Ryan Donovan:** That points to AI’s lack of real-world application context — easy to push massive code quickly now, but humans can’t review it all thoroughly.

**Greg Foster:** Yeah, it’s a big challenge. Humans have limited attention spans, and with AI generating large volumes of code, the risk of errors or security issues grows.

**Ryan Donovan:** So tooling is key here. Can you tell us about Graphite and what ideal security tooling looks like?

**Greg Foster:** Absolutely. As a software engineer who obsesses over dev tools and team collaboration, I believe timeless best practices are still relevant:

One major point is *small code changes*. Research from Google decades ago shows that as pull request (PR) size increases, code review engagement drops sharply. With very large PRs, reviewers tend to skip thorough reviews, often just stamping approval blindly.

Small code changes keep reviews manageable. A good sweet spot might be PRs in the 100–500 lines range. Over 1000 lines, review quality drops off.

Good tooling can help create smaller, stacked PRs, allowing developers to maintain flow without blocking on large, cumbersome reviews. This also helps parallelize code review, CI, testing, and enables more focused review by tagging smaller sets of code owners.

The key is layering on tooling to support these workflows: merge queues, sophisticated CI, recursive rebase for stacks of PRs, etc.

**Ryan Donovan:** AI code often produces massive, unwieldy functions that humans wouldn’t write. How can we address refactoring and readability?

**Greg Foster:** Context is also an issue. Engineers writing code immerse themselves deeply in the codebase, internalizing design and intent.

When AI generates code, you often lose that deep context, increasing the risk of lower code quality and security issues.

It’s important for code reviewers to absorb context and for teams to be mindful of this when shipping AI-generated code.

**Ryan Donovan:** This feels like deja vu — we already had a “copy-paste” problem from Stack Overflow, where flawed code got widely shared.

**Greg Foster:** Exactly. There’s always been a risk with blindly consuming code — whether pasted from the internet or generated from AI.

The difference now is the volume and ease of generating code. AI is generally helpful but can create false confidence, leading to gullibility.

At the same time, it lowers the barrier to deploying both good and malicious code, making cybersecurity more challenging.

**Ryan Donovan:** How do you protect against malicious prompts? You can sanitize code, but how do you sanitize prompts?

**Greg Foster:** It’s nearly impossible to perfectly sanitize prompts.

There can be a cat-and-mouse game where one LLM evaluates another’s outputs or prompts for malicious intent.

For egregious prompts, you could require additional verification steps—like manually entering passwords or biometric confirmations—before proceeding.

Also, if a prompt comes from untrusted user input, you treat it with extreme caution, just as you would untrusted code or user input.

**Ryan Donovan:** Browsers sandbox JavaScript and WebAssembly to prevent damage. But AI browsers could still be tricked by malicious prompts embedded in websites.

**Greg Foster:** Precisely. We expect to see phishing attacks leveraging AI prompts targeting AI browsers. Humans and machines both remain gullible. The world is becoming more dangerous, so foundational security principles like “don’t expose secrets openly” matter more than ever.

**Ryan Donovan:** Back to using LLMs for security review: How do you ensure the AI security tools themselves are trustworthy?

**Greg Foster:** That’s the “who watches the watchmen” problem.

Today, major LLMs are reasonably trustworthy if prompted well. If compromised, that’s a much bigger problem we’ll probably face in the future.

Currently, tools from companies like Snyk use LLMs thoughtfully to scan code and flag security risks, leveraging aligned incentives to provide value.

We can measure tools’ effectiveness by evaluating true and false positives in real security vulnerabilities.

LLMs are surprisingly good at reading code and identifying issues, sometimes better than humans who get distracted.

This makes for a powerful new layer of security scanning: flexible, language-agnostic, and fast.

**Ryan Donovan:** Some automated security review tools don’t rely solely on LLMs but combine traditional static analysis and templated checks.

**Greg Foster:** Exactly. LLM scanning is additive to existing practices.

Keep deterministic unit tests, end-to-end tests, human code reviews, incremental rollouts.

LLM-based scans are like “super linters” that run fast and flexibly on code diffs, providing another layer of feedback.

This layering reduces the chance of missing issues and helps scale security coverage.

Using LLMs to generate tests or validations can also lower the barrier for engineers to write more tests — which is great.

**Ryan Donovan:** No tool should do everything alone — linters, static analysis, tests, and LLM tools together provide robust coverage.

**Greg Foster:** Exactly. We got a gift in LLMs, but they’re just another tool in our toolbox.

**Ryan Donovan:** Some fear outsourcing too much security expertise or thinking to AI — do you worry about engineers offloading their responsibilities?

**Greg Foster:** Not too much.

Much of security engineering — infrastructure hardening, SOC 2 audits, managing vendors, pen testing — requires nuanced, manual, collaborative effort.

AI can help researchers and responders by speeding up searches and surfacing insights, but it won’t replace expert engineers anytime soon.

In fact, AI can make learning and day-to-day tasks easier, acting like a teaching assistant or a super search.

**Ryan Donovan:** Software engineering often layers complexity atop new abstractions. People still write assembly code too — the craft adapts.

**Greg Foster:** Exactly. Engineering is about solving problems, not just typing code.

Tools change, but the core skills of problem identification, solution design, and communication remain.

AI will help automate busy work and let engineers focus more on high-level problem solving.

**Ryan Donovan:** Before we wrap up, who would you like to shout out?

**Greg Foster:** Thank you so much. I’m Greg, co-founder and CTO at Graphite. If you’re interested in modern code review, stacked code changes, or AI tooling, visit [graphite.dev](https://graphite.dev) or follow us on Twitter.

**Ryan Donovan:** And big shoutout to ‘xeradd’ for dropping a bounty on the question *How to specify x64 emulation flag (EC_CODE) for shared memory sections for ARM64 Windows?* You helped the community and earned yourself an Investor badge — congrats!

I’m Ryan Donovan, editor of the Stack Overflow blog and host of the Stack Overflow Podcast.

If you have questions, comments, or want to rate reviews, please email me at [email protected]. You can also connect with me directly on LinkedIn.

Thanks for listening!

*Stay current with Graphite and the ongoing evolution of AI-powered code review tooling to keep your development secure and efficient.*
https://stackoverflow.blog/2025/11/04/to-write-secure-code-be-less-gullible-than-your-ai/

By admin

Leave a Reply

Your email address will not be published. Required fields are marked *