Blog

Advice from a CTO: Secure Code Practices for AI Code Assistants

What we should know about the security of AI-generated code, and how we can improve it.

Published on
04 June 2025

Jenn Gile

Head of Community at Endor Labs

AI code assistants like Cursor, Copilot, and Windsurf are transforming software development, enabling the rapid translation of ideas into functioning code, sometimes in minutes. While these tools can be powerful force multipliers, particularly for experienced developers, they also introduce significant security risks. Trained on a vast pool of public projects, these models inevitably inherit both good and bad programming patterns. Furthermore, they enable non-programmers to generate code, reducing barriers without necessarily ensuring non-developers (HR, finance, actuaries, data scientists, etc) are developing secure applications.

Dimitri Stiliadis is CTO and Co-Founder at Endor Labs, where he directs strategy for the company’s responsible use of AI and how AI is incorporated into the product. Coming from a unique perspective — a CTO at a security vendor — Dimitri has a lot to say on the challenges of getting AI co-pilots to generate secure code.

At the May 2025 edition of LeanAppSec Live, Dimitri shared five key lessons for security engineers and developers navigating the use of AI code assistants:

  • Implement guardrails
  • Get real-time security signal
  • Watch your dependency blast radius
  • Compensate for non-determinism
  • Invest in prompt engineering

You can watch the 30-minute session, including audience Q&A. To download his slides or see the full replay, register for the on-demand version.

Lesson #1: Implement Guardrails

One of the most critical steps is establishing clear guardrails for AI assistance. These guardrails define what the agents can and cannot do and highlight areas requiring specific attention. Examples include establishing dependency budgets or mandating the use of unit tests.

Apply org-wide AI policy

An organization should apply an org-wide AI policy to enforce standards across all AI assistance tools. This policy is vital for specifying what data can and should not be shared with these tools. A significant risk exists with sensitive data like secrets or certificates, which are often stored in environment files and typically ignored by .gitignore. Vibe coding tools may not use these ignore files, potentially sending secrets to their memory stores or indexing them in backend databases (like RAG systems), dramatically increasing the blast radius if these systems are compromised. Therefore, explicitly telling the tools to stop indexing such critical data is essential. Policies should cover which files are allowed to be shared with the tools' backends.

Mandate testing before commits

Automated checks can help catch issues in AI-generated code. Mandating testing, particularly unit tests, before commits is highly recommended. Dimitri recommends a test-driven approach, where the agent first writes unit tests, a human validates the test, and then the agent writes code to satisfy those tests.

Lesson #2: Get Real-Time Security Signal

AI models are trained on data that is often outdated. In fact, models may be trained on code that is at least a year old, meaning they lack information on new vulnerabilities (CVEs) disclosed daily. They are also unlikely to be trained on the latest frameworks or libraries. Expecting an AI tool to inherently know about recently discovered dependency vulnerabilities is unrealistic.

Fresh intelligence

This is why fresh intelligence is crucial. Coupling AI code assistants with tools that provide real-time or up-to-date information, such as MCP servers, allows injecting current security intelligence into the development workflow. These tools can inform the agent about what should or should not be used based on current vulnerability data, leading to much better results. While future AI models may be better trained to avoid common static analysis (SAST) pitfalls like SQL injection, they will still need real-time signals for newly identified vulnerabilities in dependencies. Security tools providing this signal help the AI produce more secure code.

Lesson #3: Watch Your Dependency Blast Radius

AI tools often don't pay attention to the number of dependencies they introduce, meaning even simple suggestions can inflate your dependency tree and lead to vulnerability multiplication. It's essential to enforce dependency budgets and provide feedback to the AI to help it limit dependency bloat.

Small features can have a huge impact

AI tools often do not pay default attention to the number of dependencies they introduce. Even simple AI suggestions or prompts can inflate your dependency tree. As demonstrated in a vibe coding experiment to build a simple board game collection app, an initial prompt resulted in 1,292 dependencies, which increased to 1,600 after attempts to fix issues. This happens because introducing one dependency can bring in numerous transitive dependencies.

Vulnerability multiplication and risk expansion

Each dependency introduces new risks, and these vulnerabilities can multiply. Also, risks extend beyond common weaknesses (CWEs) and CVEs; AI can even make architectural changes that impact the application's security posture.

Enforce dependency budgets

To mitigate potential risks, it's vital to enforce dependency budgets and only bring in dependencies if absolutely necessary. Providing real-time feedback via tools like MCP servers can help the AI understand and limit dependency bloat.

Lesson #4: Compensate for non-determinism

A key characteristic of LLMs is their non-deterministic output, which means the exact same prompt given multiple times will likely produce completely different code results. This unpredictability poses a significant challenge for maintaining and debugging code generated by these tools. For example, if you need to fix a bug or add a feature later, regenerating the code from the prompt might yield vastly different outcomes.

Tool feedback brings some “sanity”

Introducing determinism into the software engineering process is necessary. Coupling AI tools with deterministic tools like code analyzers, linters, security analyzers, and dependency analyzers provides feedback to the AI. This feedback helps the AI comply with organizational policies and introduces some "sanity" or predictability into the generated code, preventing massive, unpredictable changes across the codebase when attempting minor fixes. Validation is required for fixes suggested by AI tools, especially through proper unit testing to ensure fixes don't introduce new issues. Relying solely on the AI to validate its own fixes or change unit tests to match flawed code is a pitfall to avoid; proper engineering practices are paramount.

Model dependency

Understanding that different models will perform differently is also key; results will vary. Proper testing and qualification by security teams are needed before widespread organizational use.

Models can fix issues

With enough guidance, models can be very good in fixing issues. The complexity of fixing mundane security issues may become lower because tools will be able to do it. However, architectural issues spanning multiple models or the application's architecture may not be fixable by the tools.

Lesson #5: Invest in prompt engineering

Effective interaction requires careful prompt engineering and tool configuration, including using rules files to drive development and desired behaviors.

Use rules files to drive development

Effective interaction with AI code assistants requires careful prompt engineering. Beyond just asking for code, prompts can be used to drive desired behaviors and enforce standards. This includes using rules files to drive development. For instance, you can instruct the AI to use specific testing frameworks or follow organizational coding guidelines by incorporating these rules into the tool's configuration or prompt structure. Instead of letting every user innovate on prompts, organizations should create and share reusable prompt patterns and rules. You can even ask the tool to generate proper software engineering rules and then use those rules as input for building the application.

Use ignore rule to protect sensitive data

A critical aspect of prompt engineering and tool configuration is using ignore rules to protect sensitive data. You must explicitly prevent tools from sending critical information like secrets, certificates, or proprietary data stored in various files to the AI's backend services. Understanding that different models will perform differently is also key; results will vary. Proper testing and qualification by security teams are needed before widespread organizational use. If you don't explicitly tell the tool to stop indexing critical information like secrets and certificates, these things will find themselves indexed in a RAG database on their back end, increasing your blast radius if that database is compromised. Pay extreme attention to what data is allowed to be transitioned into their back ends.

Final Takeaways

AI code assistants are powerful tools that can significantly increase productivity, especially for experienced developers. However, when used without proper controls, particularly by inexperienced users, they can create substantial security work and introduce chaos. These tools were not inherently designed with robust software engineering or code security principles in mind, although they are expected to evolve.

For security engineers and developers, integrating security tools to augment AI functionality for production-quality code is a must. This involves:

  • Using rules and mandating their application
  • Paying extreme attention to what files and data are shared with the tools
  • Ensuring the tools use deterministic inputs, often provided by coupling them with static analysis, dependency analysis, and other security analysis tools that offer real-time feedback and enforce organizational policies

Ultimately, while AI can assist with mundane coding tasks and even fixing simple security issues, understanding and debugging code will become even more important than writing it. Core software engineering and security fundamentals cannot be replaced (see 5 Essential Skills for AppSec Engineers). By implementing guardrails, demanding real-time security intelligence, managing dependencies, accounting for non-determinism, and practicing diligent prompt engineering, organizations can leverage AI code assistants while mitigating the significant security risks they introduce.

If you want more of this kind of content, follow us on LinkedIn or subscribe to email updates!

More resources

Applying Lean Principles to Application Security
Blog
Applying Lean Principles to Application Security

What is LeanAppSec, and how can you use it?

4 Ways to Use AI for Security Engineering
Video
4 Ways to Use AI for Security Engineering

Get an inside look at how a DevSecOps team at Adobe is using AI/ML to revolutionize their WAF rule management program. Ammar Alim (Senior Manager, DevSecOps @ Adobe) shares how they’re leveraging existing resources to dynamically create, deploy, and refine WAF rules without requiring new tools or increased budget.

Fireside Chat: What to Know About Tech Industry Analysts
Video
Fireside Chat: What to Know About Tech Industry Analysts

In this episode, Katie Norton (Research Manager at IDC) gives a primer on tech industry analysts. The conversation provides insights on how to find the right analyst firm based on company needs and the importance of asking good questions during consultations. Additionally, they address common myths about analysts being 'pay to play' and examine the impact of recent npm supply chain attacks on the industry.