Blog
Bolstering LLM Security to Protect Against OWASP Top Ten Threats

Bolstering LLM Security to Protect Against OWASP Top Ten Threats

MARK SEMENENKO

Published June 28, 2024

Last edited: March 17, 2025

We’re all witness to the buzz around Generative AI (GenAI). These offerings, which convert user prompts into model-generated text, images, and videos, have permeated our professional and personal lives as they become increasingly accessible.

One incredibly prevalent type of GenAI is the Large Language Model (LLM), which includes the likes of ChatGPT and Google Gemini. You may already have used an LLM to write threatening letters of complaint, ghost-write your hilarious yet endearing wedding speeches, or rephrase the impolite things you would like to say to your colleagues, vendors, or customers to sound somewhat more professional. Whether for personal or business reasons, the use of LLMS is more popular than ever – and even more important to how businesses operate.

Regardless of how you’ve used LLMs, you’re not alone. Respondents to the 2024 State of Data Security Report reported that 88% of their employees are using GenAI, whether the company has officially adopted it or not. It’s easy to understand why – LLMs can provide fast, consumable access to information, shorten menial or manual tasks, and help users expand on new ideas, amongst other benefits.

The risks associated with LLMs, however, are often misunderstood or overlooked. And on top of this, about half of today’s data leaders claim that their data security strategy is failing to keep up with the pace of AI evolution. Over my twelve years in technical consulting and implementation, I’ve navigated the security and privacy implications of many exciting, powerful new technologies. In this blog, we’ll look at some of the top threats you need to consider when implementing or using LLMs, and how they can be proactively combated.

OWASP Top 10 for Large Language Model Applications

The Open Worldwide Application Security Project (OWASP) is a community-led nonprofit organization with a mission to foster secure software application development and deployment. Since 2003, OWASP has produced its “Top Ten,” an awareness document that presents a consensus view of the 10 most critical security risks for web applications.

At the end of 2023, OWASP published the Top 10 for Large Language Model Applications to specifically address the challenges facing modern LLM development. By raising awareness of these agreed-upon vulnerabilities, the organization aims to improve the way we develop, deploy, and utilize LLMs moving forward.

Most Pressing Threats to LLM Security

While each of the risks in OWASP’s top 10 list poses a legitimate threat to LLM development and use, we’ll focus on two of what we believe are the most dangerous – model poisoning and prompt injection. By examining these in detail, we can better understand their relevance – and how best to mitigate them.

Model Poisoning Threats

Let’s first consider how LLMs are created. As with any AI or Machine Learning (ML) model, LLMs are trained on large data sets. These data sets are the food that’s fed to the models, and they completely determine how the model behaves and responds to user prompts. A “healthy” diet will result in a model that returns relevant, well-written results. Eat something bad, however, and you might just end up with food poisoning.

Once a model has been trained, it’s not possible to interrogate it to determine which data was used in its training. But, by manipulating the input data, it is possible to alter or skew the outputs. This form of attack is called model poisoning, and its goal is to reduce the model’s accuracy by injecting incorrect, biased, or deceptive data into the training process.

It is extremely difficult to detect model poisoning attacks, especially when the data being used to train the model appears plausible. In addition, LLMs often draw data from non-curated sources, such as the open Internet. An adversary could deliberately plant mass amounts of false information in these public data sources, or modify this data, to poison the models that are trained on them.

To mitigate model poisoning threats, it is imperative to:

Properly secure access to training data
Verify the veracity of the training data’s lineage
Simplify continuous security monitoring and auditing
Automate where possible to detect changes to source training data

These measures will ensure there is no unauthorized access to training data, and that any modifications to these data sets are detected and logged, allowing for a retrospective analysis of the data on which a model was trained.

Prompt Injection Threats

Another threat to LLMs comes in the form of prompt injection, which occurs after the model has been trained and deployed. Similar to SQL injection – one of the most common web hacking techniques – LLMs can be manipulated by injecting malicious inputs disguised as legitimate prompts. An adversary is able to craft an input that appears legitimate, but is designed to elicit a different, often confidential response.

Prompt injection is a particularly concerning threat for Retrieval-Augmented Generation (RAG)-assisted LLMs. RAG-based AI models use a combination of LLMs and external data sources to generate grounded, accurate, and contextually relevant responses. This means that the model has access to both training data and real-time information, making it more vulnerable to a prompt injection attack.

To manipulate the LLM, a malicious user could craft a prompt and directly input it into the LLM, upload files containing harmful instructions, or ask the LLM to summarize documents containing the hostile prompt. LLMs by nature do not distinguish between user instructions and external data – making prompt injection much more difficult to protect against than SQL injection, where inputs can be more easily sanitized.

There are a number of methods for mitigating prompt injection attacks, including:

Monitoring for and detecting malicious prompt attempts
Treating your LLMs as you would untrusted users in your data ecosystem
Enforcing data security and privacy controls on both training data and real-time data sources

This multifaceted approach to protecting all data the LLM may access will significantly reduce the risk of disclosing sensitive data or harmful information posed by malicious prompts.

Enforcing LLM Security at Every Layer

To properly mitigate threats and empower your organization to leverage GenAI – whether in the form of an LLM, RAG, or any other form of AI/ML – we must consider and three layers of model security:

The storage layer, where the training and retrieval data is stored at rest.
The data layer, where data is transformed or “chunked” to be used for data engineering and model training.
The prompt layer, where users interact directly with the models.

As evidenced by the craftiness of model poisoning and prompt injection attacks, protecting solely the data storage layer is no longer sufficient for LLM security. Instead, you need to maintain comprehensive and consistent controls on each layer in order to ensure that the data being ingested and shared by the model is accurate, secure, and appropriate for user consumption.

When enforcing LLM security, there are three core facets to consider:

Data Discovery and Classification – You must understand which data is sensitive to treat and protect it appropriately. This may include manual cataloging, ideally with automated sensitive data discovery and classification, which changes depending on the context in which the data resides.
Data Access-Control – You must enforce fine-grained access control in a way that is transparent, federated, and scalable. Consider using modern access control techniques such as attribute-based access control (ABAC) or purpose-based access control (PBAC), which each add a level of extra flexibility and contextual information to your ability to grant or deny data access.
Monitoring and Auditing – It is essential to ensure that both user and LLM activity are monitored for any suspicious behavior, especially where prompt injection and model poisoning are involved. Keep accurate, available, and transparent records of activity on your data, ideally including simplified review/search and automatic alerting of anomalous actions.

The Immuta Data Security Platform gives you each of these dynamic capabilities, helping to enforce consistent security across the various layers of your RAG-based GenAI and LLM tools. To learn more about enforcing holistic LLM security and mitigating the risk of these attacks, request a demo from our team. For insight into how more than 700 data professionals are leveraging and protecting their AI models, check out The AI Security & Governance Report.

The AI Security & Governance Report

See how 700+ data professionals are leveraging and protecting their AI models.

3 Emerging Data Security Laws and What They Mean for You

The past few months have been particularly hectic for lawmakers across the European Union (EU). With Ursula von der Leyen’s leadership of the European Commission set to conclude after the 2024 elections, lawmakers have felt the pressure to advance critical files and policies as quickly as possible. Amid this legal...

3 Key Obstacles of Military Data Ops and What They Indicate

There are many lessons that I learned throughout my civilian and military careers, but one that continues to hold true is that obstacles to accessing the data national and strategic analysts need are not caused by a lack of reporting in the field. Rather, it is the technical burdens, lack...

3 Solution Patterns for Data Security Success

You’ve just adopted a data security solution – congratulations! Now what? As with any new piece of technology, it can be tempting to jump in feet first to solve all your problems. After all, the sooner you achieve ROI, the better – right? Not necessarily. Often, we see customers struggle...

your data

Put all your data to work. Safely.

Innovate faster in every area of your business with workflow-driven solutions for data access governance and data marketplaces.

Book a demo

Platform Services

Metadata Registry

Data Discovery & Classification

Policy Entitlement Engine

Unified Audit

Data Domains

Apps

Data Marketplace

Data Access Governance

Ecosystem Partners

Native and API Integrations

Get Started

Take a tour of Access Governance

Take a tour of Data Marketplace

Schedule a live demo

Find a consulting partner

Data problems we solve

Unify data access control

Publish & find data products

Create & enforce policy

Monitor & audit data usage

Speed business innovation

Roles we empower

Data Product Owner

Data Consumer

Data Steward

Data Governor

Data IT

Industries we transform

Financial Services

Health & Life Sciences

Public Sector

How Immuta Copilot Accelerates Data Provisioning at AI Scale

Get in the know

Blog

Resource Center

Data Fundamentals

Get a deeper look

Demo Hub

How-To Guides

Schedule a Live Demo

Get connected

Events & Webinars

Sign Up for Our Newsletter

Get support

Documentation

Customer Support

Get inspired

About us

Who We Are

Leadership

Customers

Partners

News

Connect with us

Careers

Upcoming Events

Contact Us

Customer spotlight