All About RAG: What It Is and How to Keep It Secure

Bar-El Tayouri October 24, 2024 7 min read

AI is growing in power and scope and many organizations have moved on from “simply” training models. In this blog, we will cover a common system of LLM use called Retrieval-Augmented Generation (RAG).

RAG adds some extra steps to typical use of a large language model (LLM) so that instead of working off just the prompt and its training data, the LLM has additional, usually more up-to-date, data “fresh in mind”.

It’s easy to see how huge this can be for business; being able to reference current company data without having to actually train an AI model on it has many useful applications. However, like all generative AI use cases, it comes with unique risks around inputs, outputs, and data flows. See our primer on generative AI security for the broader context. The adoption curve is staggering—see our roundup of generative AI statistics for a sense of how fast these systems are scaling across industries.

This article is part of a series of articles on AI Security.

How does RAG work?

RAG requires orchestration of two models, an embedder and a generator. A typical RAG system starts with a user query and a corpus of data such as company PDFs or Word documents.

Here’s how a typical architecture works:

During a pre-processing stage, the corpus is processed by an AI model called an embedder which transforms the documents into vectors of semantic meaning instead of plain words. Technically speaking, this stage is optional, but it makes things a lot faster if the documents are pre-processed and accessed from a vector database, rather than processed at runtime.

When a user query comes in, the prompt is also fed to the embedder, for the same reason.

Next, the embedded user query is used by a retrieval system to pull relevant pieces of text from the pre-embedded corpus. The retrieval system returns with a ranked set of relevant vectors.

The embedded user query and relevant documents are fed into a generative AI model, specifically a pre-trained large language model (LLM), which then combines the user query and retrieved documents to form a relevant and coherent output.

Security risks with RAG

The two biggest risks associated with RAG systems are poisoned databases and the leakage of sensitive data or personally identifiable information (PII). We’ve already seen instances where malicious actors manipulate databases by inserting harmful data. Attackers can skew the system’s outputs by making their data disproportionately influential, effectively controlling the AI’s responses, which poses a serious security threat. When implementing RAG, it’s essential to ask key questions: What models are you using for embedding and generation, and where are you storing your data?

Choosing the right models is crucial because different models handle security, accuracy, and privacy differently. Ensuring that these models are fine-tuned for security and privacy concerns or that services are blocking malicious behavior is key, as poorly selected models and third-party services can introduce vulnerabilities. Another overlooked risk comes from unauthorized AI connectivity in your stack. Our analysis of MCP security shows how hidden or unvetted connections can create serious blind spots in AI pipelines.

If you’re using a vector database like Pinecone or LlamaIndex, you must ensure that your data storage complies with security and privacy regulations, especially if you’re working with sensitive data. Beyond storage risks, protecting the underlying model itself is critical. Our guide to AI model security covers practical steps to safeguard models from theft and tampering.

These databases store the map between the embedding and text, and ensuring that they are properly encrypted and access-controlled is vital to prevent unauthorized manipulation. Developers often choose platforms like OpenSearch, a low-code vector database solution, because it offers easier management of these security aspects, with built-in monitoring, access control, and logging to help avoid data poisoning and leakage.

In addition to model selection and secure data storage, all AI systems operate with a system prompt—a hidden instruction set that initializes every task or conversation. Adjusting this system prompt can help mitigate security issues, such as preventing the model from generating harmful or sensitive content. However, while strengthening the system prompt can help reduce certain risks, it’s not a comprehensive solution. A strong system prompt serves as the first line of defense, but addressing AI vulnerabilities requires a broader approach, including fine-tuning the models for safety, ensuring data compliance, and implementing real-time monitoring, code sanitizers, and guardrails.

In summary, securing a RAG system involves more than just selecting the right models and storage solutions. It requires robust encryption, data governance policies, and continuous oversight to protect against data poisoning, information leakage, and other evolving security threats.

How to protect RAG systems

Protecting AI systems, including RAG systems, requires a multi-layered approach that combines proactive testing, security mechanisms, and safeguards to prevent vulnerabilities from being exploited. Securing a single RAG workflow is only part of the challenge. Organizations also need continuous visibility and control across their entire AI estate—something AI security posture management is designed to provide.

One effective strategy is to red-team your AI models. Red-teaming RAG systems involves simulated attacks to identify weaknesses in your AI system, such as prompt injection or data poisoning, before they can be exploited in real-world scenarios.

To protect RAG systems, there are several key approaches to consider according to OWASP top 10 for LLM applications:

1. Firewalls

In AI, firewalls act as monitoring layers that evaluate both input and output. They can use heuristic techniques to detect suspicious activity, such as attempts to inject harmful prompts or commands. For example, if a user tries to manipulate the AI to ignore its initial instructions (via prompt injection) and generate unintended or harmful output, the firewall can flag this as a potential attack. While firewalls provide an extra layer of security, they aren’t foolproof and may miss more sophisticated attacks that don’t match known patterns.

2. Guardrails

Guardrails are predefined rules or constraints that limit the behavior and output of AI systems. These can be customized based on the use case, ensuring the AI follows certain safety and ethical standards.

NVIDIA NeMo Guardrails offers several types of guardrails:

Input rails filter and control what kinds of inputs are acceptable, ensuring sensitive data (like names or email addresses) is not processed.
Dialog rails shape conversational flows to ensure AI responds appropriately, based on predefined conversation structures.
Retrieval rails ensure the AI retrieves only trusted and relevant documents, minimizing the risk of poisoned data entering the system.
Execution rails limit the types of code or commands the AI can execute, preventing improper actions.
Output rails restrict the types of outputs the model can produce, protecting against hallucinations or inappropriate content.

Garak, another tool from NVIDIA, is an open-source AI red-teaming tool for testing vulnerabilities in large language models (LLMs). Garak helps identify common vulnerabilities, such as prompt injection or toxic content generation. It learns and adapts over time, improving its detection abilities with each use. Promptfoo is another tool that might be used.

3. Fact-checking and hallucination prevention

RAG systems can also incorporate self-checking mechanisms to verify the accuracy of generated content and prevent hallucinations—instances where the AI produces false information. Integrating fact-checking features can reduce the risk of presenting incorrect or harmful responses to users.

4. Shift-left security

A shift-left approach focuses on integrating security practices early in the development process. For RAG systems, this means ensuring that the data used for training and fine-tuning is free of bias, sensitive information, or inaccuracies from the start. Additionally, many RAG vulnerabilities may be in the code itself, so it’s worth scanning the code and organizing for fixes to take place before the production stage. By addressing these issues early, you minimize the risk of the system inadvertently sharing PII or being manipulated by malicious input.

Conclusion

As AI systems like RAG become more advanced, it’s critical to implement these protective measures to guard against an increasing array of security threats. Combining firewalls, guardrails, fact-checking, early security practices, and robust monitoring tools creates a comprehensive defense against potential vulnerabilities. More broadly, it’s important to distinguish between securing AI itself and using AI as a security tool—two concepts often confused. Our article on AI security solutions explains the difference and why it matters for long-term resilience.

Increase visibility and control over the AI components in your applications

Mend AI

About the author

Bar-El Tayouri

Bar-El has been programming since the age of 12 and began hacking transportation cards while still in high school. Today, he leads Mend AI, a suite of products designed to enable the GenAI revolution for security-conscious enterprises. He co-founded Atom-Security, an AppSec prioritization company now embedded within Mend Container, following its acquisition by Mend.io. His background includes serving as an architect and the first engineer at an augmented reality startup, along with extensive expertise in data science and cybersecurity.