It's common knowledge that the best practice for data management is to break down data siloes to create a single source of truth. We’ve seen the rise of companies like Databricks and Snowflake who have achieved multi-billion dollar valuations on that promise. Then generative AI arrived; and in the excitement around its transformational capabilities we may have lost our way.
While it’s true that we need to react quickly to the rise of Large Language Models (LLMs); we should also ensure that we are not creating walled off data products that are unscalable and break easily in production.
With the rise of Retrieval Augmented Generation (RAG) and the specialized databases to support those efforts, we are inadvertently creating new data siloes by building an AI Agent, with its own data store for every use case, across every team.

There are a few reasons we might want to take this approach: accuracy and security would be the top two reasons that come to mind. Accuracy is obvious- having a curated dataset fit for purpose can help with the accuracy of responses and reduce hallucination. Security is a bit of a trickier topic.
Disparate applications make sense from a security perspective given the current state of how we handle data access. A LLM will usually be provisioned with access to an entire vector or graph database with curated data to serve a particular team. In this case everyone who has access to the AI Agent should also be authorized to see the data accessible by that model. Simple enough, makes sense right? But what are some of the adverse effects this might be having on our data landscape and our budgets?
First and foremost, multiple data stores and multiple LLMs, many of which are self-hosted, can create an explosion in cost. Particularly if you have a dozen or more of these “RAG Agents” up and running at the same time. This can easily balloon in cost and complexity, not only from a licensing perspective but also from a maintenance perspective.
Preparing and modeling the data for these various siloed databases alone can be a lengthy and costly project, and that effort will not be reusable, particularly if the data must be segregated by team and use case. That doesn't take into account the cost of constantly updating and making certain that the data in your datastore is current and accurate - for graph and vector databases this process is not quite as straightforward as using Fivetran or Airflow to set up a data pipeline. If security is the main reason you’re taking this approach, you just might be creating the problem that someone 5 or 10 years from now will have to untangle. Is it justifiable to create technical debt just because we couldn’t figure out how to centralize and secure our retrieval datasets? And what risks are we opening ourselves up to by having decentralized AI agents lacking a single standard?
Another rather big concern is granularity. How granular can your access control be when you are assigning what is essentially a user/admin role to an LLM. Similarly to what infosec teams have seen with role-based-access-control (RBAC), overtime you’re guaranteeing that you will need to build more and more data siloes. Most companies are only tackling a few use cases today; think of what your systems might look like as AI proliferates into every part of our businesses. What is currently hard and costly will very soon become an unsolvable nightmare.
You are also running the risk of the juice not being worth the squeeze in a sense. If you are spending 100s of thousands if not millions of dollars on a RAG application, whatever incremental productivity gains you make may not be justifiable when weighed against the costs of running and maintaining the application. One step forward, two steps back.
So is there a solution that helps avoid tech debt, data siloes, and ballooning maintenance costs? At IndyKite, we think there is. We’ve developed what we are calling a Privedged & Authorized Similarity Search (PASS) to help companies centralize their RAG data stores. It works by restricting RAG search to only database nodes that the prompter (user) is permitted to access.

Indykite’s approach allows us to break down data siloes, creating a single source of truth for retrieval augmented generation without having to worry about hallucination or data security and compartmentalization. We can quickly and easily spin up new AI applications and speed up innovation - leading to more successful pilots in less time.

Organizations have spent considerable time and resources pursuing unified data management, but the current rush to implement generative AI risks undermining those efforts. The focus on security and accuracy for Retrieval Augmented Generation (RAG) is understandable, but It's not a reason to create new data silos in an effort to implement more granular access control over your data - that would be like spinning up a new Snowflake instance for each team in your organization. This approach, though seemingly practical in the short term, leads to increased costs, complex maintenance, and potential long-term technical debt.
The fragmentation of data stores and LLMs, particularly self-hosted ones, can quickly escalate expenses and complicate data governance. Incremental productivity gains may not justify the significant investment required. We should consider the long-term implications of these choices and seek solutions that avoid creating further data fragmentation. At IndyKite, we're addressing this challenge by providing a centralized, secure, and efficient approach to RAG, aiming to break down silos and enable responsible AI innovation, enabling companies to deploy AI to their whole company while mitigating the risks that usually accompany an enterprise-wide AI system.
Want to learn more?
Download our RAG Security E-Guide to get started.