Lasse Andresen guest post for Forbes Technology Council
Go to original publication
Artificial intelligence (AI) holds great promise for enterprises in all industries, with industry-specific AI models arriving as the next wave of enlightenment. However, with generative AI products trained on large quantities of data to identify patterns, the success of the model is entirely dependent on the quality, availability and validity of that input data.
Data is everywhere, but it is not always easy to interrogate the data to understand its origin, age, sensitivity and reliability. The AI surge is forcing a conversation about accessibility to this data information—especially if it is to be leveraged for AI.
Recently, this was raised by the Data & Trust Alliance as it introduced data provenance standards to enhance AI trustworthiness. The standards are designed to help companies understand where, when and how data was collected or generated to provide transparency into the rapidly growing number of AI applications.
Data provenance is a critical piece of this transparency, capturing how data flows through the organization and offering insights into data quality, security and validity. By leveraging information about the data in use, enterprises can identify and rectify potential bottlenecks, inconsistencies or inaccuracies in the data pipeline and therefore enhance the accuracy and effectiveness of the AI product. On the flip side, without data transparency, clear data provenance and data veracity indicators, it is difficult to trust the AI model and success will be hampered.
Although there are likely many ways to capture such critical information about data, I propose a novel idea. I have been working in the identity and access management (IAM) industry for a long time and founded one of the market leaders before launching my current venture, IndyKite. It is from this perspective that I believe that modern IAM is considerably underutilized in today’s tech stack, as it can offer significant benefits for AI and other modern applications, especially in terms of data veracity. Traditional IAM has been kept separate from the rest of the stack due to security concerns. However, modern IAM aims to connect disparate data silos and treat identity data as a growth enabler.
By adopting a graph-driven, modern identity approach, you can leverage a unified identity fabric, complete with context, data provenance and risk attributes for each entity (e.g., human, system, digital product or individual data points)—all without exposing sensitive data. This approach considers each application, digital entity and data point as an "identity" and captures information to support the digital trust standards described by the Data & Trust Alliance. The data can then be used in tailored AI products and automated tasks with trust while providing the necessary transparency and evidence as to its veracity.
Getting started toward greater trust in your AI-enhanced applications does not need to be an arduous process or require significant overhauling of the tech you already have in place. With the right tooling, you can leverage your data as it is, from where it is, without a huge investment. The magic is in how you approach the challenge and how you think about your data.
The best place to start is in understanding what you want your application to solve, the data you need to solve it and the veracity required of that data. A flexible data model here is vital, as it will allow you to start small and grow and change as you scale (without breaking logic or creating overly complicated data architectures). The more data mature you are, the faster this step will be.
From here you can create your unified data layer, enriched with provenance metadata and relationships. You can leverage this layer to start to drive intelligent access decisions, enhance your application logic, gain deeper data insights in your analytics or power your AI and AI-enhanced applications.
We are right at the start of seeing all that AI can offer, and I believe we are also only scraping the surface of what can be achieved with identity-centric data veracity. This next wave of AI must not just look at what AI can enable but how we can better enable AI.