AI

To overcome AI grounding challenges, we must dig deeper

  • Grounding techniques like retrieval-augmented generation have become popular to ensure AI models don't hallucinate
  • But naive RAG models still have their limitations
  • AI developers like Google and Microsoft are working on new grounding techniques to keep AI models more accurate and up to date

Grounding AI models is like giving them a map to navigate the world, but what happens when that map isn’t up to date or misses crucial details? That's the problem hyperscalers and AI leaders are now trying to solve. 

As AI continues to weave itself into the fabric of our daily lives, ensuring that models have the most accurate and relevant information is key. And while many companies have touted retrieval-augmented generation (RAG) as a solution, the future of grounding AI might require a more sophisticated approach.

What is grounding, you ask?

In the realm of AI, grounding refers to the process of connecting a model's outputs to real-world information, ensuring that its responses are accurate and contextually relevant. Without grounding, AI models risk generating responses that are disconnected from reality, leading to potential misinformation or errors in tasks.

The most popular grounding techniques are RAG or fine tuning the models themselves.

"Since fine tuning models can be a complex and expensive undertaking, we see RAG quickly gaining popularity," said GlobalData Chief Analyst Rena Bhattacharyya.

RAG works by retrieving relevant data from a database or corpus in response to a prompt, and then using that information to generate more accurate responses. Basically, it's like having a search engine built into the AI model, allowing it to pull in external information as needed.

Without RAG or other grounding techniques, large language models (LLMs) are limited to their initial training data and that data’s timeframe, noted IDC Research Manager Hayley Sutherland. Un-grounded LLMs can lack important context or domain-specific understanding, she added, and have also been shown to frequently produce hallucinations — "responses that may sound good, but that are ultimately wrong, which can create enterprise risk and/or damage."

According to Sutherland, RAG is becoming a "battleground feature" among different vendors offering generative AI capabilities, from those offering build-it-yourself components of the RAG pipeline (embeddings models, vector storage and data management, etc.), to newer RAG-as-a-Service offerings (usually delivered via API), to built-in RAG capabilities within AI features.

Still naive

As effective as RAG might seem, it has its limitations. Naive RAG, which relies on simple retrieval mechanisms, can struggle with complex queries that require a deeper understanding of the dataset as a whole. It often falls short when the context isn’t straightforward, or when the information required for an accurate response spans multiple sources or involves nuanced reasoning.

For example, naive RAGs are limited to single-shot generation, noted AI consultant Norah Sakal.

Another limitation of naive RAG models is their inability to effectively handle queries that involve nuance. When a user submits a request with specific preferences, these models often struggle to interpret the full context of the query. Instead of refining the search to match the user's detailed request, they may retrieve a broader set of results that partially meet the criteria but also include less relevant options.

This can lead to a mix of recommendations, some of which might technically fit the query but miss the mark in terms of the user's precise needs.

“A naive RAG pipeline combines the generated response with retrieved data without any advanced optimization,” Sakal said in a blog about the limitations of RAG.

This is where more advanced techniques will need to come into play.

New RAGs in town

The opposite of a naive RAG would be an advanced or sophisticated RAG system that incorporates more complex techniques for understanding and processing queries. These advanced RAG systems might include context-aware RAG, GraphRAG or self-grounding RAG, a system that processes and organizes data into a structured representation before any queries are made.

Microsoft is working on one such advancement with a GraphRAG model, which improves on naive RAG by building a knowledge graph from a dataset, explained Jonathan Larson, senior principal data architect for special projects at Microsoft Research.

The graph can be considered “self-grounding,” he said, because creates a snapshot summary of the dataset before any queries are run, making it more adept at handling complex, holistic queries.

"GraphRAG builds a memory representation of the dataset as a whole, which allows it to clearly see and reason over the contents of the dataset and their relationships," Larson told Fierce Network. This approach not only keeps the data accurate and current but also allows for a deeper understanding of the dataset, enabling the model to answer questions that naive RAG might miss.

Google, on the other hand, is focusing on offering its customers a range of grounding options.

The company’s approach involves three types of grounding, explained Jason Gelman, Director of Product Management for Vertex AI at Google Cloud. These include grounding with Google Search, which allows models to tap into real-time information from the web, and grounding with Vertex AI Search, which enables enterprises to use their own data to inform AI responses.

Google is also working on grounding with third-party datasets, allowing customers to connect real-time data from providers like Moody's and Thomson Reuters to their models. 

“Each enterprise defines what they consider to be ‘accurate’ a little differently,” Gelman concluded. "We offer customers choice when it comes to how they choose to ground their models."