Why Knowledge Graphs Are On The Rise

Enterprise data is rarely ready for an LLM.

featured-image

Brian Platz is co-CEO at Fluree , the scalable semantic graph database backed by blockchain technology. Ever since large language models (LLMs) exploded onto the scene, executives have felt the urgency to apply them enterprise-wide. Successful use cases such as expedited insurance claims, enhanced software-developer productivity and automated customer relationships lead many to assume it's possible to quickly implement a useful, real-time LLM on top of enterprise data.

Enterprise data, however, is rarely ready for an LLM. Most organizations use relational databases, which can have fatal flaws when used directly with an LLM. Knowledge graphs—machine-readable data representations that mimic human knowledge—are bridging the gap between proprietary enterprise data and safe, reliable, helpful LLMs.



A Research and Markets report projected the compound annual growth rate (CAGR) of knowledge graphs to reach 21.8% between 2023 and 2028. Where Relational Databases Shine Relational databases, the most popular database in use today, were designed for the computing hardware of 40 years ago.

They counteract problems caused by hierarchical and network-based databases—which could not be standardized, were difficult to use and were often proprietary. Relational databases broke the mold by introducing standardization. For the first time, users could query flexibly via the common language of SQL, change data without affecting database-driven applications and reduce data redundancy, among many other benefits.

Unfortunately, those benefits don't reach far enough to make LLMs work well. When LLMs pull from relational databases, a host of new problems emerge. What LLMs Need LLMs are trained to understand knowledge in the form of context, semantics and patterns.

They use probabilities to predict the most likely word or sequence of words based on the input they receive. A SQL query, on the other hand, is designed to produce the same output for the same input each time. They operate on well-defined tables and columns, while LLMs are designed to vary their answers and remain flexible.

SQL queries work well with the business logic, data relationships, hierarchical views, metrics and calculations that a relational database prioritizes. However, if you throw SQL data at an LLM, it struggles to give you good answers because its very design is incompatible with the SQL-relational database architecture. Worse, because LLMs are trained to respond with expertise, they are likely to hallucinate, invent nonexistent tables or columns, generate incorrect data values, and produce SQL queries with elements that don't exist in a database.

Results are likely to be stale because LLMs pull static knowledge from their training data and won't automatically pull from updated data. Moreover, if nobody puts controls into place, LLMs will pull from sensitive data and put it at risk. Bridging The Gap With Knowledge Graphs We've established that LLMs struggle with relational databases, especially when they ask for complex information across a diverse range of data sources.

Integrating a knowledge graph gives LLMs a format they understand while broadening the depth and variety of data to pull from. Moreover, enterprise teams can control what the LLM is trained on so it interprets data correctly—by using updated data, for example, or being locked out of sensitive data. Data architects have the power to direct answers along predictable routes while giving the LLM room to strategically diversify.

Knowledge graphs: • Ensure LLMs pull from factual data, helping to improve accuracy and reduce hallucinations. • Structure information as machine-readable, interconnected entities and relationships. LLMs, which use statistical understanding to reason and predict, can traverse the graph structure to connect disparate pieces of information.

• Combine structured and unstructured data for the LLM to integrate while generating responses, increasing the accuracy and depth of answers. • Provide traceable paths of reasoning so users can understand how an LLM derived its answer. • Encode domain-specific information and ontologies, allowing LLMs to be tailored for particular industries, domain jargon or use cases.

• Enable data teams to use automated systems to update data regularly, track provenance and validate data at scale. Knowledge graphs can also federate multiple databases together, natively bringing together disparate data. As a result, users can ask LLMs broader answers and trust the insights they receive in return.

Combine that with the ability to control who sees what—limiting sensitive data to specific, qualified teams—and you have a trustworthy LLM that you feel safe deploying across the entire organization. Challenges With Integrating Knowledge Graphs While knowledge graphs are highly effective for structuring data in a way LLMs can leverage, integrating them into an organization's data ecosystem is not without obstacles. Introducing knowledge graphs often exposes unexpected gaps or inconsistencies in data.

Unlike relational databases, where data relationships are often limited by predefined schemas, knowledge graphs uncover more intricate interdependencies. As a result, organizations may need to adjust their data management policies or institute stricter validation practices to address these insights. Organizations may encounter scenarios where knowledge graphs initially reveal data silos, conflicting data standards or access control issues.

To mitigate these risks, organizations must prioritize cross-functional collaboration, establish clear data ownership and invest in tools that offer visibility into the graph's evolving structure and usage. A Graph For The Ages While it may seem straightforward to go from ChatGPT to an enterprise-grade, real-time LLM, the reality is choppier. Integrating an LLM directly with SQL queries and a relational database is likely to give you six-fingered hands and offensive hallucinations.

Integrating a knowledge graph takes time but optimizes enterprise data for LLMs, resulting in answers that are more likely than not to be applicable, deep and useful. Gartner Inc.'s 2023 AI Hype Cycle mentioned that knowledge graphs are moving through the hype cycle exceptionally quickly and that they complement many other AI technologies such as robots and computer vision.

They promise to be the glue between enterprise data and AI implementations for the foreseeable future. As AI continues to become smarter and more broadly applicable, it's time to explore how knowledge graphs can benefit your organization. Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives.

Do I qualify?.