Can LLMs be Good Graph Judge for Knowledge Graph Construction?

November 26, 2024 · Declared Dead · 🏛 Conference on Empirical Methods in Natural Language Processing

Authors Haoyu Huang, Chong Chen, Zeang Sheng, Yang Li, Wentao Zhang arXiv ID 2411.17388 Category cs.CL: Computation & Language Cross-listed cs.AI Citations 10 Venue Conference on Empirical Methods in Natural Language Processing Repository https://github.com/hhy-huang/GraphJudge}{https://github.com/hhy-huang/GraphJudge} Last Checked 1 month ago

Abstract

In real-world scenarios, most of the data obtained from the information retrieval (IR) system is unstructured. Converting natural language sentences into structured Knowledge Graphs (KGs) remains a critical challenge. We identified three limitations with respect to existing KG construction methods: (1) There could be a large amount of noise in real-world documents, which could result in extracting messy information. (2) Naive LLMs usually extract inaccurate knowledge from some domain-specific documents. (3) Hallucination phenomenon cannot be overlooked when directly using LLMs to construct KGs. In this paper, we propose \textbf{GraphJudge}, a KG construction framework to address the aforementioned challenges. In this framework, we designed an entity-centric strategy to eliminate the noise information in the documents. And we fine-tuned a LLM as a graph judge to finally enhance the quality of generated KGs. Experiments conducted on two general and one domain-specific text-graph pair datasets demonstrate state-of-the-art performance against various baseline methods with strong generalization abilities. Our code is available at \href{https://github.com/hhy-huang/GraphJudge}{https://github.com/hhy-huang/GraphJudge}.