Knowledge graph
{{short description|Type of knowledge base}}
{{other uses}}
File:Conceptual Diagram - Example.svg
In knowledge representation and reasoning, a knowledge graph is a knowledge base that uses a graph-structured data model or topology to represent and operate on data. Knowledge graphs are often used to store interlinked descriptions of entities{{snd}} objects, events, situations or abstract concepts{{snd}} while also encoding the free-form semantics or relationships underlying these entities.{{Cite web|date=2018|title=What is a Knowledge Graph?|url=https://ontotext.com/knowledgehub/fundamentals/what-is-a-knowledge-graph}}{{Cite web|date=2020|title=What defines a knowledge graph?|url=https://www.atulhost.com/what-is-knowledge-graph}}
Since the development of the Semantic Web, knowledge graphs have often been associated with linked open data projects, focusing on the connections between concepts and entities.{{cite conference|last1=Ehrlinger|first1=Lisa|last2=Wöß|first2=Wolfram|year=2016|title=Towards a Definition of Knowledge Graphs|url=http://ceur-ws.org/Vol-1695/paper4.pdf|conference=SEMANTiCS2016|location=Leipzig|publisher=Joint Proceedings of the Posters and Demos Track of 12th International Conference on Semantic Systems – SEMANTiCS2016 and 1st International Workshop on Semantic Change & Evolving Semantics (SuCCESS16)|pages=13–16}}{{Cite book|last=Soylu|first=Ahmet|title=The Semantic Web – ISWC 2020 |chapter=Enhancing Public Procurement in the European Union Through Constructing and Exploiting an Integrated Knowledge Graph |date=2020|chapter-url=https://doi.org/10.1007/978-3-030-62466-8_27|series=Lecture Notes in Computer Science|volume=12507|language=en|pages=430–446|doi=10.1007/978-3-030-62466-8_27|isbn=978-3-030-62465-1|s2cid=226229398}} They are also historically associated with and used by search engines such as Google, Bing, Yext and Yahoo; knowledge-engines and question-answering services such as WolframAlpha, Apple's Siri, and Amazon Alexa; and social networks such as LinkedIn and Facebook.
Recent developments in data science and machine learning, particularly in graph neural networks and representation learning and also in machine learning, have broadened the scope of knowledge graphs beyond their traditional use in search engines and recommender systems. They are increasingly used in scientific research, with notable applications in fields such as genomics, proteomics, and systems biology.{{Cite journal |last1=Mohamed |first1=Sameh K. |last2=Nounu |first2=Aayah |last3=Nováček |first3=Vít |date=2021 |title=Biological applications of knowledge graph embedding models |journal=Briefings in Bioinformatics |volume=22 |issue=2 |pages=1679–1693 |doi=10.1093/bib/bbaa012 |pmid=32065227 |via=Oxford Academic|doi-access=free |hdl=1983/919db5c6-6e10-4277-9ff9-f86bbcedcee8 |hdl-access=free }}
History
The term was coined as early as 1972 by the Austrian linguist Edgar W. Schneider, in a discussion of how to build modular instructional systems for courses.Edward W. Schneider. 1973. Course Modularization Applied: The Interface System and Its Implications For Sequence Control and Data Analysis. In Association for the Development of Instructional Systems (ADIS), Chicago, Illinois, April 1972 In the late 1980s, the University of Groningen and University of Twente jointly began a project called Knowledge Graphs, focusing on the design of semantic networks with edges restricted to a limited set of relations, to facilitate algebras on the graph. In subsequent decades, the distinction between semantic networks and knowledge graphs was blurred.
Some early knowledge graphs were topic-specific. In 1985, Wordnet was founded, capturing semantic relationships between words and meanings{{snd}} an application of this idea to language itself. In 2005, Marc Wirk founded Geonames to capture relationships between different geographic names and locales and associated entities. In 1998 Andrew Edmonds of Science in Finance Ltd in the UK created a system called ThinkBase that offered fuzzy-logic based reasoning in a graphical context.{{cite web| title=US Trademark no 75589756 | url= http://tmsearch.uspto.gov/bin/showfield?f=doc&state=4809:rjqm9h.2.1}} ThinkBase LLC{{cite web|title=ThinkBase|url=https://thinkbase.ai/kgraphs/ |access-date=25 December 2024}}
In 2007, both DBpedia and Freebase were founded as graph-based knowledge repositories for general-purpose knowledge. DBpedia focused exclusively on data extracted from Wikipedia, while Freebase also included a range of public datasets. Neither described themselves as a 'knowledge graph' but developed and described related concepts.
In 2012, Google introduced their Knowledge Graph,{{Cite web|last=Singhal|first=Amit|date=May 16, 2012|title=Introducing the Knowledge Graph: things, not strings|url=https://googleblog.blogspot.com/2012/05/introducing-knowledge-graph-things-not.html|access-date=21 March 2017|website=Official Google Blog}} building on DBpedia and Freebase among other sources. They later incorporated RDFa, Microdata, JSON-LD content extracted from indexed web pages, including the CIA World Factbook, Wikidata, and Wikipedia.{{cite web|last=Schwartz|first=Barry|date=December 17, 2014|title=Google's Freebase To Close After Migrating To Wikidata: Knowledge Graph Impact?|url=https://www.seroundtable.com/google-freebase-wikidata-knowledge-graph-19591.html|access-date=December 10, 2017|website=Search Engine Roundtable}} Entity and relationship types associated with this knowledge graph have been further organized using terms from the schema.org{{Cite web|last1=McCusker|first1=James P.|last2=McGuiness|first2=Deborah L.|title=What is a Knowledge Graph?|url=https://www.authorea.com/users/6341/articles/107281-what-is-a-knowledge-graph/_show_article|access-date=21 March 2017|website=www.authorea.com}} vocabulary. The Google Knowledge Graph became a successful complement to string-based search within Google, and its popularity online brought the term into more common use.
Since then, several large multinationals have advertised their knowledge graphs use, further popularising the term. These include Facebook, LinkedIn, Airbnb, Microsoft, Amazon, Uber and eBay.{{Cite web|date=2020|title=Knowledge Graph Enterprises|url=https://kgkg.factnexus.com/@3782~167.html}}
In 2019, IEEE combined its annual international conferences on "Big Knowledge" and "Data Mining and Intelligent Computing" into the International Conference on Knowledge Graph.{{Cite web|date=2017-07-09|title=2021 IEEE International Conference on Knowledge Graph (ICKG)*|url=https://kmeducationhub.de/ieee-international-conference-big-knowledge-icbk/|access-date=2021-03-22|website=KMedu Hub|language=en-US}}
Definitions
There is no single commonly accepted definition of a knowledge graph. Most definitions view the topic through a Semantic Web lens and include these features:{{cite journal|last1=Hogan|first1=Aidan|last2=Blomqvist|first2=Eva|last3=Cochez|first3=Michael|last4=d'Amato|first4=Claudia|last5=de Melo|first5=Gerard|last6=Gutierrez|first6=Claudio|last7=Labra Gayo|first7=José Emilio|last8=Kirrane|first8=Sabrina|last9=Neumaier|first9=Sebastian|last10=Polleres|first10=Axel|last11=Navigli|first11=Roberto|last12=Ngonga Ngomo|first12=Axel-Cyrille|last13=Rashid|first13=Sabbir M.|last14=Rula|first14=Anisa|last15=Schmelzeisen|first15=Lukas|last16=Sequeda|first16=Juan|last17=Staab|first17=Steffen|last18=Zimmermann|first18=Antoine|date=2021-01-24|title=Knowledge Graphs|journal=ACM Computing Surveys|volume=54|issue=4|pages=1–37|doi=10.1145/3447772| issn=0360-0300|arxiv=2003.02320|s2cid=235716181}}
- Flexible relations among knowledge in topical domains: A knowledge graph (i) defines abstract classes and relations of entities in a schema, (ii) mainly describes real world entities and their interrelations, organized in a graph, (iii) allows for potentially interrelating arbitrary entities with each other, and (iv) covers various topical domains.{{cite journal|last1=Paulheim|first1=Heiko|date=2017|title=Knowledge Graph Refinement: A Survey of Approaches and Evaluation Methods|url=http://www.semantic-web-journal.net/system/files/swj1083.pdf|journal=Semantic Web|pages=489–508|access-date=21 March 2017}}
- General structure: A network of entities, their semantic types, properties, and relationships.{{cite journal|last1=Krötsch|first1=Markus|last2=Weikum|first2=Gerhard|title=Editorial of the Special Issue on Knowledge Graphs|journal=Journal of Web Semantics|date=March 2016|volume=37-38|pages=53–54|doi=10.1016/j.websem.2016.04.002|url=https://doi.org/10.1016/j.websem.2016.04.002|access-date=10 February 2021|url-access=subscription}}{{Cite web|title=What is a Knowledge Graph?{{!}}Ontotext|url=https://www.ontotext.com/knowledgehub/fundamentals/what-is-a-knowledge-graph|access-date=2020-07-01|website=Ontotext|language=en-US}} To represent properties, categorical or numerical values are often used.
- Supporting reasoning over inferred ontologies: A knowledge graph acquires and integrates information into an ontology and applies a reasoner to derive new knowledge.
There are, however, many knowledge graph representations for which some of these features are not relevant. For those knowledge graphs, this simpler definition may be more useful:
- A digital structure that represents knowledge as concepts and the relationships between them (facts). A knowledge graph can include an ontology that allows both humans and machines to understand and reason about its contents.{{cite journal|last1=Peng|first1=Ciyuan|last2=Feng|first2=Xia|last3=Naseriparsa|first3=Mehdi|last4=Osborne|first4=Francesco|date=2023|title=Knowledge Graphs: Opportunities and Challenges|url=https://doi.org/10.1007/s10462-023-10465-9| journal=Artificial Intelligence Review|volume=56|issue=11 |pages=13071–13102|doi=10.1007/s10462-023-10465-9|pmid=37362886 |pmc=10068207 | issn=1573-7462|arxiv=2303.13948}}{{Cite web|date=2020|title=The Knowledge Graph about Knowledge Graphs|url=https://kgkg.factnexus.com/@3782~6.html}}
= Implementations =
In addition to the above examples, the term has been used to describe open knowledge projects such as YAGO and Wikidata; federations like the Linked Open Data cloud;{{Cite web|title=The Linked Open Data Cloud|url=https://lod-cloud.net/|access-date=2020-06-30|website=lod-cloud.net}} a range of commercial search tools, including Yahoo's semantic search assistant Spark, Google's Knowledge Graph, and Microsoft's Satori; and the LinkedIn and Facebook entity graphs.
The term is also used in the context of note-taking software applications that allow a user to build a personal knowledge graph.{{cite journal |last1=Pyne |first1=Yvette |last2=Stewart |first2=Stuart |date=March 2022 |title=Meta-work: how we research is as important as what we research |journal=British Journal of General Practice |volume=72 |issue=716 |pages=130–131 |pmid=35210247 |pmc=8884432 |doi=10.3399/bjgp22X718757}}
The popularization of knowledge graphs and their accompanying methods have led to the development of graph databases such as Neo4j,{{Cite web |title=Neo4j Graph Database & Analytics {{!}} Graph Database Management System |url=https://neo4j.com/ |access-date=8 November 2023 |website=Neo4j}} GraphDB{{Cite web |title=Ontotext GraphDB |url=https://www.ontotext.com/products/graphdb/ |access-date=8 November 2023 |website=Ontotext}} and AgensGraph.{{Cite web |title=An Enterprise Graph Database Management System |url=https://bitnine.net/agensgraph/ |access-date=19 February 2025 |website=Bitnine.net}} These graph databases allow users to easily store data as entities and their interrelationships, and facilitate operations such as data reasoning, node embedding, and ontology development on knowledge bases.
In contrast, virtual knowledge graphs do not store information in specialized databases. They rely on an underlying relational database or data lake to answer queries on the graph. Such a virtual knowledge graph system must be properly configured in order to answer the queries correctly. This specific configuration is done through a set of mappings that define the relationship between the elements of the data source and the structure and ontology of the virtual knowledge graph.{{Cite web |title=Virtual Knowledge Graphs: An Overview of Systems and Use Cases |url=https://direct.mit.edu/dint/article/1/3/201/9978/Virtual-Knowledge-Graphs-An-Overview-of-Systems}}
Using a knowledge graph for reasoning over data
{{main|Ontology (information science)}}
A knowledge graph formally represents semantics by describing entities and their relationships.{{Cite web|date=2022-04-05|title=How do knowledge graphs work?|url=https://www.stardog.com/knowledge-graph/|access-date=2022-04-05|website=Stardog|language=en-US}} Knowledge graphs may make use of ontologies as a schema layer. By doing this, they allow logical inference for retrieving implicit knowledge rather than only allowing queries requesting explicit knowledge.{{Cite web |date=2023-09-01 |title=Unlocking the Power of Google Knowledge Panel: How to Obtain and Claim Yours in 2023 – RH Razu |url=https://rhrazu.com/google-knowledge-panel-obtain-and-claim-yours-in-2023/ |access-date=2023-09-05 |website=rhrazu.com |language=en-US}}
In order to allow the use of knowledge graphs in various machine learning tasks, several methods for deriving latent feature representations of entities and relations have been devised. These knowledge graph embeddings allow them to be connected to machine learning methods that require feature vectors like word embeddings. This can complement other estimates of conceptual similarity.{{Cite book|author=Hongwei Wang|title=Proceedings of the 27th ACM International Conference on Information and Knowledge Management |chapter=RippleNet: Propagating User Preferences on the Knowledge Graph for Recommender Systems |date=October 2018|pages=417–426|doi=10.1145/3269206.3271739|arxiv=1803.03467|isbn=9781450360142 |s2cid=3766110}}{{Citation |last1=Ristoski |first1=Petar |pages=498–514 |year=2016 |last2=Paulheim |first2=Heiko |chapter=RDF2Vec: RDF Graph Embeddings for Data Mining |title=The Semantic Web – ISWC 2016 |series=Lecture Notes in Computer Science |volume=9981 |doi=10.1007/978-3-319-46523-4_30|isbn=978-3-319-46522-7 |chapter-url=https://madoc.bib.uni-mannheim.de/41307/1/Ristoski_RDF2Vec.pdf |doi-access=free }}
Models for generating useful knowledge graph embeddings are commonly the domain of graph neural networks (GNNs).{{Cite journal |last1=Zhou |first1=Jie |last2=Cui |first2=Ganqu |display-authors=1 |date=2020 |title=Graph neural networks: A review of methods and applications. |journal=AI Open |volume=1 |issue=1 |pages=57–81 |doi=10.1016/j.aiopen.2021.01.001 |s2cid=56517517 |via=Elsevier Science Direct|doi-access=free |arxiv=1812.08434 }} GNNs are deep learning architectures that comprise edges and nodes, which correspond well to the entities and relationships of knowledge graphs. The topology and data structures afforded by GNNs provides a convenient domain for semi-supervised learning, wherein the network is trained to predict the value of a node embedding (provided a group of adjacent nodes and their edges) or edge (provided a pair of nodes). These tasks serve as fundamental abstractions for more complex tasks such as knowledge graph reasoning and alignment.{{Cite journal |last1=Ye |first1=Zi |last2=Kumar |first2=Yogan Jaya |last3=Sing |first3=Goh Ong |last4=Song |first4=Fengyan |last5=Wang |first5=Junsong |date=2022 |title=A comprehensive survey of graph neural networks for knowledge graphs. |journal=IEEE Access |volume=10 |pages=75729–7574 |doi=10.1109/ACCESS.2022.3191784 |bibcode=2022IEEEA..1075729Y |s2cid=250654689 |via=IEEE Xplore|doi-access=free }}
= Entity alignment =
File:Knowledge graph entity alignment.png
As new knowledge graphs are produced across a variety of fields and contexts, the same entity will inevitably be represented in multiple graphs. However, because no single standard for the construction or representation of knowledge graph exists, resolving which entities from disparate graphs correspond to the same real world subject is a non-trivial task. This task is known as knowledge graph entity alignment, and is an active area of research.{{Cite conference |last1=Berrendorf |first1=Max |last2=Faerman |first2=Evgeniy |last3=Melnychuk |first3=Valentyn |last4=Tresp |first4=Volker |last5=Seidl |first5=Thomas |date=April 14–17, 2020 |title=Knowledge graph entity alignment with graph convolutional networks: lessons learned |conference=Advances in Information Retrieval: 42nd European Conference on IR Research, ECIR 2020, Lisbon, Portugal |series=Lecture Notes in Computer Science |volume=Proceedings, Part II |pages=3–11 |doi=10.1007/978-3-030-45442-5_1 |arxiv=1911.08342 |isbn=978-3-030-45441-8 |s2cid=208158314 |via=Springer International Publishing}}
Strategies for entity alignment generally seek to identify similar substructures, semantic relationships, shared attributes, or combinations of all three between two distinct knowledge graphs. Entity alignment methods use these structural similarities between generally non-isomorphic graphs to predict which nodes corresponds to the same entity.{{Cite arXiv |last1=Chaurasiya |first1=Deepak |last2=Surisetty |first2=Anil |last3=Kumar |first3=Nitish |last4=Singh |first4=Alok |last5=Dey |first5=Vikrant |last6=Malhotra |first6=Aakarsh |last7=Dhama |first7=Gaurav |last8=Arora |first8=Ankur |date=2022 |title=Entity alignment for knowledge graphs: progress, challenges, and empirical studies |class=cs.AI |eprint=2205.08777 }}
The recent successes of large language models (LLMs), in particular their effectiveness at producing syntactically meaningful embeddings, has spurred the use of LLMs in the task of entity alignment.{{Cite journal |last1=Hogan |first1=Aidan |last2=Lippolis |first2=Anna Sofia |last3=Klironomos |first3=Antonis |last4=Milon-Flores |first4=Daniela F. |last5=Zheng |first5=Heng |last6=Jouglar |first6=Alexane |last7=Norouzi |first7=Ebrahim |date=2023 |title=Enhancing Entity Alignment Between Wikidata and ArtGraph using LLMs |url=https://aidanhogan.com/docs/art_wikidata_kgs_llms.pdf |journal=Proceedings of the International Workshop on Semantic Web and Ontology Design for Cultural Heritage |via=International Workshop on Semantic Web and Ontology Design for Cultural Heritage (SWODCH), Athens, Greece}}
As the amount of data stored in knowledge graphs grows, developing dependable methods for knowledge graph entity alignment becomes an increasingly crucial step in the integration and cohesion of knowledge graph data.
See also
- {{Annotated link |Concept map}}
- {{Annotated link |Formal semantics (natural language)}}
- {{Annotated link |Graph database}}
- {{Annotated link |Knowledge base}}
- {{Annotated link |Knowledge graph embedding}}
- {{Annotated link |Logical graph}}
- {{Annotated link |Semantic integration}}
- {{Annotated link |Semantic technology}}
- {{Annotated link |Topic map}}
- {{Annotated link |Vadalog}}
- Wikibase- Mediawiki Software extensions for creating knowledge bases
- Wikidata - Free Knowledge Database Project
- {{Annotated link |YAGO (database)}}
References
{{reflist}}
External links
{{subject bar|d=y|auto=y}}
- {{cite news|url=https://www.technologyreview.com/2020/09/04/1008156/knowledge-graph-ai-reads-web-machine-learning-natural-language-processing/ | title= This know-it-all AI learns by reading the entire web nonstop | quote=Diffbot is building the biggest-ever knowledge graph by applying image recognition and natural-language processing to billions of web pages. | work = MIT Technology Review | author = Will Douglas Heaven | date = 4 September 2020 | access-date = 5 September 2020}}
{{Scholia|topic}}
{{Authority control}}
Category:Ontology (information science)