Knowledge Graph知识图谱—9. Data Quality and Linking
9. Data Quality and Linking
9.1 How well are the linked open data in practice?





Linked Open Vocabularies(LOV) project
– analyze usage of vocabularies









9.2 Quality
Linked Data Conformance vs. Quality
Conformance: – i.e., following standards and best practices, technical dimension, can be evaluated automatically
Quality: – i.e., how complete/correct/… is the data, content dimension, hard to evaluate automatically



Example: Crowd Evaluation of DBpedia
The Quality of Linked Open Data is far from perfect: conformance & content
Improving the quality is an active field of research
– Survey 2017: >40 approaches
– since then: a lot of work in KG embeddings
9.3 Links
Previously on Knowledge Graphs
- Integrate data from different sources
- Make connections between entities in those sources
- Facilitate cross data source queries
- Overcome data silos
Why do we need Links?

How do we Create the Links?

数据太多,很多将自己的跟其他数据集互连
9.3.1 Tool Support
A plethora of names
Mostly used for schema level:
- Ontology matching/alignment/mapping
- Schema matching/mapping
Mostly used for the instance level:
- Instance matching/alignment
- Interlinking
- Link discovery
9.3.2 Automating Interlinking



Basic Interlinking Techniques

Sources for Interlinking Signals

Simple String Based Metrics
- String equality
e.g. foo:University_of_Mannheim, bar:University_of_Mannheim - Common prefixes
e.g. foo:United_States, bar:United_States_of_America - Common postfixes
e.g. foo:Barack_Obama, bar:Obama - Typical usage of prefixes/postfixes: |common|/max(length)
foo:United_States, bar:United_States_of_America → 12/22
foo:Barack_Obama, bar:Obama → 5/12
Edit Distance

N-gram based Similarity

Typical Preprocessing Techniques

Language-specific Preprocessing

Using External Knowledge

From Matching Literals to Matching Entities

Preprocessing and Matching Pipelines

9.4 Schema Matching





9.5 Instance based Matching

Enforcing 1:1 Mappings



9.5 Matcher Combination



Evaluating Matchers

Challenges in Matching



本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。 如若内容造成侵权/违法违规/事实不符,请联系我的编程经验分享网邮箱:veading@qq.com进行投诉反馈,一经查实,立即删除!