Knowledge Graph知识图谱—9. Data Quality and Linking
9. Data Quality and Linking
9.1 How well are the linked open data in practice?

 
 


Linked Open Vocabularies(LOV) project
 – analyze usage of vocabularies









9.2 Quality
Linked Data Conformance vs. Quality
 Conformance: – i.e., following standards and best practices, technical dimension, can be evaluated automatically
Quality: – i.e., how complete/correct/… is the data, content dimension, hard to evaluate automatically



Example: Crowd Evaluation of DBpedia
The Quality of Linked Open Data is far from perfect: conformance & content
 Improving the quality is an active field of research
 – Survey 2017: >40 approaches
 – since then: a lot of work in KG embeddings
9.3 Links
Previously on Knowledge Graphs
- Integrate data from different sources
- Make connections between entities in those sources
- Facilitate cross data source queries
- Overcome data silos
Why do we need Links?
 
How do we Create the Links?
 
数据太多,很多将自己的跟其他数据集互连
9.3.1 Tool Support
A plethora of names
 Mostly used for schema level:
- Ontology matching/alignment/mapping
- Schema matching/mapping
Mostly used for the instance level:
- Instance matching/alignment
- Interlinking
- Link discovery
9.3.2 Automating Interlinking



Basic Interlinking Techniques
 
Sources for Interlinking Signals

Simple String Based Metrics
- String equality
 e.g. foo:University_of_Mannheim, bar:University_of_Mannheim
- Common prefixes
 e.g. foo:United_States, bar:United_States_of_America
- Common postfixes
 e.g. foo:Barack_Obama, bar:Obama
- Typical usage of prefixes/postfixes: |common|/max(length)
 foo:United_States, bar:United_States_of_America → 12/22
 foo:Barack_Obama, bar:Obama → 5/12
Edit Distance
 
N-gram based Similarity
 
Typical Preprocessing Techniques
 
Language-specific Preprocessing
 
Using External Knowledge
 
From Matching Literals to Matching Entities
 
Preprocessing and Matching Pipelines
 
9.4 Schema Matching

 



9.5 Instance based Matching

Enforcing 1:1 Mappings
 
 

9.5 Matcher Combination



Evaluating Matchers
 
Challenges in Matching
 


本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。 如若内容造成侵权/违法违规/事实不符,请联系我的编程经验分享网邮箱:veading@qq.com进行投诉反馈,一经查实,立即删除!