Knowledge Graph知识图谱—2. RDF & 3. RDFS
2. Resource Description Framework (RDF)
Graph
Directed vs. undirected
Labeled vs. unlabeled
Homogenous vs. heterogeneous nodes
Cyclic vs. acyclic
Knowledge Graphs are Graphs
Directed, labeled graph, Heterogeneous node types (and edges), Need not be cycle free
Node types (“classes”) and edge types (“properties”) are also referred to the “schema” of the graph (aka “ontology”) e.g. an edge of type “author” links a publication to a person
Metadata on the Web
Goal: more effective rating and ranking of web contents
Metadata on the Web: Dublin Core
2.1 What is RDF?
RDF = Resource Description Framework
Description of arbitrary things
A knowledge graph consists of multiple sentences
We usually think of knowledge graphs as densely connected graphs (Objects of one statement become subjects of another)
2.2 Basic Building Blocks of RDF
2.2.1 Resources
- denote things
- are identified by a URI
- can have one or multiple types
- A resource can be a subject itself
Types: All resources (not literals) can have a type. Types can be arbitrarily defined. The predefined predicate rdf:type* defines the type of a resource.
2.2.2 Literals
- are values like strings or integers,;
- A literal is an atomic value, can only be objects, not subjects or predicates(graph view: they can only have ingoing edges)【和resource的不同点】
- can have a datatype or a language tag (but not both)
Datatypes for Literals
(Almost) all XML Schema datatypes may be used, Exception: XML specific types, The underspecified type “duration”, and sequence types.
There are no default datatypes (not even “string”!)
Language Tags for Literals
Literals may be defined in different natural languages: “München”@de, “Munich”@en
Those can be marked
Knowledge Graphs can be multilingual!
Example:
:Munich :hasName “München”@de .
:Munich :hasName “Munich”@en .
:Munich :hasPopulation "1356594 "^^xsd:integer .
:Munich :hasFoundingYear “1158-01-01”^^xsd:date .
以下是3种不同的literal
– “München”
– “München”@de
– “München”^^xsd:string .
2.2.3 Properties (Predicates)
- Link resources to other resources and to literals
2.3 Triple Notation
Triples consist of a subject, predicate, and object.
An RDF document is an unordered set of triples.
Example:
Literal with language tag:
http://www.dws.informatik.uni-mannheim.de/teaching/semantic-web
http://purl.org/dc/elements/1.1/subject
“Semantic Web”@en .
Type literal:
http://www.dws.informatik.uni-mannheim.de/teaching/semantic-web
http://www.uni-mannheim.de/mhb/creditpoints
“6”^^http://www.w3.org/2001/XMLSchema#integer .
2.4 Turtle Notation
A simplified triple notation
Turtle(可读性三元组链接语言)是一种更具可读性的RDF表示方式,它使用简洁的语法来描述RDF数据。
2.5 Notation RDF/XML
Encodes RDF in XML
Suitable for machine processing (plenty of XML tools)
2.6 JSON-LD Notation
JSON-LD: Standard for serializing RDF in JSON
2.7 Blank Nodes
Blank Nodes in Turtle
Application of Blank Nodes: n-ary Predicates
RDF predicates always connect a subject and an object, i.e., in the sense of predicate logic, they are binary predicates.
Sometimes, n-ary predicates are needed
– has_ingredient(Recipe, Sugar, 100g)
2.8 Semantic Principles of RDF
AAA principle: anybody can say anything about anything, also used for RDF knowledge graphs
Non-unique name assumption: there is not just one name for each thing. Just that two things have different names does not mean that they are different!
Open World Assumption: there may be more information on a resource than what we have
2.9 RDF and HTML
The Semantic Web uses RDF
The “classic” Web uses HTML
Explicit reference to a RDF version: an agent stumbling on the HTML page can download the RDF data file
Content Negotiation
MIME: Multipurpose Internet Mail Extensions
Link to RDF Document | Content Negotiation |
---|---|
Can be done with a simple HTML editor; No special server configuration needed | Requires particular server setup; One URI can be used for different representations |
Both cases require: two different representations + “double bookkeeping”
→ Potential source of inconsistencies!
2.10 RDF in Attributes (RDFa)
Idea of RDFa:
Why not encode HTML and RDF in one document? The essential information only has to be encoded once
RDFa combines XHTML with RDF
2.11 Microdata
Alternative to RDFa
Adding structured information to web pages
– By marking up contents
– Arbitrary vocabularies are possible
– Introduced with HTML5
RDFa vs. Microdata
Commonalities
– Arbitrary classes/predicates are possible
– Although Microdata is mainly used with schema.org
Differences
– Microdata is slightly less expressive
– No URIs, only blank nodes
– No cycles in the resulting RDF graph
– No reification (see later)
2.12 RDFa, MicroFormats, and Microdata
MicroFormats: fixed vocabularies for persons, addresses, etc
WebDataCommons: Large-Scale Extraction of RDFa, MicroFormats, Microdata, JSON-LD from the Web
2.13 RDF Tools
Storage: relational databases, graph databases
Validation: validating parsers checking consistency
Visualization: mostly graph based visualization
Reasoning: inference over graphs
Programming: APIs
2.14 Metadata for RDF
Dublin Core was designed as Metadata for the Web. Knowledge graphs may have metadata as well.
Most prominently: provenance
– Where does the data come from?
– Who created it?
– When was it created?
– What was the process creating it?
…
2.15 Reification
In RDF: Statements about statements
“Peter says that Rome is the capital of Spain.”
Implementation:
RDF Statements are considered resources themselves. Can be subject or object of other statements.
3. RDF Schema (RDFS)
Schemas and ontologies bring semantics to knowledge graphs
3.1 Semantics
How do Semantics Work?
3.1.1 Lexical semantics
Meaning of a word is defined by relations to other words. / Defining semantics by establishing relations between words
3.1.2 Extensional semantics
Meaning of a word is defined by the set of its instances.
3.1.3 Intensional semantics
e.g., feature-based semantics. Meaning of a word is defined by features of the instances
Intensional vs. Extensional Semantics
1. Intensionally different things can have the same extension. Classic example: morning star and evening star, both have the same extension (i.e., Venus)
2. The extension can change over time without the intension changing
3. Intension may also change over time
4. Extension may also be empty
3.1.4 Prototype semantics
Meaning of a word is defined by proximity to a prototypical instance.
Intensional and extensional semantics are based on boolean logics, but prototype semantics is a more fuzzy variant.
Semantics define the meaning of words
That is what we do with ontologies — using methods from lexical, intensional, and extensional semantics
3.2 Ontology
An ontology is an explicit specification of a conceptualization.
Ontologies encode the knowledge about a domain
They form a common vocabulary and describe the semantics of its terms
In computer science (with a or the)
– a formalized description of a domain
– a shared vocabulary
– a logical theory
Essential Properties of Ontologies
- Explicit: Meaning is not “hidden” between the lines
- Formal: e.g., using logic or rule languages
- Shared: An ontology just for one person does not make much sense
- Partial: There will (probably) never be a full ontology of everything in the world
3.3 Encoding Simple Ontologies: RDFS
RDFS 是建立在 RDF 之上的一个扩展,用于定义 RDF 中的资源和它们之间的关系的模式。RDFS 引入了类(Class)和属性(Property)的概念,允许用户定义资源的类型和属性的含义。RDFS 还提供了一些关于资源和属性的基本推理规则,例如,如果资源 A 是类 B 的实例,那么资源 A 也是类 B 的子类的实例。RDFS 用于建模资源之间的层次结构和基本约束,以增强 RDF 数据的语义。
3.3.1 Most important element: classes
Classes form hierarchies
类(Class):
类定义了一组资源,这些资源具有相似的特
征或属性。类用于将资源分组或分类,以便
更好地组织和描述数据。类是一种元数据,用于描述数据模型中的概念和实体类型。例如,你可以定义一个类 “Person” 来表示所有人的概念。类可以形成层次结构,其中一个类可以是另一个类的子类。这种层次结构允许你建立更具体和抽象的类之间的关系。例如, “Student” 可以是 “Person” 类的子类。类还可以有实例,这些实例是该类的具体成员。例如,“Alice” 和 “Bob” 可以是 “Person” 类的实例。
3.3.2 Properties in RDF Schema
resemble two-valued predicates in predicate logic
Properties also form hierarchies
属性(Property):
属性用于描述资源的特征或关系,它们定义了资源之间的关系。属性在RDF数据模型中起着关键作用,用于连接主体(subject)和宾语(object)。属性可以表示资源的某种性质,例如,“hasName” 属性可以用于表示一个资源的名称。属性也可以表示资源之间的关系,例如,“isFriendOf” 属性可以用于表示两个资源之间的友谊关系。类型属性(rdf:type)用于将资源与类关联起来,指示资源的类型
Domains and Ranges of Properties
In general, properties exist independently from classes.
Defining the domain and range of a property
:capitalOf rdfs:domain :City . [Domain defines the type of the instance that is in the subject position]
:capitalOf rdfs:range :Country . [Range defines the type of the instance which is in the object position]
Domain and range are inherited by sub properties
Predefined Properties
rdf:type, rdfs:subClassOf, rdfs:subPropertyOf, rdfs:domain, rdfs:range, rdfs:label, rdfs:comment, rdfs:seeAlso(Links to other resources:), rdfs:isDefinedBy(Link to defining schema)
URIs vs. Labels
A URI is only a unique identifier, and it does not need to be interpretable.
Labels are made for human interpretation, and can come in different languages
3.4 RDF Schema and RDF
Every RDF Schema document is also an RDF document. -> All properties of RDF also hold for RDFS! (Non-unique Naming Assumption & Open World Assumption)
3.5 Reasoning with RDF
T-Box: Definition of the Terminology
A-box: Definition of the Assertions
RDF Schema allows for deductive reasoning on RDF (given facts and rules, derive new facts)
The corresponding tools are called reasoner
Opposite of deduction: induction
deriving models from facts, e.g.data mining and machine learning
Interpretation and Entailment
Entailment: The set of all consequences of a graph
Mapping a graph to an entailment is called interpretation
This interpretation creates all statements explicitly contained in the graph. But the implicit statements are the interesting ones!
Interpretation using Deduction Rules
RDF interpretation can be done using RDFS deduction rules. Those create an entailment
using existing resources, literals, and properties, creating additional triples like <s,p,o>.
Deduction rules are an interpretation function
Simple reasoning algorithm (a.k.a. forward chaining)
Multiple Domains/Ranges
3.6 What we Cannot Express
There is no negation in RDF and RDFS.
We cannot produce any contradictions
The missing negation perfectly fits the AAA principle and Open World Assumption.
Any new knowledge will always fit to the knowledge that is already there. This principle is called “monotonicity”
RDF Schema is not very powerful but free of contradiction.
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。 如若内容造成侵权/违法违规/事实不符,请联系我的编程经验分享网邮箱:veading@qq.com进行投诉反馈,一经查实,立即删除!