构建知识图谱的Python代码
时间: 2023-11-19 09:59:20 浏览: 140
构建知识图谱的Python代码需要依赖于一些开源库,比如rdflib、SPARQLWrapper、pandas等。以下是一个简单的示例代码,用于创建一个包含人物、书籍、作者、出版社等实体的知识图谱。
```python
from rdflib import Graph, Namespace, Literal
from rdflib.namespace import RDF, RDFS, FOAF, XSD
from SPARQLWrapper import SPARQLWrapper, JSON
import pandas as pd
# 定义命名空间
ns = Namespace("http://example.org/")
# 创建空图
g = Graph()
# 定义实体和属性
Person = ns.Person
Book = ns.Book
Author = ns.Author
Publisher = ns.Publisher
name = ns.name
title = ns.title
author = ns.author
publisher = ns.publisher
# 添加实体
g.add((Person, RDF.type, RDFS.Class))
g.add((Book, RDF.type, RDFS.Class))
g.add((Author, RDF.type, RDFS.Class))
g.add((Publisher, RDF.type, RDFS.Class))
# 添加属性
g.add((name, RDF.type, RDF.Property))
g.add((name, RDFS.domain, Person))
g.add((name, RDFS.range, Literal))
g.add((title, RDF.type, RDF.Property))
g.add((title, RDFS.domain, Book))
g.add((title, RDFS.range, Literal))
g.add((author, RDF.type, RDF.Property))
g.add((author, RDFS.domain, Book))
g.add((author, RDFS.range, Author))
g.add((publisher, RDF.type, RDF.Property))
g.add((publisher, RDFS.domain, Book))
g.add((publisher, RDFS.range, Publisher))
# 读取数据
df = pd.read_csv('data.csv')
# 添加实例
for i, row in df.iterrows():
person = ns[row['person']]
book = ns[row['book']]
author = ns[row['author']]
publisher = ns[row['publisher']]
g.add((person, RDF.type, Person))
g.add((book, RDF.type, Book))
g.add((author, RDF.type, Author))
g.add((publisher, RDF.type, Publisher))
g.add((person, name, Literal(row['person'])))
g.add((book, title, Literal(row['book'])))
g.add((book, author, author))
g.add((book, publisher, publisher))
# 查询数据
sparql = SPARQLWrapper("http://localhost:3030/knowledge-graph/sparql")
sparql.setQuery("""
SELECT ?person ?book ?author ?publisher
WHERE {
?person rdf:type ns:Person .
?book rdf:type ns:Book .
?book ns:author ?author .
?book ns:publisher ?publisher .
?person ns:name ?person_name .
?book ns:title ?book_title .
?author ns:name ?author_name .
?publisher ns:name ?publisher_name .
FILTER(?person_name = "Alice")
}
""")
sparql.setReturnFormat(JSON)
results = sparql.query().convert()
# 打印结果
for result in results["results"]["bindings"]:
print(result["book"]["value"], result["author"]["value"], result["publisher"]["value"])
```
其中,我们首先定义了命名空间、实体和属性,然后创建一个空图。接着,我们读取数据并添加实例和属性到图中。最后,我们使用SPARQL查询图中的数据,以便打印结果。在这个示例代码中,我们使用了本地的SPARQL端点,也可以使用其他的SPARQL端点,比如DBpedia等。
阅读全文