PyRDF
By Zef Hemel
- 4 minutes read - 658 wordsYesterdayâs post and comments got me thinking. It still is fairly hard to manipulate and generate RDF data and I donât think it really has to be. ActiveRDF (a Ruby RDF API) takes an interesting approach and I thought Iâd build something similarish in Python, so I started that and after a couple of hours I already have something quite neat. Iâve called it PyRDF for now and hereâs a sample piece of code for you to get a feel for how it works.
import pyrdf
from pyrdf import RdfStore, RdfResource, RdfType
from rdflib.Namespace import Namespace
NS_P = Namespace(âhttp://www.zefhemel.com/ont/person#')
NS_J = Namespace(âhttp://www.zefhemel.com/ont/job#')
store = RdfStore(defaultNS = NS_P)
store.prefix_mapping(âpâ, NS_P)
store.prefix_mapping(âjâ, NS_J)
pyrdf.setDefaultStore(store)
Person = RdfType(NS_P[âPersonâ])
Website = RdfType(NS_P[âWebsiteâ])
Job = RdfType(NS_J[âJobâ])
zef = RdfResource(NS_P[âzefâ], rdf_type = Person)
zef.name = âauthor: Zef Hemelâ
zef.age = 22
zef.country = âIrelandâ
zef.city = âDublinâ
job1 = RdfResource(NS_J[âjob1â], defaultNS=NS_J, rdf_type = Job)
job1.name = âStudent System Administratorâ
job1.description = âFiddling around with Linux serversâ
job1.startYear = 2003
job1.endYear = 2005
job2 = RdfResource(NS_J[âjob2â], rdf_type = Job)
# And without the defaultNS set:
job2.j_name = âWriting websiteâ
job2.j_description = âWriting own weblogs, not that well paid.â
job2.j_startYear = 2003
zef.hadJob = [job1, job2]
zef.website = []
zefhemelcom = RdfResource(NS_P[âzefhemelcomâ], rdf_type = Website)
zefhemelcom.title = âZefHemel.comâ
zefhemelcom.url = âhttp://www.zefhemel.com'
zef.website.append(zefhemelcom)
zefnu = RdfResource(NS_P[âzefnuâ], rdf_type = Website)
zefnu.title = âZef.Nuâ
zefnu.url = âhttp://zef.nu'
zef.website.append(zefnu)
print store.serialize(format=âpretty-xmlâ)
Here is the output of that, saves quite some typing eh?Ok, you probably need an understanding of XML and XML namespaces to fully understand this but even if you donât, it should be pretty obvious. PyRDF right now has three classes:
RdfStore, which stores RDF triples as described before. You donât have to do much with this except registering some prefixes. Later on you can also use this class to serialize your data into RDF/XML and to save it and load it from files, but that doesnât work yet. 2. RdfResource, which represents a resource, you can simply see this as an object. When instantiating an RdfResource you have to give it at least an URI. Additionally you can pass it:
store, a place to store the resourceâs data, by default itâs all stored in the defaultStore and usually thatâs fine. * defaultNS, this default namespace thatâs used for the property names. More on this later.
- A number of initial properties and values. This is the same as writing resourcename.property = value, but is just added for convenience
RdfType, this is a direct subclass of RdfResource, it doesnât do much, hardly anything at the moment. Later it could potentially be used to enforce correct typing and property use and stuff.
_RdfResource_s have properties, just like objects. Properties can have other resources, literals (strings, integers etc.) or lists (of resources or literals) as values. PyRDF tries to automatically guess what kind of type a property is. If you start using it as a list, it will function as a list, if you put or literals or RdfResources in it, it will (hopefully) act as expected.
By default the property name is combined with the default namespace of the resource (or store), so for example if your default namespace is http://www.zefhemel.com/ont/person# and your property name is age, then the URI of the property will be http://www.zefhemel.com/ont/person#age. If you use a prefix followed by an underscore in the property name, like j_description, the default namespace will be overridden by the namespace associated with the j prefix. So in this case the URI will be http://www.zefhemel.com/ont/job#description.
Thatâs it, thatâs all that thereâs to it and I think itâs pretty neat. I will now work on the querying capabilities, but I think itâs already quite nice like this.
If you want to play around with it you can do a subversion check-out from http://svn.zefhemel.com/pyrdf or you can just visit that address and download it with your browser. You need rdflib to run it, but I think it comes preinstalled with Python (on Windows anyway).