Archive for the ‘RDF’ Category

Sir Tim Inaugural Lecture

Wednesday, March 14th, 2007

Just watching the live video feed of Prof. Sir Tim Berners-Lee's inaugural lecture in the Electronics and Computer Science department at Southampton Uni. I can't see the slides which is a nuisance. I thought I'd type up a few notes as I listen.

He started off talking about wishy-washy guff about engineering versus analysis of network systems. And creativity, which is part of engineering.

Now he's found his feet a bit more. I thought it was amusing that he was trying to talk about Web 2.0 sites but without mentioning the actual term "Web 2.0″.

He made a big point about macrosopic social elements (the web community) deriving from microscopic (URI schemes and HTTP and HTML and stuff and junk). (This is exactly the point I make when trying to explain where TBL fits in to the history of the web: TBL is not responsible for the massive cultural system built on top of the web. It's mere chance that his distributed hypermedia system took root. A lot of people can't distinguish the utility of the web now from the seed protocols (not even ideas, as such, which were already established) that TBL gave us.)

He mentioned something about email and how it's abused.

The web – what it was intended to do and the primary concepts that drive it. Layering technologies on top of one another. Wow. Abstraction.

The web is an information space. A mapping between a URI and some information.

PageRank. Google. Deriving macrosopic web usage models from something very simple like number of links. Audio went a bit rubbish for a while but it's back now.

Wiki. How microscopic behaviour like collaborative editing grows into macroscopic systems like Wikipedia. This will revolutionise democracy and politics.

Blogs. Woo. The Blogosphere. May be rubbish. Who knows. Probably both rubbish and excellent at the same time.

Information in HTML format is not manipulatable. Se we need a semantic web to re-use data as data. RDF, OWL, SPARQL. Use URIs for things rather than web pages. And the relationships between overhead projectors and colours. Merge and query is very easy. FOAF networks. (Yay! I know all about those. Oh, I have to rebrand Mauvespace btw, following a conversation with a friend of a friend who is an IP lawyer. Just need to think of a name.)

Some websites are tables, some are trees, some are "hypercubes". (He keeps calling tables and matrices "rectangles". That strikes me a such a cute web-kiddy thing to do, labelling arrays as "Square, daddio" while graphs are new and "cool")

Something to do with trees and top-down OOP. (*shrug*)

What shape is the Internet? It's a net. (It's not. It's a fluffy cumulus cloud. Every first-year computer science student knows that.) It's robust.

The web is a web. What shape is that? What does that mean? (I would have thought it's a directed graph). It should be shaped like the world.

Common vocabularies for describing things with RDF. You get local collaboration to produce specific ontologies and you use some terms from global ontologies. Spatial things can be used in lots of applications. Overlapping ontologies.

The web is actually fractal. Structure at all different levels. (Fractal is not the right word).

Much less work is done in describing ontologies than using them.

Web Science includes

  • User interface for the web. SemWeb doesn't have this.
  • Building resiliant systems. Against slashdotting, attack. At an architectural level.
  • New devices – handheld and large screens.
  • Creativity. Connecting people and making them more effective. Allowing them to understand one another; letting half-formed ideas in two different people's heads on different sides of the planet connect.

Right, done.

It was a whistlestop tour of web science I suppose, but I didn't really feel that it was particularly insightful. Of course I'm not in the business of rationalising the way that the web works. I just program. I think TBL has to try to rationalise it because that's what he's famous for; at a personal level he probably feels people look to him to explain the ways of the beast. But of course he didn't create it. Mainly people just create web apps and it either catches on or not, or it needs a bit of pointling to actually make it work the way people want it to. With a lot of Web 2.0 sites, it just involves a huge amount of development to get to the point of having a web app that works well enough and scales, and then creative ideas can be tried out on pieces, beta tested and deployed.

This is exactly how the web started and evolved and I don't think I understand how we got to where we are now any better than I did before. I don't think it's possible to either; the web evolves in parallel across the globe. It doesn't have a single history behind it or a single motivation driving it. Deconstructing the web appears to me to be analagous with Psychohistory.

There is a podcast available, but don't feel obliged.

Writing an RSS client

Tuesday, February 13th, 2007

Interestingly, my latest paid project is to build an RSS reader. I am doing this not out of bloody-minded determination to reinvent the wheel, and I would be perfectly happy to adapt an existing project to work in the way I want, but none of the apps I have seen or tried does what I want it to do.

This project is a desktop feed notifier. It will poll feeds and pop up messages (non-intrusively) either when it starts or when it first sees them.

I have mentioned my views on RSS before, but happily they don't conflict this project. Because this is aimed at intranet service notifications there is a contract between producer and consumer, not merely a shared protocol.

I think that one good aspect of RSS is its ubiquity. Several apps already in use in this Intranet are RSS-aware and can be wired into this system with a minimum of work.

Without wanting to revisit the previous arguments too much, I might as well summarise them for completeness. I can envisage only two useful strategies for a syndication format:

  • Fixed contract: Specify a unique set of obligations for producer and consumer including both syntax and semantics. eg. RSS 0.90
  • Negotiated contract: Specify obligations of syntax, but encourage producers to offer a complete semantic representation, and allow consumers to build a customised syndication from within it. eg. RDF.

Mauvespace vs Facebook

Monday, January 29th, 2007

I find Facebook very annoying. I can't seem to make it do anything useful. It seems to get certain, key things stunningly wrong, assumptions which are disingenuous in my case and make it seem broken. I can't find any friends on it and I'm getting bombarded by junk which isn't applicable to me. I can't find options to do many of the things which I'm sure are possible.

However, I'm impressed with what Facebook is supposed to do. It's far and away the closest of the social networking sites to what Mauvespace aims to do. That in itself is interesting. I didn't invent very many of the concepts regarding what Mauvespace can do: many of the suggestions about the combined expressibility of RDF vocabularies come from the web. However, it occurs to me that a fair number of those might have been inspired by Facebook or others, and Mauvespace merely inherits those suggestions (albeit mostly unimplemented as yet).

Specifically, things like annotating not only pictures as depicting a person, but regions of pictures, are things that I've read specifically about in comments describing RDF ontologies. I'm surprised Facebook isn't semantic.

Still, several key factors differentiate Mauvespace as a social network even if it could do everything Facebook can (and the eventual plan is certainly to implement some of those things):

  • It's open source.
  • It's entirely themable.
  • It's semantic.
  • It's distributed and interoperable (as a result of being semantic).

Not all of these will matter to all people. Many people I've spoken to simply say "I'm interested, but only because I tried x and didn't like it." But regardless of what matters to other people, these things are exactly the most important things to me personally:

  • I can make it work the way I want it to (as can anyone else).
  • I can make it look as pretty as I like without resort to hackery (as can anyone else).
  • I can use whatever data users make available in any way I see fit.
  • No for-profit organisation controls my data, forces me to use their system to talk to my friends, forces my friends to use their system to talk to me, requires me to pay them money or requires me to view their ads.

I don't think any proprietary social networking site could ever meet these requirements. That is why Mauvespace exists. Or very soon will.

Burning SQL bras

Thursday, January 25th, 2007

RDF makes me feel so liberated now that I've actually got it all up and running! Storing data in a freeform RDF graph is so easy when you don't have to worry about setting up tables or writing queries or anything. Add an arc, remove an arc. It's that simple.

Liberation is not enough of an incentive on its own, perhaps, but the fact that your web applications are trivially Semantic Web-ready when using an RDF database means that this is definitely the way I will be writing web applications from now on! (Subject to caveats about speed and optimisation and legacy code and pure appropriateness).

In particular, I've been able to write a single, easy-to-use class that displays a configurable form, pre-filled from the model, and saves changes back to the model on submit. Code for this looks like this:

$form=new RDFForm($model, $me);
$form->setAction('profile.php?view=basics');

$form->addMultipleFieldMapping(vocab('foaf:name'),
               new StringLiteralProperty(_('Name')));

$form->addFieldMapping(vocab('foaf:title'),
               new StringLiteralProperty(_('Title'), 4));

$form->addFieldMapping(vocab('foaf:givenName'),
              new StringLiteralProperty(_('First Name')));

$form->addFieldMapping(vocab('foaf:surname'),
              new StringLiteralProperty(_('Surname')));

$form->addMultipleFieldMapping(vocab('foaf:nick'),
              new StringLiteralProperty(_('Nickname')));

$gender=new LiteralEnumProperty(_('Gender'));
$gender->addOption('male', _('Male'));
$gender->addOption('female', _('Female'));
$form->addFieldMapping(vocab('foaf:gender'), $gender);

if (isset($_POST['save']))
{
        $form->updateModel();
}

$form->render();

Of course, this is possible with relational databases too given enough layers of wrappers, but this approach makes it trivial to
implement new fields and new field types. Here is a screenshot of how this appears on the page.

RDF Form Screenshot

Atom and RDF

Wednesday, January 17th, 2007

I'm annoyed with Atom.

I was hoping to use Atom to describe a range of things within Mauvespace, such as blogs, logs and so on, but it appears that even though there is a mapping from Atom to RDF, there is no inverse mapping from that RDF vocabulary to Atom, because it is not universally possible to convert in that direction.

For example, the <atom:author> element mandates exactly one child element <atom:name>. Even if an RDF reasoner can assume an author has a name, it does not necessarily know what it is. Also, if you pull Atom data from two feeds written under different pseudonyms into an RDF model and then claim that two authors are the same, the model stops being able to distinguish which name each feed was written under, unless you add a vocabulary to subclass Atom authors as pseudonyms of FOAF people.

These may seem like gripes about the mapping, but it's more serious than that. It means that there is no bijection between an arbitrary RDF model and a valid Atom 1.0 document.

I can see a few options:

  • Map RDF to Atom only. Construct a mapping from any sensible RDF model structure to Atom. That this would not be bijective means that the software could not import from Atom. An alternative RDF version would have to be provided to import triples.
  • Work instead with RSS 1.0, which is pure RDF but is widely considered deprecated and isn't as expressive as Atom anyway. Trivially, however, this is bijective.
  • Map Atom to RDF only. Create a separate Atom store and map into RDF only temporarily when I need to reason upon it or style it with the RDF template code. Lack of bijection means the store could not invariably re-export Atom intact. Synchronising updates bidirectionally between Atom and triple store becomes an issue.

Plan for 2007

Monday, January 8th, 2007

New Year is a good time to look forward to the things we hope to achieve over the next year. So I thought I'd define now my main (technological) priorities for the year ahead so that I can get some sense of focus.

  1. Get up to speed on RDF and get using it in applications. I am not a total stranger to RDF but I've not used it at all so far. The main focus of my effort for now is a new project called Mauvespace. Mauvespace is an open-source web application that is a cross between a semantic CMS for personal homepages and a full social networking service. I don't want to hype it too much now though until there is something to show. But I hope very soon to roll up all of my homepage stuff from Mauveweb into Mauvespace, then throw it open to other people to use it for the same thing, either on my server or on their own. This frees up the mauveweb.co.uk domain, which could become a place for web projects. Sorry about all the 'Mauve's. I guess I'm not very imaginative with names. Although, it works as a brand, I suppose.
  2. Deploy some applications using Zope. My Python web applications are becoming increasingly Zope-like. The latest one I've been working on for a client is a self-contained web server, but that's partly because I wanted very careful handling of file uploads. I needed to remove file size and memory limits imposed by PHP, and I implement concurrent querying of the status of uploads, which allows me to provide AJAX progress bars. There are lots of parallels with Zope: that it's Python; that it's a web server; that any persistence is object-based (although in this application it's in-memory persistence; non-volatile data is retrieved from other network services mandated by the brief). Anyway, in 2007 I hope to transfer from ad-hoc Zope-like systems to Zope proper with all the advantages that brings. It's just a shame there have always been reasons not to so far. Unfortunately Mauvespace is PHP by necessity. PHP is the only language that enjoys widespread hosting support and I consider that vital.
  3. Hack Inkscape. Inkscape is of course hugely important to my work and as a result I've become quite involved with making sure it meets my needs, mainly through bug reporting, feature requesting, and so on. I would like to stretch my C++ legs and improve things, if I find time. Incidentally Inkscape 0.45 has been bug hunted and is moving to feature freeze very soon. The headline news is the Gaussian blur feature but there are a plethora of other improvements too.
  4. Continue the high standard of technical commentary on this blog :) Actually, I wish I could get it more organised and make it more accessible to people who aren't knowledgable web developers. But if it would be less personally useful to me if that was the case. So the status quo may have to suffice.