Archive for January, 2007

PHP Gotcha

Tuesday, January 30th, 2007

While it has become quite common practice for me to berate PHP, I really never imagined I would come across actual rock-solid evidence that PHP is just one big practical joke the Zend people are playing on me. Noel Edmonds is probably hiding somewhere waiting to pop out and present a trophy, and then pour gunge all over all PHP developers and just generally be weird.

You know classes, right? Those things which are, by definition, not null? That are in fact, the exact opposite of null?

<?php

class TestCls { };

$a = new TestCls();

print ($a == null) ? 'null': 'not null';

?>

Guess. Go on. I'll give you one guess what PHP prints.

Yes, I know it works if I use a === rather than an ==. But that doesn't mean the == behaviour isn't sick and wrong.

Mauvespace vs Facebook

Monday, January 29th, 2007

I find Facebook very annoying. I can't seem to make it do anything useful. It seems to get certain, key things stunningly wrong, assumptions which are disingenuous in my case and make it seem broken. I can't find any friends on it and I'm getting bombarded by junk which isn't applicable to me. I can't find options to do many of the things which I'm sure are possible.

However, I'm impressed with what Facebook is supposed to do. It's far and away the closest of the social networking sites to what Mauvespace aims to do. That in itself is interesting. I didn't invent very many of the concepts regarding what Mauvespace can do: many of the suggestions about the combined expressibility of RDF vocabularies come from the web. However, it occurs to me that a fair number of those might have been inspired by Facebook or others, and Mauvespace merely inherits those suggestions (albeit mostly unimplemented as yet).

Specifically, things like annotating not only pictures as depicting a person, but regions of pictures, are things that I've read specifically about in comments describing RDF ontologies. I'm surprised Facebook isn't semantic.

Still, several key factors differentiate Mauvespace as a social network even if it could do everything Facebook can (and the eventual plan is certainly to implement some of those things):

  • It's open source.
  • It's entirely themable.
  • It's semantic.
  • It's distributed and interoperable (as a result of being semantic).

Not all of these will matter to all people. Many people I've spoken to simply say "I'm interested, but only because I tried x and didn't like it." But regardless of what matters to other people, these things are exactly the most important things to me personally:

  • I can make it work the way I want it to (as can anyone else).
  • I can make it look as pretty as I like without resort to hackery (as can anyone else).
  • I can use whatever data users make available in any way I see fit.
  • No for-profit organisation controls my data, forces me to use their system to talk to my friends, forces my friends to use their system to talk to me, requires me to pay them money or requires me to view their ads.

I don't think any proprietary social networking site could ever meet these requirements. That is why Mauvespace exists. Or very soon will.

Burning SQL bras

Thursday, January 25th, 2007

RDF makes me feel so liberated now that I've actually got it all up and running! Storing data in a freeform RDF graph is so easy when you don't have to worry about setting up tables or writing queries or anything. Add an arc, remove an arc. It's that simple.

Liberation is not enough of an incentive on its own, perhaps, but the fact that your web applications are trivially Semantic Web-ready when using an RDF database means that this is definitely the way I will be writing web applications from now on! (Subject to caveats about speed and optimisation and legacy code and pure appropriateness).

In particular, I've been able to write a single, easy-to-use class that displays a configurable form, pre-filled from the model, and saves changes back to the model on submit. Code for this looks like this:

$form=new RDFForm($model, $me);
$form->setAction('profile.php?view=basics');

$form->addMultipleFieldMapping(vocab('foaf:name'),
               new StringLiteralProperty(_('Name')));

$form->addFieldMapping(vocab('foaf:title'),
               new StringLiteralProperty(_('Title'), 4));

$form->addFieldMapping(vocab('foaf:givenName'),
              new StringLiteralProperty(_('First Name')));

$form->addFieldMapping(vocab('foaf:surname'),
              new StringLiteralProperty(_('Surname')));

$form->addMultipleFieldMapping(vocab('foaf:nick'),
              new StringLiteralProperty(_('Nickname')));

$gender=new LiteralEnumProperty(_('Gender'));
$gender->addOption('male', _('Male'));
$gender->addOption('female', _('Female'));
$form->addFieldMapping(vocab('foaf:gender'), $gender);

if (isset($_POST['save']))
{
        $form->updateModel();
}

$form->render();

Of course, this is possible with relational databases too given enough layers of wrappers, but this approach makes it trivial to
implement new fields and new field types. Here is a screenshot of how this appears on the page.

RDF Form Screenshot

File uploads

Tuesday, January 23rd, 2007

I have mentioned briefly work that I was doing to wrap file uploading in AJAX for a proper experience. Browser-based file uploads have been downtrodden over the past few years.

In client terms, file uploads work in almost exactly the same way as they have always done: the page blocks while the data is posted, and a very small progress bar shows up in the status bar. This is a user interface disaster for big files.

On the server side, the situation is more varied, but there is often little support for streaming of file uploads. In PHP, file uploads are read wholly into memory, parsed and saved out to a temporary folder before a script even gets called. The request must fit within both PHP's file upload size limit and its memory limit. As far as I can tell, something similar happens in Zope although you can argue that Zope allows other standards for upload such as DAV and FTP natively. In plain CGI, of course, there is no handling of the uploads, so if you're using a CGI wrapper, it can do whatever you want to handle this. Perl's CGI.pm module allows a hook, at least. Python's cgi module doesn't, nor is it easy to subclass.

All in all, the situation of binding file uploads to form submissions, and processing of those in common server-side languages is wholly inadequate as file size gets large. File uploads are convenient because they are a commonly-supported fall-back, but the workarounds, although solving some of these problems, don't have the simplicity of a browser-native solution.

In my recent project I looked at ways of working around these limitations. The best workaround for the client-side problems I have found so far is to perform the upload in an <iframe>, using AJAX queries to present a progress bar. This still has problems, notably that it's one file at a time, both on the choosing and the uploading. In Firefox I can actually perform two concurrent uploads in different <iframes>, but the AJAX progress bar doesn't then update.

Server-side, I wrote the whole thing as a webserver so that the AJAX queries could talk directly to the thread streaming the upload. Additionally I wrote my own parser to parse on-the-fly the data uploaded, so that the daemon knows what is uploading at any given stage. It works quite well, and the system is extensible in that it could combine a daemon that allows other forms of upload; feedback for these would also appear in the browser windows.

Even so, I wish that file uploading was something people were thinking about more. It's central to so many web applications now.

There are numerous problems:

  • File uploads are synchronous. Downloads can happen in the background in their own, but uploads can't.
  • File uploads don't have a proper UI. Current browsers appear to show a tiny upload bar that isn't really very accurate and doesn't give data rates or estimated time remaining.
  • Uploads are chosen one at a time.
  • Javascript can't be used for polish. The model that has empowered Web 2.0 improvements is that of taking an existing HTML/HTTP model and allowing it to be controlled by Javascript. However, there is no way into the uploading or the file selection processes with Javascript.

The most general solution I can see would provide a Javascript API for uploading. This would allow Javascript to show a (native) file chooser dialog, and instruct the browser on what to do with the files it returns. POST or PUT to the origin server seem useful, as does FTP upload. Clearly there are security concerns, but I fail to see how, as long as Javascript may instigate an operation, read upload statistics, but not read the filesystem, this presents a problem.

Perhaps an AJAX-style API could be along these lines:

//configure a native dialog to present to the user
var ufc=new UploadFileChooser();
ufc.setAcceptableFileTypes(['image/jpeg', 'image/png']);
var uploads=ufc.chooseFiles();

for each (var u in uploads)
{
u.onreadystatechange=doSomething; //callback
//this URL is constrained the origin server to prevent XSS
u.beginHttpPost('http://example.com/upload');
}

After this, the user could close the tab or leave the page, and the browser would upload the files in the background, perhaps with a progress bar appearing within the Downloads window. Note that it could queue the files rather than uploading them all at once, depending on user settings. The Javascript, and indeed the user, should be able to request that an upload is aborted. The Javascript should also be able to query the upload, using the object reference provided.

Atom and RDF

Wednesday, January 17th, 2007

I'm annoyed with Atom.

I was hoping to use Atom to describe a range of things within Mauvespace, such as blogs, logs and so on, but it appears that even though there is a mapping from Atom to RDF, there is no inverse mapping from that RDF vocabulary to Atom, because it is not universally possible to convert in that direction.

For example, the <atom:author> element mandates exactly one child element <atom:name>. Even if an RDF reasoner can assume an author has a name, it does not necessarily know what it is. Also, if you pull Atom data from two feeds written under different pseudonyms into an RDF model and then claim that two authors are the same, the model stops being able to distinguish which name each feed was written under, unless you add a vocabulary to subclass Atom authors as pseudonyms of FOAF people.

These may seem like gripes about the mapping, but it's more serious than that. It means that there is no bijection between an arbitrary RDF model and a valid Atom 1.0 document.

I can see a few options:

  • Map RDF to Atom only. Construct a mapping from any sensible RDF model structure to Atom. That this would not be bijective means that the software could not import from Atom. An alternative RDF version would have to be provided to import triples.
  • Work instead with RSS 1.0, which is pure RDF but is widely considered deprecated and isn't as expressive as Atom anyway. Trivially, however, this is bijective.
  • Map Atom to RDF only. Create a separate Atom store and map into RDF only temporarily when I need to reason upon it or style it with the RDF template code. Lack of bijection means the store could not invariably re-export Atom intact. Synchronising updates bidirectionally between Atom and triple store becomes an issue.

PHP4 must die

Tuesday, January 9th, 2007

Sitting down this evening to code some PHP from scratch after a couple of months of working exclusively in Python, I am stunned to realise how bad plain PHP (PHP4, we're talking about) is. The shop code provides a fairly comprehensive framework on top of standard PHP, wrapping database, output and error handling. Without it, PHP is so much more dreadful than I remember.

I'm actually amazed that PHP doesn't print a stack trace when there's an error. You're clearly not supposed to write functions.

Perhaps I will make the jump to PHP5. There are compatibility issues even now, but maybe it's simply time.

Plan for 2007

Monday, January 8th, 2007

New Year is a good time to look forward to the things we hope to achieve over the next year. So I thought I'd define now my main (technological) priorities for the year ahead so that I can get some sense of focus.

  1. Get up to speed on RDF and get using it in applications. I am not a total stranger to RDF but I've not used it at all so far. The main focus of my effort for now is a new project called Mauvespace. Mauvespace is an open-source web application that is a cross between a semantic CMS for personal homepages and a full social networking service. I don't want to hype it too much now though until there is something to show. But I hope very soon to roll up all of my homepage stuff from Mauveweb into Mauvespace, then throw it open to other people to use it for the same thing, either on my server or on their own. This frees up the mauveweb.co.uk domain, which could become a place for web projects. Sorry about all the 'Mauve's. I guess I'm not very imaginative with names. Although, it works as a brand, I suppose.
  2. Deploy some applications using Zope. My Python web applications are becoming increasingly Zope-like. The latest one I've been working on for a client is a self-contained web server, but that's partly because I wanted very careful handling of file uploads. I needed to remove file size and memory limits imposed by PHP, and I implement concurrent querying of the status of uploads, which allows me to provide AJAX progress bars. There are lots of parallels with Zope: that it's Python; that it's a web server; that any persistence is object-based (although in this application it's in-memory persistence; non-volatile data is retrieved from other network services mandated by the brief). Anyway, in 2007 I hope to transfer from ad-hoc Zope-like systems to Zope proper with all the advantages that brings. It's just a shame there have always been reasons not to so far. Unfortunately Mauvespace is PHP by necessity. PHP is the only language that enjoys widespread hosting support and I consider that vital.
  3. Hack Inkscape. Inkscape is of course hugely important to my work and as a result I've become quite involved with making sure it meets my needs, mainly through bug reporting, feature requesting, and so on. I would like to stretch my C++ legs and improve things, if I find time. Incidentally Inkscape 0.45 has been bug hunted and is moving to feature freeze very soon. The headline news is the Gaussian blur feature but there are a plethora of other improvements too.
  4. Continue the high standard of technical commentary on this blog :) Actually, I wish I could get it more organised and make it more accessible to people who aren't knowledgable web developers. But if it would be less personally useful to me if that was the case. So the status quo may have to suffice.

Cineworld Cinemas

Friday, January 5th, 2007

Cineworld Cinemas' website has been revamped again recently. It was not all that long ago that it was last done, but it has frequent had performance problems which leads me to believe that this is why it has been redone (more or less from scratch). I use our local Cineworld Cinema a lot. I saw 37 films there last year. This stuff matters a lot to me.

This makes it the third iteration in a row with severe accessibility and usability problems.

  1. The earliest website I saw was static, but ugly with a large spinning raytraced star. This was their branding style at the time. Although it had weekly film times, you could not book online. You had to phone a telephone number which had a horrific voice recognition system to book. Film times were displayed in one weekly timetable, by cinema.
  2. This was replaced by a much more contemporary website in their new branding style with AJAX drop-down menus for booking and searches for film times. This was clumsy and unintuitive; the menus looked exactly like tabs, and you were supposed to select one item from each tab/menu – cinema, film, showing, number of tickets – before moving on to book. The link I needed was a less prominent "What's On" at the top of the page to get showing times for the week ahead. However there was no way to bookmark the showing times for my local cinema, because it was a form POST. Most people are unlikely to want to search for their nearest cinema every time they are thinking of going! As I mentioned, this site ground to a halt regularly.
  3. The new one looks similar but works even worse. There are three somewhat cryptic tab/buttons called Cinemas, Films and Dates, plus a larger button saying "Find out what's on here and book now" which doesn't do anything. Cinemas takes you to a horrid Flash map to select region and cinema, but will then display showing times for today only: much less useful than a week's timetable. But it can now be bookmarked. Films lists all films that are showing at Cineworld Cinemas. But not necessarily cinemas anywhere near me. Dates takes you to, via the Flash region selection map, to a screen which lets you pick one date, one time, and one cinema to see which films are on. It then ignores the cinema you chose and displays film times for all cinemas in the "region" (19 cinemas covering the whole of the South of England). The page title, for the whole site, is "Cinematheque1″. I can't operate this site on my smartphone, perhaps because it doesn't support the latest versions of Flash.

I just find it bizarre that their website should get so steadily worse, especially when Odeon was so strongly criticised for lack of accessibility.

Big World Travelog

Wednesday, January 3rd, 2007

I finished updated my brother's blog over New Year, and I'm now quite happy with it. I spent ages on a graphical title block in Inkscape. I'm not totally satisfied with the caricatures who are a little more cartoony than I would have liked. The new theme is based on K2, which I do think is pretty.

I had a strange bug with the Google Maps on the gallery which I found after a bit of searching was caused by Google Maps not supporting embedding within XHTML. But I'm pleased that the thumbnailing/unthumbnailing works so well. Previously the map enlarged and reduced; this is much less intrusive in either state.

I did have to disable all of K2's shiny Javascript features to make it work with my Javascript (for map markers) :( But they were a UI disaster anyway. One of the advantages to JS over Flash is that it allows us not to create horrific new UI paradigms. So going to great lengths to do so is missing the point.