File Uploads in Firefox 3.5

June 27th, 2009

Apparently, improved developer control of browser-native file uploads, something I wishlisted back in 2007, is going to be available in the upcoming Firefox 3.5.

RSS: Error-prone

June 24th, 2009

I subscribe to only about a dozen RSS or Atom feeds, but more than half of them suffer from one problem or another.

  • Intermittently dumping a dozen duplicate posts.
  • Dumping a dozen duplicate posts on every refresh.
  • Duplicating the most recent post on every refresh.
  • Double-escaping HTML entities, so I see “, ”, … and such like in post names.
  • XML syntax errors causing total feed outage until some improperly encoded post drops off the feed.
  • <pre> code snippets that have lost their formatting.
  • And, of course, the occasional snippet of HTML that doesn't work as intended when removed from the context of the original HTML document and embedded in RSS.

I often have to search for Pipes to get a useful feed, which is a consequence of the way RSS specifies only a data format, not an obligation on producers, an architectural flaw I've discussed before.

But quite aside from this, it seems that a significant proportion of feeds aren't implemented properly.

Obviously we can blame developers for bugs, but the design of RSS may well be a contributing factor. The process of encapsulating HTML fragments in XML is not as straightforward as it looks. The requirement for a unique ID for each post at first glance does not look onerous. But does the ID correspond to the specific version of a post? Or does it correspond to the current version, however it may have changed since it was first published?

RSS may be useful, but it should also just work, and it doesn't. Developers and standardistas alike should start thinking why.

Sample Code for Employers

June 10th, 2009

If you are looking for a job programming, you need to demonstrate to a potential employer that more than anything else, you will be profitable. In terms of programming, profitable code is robust, is produced quickly, is readable, understandable and maintainable by the rest of the team, and depending on the job you're applying for, may have to be secure or efficient too.

As an employer I consider it vital to see a sample of a developer's code before employing them - it's the most reliable way of assessing how competent a candidate. The standard of code I've encountered from candidates over the past few weeks was generally weak. If you want to stand out, you may be interested in a few tips. I've tried to put these in order of importance.

  1. Submit only your own code If you worked on the code with anyone else, it's worthless to me.
  2. Submit your best code If you have more than one project, submit the best. I'm liable to judge you on the worst. The most recent code you've written is normally the best, assuming your programming skills have improved over time.
  3. Make it easy to run the code I can't run most of the code I receive. Maybe it runs, but generally it's not clear how I get it running (web applications are generally hard to run). I am interested most in the code itself, not seeing the program run, but it's a big bonus if I can run it. So why not include an INSTALL file documenting the process. For web applications, you could include database dumps with sample data, or swap your application onto SQLite and include the database file. But an even easier way is just to get it hosted somewhere and send a link.
  4. Submit code that does something interesting A lot of code is either boilerplate or performs very common tasks. For example, a web application that takes input from a user, puts it into a database, retrieves data from a database, and outputs it into a page is the most simple web application it's possible to write. It's covered step-by-step in any web programming textbook. If that's all the application does, it had better be pretty dazzlingly tidy code. But I prefer to see code that is outside the scope of textbooks.
  5. Use plenty of third-party libraries The more libraries I see neatly integrated into your software, the more efficient you look as a programmer. There can be reasons to re-invent the wheel in practice, such as to overcome license restrictions, but when you're submitting sample code to employers it makes you look inefficient. Moreover when I see code draw on more and more appropriate libraries (or web APIs, or data sets) it also means that you know what's available and you're thinking about how to combine them creatively. (Incidentally, if you bundle third-party code with the source you supply, put it in a directory called "lib" or something so that I can see to ignore it.)
  6. Write good HTML. I'm less tolerant of bad HTML than a web browser is. I'm not going to pass or fail you purely for HTML that's not standards compliant because it's enormously widespread, relatively low-impact, and fairly easy to teach - but you're applying for a job writing software that outputs in a well-defined language. It comes across better if you're actually outputting in that language, and not some misinterpretation of it. It begs the question, would you do that for any other data format or protocol? Anyway, bad HTML breeds bugs.

If you can't find code that meets the above criteria, why not write something especially? It's possible you could write something in a weekend that can improve your job prospects significantly.

But if you're not employed at the moment, and you're looking for a job as a programmer, you should be constantly either writing code or reading articles on the web about writing code. Employers can teach you skills on the job, but it costs money to do this, and that's money that won't be going into your salary.

Job Opportunities

April 7th, 2009

My company, Mauve Internet, is looking to take on staff in the Chichester/Portsmouth area. I am looking for a Django developer, so if you have experience with that, or perhaps only a little Python experience but are enthusiastic to learn more, please check out the job details or perhaps get in touch.

Mauve Internet is currently based in Chichester but I've been looking at offices in Portsmouth and Havant too, perhaps I'm even leaning that way. Also, the job spec I've written up is fairly narrow. In truth, we may have positions open for a much wider range of talents and levels of experience in the field of web design and web development. I'm be happy to consider all applications accompanied by a CV.

Exercise: Recreate an attractive website

March 3rd, 2009

I pick up a fair bit of business from companies that already have a website created by, say, the boss's son's friend who has gone off to university, but also I'm forced to use hundreds of websites that have cringingly poor design or cripplingly poor usability. I am very supportive of amateur and beginner web designers striving to improve, but when I browse the web it's obvious there are still very many web designers who don't understand what they need to do to improve, and turn out poor websites time and again. It's a widening gap: while the beaten track consists of websites that scream 2009 in both look and feel, many amateur websites are stuck in 1999.

Anyone interested in learning to design websites would do well to study the existing web in a bit more detail - look and learn, as well as practice! I'm going to present a simple exercise a web designer of any level can benefit from.

Instructions

  1. Choose a topic. Perhaps a hobby, or a movie; something you are passionate about is best, but definitely something that you can write at least a few pages about.
  2. Pick a website you personally like a lot, but nothing too simple. The idea is to stretch yourself, so pick something elaborate.
  3. Use the design of the website you have chosen as the template for a new website about your chosen topic.

What you need to end up with is an attractive website that takes a lot of inspiration or perhaps even outright rips off the website you chose as a template - but recreated from scratch. It's OK to explore a variation on the style, but don't skirt around some aspect that looks difficult. That's cheating. You might like to source images using Google or from a stock photo website like iStockphoto.

Pepsi Rebranding

February 12th, 2009

There's a hilarious document doing the rounds after turning up on reddit: a document detailing why Pepsi's new brand identity is so good. In terms of general relativity. And smileys.

Pepsi Rebranding Advice (PDF, 6MB)

As I've pointed out previously, the criteria for whether a logo is an improvement are whether it meets the design criteria and then whether you get positive feedback from focus groups. This 27-page document full of guff, if authentic, seems to have convinced the board of PepsiCo, who on this evidence must be credulous idiots. What would convince me is a 27-page document showing that a significant majority of a representative group not just like the new logo, but can associate it with the product they already know, and the values the brand is seeking to convey.

The new logo doesn't seem to have hit the UK yet, so here it is if you've not seen it:

Pepsi's New Logo

It's not a bad logo in itself. But it's no longer as unique and iconic as the old one had become. It reminds me a little of the Sony-Ericsson logo (which is prettier). The new packaging looks almost like own-brand supermarket cola. I'm sure Pepsi will throw money behind promoting this new logo. But it's money they did not need to spend. They already had a distinctive and versatile logo.

Sky News

February 9th, 2009

I don't watch Sky News but in Googling today I happened to stumble onto their site. I was amazed how poor it was compared to the BBC news website that I visit whenever I do want news.

I've annotated a copy of the Sky News homepage as of 2009-02-06 with my comments on the graphic design. This is very similar to the way I actually work when I'm working with a designer. Click the image below to embiggenify.

A critique of Sky News' graphic design (Image-heavy SVG)

Profanity

January 28th, 2009

The web has never responded very well to censorship. So much of the web is about freedom of expression that whenever someone tries to express himself, and is prevented from doing so, he feels disenfranchised. That applies even more so in the case of the Scunthorpe problem, because people who weren't trying to swear in the first place feel much more aggrieved.

On the other hand, website owners do not want their image damaged by users who can't keep their potty mouths shut.

When developing sites that allow users a voice, we need to find ways to protect the website owners, or the atmosphere of a community, without damaging the goodwill of the user base. Any website that depends on user input, and which doesn't have any users, is a failure.

Profanity filtering is not the answer because, at least, I've never seen it done well enough to be both comprehensive and unintrusive. All problems that relate to processing natural language are extremely complicated. We have barely started to scrape the surface in terms of parsing English text, let alone extracting the semantics from it that we would need to determine if a word is offensive. So any attempt at a naïve profanity filter is doomed to failure. For example, you can be profane without being offensive:

She turned round and screamed, "Fuck off, you stuck-up bitch". I was appalled!

You're a grumpy old bastard, but I love you.

and you can be offensive without being profane:

I did your mum last night. She's fatter than a blue whale, but she knows a trick or two. Your sister does too actually.

and let's not forget the cases where you can't tell:

Do you have a cock or do you just keep hens? Oh, we have a big gold cock. You know, the pussy is afraid of him!

Ok, the last example is contrived and of course nobody would type it with a straight face.  Still, in the right context, it's innuendo not profanity.

With those insurmountable problems, there's simply no substitute for a human keeping an eye on things. However, even with moderation, there are problems to face. Exactly what is acceptable? Moderators can easily pronounce on clear-cut cases of abusiveness or offensiveness, but people have different sensibilities as to what's acceptable. It's also fairly easy for moderators to miss the odd bit of abuse, especially if it's only offensive in some contexts.

One trick to help keep control of the situation is to carefully set the tone. If you can use the language and style of the website to convey a sense of what might be appropriate, you can influence the tone users are likely to take. Though moderators still have to check the same amount of content, this reduces the chance that something untoward will slip through. Phrases like "Interglobal Inc do not take any responsibility for the content of this service"  - phrases which are of dubious merit anyway - may have the opposite effect, by giving users the impressi0n that they don't care what the tone is. You also stand to lose control of the tone in the subconscious minds of users if you use some well-known software - phpBB for example - which users might have used elsewhere and come to associate with a certain mode of speech.

If you do censor people,  a light touch is often better than a heavy hand.

Mauvesoft

January 26th, 2009

I've overhauled Mauvesoft, my programming projects website. Check it out.

How to program a calendar

January 26th, 2009

Programming a calendar sounds deceptively easy. And it is, until you come to realise that there's very little point in displaying a calendar that doesn't show information about events and periods. You have a potentially overlapping set of periods to display, each spanning days or months. It becomes much more complicated.

At the moment I'm programming a calendar for the booking of accommodation, which is particularly complicated because a) you book nights, not days, and month planners have cells for days, not nights, and b) the dates that are available are the dates not booked, not the dates booked.

I'm using a simpler approach, converting all calendar periods into a stream of events in date order. The interface between producers and consumers of calendar events looks like this:

class CalendarListener(object):
  def start_month(self, month):
    """Called before the first day of the month, and before any periods in that month."""
   
  def end_month(self, month):
    """Called after the last day of the month, and after any periods in that month."""
   
  def start_day(self, date):
    """Called once for each day to display"""
   
  def start_period(self, date, period):
    """Called before the day in which the period begins"""
   
  def end_period(self, date, period):
    """Ends the previously started period"""

This interface makes it very easy to produce, filter, and consume calendar data. What was previously a complicated process of intersecting, splitting, joining, structuring and outputting date ranges suddenly becomes very simple. All of the events received via this interface are guaranteed to be in chronological order, so no date comparison is needed. Almost all calendar operations can be performed with a simple state machine.

A consumer that renders to HTML, for example, is as simple as this:

class MonthRenderer(CalendarListener):
  def __init__(self):
    self.buf = StringIO()
   
  def start_month(self, month):
    print >>self.buf, """<div class="month"><h4>%s</h4>
      <img class="
week" src="/assets/cal/week.png" alt=""/>""" % month.name()
   
    w = month.first_day().weekday()
    if w:
      print >>self.buf, '<div class="padding" style="width: %dpx"></div>' % (w * 21)
 
  def end_month(self, month):
    print >>self.buf, "</div>"
   
  def start_day(self, date):
    print >>self.buf, '<span class="day">%d</span>' % date.day

(Note: date and datetime are standard Python classes. Month, however, is my own class. Also, some people use a table rather than CSS for this; that's obviously a fairly simple alteration.)

It took me quite a few false starts before I realised the relative simplicity and convenience of this pattern, which is why I wanted to recommend this. It's very easy to fall into a trap of building complexity and tackling problems using ever-more complicated calendar classes and processors and never take the step back to find a better approach.

The naïve approach for programming a calendar is to write a function, say, print_month() which renders a month of a calendar. Then call this 12 times. Then wrap it up in a class so you can subclass it to retrieve a list of events and modify output. This quickly became excessively complicated, as I wrote methods to chop and join periods together, work out what the formatting of each day should be, and render it.

Alas, the calendar also requires Javascript, and doesn't benefit quite as much from an event-driven approach because it needs to operate on the structured HTML DOM.