Archive for the ‘Web Standards’ Category

Fonts and font-family

Wednesday, February 3rd, 2010

Yesterday, on Twitter, I watched a discussion emerge as one person I follow pointed out that another person's hosted wordpress.com blog was illegible on her computer, with all of the content appearing in ugly bold italics.

While we never got to the bottom of that issue (I couldn't reproduce it), it's worth backing up and examining font use on the web.

Fonts, unlike any other aspect of web browser rendering, depend on the platform, not the browser version.

The reason is simple: fonts are not bundled with the browser, but with the operating system, or installed with some creative applications.

If you select fonts based on how they look on your computer, they will look different on another computer with a different set of fonts installed. Also, fonts are matched by name in CSS, so when you write

font-family: "Helvetica", "Arial", sans-serif;

you are requesting a font named "Helvetica", then one named "Arial", then the default sans-serif fonts. This is a very common thing for people to write, because Helvetica is a popular choice on Mac, Arial is a popular choice on Windows, and sans-serif is a catch-all. The intention is to select a nice sans-serif font on each platform.

Unfortunately, "Helvetica" can exist on Windows and Linux as well as Mac. Helvetica has been around as a typeface since 1957, and there are different versions of it around – by what route, or with what degree of intellectual property infringement, I do not know. There are also a fair number of variants that your computer might also consider, if they are installed.

On Linux, Helvetica was historically an X bitmap font (ugly, impractical things that are now effectively dead). These days it is generally an fontconfig alias for a free sans-serif font, but renders with iffy hinting and kerning, perhaps to conform to the original font's metrics (ie. it has been shoehorned into exactly the same space, so that printed publications don't come out wrong). I actually find this font quite uncomfortable to read.

On Windows, you may occasionally find Helvetica exists, perhaps even as the same font, installed on its own, or with some software suite, but if you do you'll find several browsers on Windows render fonts with Microsoft's ClearType renderer optimised for legibility, not the Mac's quality-optimised renderer, also used in Safari 3 on Windows. Microsoft own fonts have been tweaked to work well with ClearType – others may not. Linux is (as ever) more flexible: it's possible to configure the amount of hinting to use through fontconfig, although most users will keep their distribution's defaults.

Ultimately it's an impenetrable picture – you cannot be sure that the fonts you list will give anything like the browsing experience you were expecting. The same overall picture applies with serif fonts and monospace fonts.

The best solution (unless you want to try downloadable fonts, which I wouldn't recommend for body fonts) is to side-step the specifics of fonts entirely and delegate to the user/browser/operating system. There are three suitable aliases for font families: sans-serif, serif, and monospace. These will reliably give you a good font of that category. There are two other aliases, cursive and fantasy which are too poorly defined – you could get practically anything.

Is this really the only option? If you're prepared to go to the enormous lengths required, can you not pick a list of named fonts, test broadly and claim it works? Well, yes, if you test broadly enough you can get say 99.9% coverage. Unfortunately, that's not always good enough.

The topic of a site turns out to significantly affect the statistics of users that visit it. For example, a site about Linux will get more Linux hits. A site about using Photoshop will get most hits from people with Adobe Creative Suites installed, and that comes with fonts. So as a theme designer, what was 99%+ for you could be 90% for some of the people who use your theme.

So, in summary, stick to the safe fonts: sans-serif, serif and monospace. Fonts that are ubiquitious and designed for the screen are also quite safe – Arial and Verdana. You might be able to find some other safe places by consulting statistics if you are feeling creatively hemmed in. But please, don't make font assumptions.

Seam Carving

Saturday, August 30th, 2008

I've just come across this awesome technique:

One of the reasons designers like to use fixed rather than fluid website layouts is because of the difficulty in providing attractive images at unknown aspect ratios. This technique offers a really beautiful solution.

The presentation shows that all that is needed to apply the effect is an image plane containing the priority of each pixel or effectively in which seam removal it is to be eliminated.

I demand that:

  1. There be a PNG extension chunk defined to encode this plane.
  2. Web browsers support this new chunk when non-proportionally scaling images.

WCAG + Samurai

Wednesday, July 9th, 2008

The WCAG Samurai Errata to WCAG 1.0 is an independently-produced set of amendments ("errata" is not really the right word) that fairly accurately nail how WCAG is and isn't working for pragmatic but standards-compliant developers like me.

I stumbled across this while browsing Google for references while writing a proposal for a client. I was writing my usual set of caveats about the Web Content Accessbility Guidelines (WCAG), in which I lament how rubbish they are in practice. My usual spiel goes along the lines of "WCAG 1.0 informs all of our work at Mauve Internet. However, because the WCAG is subject to intepretation, and is now somewhat dated, we do not warrant conformance." I would like to make a clearer statement of how and why but a specification isn't really the place to challenge sections of the WCAG.

Happily, I can now get rid of this kind of non-committal note when writing specifications, and offer a firm commitment to WCAG+Samurai.

I will also be revising my company website – indeed, my company policy – to recommend conformance to the WCAG guidelines as amended by the WCAG Samurai Errata where possible.

IE8 actually does not pass Acid2

Tuesday, February 12th, 2008

I've just been reading the latest goss on Acid3 and I've come across the most wonderfully perverse announcement from Microsoft (this is a few weeks old, but it's not like I regularly read Microsoft blags).

Since announcing that IE8's renderer can pass Acid2, it transpires that they have a Catch-22:

To have IE8 conform to standards, you have to use a non-standard workaround.

Wow. I'd assumed Microsoft knew what standards were but couldn't seem to achieve compliance. Not so! Turns out they can achieve compliance but they don't know what web standards are!

Web Standards in the Next Generation

Saturday, December 22nd, 2007

With the news this week that Microsoft have a build of Internet Explorer that can pass Acid2, I wonder if I will be forced to eat my words when I suggested recently that Internet Explorer may be falling further behind with web standards, not closing the gap.

Well, we have an interesting opportunity to measure an aspect of that gap. A quick glance at Bugzilla shows that Gecko was able to generate a correct screenshot by 2006-04-17. Internet Explorer claimed correct rendering on 2007-12-12. The gap is 604 days for Gecko, but obviously, greater for other browsers who have been compliant for much longer.

If Internet Explorer 8 progresses anything like Gecko, there will be a large number of bugs still to fix. If Internet Explorer 8 progresses anything like Internet Explorer historically progresses, most of those bugs won't get fixed. In other words, I'll believe it when I see it. That might not be for 20 months and might not be available on Windows XP. In fact, if the same post on the IE blog they are keen to excuse themselves from commitment to specific web standards, offering only a general tone in favour of them but excusing themselves with respect to backward compatibility. Taken as a preamble to a compliant-looking Acid2 rendering, I take this to mean, "we may not deliver this in IE8″. I think everyone hopes they will, but by comparison, some of the Acid2 patches could have been in Firefox 2 but weren't because Firefox 2 was built with a frozen earlier build of Gecko.

Meanwhile, Firefox 3 is drawing closer. My impression is that the gap between 2 and 3 is not huge, which (hear me out!) is because Firefox 2 was excellent and Firefox 3 struggles to improve upon it. The difference for users is relatively minor. Although the new approach to bookmarking is hugely refreshing I think many users, including my parents, just won't get it.

The difference for developers is significantly less marked – the difference between having functional support for a technology that isn't portable to IE and having good support for a technology that isn't portable to IE is not something that will revolutionise the web. In fact, looking at the Firefox 3 for Developers page, the changes are disappointing and even worrying. In some ways it's a return to the browser wars of the late 1990s when competition between browser vendors' extensions demolished the concept of web standards.

  • Support for aspects of HTML5 – there isn't even a first working draft of HTML5. Although it was the WHATWG spec before, complying with a specification this early will mean that the implementation may not conform to the final specification, by which time, developers will be relying on the non-standard behaviour.
  • APNG – APNG is a Mozilla-sponsored bastardisation of PNG to add animation. It doesn't subscribe to the contract of PNG (which expressly forbids animation) and it isn't negotiable properly because it hijacks PNG's MIME type, extension and magic. This spells very bad news for the PNG format. In future it will be impossible to tell if a PNG is animated or not, and of course all legacy software will believe not. Despite the best efforts of a number of people, myself included, but most particularly Glenn Randers-Pehrson, Mozilla refused to adopt amendments which would resolve the conflicting standards and the PNG group failed to ratify APNG as an official extension. Although APNG was an ad-hoc solution to offer animated UI elements in Mozilla, it is being released and promoted as the new web standard for animation and MNG support, although a superior and established format, has been canned.
  • Microformats – Firefox 3 builds-in support for Microformats, which could just as easily be a standalone Javascript library. There's no reason why this should be built-in, except to create a de-facto standard in an API which Mozilla controls. Moreover it promotes microformats as a de-facto standard, which I'm not comfortable with, because I think Microformats are an ugly hack in lieu of a proper solution.

The HTML Standards, Part 1

Thursday, July 12th, 2007

I am an XML addict. XML has that simplicity and elegance that programmers crave. XML represents a flow of structured data between applications in a form that is an ideal blend of computer-readability and human readability, and that makes profound sense to a lot of people.

XHTML bottles that for web markup. HTML does not.

(more…)

Sir Tim Inaugural Lecture

Wednesday, March 14th, 2007

Just watching the live video feed of Prof. Sir Tim Berners-Lee's inaugural lecture in the Electronics and Computer Science department at Southampton Uni. I can't see the slides which is a nuisance. I thought I'd type up a few notes as I listen.

He started off talking about wishy-washy guff about engineering versus analysis of network systems. And creativity, which is part of engineering.

Now he's found his feet a bit more. I thought it was amusing that he was trying to talk about Web 2.0 sites but without mentioning the actual term "Web 2.0″.

He made a big point about macrosopic social elements (the web community) deriving from microscopic (URI schemes and HTTP and HTML and stuff and junk). (This is exactly the point I make when trying to explain where TBL fits in to the history of the web: TBL is not responsible for the massive cultural system built on top of the web. It's mere chance that his distributed hypermedia system took root. A lot of people can't distinguish the utility of the web now from the seed protocols (not even ideas, as such, which were already established) that TBL gave us.)

He mentioned something about email and how it's abused.

The web – what it was intended to do and the primary concepts that drive it. Layering technologies on top of one another. Wow. Abstraction.

The web is an information space. A mapping between a URI and some information.

PageRank. Google. Deriving macrosopic web usage models from something very simple like number of links. Audio went a bit rubbish for a while but it's back now.

Wiki. How microscopic behaviour like collaborative editing grows into macroscopic systems like Wikipedia. This will revolutionise democracy and politics.

Blogs. Woo. The Blogosphere. May be rubbish. Who knows. Probably both rubbish and excellent at the same time.

Information in HTML format is not manipulatable. Se we need a semantic web to re-use data as data. RDF, OWL, SPARQL. Use URIs for things rather than web pages. And the relationships between overhead projectors and colours. Merge and query is very easy. FOAF networks. (Yay! I know all about those. Oh, I have to rebrand Mauvespace btw, following a conversation with a friend of a friend who is an IP lawyer. Just need to think of a name.)

Some websites are tables, some are trees, some are "hypercubes". (He keeps calling tables and matrices "rectangles". That strikes me a such a cute web-kiddy thing to do, labelling arrays as "Square, daddio" while graphs are new and "cool")

Something to do with trees and top-down OOP. (*shrug*)

What shape is the Internet? It's a net. (It's not. It's a fluffy cumulus cloud. Every first-year computer science student knows that.) It's robust.

The web is a web. What shape is that? What does that mean? (I would have thought it's a directed graph). It should be shaped like the world.

Common vocabularies for describing things with RDF. You get local collaboration to produce specific ontologies and you use some terms from global ontologies. Spatial things can be used in lots of applications. Overlapping ontologies.

The web is actually fractal. Structure at all different levels. (Fractal is not the right word).

Much less work is done in describing ontologies than using them.

Web Science includes

  • User interface for the web. SemWeb doesn't have this.
  • Building resiliant systems. Against slashdotting, attack. At an architectural level.
  • New devices – handheld and large screens.
  • Creativity. Connecting people and making them more effective. Allowing them to understand one another; letting half-formed ideas in two different people's heads on different sides of the planet connect.

Right, done.

It was a whistlestop tour of web science I suppose, but I didn't really feel that it was particularly insightful. Of course I'm not in the business of rationalising the way that the web works. I just program. I think TBL has to try to rationalise it because that's what he's famous for; at a personal level he probably feels people look to him to explain the ways of the beast. But of course he didn't create it. Mainly people just create web apps and it either catches on or not, or it needs a bit of pointling to actually make it work the way people want it to. With a lot of Web 2.0 sites, it just involves a huge amount of development to get to the point of having a web app that works well enough and scales, and then creative ideas can be tried out on pieces, beta tested and deployed.

This is exactly how the web started and evolved and I don't think I understand how we got to where we are now any better than I did before. I don't think it's possible to either; the web evolves in parallel across the globe. It doesn't have a single history behind it or a single motivation driving it. Deconstructing the web appears to me to be analagous with Psychohistory.

There is a podcast available, but don't feel obliged.

Writing an RSS client

Tuesday, February 13th, 2007

Interestingly, my latest paid project is to build an RSS reader. I am doing this not out of bloody-minded determination to reinvent the wheel, and I would be perfectly happy to adapt an existing project to work in the way I want, but none of the apps I have seen or tried does what I want it to do.

This project is a desktop feed notifier. It will poll feeds and pop up messages (non-intrusively) either when it starts or when it first sees them.

I have mentioned my views on RSS before, but happily they don't conflict this project. Because this is aimed at intranet service notifications there is a contract between producer and consumer, not merely a shared protocol.

I think that one good aspect of RSS is its ubiquity. Several apps already in use in this Intranet are RSS-aware and can be wired into this system with a minimum of work.

Without wanting to revisit the previous arguments too much, I might as well summarise them for completeness. I can envisage only two useful strategies for a syndication format:

  • Fixed contract: Specify a unique set of obligations for producer and consumer including both syntax and semantics. eg. RSS 0.90
  • Negotiated contract: Specify obligations of syntax, but encourage producers to offer a complete semantic representation, and allow consumers to build a customised syndication from within it. eg. RDF.

File uploads

Tuesday, January 23rd, 2007

I have mentioned briefly work that I was doing to wrap file uploading in AJAX for a proper experience. Browser-based file uploads have been downtrodden over the past few years.

In client terms, file uploads work in almost exactly the same way as they have always done: the page blocks while the data is posted, and a very small progress bar shows up in the status bar. This is a user interface disaster for big files.

On the server side, the situation is more varied, but there is often little support for streaming of file uploads. In PHP, file uploads are read wholly into memory, parsed and saved out to a temporary folder before a script even gets called. The request must fit within both PHP's file upload size limit and its memory limit. As far as I can tell, something similar happens in Zope although you can argue that Zope allows other standards for upload such as DAV and FTP natively. In plain CGI, of course, there is no handling of the uploads, so if you're using a CGI wrapper, it can do whatever you want to handle this. Perl's CGI.pm module allows a hook, at least. Python's cgi module doesn't, nor is it easy to subclass.

All in all, the situation of binding file uploads to form submissions, and processing of those in common server-side languages is wholly inadequate as file size gets large. File uploads are convenient because they are a commonly-supported fall-back, but the workarounds, although solving some of these problems, don't have the simplicity of a browser-native solution.

In my recent project I looked at ways of working around these limitations. The best workaround for the client-side problems I have found so far is to perform the upload in an <iframe>, using AJAX queries to present a progress bar. This still has problems, notably that it's one file at a time, both on the choosing and the uploading. In Firefox I can actually perform two concurrent uploads in different <iframes>, but the AJAX progress bar doesn't then update.

Server-side, I wrote the whole thing as a webserver so that the AJAX queries could talk directly to the thread streaming the upload. Additionally I wrote my own parser to parse on-the-fly the data uploaded, so that the daemon knows what is uploading at any given stage. It works quite well, and the system is extensible in that it could combine a daemon that allows other forms of upload; feedback for these would also appear in the browser windows.

Even so, I wish that file uploading was something people were thinking about more. It's central to so many web applications now.

There are numerous problems:

  • File uploads are synchronous. Downloads can happen in the background in their own, but uploads can't.
  • File uploads don't have a proper UI. Current browsers appear to show a tiny upload bar that isn't really very accurate and doesn't give data rates or estimated time remaining.
  • Uploads are chosen one at a time.
  • Javascript can't be used for polish. The model that has empowered Web 2.0 improvements is that of taking an existing HTML/HTTP model and allowing it to be controlled by Javascript. However, there is no way into the uploading or the file selection processes with Javascript.

The most general solution I can see would provide a Javascript API for uploading. This would allow Javascript to show a (native) file chooser dialog, and instruct the browser on what to do with the files it returns. POST or PUT to the origin server seem useful, as does FTP upload. Clearly there are security concerns, but I fail to see how, as long as Javascript may instigate an operation, read upload statistics, but not read the filesystem, this presents a problem.

Perhaps an AJAX-style API could be along these lines:

//configure a native dialog to present to the user
var ufc=new UploadFileChooser();
ufc.setAcceptableFileTypes(['image/jpeg', 'image/png']);
var uploads=ufc.chooseFiles();

for each (var u in uploads)
{
u.onreadystatechange=doSomething; //callback
//this URL is constrained the origin server to prevent XSS
u.beginHttpPost('http://example.com/upload');
}

After this, the user could close the tab or leave the page, and the browser would upload the files in the background, perhaps with a progress bar appearing within the Downloads window. Note that it could queue the files rather than uploading them all at once, depending on user settings. The Javascript, and indeed the user, should be able to request that an upload is aborted. The Javascript should also be able to query the upload, using the object reference provided.

Atom and RDF

Wednesday, January 17th, 2007

I'm annoyed with Atom.

I was hoping to use Atom to describe a range of things within Mauvespace, such as blogs, logs and so on, but it appears that even though there is a mapping from Atom to RDF, there is no inverse mapping from that RDF vocabulary to Atom, because it is not universally possible to convert in that direction.

For example, the <atom:author> element mandates exactly one child element <atom:name>. Even if an RDF reasoner can assume an author has a name, it does not necessarily know what it is. Also, if you pull Atom data from two feeds written under different pseudonyms into an RDF model and then claim that two authors are the same, the model stops being able to distinguish which name each feed was written under, unless you add a vocabulary to subclass Atom authors as pseudonyms of FOAF people.

These may seem like gripes about the mapping, but it's more serious than that. It means that there is no bijection between an arbitrary RDF model and a valid Atom 1.0 document.

I can see a few options:

  • Map RDF to Atom only. Construct a mapping from any sensible RDF model structure to Atom. That this would not be bijective means that the software could not import from Atom. An alternative RDF version would have to be provided to import triples.
  • Work instead with RSS 1.0, which is pure RDF but is widely considered deprecated and isn't as expressive as Atom anyway. Trivially, however, this is bijective.
  • Map Atom to RDF only. Create a separate Atom store and map into RDF only temporarily when I need to reason upon it or style it with the RDF template code. Lack of bijection means the store could not invariably re-export Atom intact. Synchronising updates bidirectionally between Atom and triple store becomes an issue.