MVC Controllers

May 3rd, 2011

MVC as a pattern has many interpretations, but an interpretation I think is common goes as follows:

  • Model - database abstraction/persistence
  • View – templates
  • Controller - all the code that interprets the request, interacts with models, and sets up and renders a template

There is a distinct problem with this, and it's that the controller is bloated with all manner of code  - things which don't seem bottom-up enough to be part of the model, or code to pre-chew model data for the benefit of dumb templates. This leads to controllers dozens of lines long that are difficult to read as a whole.

In my code (Django code, where controllers are called views) I have been try to avoid this, for a long while now. Since controllers constitute the glue between request, model interaction and setting up of the view, I read these frequently to determine how the application is glued together. Therefore I've found them to benefit from being as short and transparent as possible.

These facts should be immediately visible on reading the code of a controller:

  1. What are the input parameters from the request (GET/POST/Cookies/Headers/Session Vars)
  2. How those parameters are interpreted/validated (their domain)
  3. What operation the request performs
  4. What variables are passed to the template system

I think it's worth formularising this. My take would be something like this:

A controller should contain no code other than these distinct phases:

  1. Unpacking the request parameters/validating the request.
  2. Invoking an operation, defined elsewhere.
  3. Setting up the context for template rendering.

These can even be labelled as such.

I've conflated unpacking/validating because these can generally be stated succinctly together, using trivial code. For example, in Django, you'd see code like this:

def category(request, category_id):
  category = get_object_or_404(Category, id=category_id)

which I think succinctly comprises these facts:

  • The controller named category receives a parameter category_id
  • The valid domain of category_id is the set of all Category ids
  • If category_id is outside that domain, Http404 is raised

When identifying the business logic of a view, I've found that when the code enacting this grows to more than a couple of lines it's best either to be put into a form's .save() method (thus a form basically defines one operation on a mixed bag of unvalidated input), or into model methods (usually for simple model manipulation), or into a separate class that defines more complicated business logic.

When reading the template context setup, I want to know, when writing templates, what variables I have. Thus this should be explicit in the code of the controller I'm looking at.

Dev versus DevOps

January 20th, 2011

Having spent the past few months working in ops I have learned a wide range of new skills in server and network infrastructure. I found that my skills as a developed augmented what my then competent ops skills. Coming back to full-time development now I was expecting to find that my infrastructure skills would improve my development. What I wasn't expecting was such an early and staggering example.

A year ago, I solved a problem that I was then experiencing – querying ActiveDirectory with python-ldap under Debian. There was an incompatibility between GnuTLS and AD that made this impossible, due to AD missing TLS 1.1 support and no fallback from TLS 1.1 to TLS 1.0. This would happen:

$ gnutls-cli -p 636 ad.example.com
Resolving 'ad.example.com'...
Connecting to 'ad.example.com:636'...
*** Fatal error: A TLS packet with unexpected length was received.
*** Handshake has failed
GNUTLS ERROR: A TLS packet with unexpected length was received.

This worked when disabling TLS 1.1 in GnuTLS, but libldap does not expose a way to set GnuTLS options, and so nor does python-ldap.

My developer solution

My 2010 workaround was to recompile libldap against OpenSSL (which tested to work with AD). This is how it is done:

Build instructions for libldap

  1. check that source repos are available in /etc/apt/sources.list
  2. $ apt-get source openldap
  3. # apt-get build-dep openldap
  4. # apt-get install libssl-dev
  5. cd to the openldap-* directory
  6. $ CPPFLAGS=-D_GNU_SOURCE ./configure –prefix=<where> –with-tls=openssl
    See http://www.openldap.org/its/index.cgi/Build?id=5464 for reasons behind the _GNU_SOURCE flag.
  7. $ make -j <number_of_cpus> depend
  8. $ cd libraries
  9. $ make -j <number_of_cpus>
  10. $ make install
  11. $ cd ../include
  12. $ make install

Build instructions for python-ldap

libldap may provide the same binary interface (ABI) whether it's compiled with GnuTLS or OpenSSL, but there is a chance that it may differ, so recompiling python-ldap against the new libldap is recommended.

  1. # apt-get install python-dev
  2. Obtain source for stable python-ldap from http://pypi.python.org/pypi/python-ldap/
  3. Extract archive and enter extracted directory
  4. Edit setup.cfg:

    add <where>/lib after library_dirs =
    add <where>/include after include_dirs =
    add extra_link_args = -L<where>/lib -rpath <where>/lib somewhere in the [_ldap] section.

  5. $ mkdir -p <where>/lib/python<version>/site-packages (where <version> is eg. 2.5, 2.6)
  6. $ PYTHONPATH=<where>/lib/python<version>/site-packages/ python setup.py install –prefix=<where>

Running Python

To use the recompiled version of the libraries

$ PYTHONPATH=/lib/python/site-packages/ python example.py

My DevOps Solution

Using stunnel it is possible to "unwrap" the SSL layer and provide unencrypted access to python-ldap. stunnel is compiled against OpenSSL and thus doesn't suffer from the GnuTLS bug.

$ sudo stunnel -c -d 127.0.0.1:389 -r ad.example.com:636

As a developer there's a certain hesitance to introduce another independent service into the system. It feels like weakening the chain, going from one point of failure to many points of failure – potentially bugs or misconfigurations in the adapting component itself or misconfigurations of the server that is supposed to be hosting that component.

As a DevOp, given the tools and experience to maintain infrastructure systems that involve vastly more components than this, it seems robust – no non-standard components, just an easy-to-configure off-the-shelf tool doing what it is intended for.

Optimism

July 30th, 2010

I find this somewhat overly optimistic.

optimism

Puppet

July 4th, 2010

Over the last couple of months I have been getting to grips with Puppet, a client-server system for applying configurations to remote machines. It is a powerful tool for network administration, allowing the configuration for your entire network to be stored in one versioned repository and applied with little effort. There are other, similar tools in this space, but Puppet seems to be particularly popular at the moment.

Puppet includes a master server that serves configuration and files, and a client daemon that connects back to the master using client-certified SSL, and applies configuration on the machine on which it runs. Configurations are defined in Puppet's own declarative language.

While Puppet is described as a tool that will apply configurations to remote machines, the fact that Puppet manifests comprise definitive knowledge about the network configuration should not be overlooked. As you configure services, the Puppet rules that you write serve as documentation of the process that you followed.

This is not to say Puppet is without problems. The client is very heavy, and can consume lots of memory to apply a configuration – this can be a showstopper on an otherwise very light VM. The Puppet language is clean for simple cases, but restrictions in its syntax that stem either from incompleteness, or deliberate restrictions intended to enforce configuration sanity can defeat attempts to write complicated and re-usable recipes.

It is also difficult to test Puppet recipes. You can run them on a VM, but it's time-consuming to ensure that they apply correctly, first time, given an out-of-the-box install. It's somewhat likely that you would need to run Puppet once, then run apt-get update, and then run Puppet again.

From my point of view as a developer who generally works with normalised databases, what I find ugly is that the Puppet repository is not one fact, one place. Puppet recipes most frequently just copy configuration files onto the client, and the particulars of a configuration file may implicitly depend on facts buried in many other configuration files or Puppet manifests. For example, the IPs listed in DNS zone files must match the IPs assigned in each host's network configuration.

To avoid some of these problems, a future Puppet-like tool could perhaps take the form of a comprehensive and extensible network information system (eg. RDF), and a suite of tools and recipes for compiling that information into something as lightweight as a bash script to run on each remote machine.

Fonts and font-family

February 3rd, 2010

Yesterday, on Twitter, I watched a discussion emerge as one person I follow pointed out that another person's hosted wordpress.com blog was illegible on her computer, with all of the content appearing in ugly bold italics.

While we never got to the bottom of that issue (I couldn't reproduce it), it's worth backing up and examining font use on the web.

Fonts, unlike any other aspect of web browser rendering, depend on the platform, not the browser version.

The reason is simple: fonts are not bundled with the browser, but with the operating system, or installed with some creative applications.

If you select fonts based on how they look on your computer, they will look different on another computer with a different set of fonts installed. Also, fonts are matched by name in CSS, so when you write

font-family: "Helvetica", "Arial", sans-serif;

you are requesting a font named "Helvetica", then one named "Arial", then the default sans-serif fonts. This is a very common thing for people to write, because Helvetica is a popular choice on Mac, Arial is a popular choice on Windows, and sans-serif is a catch-all. The intention is to select a nice sans-serif font on each platform.

Unfortunately, "Helvetica" can exist on Windows and Linux as well as Mac. Helvetica has been around as a typeface since 1957, and there are different versions of it around – by what route, or with what degree of intellectual property infringement, I do not know. There are also a fair number of variants that your computer might also consider, if they are installed.

On Linux, Helvetica was historically an X bitmap font (ugly, impractical things that are now effectively dead). These days it is generally an fontconfig alias for a free sans-serif font, but renders with iffy hinting and kerning, perhaps to conform to the original font's metrics (ie. it has been shoehorned into exactly the same space, so that printed publications don't come out wrong). I actually find this font quite uncomfortable to read.

On Windows, you may occasionally find Helvetica exists, perhaps even as the same font, installed on its own, or with some software suite, but if you do you'll find several browsers on Windows render fonts with Microsoft's ClearType renderer optimised for legibility, not the Mac's quality-optimised renderer, also used in Safari 3 on Windows. Microsoft own fonts have been tweaked to work well with ClearType – others may not. Linux is (as ever) more flexible: it's possible to configure the amount of hinting to use through fontconfig, although most users will keep their distribution's defaults.

Ultimately it's an impenetrable picture – you cannot be sure that the fonts you list will give anything like the browsing experience you were expecting. The same overall picture applies with serif fonts and monospace fonts.

The best solution (unless you want to try downloadable fonts, which I wouldn't recommend for body fonts) is to side-step the specifics of fonts entirely and delegate to the user/browser/operating system. There are three suitable aliases for font families: sans-serif, serif, and monospace. These will reliably give you a good font of that category. There are two other aliases, cursive and fantasy which are too poorly defined – you could get practically anything.

Is this really the only option? If you're prepared to go to the enormous lengths required, can you not pick a list of named fonts, test broadly and claim it works? Well, yes, if you test broadly enough you can get say 99.9% coverage. Unfortunately, that's not always good enough.

The topic of a site turns out to significantly affect the statistics of users that visit it. For example, a site about Linux will get more Linux hits. A site about using Photoshop will get most hits from people with Adobe Creative Suites installed, and that comes with fonts. So as a theme designer, what was 99%+ for you could be 90% for some of the people who use your theme.

So, in summary, stick to the safe fonts: sans-serif, serif and monospace. Fonts that are ubiquitious and designed for the screen are also quite safe – Arial and Verdana. You might be able to find some other safe places by consulting statistics if you are feeling creatively hemmed in. But please, don't make font assumptions.

The Virtual Revolution

January 31st, 2010

Last night's BBC documentary The Virtual Revolution, available on iPlayer now, is exactly typical of all internet documentaries I have seen, from the generic title (pick one of "The Digital", "The Cyber", "The Virtual" and one of "Revolution", "Renaissance", "Tomorrow" etc.) to using "web" and "internet" interchangeably, to cutting to shots of computer screens showing something internetty, like repeatedly typing www.com into a browser's address bar (it is a valid domain, but it is enormously more likely to be typed through incompetence), to the intentionality ascribed to the entire edifice, which, they alleged, was deliberately designed to democratise everything ever.

In fact the only unusual thing was the omission of make-up to blend Aleks Krotoski's blush red nose into the rest of her face. I don't mean to be personally insulting to Krotoski – if my nose was coming out bright red on camera I'd want the production team to address it.

The story was woven into a history of the internet as told by "key players and pioneers" including Sir Tim, Youtube, Wikipedia and Arianna Huffington of frequently alt-med promoting rag HuffPo, thus neatly side-stepping the role of the millions of faceless bloggers and web users who pump content into the web and Web 2.0 sites and who are in truth responsible for what the web is today.

Actually, I say sidestepping – blogs were mentioned.

The world of blogging is going through a crisis. Of the more than 130 million blogs active since 2002, it's estimated that over 90% are now dormant.

Ok, a lot of people set up blogs and stop posting to them. But ignore that: what they've reversed here is the fact that there are 13 million active blogs on the web. That is a HUGE number. That means there is one active blog for every 130 Internet users.

Youtube in particular is noteworthy only for being the most popular video distribution site. As a site neither pioneering or unique, you wonder how their CEO's opinion could possibly be more valuable than that of it's more popular video bloggers. Incidentally, unlike many sites, such as Facebook, there's almost no drawback to switching to a competitor, such as Vimeo.

Jimmy Wales, founder of Wikipedia, on the other hand, is truly visionary. Nobody would have thought a wiki could scale to the size of an encyclopaedia and beyond without its quality suffering a lot more than Wikipedia's actually does. The result is the most useful site on the Internet outside of Google. Wales did not, of course, invent the wiki or prove the wiki concept itself.

But the main thing this programme gets wrong is simple definitions. The whole episode laments the fact that the internet was supposed to be democratic, but they claim it isn't because everyone uses Facebook, or Youtube, and sites like HuffPo get more traffic than your average blog. The word oligarchy was used.

Wrong. People can choose which websites to use or not use. Remember Myspace? Owned by News Corp, one of the world's biggest media companies? What happened to that? I suppose, as oligarchs, they must have decided for us that we weren't going to use it any more, right?

I won't bother with the rest of the series.

Facebook Account Hacked

January 22nd, 2010

Today my Facebook account was hacked. Messages were sent to 42 of my friends, with a random subject and contents of the form:

hi! <recipient's first name>! <link>

All of the messages were shown as sent via Facebook Mobile, which, to my knowledge, I have never used.

I did several things:

  1. I posted on my Facebook wall advising people not to open these messages.
  2. I reported the intrusion to Facebook.
  3. I changed my Facebook password.
  4. Replied manually to every message sent warning people not to click on the links.

Below is the reply from Facebook. I've not replied yet, but it's frustrating that Facebook have not listened to a word I've said.

Subject: Re: Messages or Posts Were Sent From My Account, and I Didn't Send Them

Hi,

We have detected suspicious activity on your Facebook account and have reset your password as a security precaution.

Er… I told you about it. You're replying to an e-mail which I sent you about it. Detected my arse.

It is possible that malicious software was downloaded to your computer or that your password was stolen by a phishing website designed to look like Facebook. Please carefully follow the steps provided:

1. Run Anti-Virus Software: If your computer has been infected with a virus or with malware, you will need to run anti-virus software to remove these harmful programs and keep your information secure.

For Microsoft
http://www.microsoft.com/protect/viruses/xp/av.mspx
http://www.microsoft.com/protect/computer/viruses/default.mspx

For Apple
http://support.apple.com/kb/HT1222

As I told you in my e-mail, I run Linux and it is up-to-date.

2. Reset Password: From the Account Settings page, you will need to create a new password. Be sure that you use a complex string of numbers, letters, and punctuation marks that is at least six characters in length. It should also be different from other passwords you use elsewhere on the internet. Here is your new login information:

<redacted>

As I told you in my e-mail, I have already changed my password. Changing it again and sending it to me in cleartext e-mail is actually making the security of my account worse.

3. Secure Email: Make sure that any email addresses associated with your account are secure, since anyone who can read your email can probably also access your Facebook account. If you believe someone has accessed one of your email accounts, you should change its password.

As I told you in my e-mail, I don't believe anyone has access to my e-mail.

4. Never Click Suspicious Links: It is possible that your friends could unknowingly send spam, viruses, or malware through Facebook if their accounts are infected. Do not click this material and do not run any .exe files on your computer without knowing what they are. Also, be sure to use the most current version of your browser as they contain important security warnings and protection features.

As I said in my e-mail, my operating system is Linux and it is up-to-date. I cannot run any .exe files without serious difficult. In practical terms, it is very unlikely to have been compromised.

5. Log in at Facebook.com: Make sure that when you access the site, you always log in from a legitimate Facebook page with the facebook.com domain. If something looks or feels suspicious, go directly to www.facebook.com to log in.

Please. If I want to visit Facebook I select it from the AwesomeBar. I don't even receive e-mails from Facebook any more because I've disabled them, so I'd spot a phishing attack a mile off.

6. Learn More: Please visit the following page for further information about Facebook security and information on reporting material http://www.facebook.com/security

Wow, practical.

Finally, if this did not resolve your issue, please revisit the Help Center to select the appropriate contact form and submit a new inquiry:

http://www.facebook.com/help/?ref=pf

So that you can ignore what I say all over again?

Thanks,

The Facebook Team

Thanks for nothing.

These e-mails include random links, and it's probably that the nature of the attack could be uncovered by finding out more about what these links contain. It seems very probable that the page you would see will try in some way to continue the attack. That is the definition of a worm: an attack that propagates itself over the network. I tried downloading the contents of a link with wget. It timed out.

Worms are not unknown on Facebook. As always, think very carefully before clicking on untrusted links, installing untrusted apps, and check carefully that the site you are entering your credentials into is the one you expect.

My thanks to Sammy and Marit for alerting me to the attack.

What is Twitter?

December 27th, 2009

Over the past few months, I have found myself in conversations about Twitter. Judging from the way people have voiced their preconceptions, Twitter is one of the more misunderstood websites on the intertubes, with common misconceptions including "Why would I want to read about every little thing someone is doing?", "It's just the latest fad" and "I don't know anybody on Twitter". Trying to avoid sounding like a shill, I would like to address these misconceptions.

Twitter is usually described as a microblogging service, a term which is not really descriptive but slightly disingenuous. Users write 140-character tweets. They can select other users to follow, thus building a stream of tweets that, hopefully, matches your interests. They can reply to or mention other users. It's also possible to retweet or "RT" a tweet, distributing it to your own followers.

This misses the point. Twitter provides three main things: identity, a voice, and the ability to build channels from other users' voices.

Supporting this, it also provides numerous ways to find different voices to add into the mix, with searches, and links from other tweets, and trending topics. Unlike other social networks you are generally free to follow whomever seems interesting: your voice is public, your followers are not your friends but those interested in your tweets.

Your identity, tweets and channels can be used on third-party sites as well as Twitter, which means that Twitter can be used as a platform for other applications. Whereas Facebook provides a photo albums tool – like it or lump it – use whatever photo-sharing website you like with Twitter. There are several in widespread use. It's quite a democratic system. You can often log into third party websites with your Twitter identity, tying your actions there to your Twitter voice.

Twitter is more like IRC than blogs; the short tweet length demands snippets, ideas, jokes, links and – though it's not quite a 'real-time' as IRC – it's quite possible to conduct a conversation.

People do not tweet about every little thing they are doing. Such a Twitterer would not be interesting to follow. It's not just the latest fad; it's a platform for sharing news and interesting tidbits that has already broken major news stories, made and buried film releases, and on which is built a rich and growing collection of social tools that, unlike Facebook, compete with and improve upon one another. And you don't need to know people, because there are already thousands of people tweeting about exactly those things you are interested in. Follow them, reply to them… maybe you'll even make some new friends. When's the last time you did that purely on Facebook?

The best advice I can give to anyone who has heard the buzz about, but didn't "get" Twitter, is just to try it. Twitter is new, and people are constantly discovering new ways to use it. Tweet about what interests you. Follow people who interest you. If you do, you'll probably find Twitter interesting and engaging.

The dangers of double resizing

October 25th, 2009

Amazon have made a bit of an mess of building their thumbnails. On their homepage I was greeted with these:

51PVI7LcjDL._SL123_PIsitb-sticker-arrow-sm,TopRight,8,-14_OU02_51VA3NskorL._SL123_PIsitb-sticker-arrow-sm,TopRight,8,-14_OU02_

The moiré pattern of blurriness is an artifact – evidence of the fact that these "Look inside" thumbnails are caused by resizing already thumbnailed images – probably the thumbnail of the book cover without the "Look inside" banner. To avoid this on your sites, you need to build thumbnails from a sufficiently high-resolution image – ideally a high-resolution original. In practice, it can be faster and less memory-hungry to thumbnail from a medium-sized image, and this will generally not show visible artifacts. Of course, if you've already got a high-resolution image loaded into memory, you can side-step all of the quality issues by building all of the thumbnails you might need from it at once. Note also that you need to resize down enough to hide any JPEG compression artifacts.

To understand how the tell-tale moiré pattern comes about, let's imagine the source and destination pixel grids:

amazon-grids

When we overlay them you can see the moiré pattern appearing.

amazon-moire

Where the grid intersections are aligned, one source pixel maps fairly closely to a destination pixel, which makes that spot in the thumbnail crisp. But as you move away from those spots and the error builds up, the grid intersections disalign, and one source pixel is smeared over four destination pixels. That makes for a blurry spot.

Answers.com double-click

October 16th, 2009

A few weeks ago I mentioned word-selection by double-click.

I have discovered that Answers.com improves on this with a rather nifty hidden feature: if you double click on any word on the page it will immediately look that word up in Answers.com using AJAX!

This looks very innovative to me. Using the rich Javascript API to augment the browser's existing functionality is very pleasant, but here the product is a dictionary/reference site that is totally cross-linked! Poor old Wikipedia seems rather limited by comparison (though, to be fair, there are massive advantages to conventional links. This technique is not a replacement for that).