Perhaps the oddest thing about Wolfram|Alpha is that the text that appears in query results is not text at all, but is in fact made up of dynamically generated GIFs:
All output content is rendered as images, for consistency.
Of course, a sentiment like that would make any web designer jump off a bridge. Considering all the other nice UI nuances that Wolfram|Alpha has, I call bullshit. It’s not about visual consistency.
I think it’s an attempt at preventing what is quickly becoming the bane of any informative website’s existence: screen scraping.
Screen scraping is, of course, using a script or bot to extract data from the visual output of a page. A web screen scraper digs out the needed data from the HTML source and formats it accordingly. This technique is used to subvert APIs, feeds, etc. when these “legally provided” methods of access don’t give you what you need. Of course, it’s also a way to really piss the people who run the website off, because screen scraping is typically immune to throttling and data control, unlike an API or feed, which can be monitored and cached.
And notably, it appears that this kind of control is of the utmost interest to Wolfram|Alpha, as it’s part of their “Step 3: Profit!” plan (also from the FAQ):
Subscriptions will be available in the near future with enhanced features for large-scale and commercial use.
That said, I can’t imagine it would be very difficult to write an OCR like re-texter for data scraped from Wolfram|Alpha.
When that happens will they have to change it so all the text looks like a smear of CAPTCHA images?
What do you think about this solution to screen scraping?