Screencast: Building an online profile of distributed data with YQL
Distributing your information all over the web has become a common practice over the last few years and it makes a lot of sense. By covering lots of distribution channels you can reach various audiences and get comments and feedback from them.
You also make yourself independent of a single online resource – if your server is unavailable your data is still around. I could go on with the benefits of distribution (after all I’ve written a book on the subject) but let’s take a look on the flipside: by spreading your data all over the web you also spread yourself thin and you want a single resource to act as your main URL.
People have been telling me for a while that they don’t have time to find all the things I leave across the web and that they are wondering if there’s a single entry point. One Solution is FriendFeed but you want to be able to style your “online profile” more than that.
This is where YQL comes into the equation. Using YQL, a YUI CSS grid, a few dozen lines of PHP and a bit of CSS I managed to pull together My online portfolio http://icant.co.uk and you can do this as easily. The following screencast shows you how it is done:
You can also “download a readable version of the screencast for ipods”:http://us.dl1.yimg.com/download.yahoo.com/dl/ydn/yqlscreencast2.m4v.
Since I put together the screencast (which was a bit hurried as I needed to catch a flight) I’ve updated the idea with yet another script that scrapes the resulting HTML document to create an “RSS feed of all my data on the web”:http://icant.co.uk/feed.php.
- “Check out the source of index.php”:http://icant.co.uk/usethesourceluke.php?file=index.php
- “Check out the source of feed.php”:http://icant.co.uk/usethesourceluke.php?file=feed.php
Using YQL has a few more benefits than reading all the different sources yourself and mixing them up: the results are cached for you, YQL’s connection to the web is very much likely to be faster than yours which makes the fetching process easier and you have full control over what’s happening as YQL output gives you diagnostics information.
I’ll talk more about in YQL in various talks in the nearer future, and there are even more interesting changes to the system itself around the corner. Stay alert for awesome updates.
Tags: api, data distribution, scrapi, scraping, screencast, webapi, ydn, yql


April 15th, 2009 at 3:18 pm
I was thinking along these lines a few weeks ago (actually, in relation to evolt.org) – why content manage & host everything yourself, when there are free(beer) services available to do that for you? All you need to do then is have a site that is just a shell to integrate it all and overlay your branding.
Of course the risk is lockin, and the potential for those hosting/CM services to die…
April 15th, 2009 at 3:25 pm
The other benefit of doing it this way is what we’ve talked about for evolt.org – serving the content from multiple platforms, avoiding the “I can’t contribute because I don’t know Drupal/whatever” and letting people code a front end in whatever they’re most used to.
Having the content stored neutrally is the key to this, allowing interchangeable front-ends to deliver it, roundrobin served from the multiple platforms, transparently to the user.
April 15th, 2009 at 11:04 pm
Seeing this on your start page: “Warning: Invalid argument supplied for foreach() in /nfs/c01/h02/mnt/4450/domains/icant.co.uk/html/index.php on line 45″
Displaying errors on a production server… You just lost one guru point!
April 15th, 2009 at 11:30 pm
@Lars interesting, doesn’t have an error here. Yeah, the server is not set up by me, that is mediatemple’s default. Should have a safeguard now though.