YQL is so the bomb to get web data as XML or JSON
Yesterday I wrote a blog post on YDN about opening the web covering curl, pipes and YQL and today I did a more detailed deep-dive on Ajaxian about how YQL can help you to convert the web to JSON.
Suffice to say, I like YQL a lot – it is the command line interface to the web (and a text version of Yahoo Pipes). Go and play with it yourself:
As explained in the Ajaxian article, all the non-authentication web services can be accessed through a public REST API. Simply add your YQL statement to http://query.yahooapis.com/v1/public/yql?q= and add a format=json parameter and a callback parameter with the name of your callback function and you are set.
This would for example to allow you to search for rabbit images on the web and display them quick and dirty with a few lines of JavaScript:
<div id="photos"></div>
<script type="text/javascript" charset="utf-8">
function photos(o){
var out = document.getElementById('photos');
var html = '';
for(var i=0;i<o.query.results.result.length;i++){
var cur = o.query.results.result[i];
html += '<img src="'+cur.thumbnail_url+'" alt="'+cur.abstract+'">';
}
out.innerHTML = html;
}
</script>
<script type="text/javascript" src="http://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20search.images%20where%20query%3D%22rabbit%22%20and%20mimetype%20like%20%22%25jpeg%25%22&format=json&callback=photos"></script>
YQL allows you to access any freely available data service and even scrape HTML, how cool is that?



December 13th, 2008 at 12:51 pm
Why does the xpath query work only on html and not on xml or rss?
Pretty cool btw.
December 13th, 2008 at 3:33 pm
@Brian you don’t need XPATH there, all you need to do is use the SQL style syntax. For example the following only gets the title elements from the RSS feed of Ajaxian:
select title from rss where url=”http://feeds.feedburner.com/ajaxian” limit 3
December 14th, 2008 at 1:01 pm
Yeah, I know that, I was just assming that I can use xpath on any xml based document. Though in most cases non-html documents are much like database tables so you’re right, using sql on them makes much more sense.
One more question:
I have an atom feed with a tag <atom:link rel=”next” href=”" /> that points to the next page of the feed. How could tell yql to include the next few pages too?
December 15th, 2008 at 9:08 pm
Hey Brian, paging with atom:link isn’t supported yet but it does sound like a good idea. You’d need some scripting to do this.
get all the links which is posssible by running the following query:
select href from xml where url=”http://…/atomlink.xml” and itemPath=”feed.link” and rel=”next”
and then create a subselect with all the links in them i.e
select * from rss where url in ( ‘http://result1′, ‘http://result2′ …)
Thanks for the feedback on YQL
– Nagesh
December 17th, 2008 at 7:40 pm
Hi! Thanks for the response, I think this will do it!
December 17th, 2008 at 10:13 pm
Can you SELECT COUNT… with this?