YQL使ってみた。
Web上のコンテンツをSQLっぽい言語で取得可能にするサービスYQL。便利ですのぅ。
こんなYQLで、
select * from html where url='http://movapic.com/fn7' and xpath = '//td[@class="image"]'
スクレイピングしてくれた結果を返してくれる。
以下はXMLの場合。
<?xml version="1.0" encoding="UTF-8"?> <query xmlns:yahoo="http://www.yahooapis.com/v1/base.rng" yahoo:count="10" yahoo:created="2009-12-01T12:09:37Z" yahoo:lang="en-US" yahoo:updated="2009-12-01T12:09:37Z" yahoo:uri="http://query.yahooapis.com/v1/yql?q=select+*+from+html+where+url%3D%27http%3A%2F%2Fmovapic.com%2Ffn7%27+and+xpath+%3D+%27%2F%2Ftd%5B%40class%3D%22image%22%5D%27"> <diagnostics> <publiclyCallable>true</publiclyCallable> <url execution-time="3185" proxy="DEFAULT"><![CDATA[http://movapic.com/fn7]]></url> <user-time>3190</user-time> <service-time>3185</service-time> <build-version>3805</build-version> </diagnostics> <results> <td class="image" width="420px"> <a href="/fn7/pic/915153"> <img class="thumnail" src="http://image.movapic.com/pic/s_200911281634014b10d26992285.jpeg"/> </a> </td> <td class="image" width="420px"> <a href="/fn7/pic/914234"> <img class="thumnail" src="http://image.movapic.com/pic/s_200911281432114b10b5dbb78a0.jpeg"/> </a> </td> <td class="image" width="420px"> <a href="/fn7/pic/913974"> <img class="thumnail" src="http://image.movapic.com/pic/s_200911281401304b10aeaa3806b.jpeg"/> </a> </td> <td class="image" width="420px"> <a href="/fn7/pic/913872"> <img class="thumnail" src="http://image.movapic.com/pic/s_200911281348134b10ab8dac6f6.jpeg"/> </a> </td> <td class="image" width="420px"> <a href="/fn7/pic/909355"> <img class="thumnail" src="http://image.movapic.com/pic/s_200911271648264b0f844a6ae66.jpeg"/> </a> </td> <td class="image" width="420px"> <a href="/fn7/pic/908490"> <img class="thumnail" src="http://image.movapic.com/pic/s_200911271255154b0f4da313a7e.jpeg"/> </a> </td> <td class="image" width="420px"> <a href="/fn7/pic/906882"> <img class="thumnail" src="http://image.movapic.com/pic/s_200911262206244b0e7d5020d74.jpeg"/> </a> </td> <td class="image" width="420px"> <a href="/fn7/pic/906233"> <img class="thumnail" src="http://image.movapic.com/pic/s_200911262018514b0e641bc49ce.jpeg"/> </a> </td> <td class="image" width="420px"> <a href="/fn7/pic/903066"> <img class="thumnail" src="http://image.movapic.com/pic/s_200911252236134b0d32cd74280.jpeg"/> </a> </td> <td class="image" width="420px"> <a href="/fn7/pic/889236"> <img class="thumnail" src="http://image.movapic.com/pic/s_200911222143304b0931f269adc.jpeg"/> </a> </td> </results> </query>
これはもっと使ってみるべき。