YQL使ってみた。

Web上のコンテンツをSQLっぽい言語で取得可能にするサービスYQL。便利ですのぅ。

こんなYQLで、

select * from html where url='http://movapic.com/fn7' and xpath = '//td[@class="image"]'

スクレイピングしてくれた結果を返してくれる。
以下はXMLの場合。

<?xml version="1.0" encoding="UTF-8"?>
<query xmlns:yahoo="http://www.yahooapis.com/v1/base.rng" yahoo:count="10" yahoo:created="2009-12-01T12:09:37Z" yahoo:lang="en-US" yahoo:updated="2009-12-01T12:09:37Z" yahoo:uri="http://query.yahooapis.com/v1/yql?q=select+*+from+html+where+url%3D%27http%3A%2F%2Fmovapic.com%2Ffn7%27+and+xpath+%3D+%27%2F%2Ftd%5B%40class%3D%22image%22%5D%27">
    <diagnostics>
        <publiclyCallable>true</publiclyCallable>
        <url execution-time="3185" proxy="DEFAULT"><![CDATA[http://movapic.com/fn7]]></url>
        <user-time>3190</user-time>
        <service-time>3185</service-time>
        <build-version>3805</build-version>
    </diagnostics>
    <results>
        <td class="image" width="420px">
            <a href="/fn7/pic/915153">
                <img class="thumnail" src="http://image.movapic.com/pic/s_200911281634014b10d26992285.jpeg"/>
            </a>
        </td>
        <td class="image" width="420px">
            <a href="/fn7/pic/914234">
                <img class="thumnail" src="http://image.movapic.com/pic/s_200911281432114b10b5dbb78a0.jpeg"/>
            </a>
        </td>
        <td class="image" width="420px">
            <a href="/fn7/pic/913974">
                <img class="thumnail" src="http://image.movapic.com/pic/s_200911281401304b10aeaa3806b.jpeg"/>
            </a>
        </td>
        <td class="image" width="420px">
            <a href="/fn7/pic/913872">
                <img class="thumnail" src="http://image.movapic.com/pic/s_200911281348134b10ab8dac6f6.jpeg"/>
            </a>
        </td>
        <td class="image" width="420px">
            <a href="/fn7/pic/909355">
                <img class="thumnail" src="http://image.movapic.com/pic/s_200911271648264b0f844a6ae66.jpeg"/>
            </a>
        </td>
        <td class="image" width="420px">
            <a href="/fn7/pic/908490">
                <img class="thumnail" src="http://image.movapic.com/pic/s_200911271255154b0f4da313a7e.jpeg"/>
            </a>
        </td>
        <td class="image" width="420px">
            <a href="/fn7/pic/906882">
                <img class="thumnail" src="http://image.movapic.com/pic/s_200911262206244b0e7d5020d74.jpeg"/>
            </a>
        </td>
        <td class="image" width="420px">
            <a href="/fn7/pic/906233">
                <img class="thumnail" src="http://image.movapic.com/pic/s_200911262018514b0e641bc49ce.jpeg"/>
            </a>
        </td>
        <td class="image" width="420px">
            <a href="/fn7/pic/903066">
                <img class="thumnail" src="http://image.movapic.com/pic/s_200911252236134b0d32cd74280.jpeg"/>
            </a>
        </td>
        <td class="image" width="420px">
            <a href="/fn7/pic/889236">
                <img class="thumnail" src="http://image.movapic.com/pic/s_200911222143304b0931f269adc.jpeg"/>
            </a>
        </td>
    </results>
</query>

これはもっと使ってみるべき。