[img]http://bd7lx.iteye.com/upload/attachment/pic/2420/58acaa6a-bd8a-4661-bb32-46932f1f4e4b-thumb.jpg[/img]
:D
[img]http://bd7lx.iteye.com/upload/attachment/pic/2421/d7f401d5-5a9a-48a3-beb8-d37d901f89dc-thumb.jpg[/img]
其实想说的是鸡汤, 美丽的rubyful soup 和Hpricot 的 HTML Parser for Ruby
http://www.crummy.com/software/BeautifulSoup/
Rubyful Soup 1.0.4 released February 1, 2006
http://www.crummy.com/software/RubyfulSoup/
http://code.whytheluckystiff.net/hpricot/
接下来将解释如何用Html的解析工具,把网站上想要的内容刮下来, 请稍候。
可以看看已经讨论过的相关内容先
http://www.railscn.com/viewtopic.php?t=473
http://www.railscn.com/viewtopic.php?t=1038
http://www.rubyrailways.com/data-extraction-for-web-20-screen-scraping-in-rubyrails/
这个WWW::Mechanize, a handy web browsing ruby object 也被用作HTML 解析用.
http://rubyforge.org/projects/mechanize/
[img]http://code.whytheluckystiff.net/hpricot/chrome/site/images/hpricot-small.png[/img]
Hpricot处理Html快,解析XML也是相当的快
http://www.rubyinside.com/parse-xml-quickly-and-easily-with-hpricot-166.html
偷上瘾了,因为太简单了, 今天最新的新闻贴:
初步鉴定结果:
技术含量 一个星 代码量 五颗 文章长度 6颗星
THE Unbelievably Easy Way to Steal Other Web Sites: Addictively Amazing!
http://web2withrubyonrails.gauldong.net/2006/11/02/the-unbelievably-easy-way-to-steal-other-web-sites-addictively-amazing/