« Ruby Http Client Library Performance | Main | Using a Proxy Object to Design an API »

February 03, 2009

Comments

Marc

For Nokogiri:
sudo aptitude install libxml2
sudo aptitude install libxslt-dev


Feedzirra itself:
sudo gem install pauldix-feedzirra

After that still several errors, so:
sudo gem install curb
sudo gem install curl-multi


require "rubygems"
require "feedzirra"
feed = Feedzirra::Feed.fetch_and_parse("http://www.tvnzb.com/tvnzb_new.rss")

--> NameError: uninitialized constant Curl::Multi
from /var/lib/gems/1.8/gems/pauldix-feedzirra-0.0.1/lib/feedzirra/feed.rb:57:in `fetch_and_parse'

after that error I thought it might be an ok idea to require "curl-multi"
require "curl-multi"
/var/lib/gems/1.8/gems/curl-multi-0.2/lib/curl-multi.rb: In Funktion »perform«:
/var/lib/gems/1.8/gems/curl-multi-0.2/lib/curl-multi.rb:347: Warnung: Aufruf von »_curl_easy_getinfo_err_string« mit Attributwarnung deklariert: curl_easy_getinfo expects a pointer to char * for this info
/var/lib/gems/1.8/gems/curl-multi-0.2/lib/curl-multi.rb:350: Warnung: Aufruf von »_curl_easy_getinfo_err_long« mit Attributwarnung deklariert: curl_easy_getinfo expects a pointer to long for this info
/var/lib/gems/1.8/gems/curl-multi-0.2/lib/curl-multi.rb: In Funktion »add_to_curl«:
/var/lib/gems/1.8/gems/curl-multi-0.2/lib/curl-multi.rb:248: Warnung: Aufruf von »_curl_easy_setopt_err_write_callback« mit Attributwarnung deklariert: curl_easy_setopt expects a curl_write_callback argument for this option
=> true

But still:
feed = Feedzirra::Feed.fetch_and_parse("http://www.tvnzb.com/tvnzb_new.rss")
NoMethodError: undefined method `on_success' for #

Elad

it's only a feed reader library? you can't build a feed with it?

Paul Dix

Marc,
You need to have libcurl installed. Also, it doesn't use curb or curl-multi. You must have the taf2-curb fork of curb. What were the errors thrown when you did a gem install pauldix-feedzirra?

Elad,
It's only a fetcher and parser. Generating feeds depends on what you're generating from. To generate you really only need a few lines of builder code. It's not something I'd even consider using a library for (other than builder or something like that).

Daniel Higginbotham

This looks great! I have a couple sites that need something like this.

Bryan Helmkamp

Congrats on the release, Paul. Those are some hot benchmarks

julbouln

looks great, but i can't get it working :

/Library/Ruby/Site/1.8/rubygems/custom_require.rb:31:in `gem_original_require': no such file to load -- curb_core (LoadError)
from /Library/Ruby/Site/1.8/rubygems/custom_require.rb:31:in `require'
from /Library/Ruby/Gems/1.8/gems/taf2-curb-0.2.3/ext/curb.rb:5
from /Library/Ruby/Site/1.8/rubygems/custom_require.rb:31:in `gem_original_require'
from /Library/Ruby/Site/1.8/rubygems/custom_require.rb:31:in `require'
from /Library/Ruby/Gems/1.8/gems/pauldix-feedzirra-0.0.1/lib/feedzirra.rb:5
from /Library/Ruby/Site/1.8/rubygems/custom_require.rb:36:in `gem_original_require'
from /Library/Ruby/Site/1.8/rubygems/custom_require.rb:36:in `require'
from test_feedzirra.rb:2

something missing in the gem ?

Paul Dix

julbouln,
That error is probably due to Mac Ports. I just added a note to the installation instructions on the readme. If you have Mac Ports and you have curl installed through there, you need to remove it. If you're on Leopard then you're ready to go. Otherwise, download the latest curl and build from source.

If you're not using Mac Ports, then my guess is still that you have an older version of curl. Clean it out and get the latest.

julbouln

thanks for the reply
I have mac ports installed, but have the leopard /usr/bin/curl version.
I tested on a linux box and got the same error

usr/local/lib/site_ruby/1.8/rubygems/custom_require.rb:31:in `gem_original_require': no such file to load -- curb_core (LoadError)

julbouln

It seems that installing taf2-curb with gem doesn't actually compile the lib,
doing it manually in the taf2-curb folder

ruby ext/extconf.rb;make;make install

does everything working fine!

Sean

Paul, this looks really promising. Back in July I set out to rewrite hoards of FeedTools to use libxml and generally have a cleaner implementation, but I quickly realized that was going to be a full-time job. Well done!

Paul

If you want to remove that dependency on curb, which in my experience causes nothing but trouble (as the comments here seem to indicate as well), might I humbly suggest our http client library, Resourceful. http://resourceful.rubyforge.org/

It will handle all the other things for you; the conditional get, redirects, proxies, etc... Take a look, I'm trying to drum up some more interest in it.

Josh Kim

You, sir, are awesome. I was just about to start writing my own because of the lack of choices out there, but now I'm following you on github. I'm sure I'll contact you again... hopefully with more praises.

Todd Fisher

Hi Paul,
Nice work on the feed library. I'm still working on porting curb to ruby 1.9.1, assuming I get some free time between work this week, I'm hoping to make a new release early next week.

-taf2

Todd Fisher

To everyone reporting issues installing curb, can you send me the results of running ruby ext/extconf.rb, inside of the failed gem build dir... You can send me a message on github.com
thanks,
taf2

Paul Dix

Hi Other Paul,
I would only remove the dependency on curb if I found something that was easier to install yet still kept the speed. I took a look at the Resourceful source on github and I see that it's using net/http. That's a deal breaker for me since I've written about net/http being too slow for my needs. Blocking IO and no keep-alive ruin the performance of any library built on top of it.

Hi Todd,
That's awesome that you're working on 1.9.1 compatibility. What about the issue of the gem not compiling on gem install? Is that some other issue on people's machines or just a quick fix to the gemspec? I'll put a note in the installation instructions to also let you know about curb problems. Let me know if I can help in any way.

Chris

Great stuff, I'm really excited! Only problem, I can't install it :( I get as far as installing curb:

$ sudo gem install taf2-curb

...

Makefile:137: warning: overriding commands for target `/usr/lib/ruby/gems/1.8/gems/taf2-curb-0.2.4/ext'
Makefile:135: warning: ignoring old commands for target `/usr/lib/ruby/gems/1.8/gems/taf2-curb-0.2.4/ext'
/usr/bin/install -c -m 644 ./curb.rb /usr/lib/ruby/gems/1.8/gems/taf2-curb-0.2.4/ext
/usr/bin/install: `./curb.rb' and `/usr/lib/ruby/gems/1.8/gems/taf2-curb-0.2.4/ext/curb.rb' are the same file
make: *** [/usr/lib/ruby/gems/1.8/gems/taf2-curb-0.2.4/ext/curb.rb] Error 1

Any ideas?

Paul Dix

Hi Chris,
You can try going into /usr/lib/ruby/gems/1.8/gems/taf2-curb-0.2.4/ext and run make. Otherwise, uninstall that gem and do a git clone git://github.com/taf2/curb.git then run rake gem in the curb directory, then sudo gem install pkg/curb-0.2.4.0.gem

Please let me know if that works.

Chris

I successfully installed taf2-curb 0.2.6.1 by cloning and building locally. It doesn't help much though, since the feedzirra installation still fails when trying to build taf2-curb 0.2.4? Does it depend on 0.2.4 specifically?

Paul Dix

Hi Chris,
I'm looking at the taf2-curb gemspec and it says version 0.2.4. Even so, the Feedzirra gemspec requires taf2-curb >= 0.2.4 so I would expect higher versions to work. Can you paste in the exact error?

Thibaut

Hi,

Your library is really great, and it came exactly when I needed it, thanks a lot !

I've just one suggestion. On an atom feed the last_modified attribute gets the value of the field whereas there is a field which seems more appropriate (at least it is on the atom feed I have to parse frequently, because the lastBuildDate field changes every 5 minutes, yet there is no new items since the time referenced by the pubDate field).

Thibaut

Chris

Finally, I've got it installed :) I think the reason it failed the first time I built curb locally was that feedzirra wanted taf2-curb, but when I installed it locally it was called curb only. github gems isn't always so hot.

Will be playing around with feedzirra now, thanks for your help!

Sean Porter

I'm curious about this in the readme:

This thing needs to hammer on many different feeds in the wild. I’m sure there will be bugs. I want to find them and crush them. I didn’t bother using the test suite for feedparser. i wanted to start fresh.

Given that rfeedparser is just a port of feedparser, doesn't it make sense to start with that suite as it's uncovered all of the well-documented nastiness with RSS?

http://diveintomark.org/archives/2004/02/04/incompatible-rss

Paul Dix

Sean,
I'm actually not opposed to converting those tests to go against Feedzirra. I just didn't want to take the time to convert all those little edge cases. I thought I would get further by just hitting the main cases.

The other thing is that Feedzirra isn't trying to be exhaustive on the elements in each of the feed types it parses. Exactly the opposite, actually. Feedzirra only wants elements that are common to all of the feed types. The funny thing is that even that is a little loose since some feeds claim to be RSS version whatever, but leave out elements like pubDate (I'm looking at you, RubyForge release feed.)

Ultimately, if someone wants to convert those tests to run against Feedzirra, I'd definitely take do a git pull.

Paul Dix

Hi Thibaut,
Last modified actually gets either the last published entry date or (if available) the last-modified from the response header. Some Atom feeds update the last modified (in the Atom reponse) when someone posts a new comment. I'm not quite sure how I should handle this. For now, I'm just keeping the single published date and updating the last_modified attribute on the Feed object with the response header. If the server doesn't include it then it's always just the last published entry date.

A Nobody!

Paul thank you for this gem i was able to get it installed and all of the dependencies after some struggles with a missing package on my local machine.

To people having errors, you might want to review this post: http://ruby.zigzo.com/2009/02/15/feedzirra-installation-errors/

The comments to this entry are closed.

My Photo

Talks

Linkage

Twitter / pauldix