« Making a Ruby C library even faster | Main | Feedzirra - A Ruby feed library built for speed »

January 29, 2009

Comments

hgs

Did the servers you tested against support any forms of compression, and which one(s) were used at the time? If no compression was used, it seems fair to repeat the experiments with the various choices the library supports, because that is the point of them. Which version(s?) of ruby did you test?

Paul Dix

I was just testing against my site (hosted by typepad.) My guess is that they support compression. However, I didn't specify that compression was ok in any of my headers. I may rerun the test with compression, but I doubt it will make a difference. I expect that the Libcurl Multi and EventMachine libraries will far outperform rfuzz and net::http when it comes to making multiple requests. That's more a function of their deferred pattern than support for compression (see reactor pattern.

I performed the tests on ruby 1.8.7. Based on the post I linked to, my guess is that the differences would have been even more pronounced if I had been running 1.8.6. Not sure about 1.9. Even if it was slightly better, I think using libcurl is a better option. It's widely used, and widely tested. People put it to good use in many languages.

hgs

I think proving that compression works has value because if many people use your reader then any advantage will be multiplied. See for example http://griffin.oobleyboo.com/archive/ruby-net-http-and-content-encoding-http_encoding_helper/ and http://www.codinghorror.com/blog/archives/000807.html although there are lots of others discussing the scalability of RSS. And this is optimisation, so getting the right benchmarks is part of making the right decision.

Paul Dix

I'll most certainly test out compression for my library. However, that will be a test of taf2-curb with compression and without compression. This test was more about selecting an http client.

Decompression will happen outside the client after the download has happened via Zlib or something. The test I'll be running will be to determine if the extra cpu used by decompression makes getting feeds faster or slower as a whole (my guess is compression=faster).

John Nunemaker

Could you test against a local server to get more accurate results? Maybe like a apache serving files of different sizes on your own computer or on a spare one? Then you wouldn't have to worry about internet network differences. I don't know, just a thought.

Paul Dix

Testing against a local server would yield more accurate for that type of benchmark (against a server that has a very quick response time). However, a realistic test would have to simulate latency and servers that have variable response time. Variable response time is a much more realistic scenario. Further, because of the deferred processing method of the eventmachine and libcurl multi methods, variable conditions are the ones in which they'll have an even bigger advantage over net::http and rfuzz. I realize that last statement is speculation since I don't have actual numbers, but I'm fairly comfortable making it because of the huge difference in performance on the test this post is about.

I think the differences between the libraries on a single request or the differences between the eventmachine and libcurl multi options on many requests could be attributed to variable network conditions. Actually, even the single request performance characteristics would be greatly changed if the response is particularly large. The post I linked at the beginning details those problems.

Aman

Try http://github.com/igrigorik/em-http-request. It combines the power of EM with the robust HTTP parser bundled with mongrel.

Aaron Patterson

This test is pretty biased against net/http since libcurl uses keep-alive requests by default and net/http does not. :-(

I've submitted patches to ruby 1.9 to make use of non blocking requests. I need to fix some tests so that it gets rolled in to the next release. I think that with non blocking socket calls, and keep alive requests net/http should be nearly as fast as curb.

Max Lapshin

Paul. curb is good, but EventMachine makes possible to use several different functionalities in one process and it seems to be impossible to do it with curb =(

I need to rewrite feeds (cache data locally), so I need to use thin + some feed parser. It seems that I'll have to rewrite your library to use with EM. Perhaps, pluggable backend?

The comments to this entry are closed.

My Photo

Talks

Linkage

Twitter / pauldix