« Feedzirra gets a few more features | Main | Feedzirra release adds simple custom parsing and more! »

March 05, 2009

Comments

Jeff

This looks interesting. One of the things that concerns me about using all these libraries built on curl is that there doesn't seem to be a fakeweb ( http://github.com/chrisk/fakeweb/tree/master ) equivalent to make testing easier. So you're left with setting up local servers, etc.

Nevertheless this looks useful. I think some benchmarks of this against equivalent HTTParty code would be helpful both for showing differences in code implementation for wrapping services and also would highlight the speed boost that the curl libraries provide.

I'm still hoping for a curb-compatible fakeweb, but I think implementing it in the short-term is a bit over my head :/

Paul Dix

That's a good point. I'll have to include a good testing framework as part of this whole thing.

Todd Fisher

Hi Paul,
This was exactly the use case I wrote evdispatch to resolve. check it out, there is definitely room to improve it. http://evdispatch.rubyforge.org/

The simple idea, is to have a per process background posix thread, running a libev loop, waiting for work to be signaled. Once a request comes into the queue the libcurl multi interface is used to send the request. This enables ruby to dispatch or queue work for the background posix thread to fetch, while not blocking the ruby interpreter. At a later point, ruby can block or timeout to wait for all the concurrent requests to complete.

Todd Fisher

Sorry to double post, but took a quick look at your HTTPMachine, I see you're already using my curb fork. In this case, it is probably a better solution to stick with curb... evdispatch was an experiment I developed before extending curb. evdispatch would in theory get better throughput, but in practice unless your making 1000s of service requests, curb will definitely work better... also, evdispatch has bugs, that haven't gone back to resolve...

Paul Dix

Yeah, I think curb & the libcurl multi interface is best way to go. The one thing that this doesn't do yet is perform POST, PUT, and DELETE in parallel. For the time being I'm ok with that, but if I find I need it later I might have to fork Curb to write that support in. In the meantime, thanks for all the great work on Curb!

Lourens

Paul,

Not sure if this makes sense, but I'm more in favor of ESI and / or Nginx SSI as the ESI spec supports backend timeouts, expiry etc.

Todd's the author of mongrel-esi, but then again this isn't SOA, but simply a variation thereof that's perhaps more feasible with Ruby's typical multi-process in favor of multi-threaded deployment model.

- Lourens

Josh Knowles

I understand what you're going for, but why HTTP? Seems like things like Thrift, Jabber, AMPQ, etc. would be more fitting.

Pat Maddox

SOA def presents some special technical challenges in the Ruby world as you point out. You mentioned message queues, but there's a lot more to be said about that topic. One strategy is to subscribe to events from the different systems and then take the relevant data from them and stick it in your db. Much much faster since you're hitting a local db instead of an external service on each request. You also don't have to handle failure scenarios in your app code. Finally, it provides a natural seam for sanitizing and translating the data if necessary. Take a look at the anti-corruption layer section of Domain-Driven Design.

Pat Maddox

SOA def presents some special technical challenges in the Ruby world as you point out. You mentioned message queues, but there's a lot more to be said about that topic. One strategy is to subscribe to events from the different systems and then take the relevant data from them and stick it in your db. Much much faster since you're hitting a local db instead of an external service on each request. You also don't have to handle failure scenarios in your app code. Finally, it provides a natural seam for sanitizing and translating the data if necessary. Take a look at the anti-corruption layer section of Domain-Driven Design.

Paul Dix

One of the points of SOA is to not have everything going into a single DB. You partition out functionality early so you don't have the classic DB scaling problems later. On the idea of not hitting an external service, that's exactly what a DB is. It doesn't matter if you hit only that 1 or 1000, as long as they all return within an acceptable amount of time to render a request to the user.

Josh, on the issue of AMPQ, the kind of SOA I'm describing is meant to be synchronous. Things that the user is waiting on. Thing of a comment service, or on your site having a newsfeed service. The user needs to see these for the pages they're on and it's not something that gets queued. The writing to the data store can be queued, but pulling back the data to render the request needs to happen in real time. As for using Thrift or Jabber to do this kind of thing, it's definitely worth looking into.

The comments to this entry are closed.

My Photo

Talks

Linkage

Twitter / pauldix