« Figuring Out Where Ruby is Spending My Memory | Main | Keeping Tahiti Secret for TechCrunch20 »

July 05, 2007

Comments

Yan

Paul,

All storage supplied with your ec2 instance is transient so if you run a database instance on it, you have to do something like write ahead logging (in postgres) to an s3 instance. Another thing I've heard is this: http://www.openfount.com/blog/s3dfs-for-ec2 which lets you mount an s3 store as a filesstem under ec2, then presumably you could use it transparently. I have no experience with either of these methods directly but have been reading up on them in an attempt to evaluate EC2 hosting...hope that helps! See also - http://del.icio.us/skwp/ec2

Paul Dix

Yan, keeping the database on a mounted S3 instance is exactly what I want to avoid. That's like keeping your database on NFS which I would guess kills performance. The three options I have heard are:
1. Run periodic backups of the DB to S3
2. Write the log files to S3 so you can restore
3. Create another EC2 node as a slave

I'm not worried about being able to persist the DB. Really I'm just worried about if the transient storage provided by EC2 is actual connected hard disk storage. My understanding is that you can even reboot an EC2 node without the data in transient storage being lost. You just can't kill the instance. At least this is what I've read.

The comments to this entry are closed.

My Photo

Talks

Linkage

  • My Github
  • Feedzirra
    My Ruby library for parsing and fetching feeds at blinding speed.
  • SAX Machine
    My Ruby library exposes a DSL for building Nokogiri backed SAX parsers.
  • Typhoeus
    My Ruby library for running HTTP requests quickly, easily, and in parallel.
  • NYC Machine Learning Meetup
    The meetup I organize. Talks from researchers and practitioners on machine learning and related technologies and techniques.
  • Benchmark Solutions
    The financial market data startup I work for in NYC. We're hiring and need Javascript, Scala, C++, and Ruby programmers. We're also on the lookout for PhDs in statistics or machine learning.

Twitter / pauldix