« Released Basset Gem for Machine Learning | Main | ActiveDocument: More than just a document store »

January 11, 2008

Comments

Jason

Since maintaining Thrift type declaration files are cumbersome notably for dynamic languages and evolving, versionable doc types yet still wanting to maintain cross-platform capabilities why not use a structured text format for document representation? Something readily parseable and lightweight such as JSON or YAML?

This would keep Thrift where it belongs, at the communication interface perimeter while maintaining the desire to keep the persisted doc's representation as cross platform. Using Thrift to store temporary execution objects such as what Jake's done with ThruQueue is one thing but Thrift encoding long-lived, and presumably business-critical, entities is pushing Thrift far too deep into the infrastructure, coupling your documents with it in perpetuity. This would be like persisting the docs as their serialized CORBA representation because CORBA is what's used as the outer edge IPC.

The comments to this entry are closed.

My Photo

Talks

Linkage

  • My Github
  • Feedzirra
    My Ruby library for parsing and fetching feeds at blinding speed.
  • SAX Machine
    My Ruby library exposes a DSL for building Nokogiri backed SAX parsers.
  • Typhoeus
    My Ruby library for running HTTP requests quickly, easily, and in parallel.
  • NYC Machine Learning Meetup
    The meetup I organize. Talks from researchers and practitioners on machine learning and related technologies and techniques.
  • Benchmark Solutions
    The financial market data startup I work for in NYC. We're hiring and need Javascript, Scala, C++, and Ruby programmers. We're also on the lookout for PhDs in statistics or machine learning.

Twitter / pauldix