I will be speaking at the Gotham Ruby Conference (aka GoRuCo). With a great lineup of speakers, it's sure to be an informative event. I was about to suggest everyone grab a ticket, but I just noticed it's sold out!
Here's the brief description of my talk on Categorizing Documents in Ruby:
Text classification is the task of selecting a class or category for a document or block of text. The canonical example of this is the use of the Naive Bayes classifier for identifying spam vs. non-spam email. Classifiers can also be used for language identification, categorizing news articles or blog posts, detecting trackback spam, comment spam, wiki spam, and more. In my talk I will cover the basics of document classification while focusing on the various tools available in Ruby for each aspect of classification.
I'm really excited to be given the chance to present on this topic. Many thanks to the GoRuCo organizers!
technorati tags:goruco, conference
Neat! You should check out the squish classifier I wrote the other day. Melt-your-brain slow, but kinda cool because it avoids some of the problems with tokenization-based classifiers like Naive Bayes.
Posted by: Bob Aman | April 04, 2007 at 10:14 PM