On Wed, 07 Feb 2007 04:05:35 +0900, Jason Frankovitz wrote:

> Giles Bowkett wrote:
>>> First of all Giles and Ken, thanks for your answers. It sounds like a
>>> Bayesian approach won't work for what I want to do. This same gem has
>>> another classifer inside it, called Classifier::LSI which does latent
>>> semantic indexing. I don't know much about it yet other than it's not as
>>> fast or as small as a Bayesian classifier. However, would it be more
>>> suited to supporting a "none of the above" feature?
>>>
>>> Or would you recommend something entirely different?
>> 
>> Well, a latent semantic indexer is a whole different thing. I know of
>> a company that built a search engine with latent semantic analysis. If
>> you search it for naked pictures of Britney Spears -- just as a stupid
>> example -- it'll also ask you if you want to hear her music or if
>> you're interested in naked pictures of Lindsay Lohan as well. Latent
>> semantic indexers are a very smart technology but I think they require
>> **extremely** large data sets to be useful. They compare patterns of
>> linkage to identify things which must have some latent semantic
>> connection, that is to say, words that are different but mean similar
>> things. There are very few problems for which latent semantic analysis
>> **isn't** overkill.
>> 
> 
> Well, within the not-too-distant future, we'll be handling a sizable 
> dataset so LSI might make sense after all. This would be for a system 
> we're building that's doing something quite cool but I can't shout all 
> the details from the rooftops just yet :) Would it be all right for me 
> to give you specifics via email? I'd be happy to edit the Ruby-germane 
> portions of our offline conversation and post them back onto the forum. 
> My email is jason at seethroo dot us.

I suggest learning about machine learning techniques in general before you
try to do *anything* quite cool that you can't shoud from the rooftops
just yet.

I recommend "Machine Learning" by Tom Mitchell[1].

--Ken 
[1] http://www.cs.cmu.edu/~tom/mlbook.html

-- 
Ken Bloom. PhD candidate. Linguistic Cognition Laboratory.
Department of Computer Science. Illinois Institute of Technology.
http://www.iit.edu/~kbloom1/