On Sat, 5 Jan 2008 08:09:22 +0900, Philip Hallstrom <ruby / philip.pjkh.com> wrote: >> Can this get any more off topic? > > Yes. Unless we make it the next ruby quiz to query imdb.com :) Well, it shouldn't be too hard in principle -- they have complete pages of just movie titles broken up by year and initial letter. http://www.imdb.com/TitlesByYear?year=#{year}&start=#{initial}&nav=/Sections/Years/#{year}/include-titles (Where initial is one of 'A'..'Z' or '*') The main difficulty is normalizing titles where initial articles have been moved to the end, and doing so in a language-insensitive way. However, I think the nice people at imdb.com would frown on someone (let alone lots of someones) mining hundreds of thousands of movie titles this way, so I was wondering if there was a reasonably large corpus of titles precompiled somewhere which we could use instead. Of course, we could always just write the scripts anyway and pretend we had a more accessible database of titles to work from. -mental