ke, 2005-06-01 kello 19:00, M. Eteum kirjoitti:
> Dear Ruby Guru:
>       Is there a way to identify any documents from its header? I have a 
> bunch of document collected over the year from multi platform system, 
> Mac, Windows, and various unix/linux variant where some of the document 
> does not have file extension. Are there a list that tells us what header 
> should we expect for certain documents e.g. txt, rtf, pdf, jpg, mpg, 
> word, excel, visio, etc ...
> 
> Thanks

Hello,

If you have shared-mime-info database installed
( http://freedesktop.org/wiki/Software_2fshared_2dmime_2dinfo )
you can use this: http://www.code-monkey.de/projects/mimeInfoRb.html
Or my extended version: http://dark.fhtr.org/mime_info_rb.tar.gz

>From the README:

 MimeInfo class provides an interface to query freedesktop.org's
 shared-mime-info database. It can be used to guess a filename's
 Mimetype and to get the description for the Mimetype.

   require 'mime_info'

   info = MimeInfo.get('foo.xml') #=> Mimetype['text/xml']
   info.description               
   #=> "eXtensible Markup Language document"
   info.description("de")         #=> "XML-Dokument"
   
   info2 = MimeInfo.get('foo.rb')     #=> Mimetype['application/x-ruby']
   info2.description                  #=> "Ruby script"
   info2.is_a? Mimetype['text/plain'] #=> true

   t = Mimetype['audio/x-mp3'] #=> Mimetype['audio/x-mp3']
   t.description               #=> "MP3 audio"
   t.description('cy')         #=> "Sain MP3"
   t.descriptions['fr']        #=> "audio MP3"
   t == Mimetype['audio']['x-mp3'] #=> true
   t.is_a? Mimetype['audio']       #=> true
   t.ancestors #=> [Mimetype['audio/x-mp3'], Mimetype['audio'], 
               #    Mimetype['application/octet-stream'], Mimetype, 
               #    Module, Object, Kernel]


HTH,

Ilmari