url: http://dark.fhtr.org/repos/metadata
tarball: http://dark.fhtr.org/repos/metadata/metadata-0.1.tar.gz

Description
-----------

  This package `Metadata' comes with a library called `metadata' and
  a small program called `mdh'.

  The library probes files for their metadata (e.g. jpeg dimensions
  and camera make, mp3 artist, pdf word count) and returns the metadata
  as a Hash.

  Mdh can print out file metadata as YAML and package the metadata
  with the file.

  This package has many dependencies since there is no single universal
  metadata header format that all files use. Blame resource forks, filename
  extensions, bags of bytes and mimetypes.

  The metadata hash mostly follows the shared-metadata-spec naming.
  http://wiki.freedesktop.org/wiki/Specifications/shared-filemetadata-spec

Usage
-----

  # print out metadata header
  mdh -p myfile.jpg

  # create myfile.jpg.mdh, which consists of metadata header + myfile.jpg
  mdh myfile.jpg

  # print out metadata header from mdh file
  mdh -e -p myfile.jpg.mdh

  # strip out metadata header from mdh file and save it to myfile.jpg
  mdh -e myfile.jpg.mdh

irb> Metadata.extract('myfile.jpg')
irb> Metadata.extract_text('myfile.pdf')
irb> Pathname.new("myfile.jpg").metadata


Requirements
------------

  * Ruby 1.8

  * Tons of metadata extraction programs,
    list of debian packages follows:
      dcraw
      libimlib2-ruby
      extract
      libimage-exiftool-perl
      poppler-utils
      mplayer
      html2text
      imagemagick
      unhtml
      pstotext
      antiword
      catdoc
      shared-mime-info

  * You do want to install the latest versions of dcraw and
    shared-mime-info to be able to handle camera raw images.
    http://cybercom.net/~dcoffin/dcraw/
    http://freedesktop.org/wiki/Software/shared-mime-info

  * Python + chardet library
    http://chardet.feedparser.org/

License
-------

  Ruby's


Ilmari Heikkinen <ilmari.heikkinen gmail com>