Daniel Berger wrote:
> Brian Buckley wrote:
>> Hello all,
>>
>> Using Memoize gem 1.2.0, memoizing TO a file appears to be working
>> for me but subsequently reading that file (say, by rerunning the
>> same script) appears NOT to be working  (the fib(n) calls are being
>> run again). Inspecting the Memoize module I changed the line
>>
>> cache = Hash.new.update(Marshal.load(File.read(file))) rescue { }
>> to
>> cache = Hash.new.update(Marshal.load(File.read(file)))
>>
>> and it instead of silently failing I now see the error message: "in
>> `load': marshal data too short (ArgumentError)"
>>
>> My questions:
>> 1  What is causing this error? (possibly Windows related?)
>
> That is odd.  I've run it on Windows with no trouble in the past.  Is
> it possible you ran this program using 1.8.2, downloaded 1.8.4, then
> re-ran the same code using the same cache?  It would fail with that
> error if such is the case, since Marshal is not compatible between
> versions of Ruby - not even minor versions.
>
>> 2  What is the purpose of the rescue{} suppressing the error info in
>> the first place?
>
> The assumption (whoops!) was that if Hash.new.update failed it was
> because there was no cache (i.e. first run), so just return an empty
> hash.
>
>> 3  Instead of using Marshall would using yaml be a reasonable
>> alternative? (I am thinking of readability of the cache file and
>> also capability to pre-populate it)
>
> It will be slower, but it would work.

As you and others have pointed out this is lilely a problem caused by not
opening the file in binary mode.  IMHO lib code that uses Marshal should
ensure to open files in binary mode (regardless of platform).  Advantages
are twofold: we won't see these kind of erros (i.e. it's cross platform)
and documentation (you know from reading the code that the file is
expected to contain binary data).

Also, the line looks a bit strange to me.  Creating a new hash and
updating it with a hash read from disk seems superfluous.  I'd rather do
something like this:

cache = File.open(file, "rb") {|io| Marshal.load(io)} rescue {}

Marshal.load and Marshal.dump can actually read from and write to an IO
object.  This seems most efficient because the file contents do not have
read into mem before demarshalling and it's fail safe the same way as the
old impl.

Kind regards

    robert