Issue #8795 has been updated by mml (McClain Looney).


nobu (Nobuyoshi Nakada) wrote:
> How was this data generated?
> Using 3rd party's library or something?

The inputs consist of very badly malformed xml generally, though there's some malformed csv and json in there as well :)  Other than that, the marshal strings were generated in the traditional way (Marshal.dump(obj)).  There were a few other objects that triggered this issue as well, i'll verify that all of them were Time objects.
----------------------------------------
Bug #8795: "Null byte in string error" on Marshal.load
https://bugs.ruby-lang.org/issues/8795#change-41222

Author: mml (McClain Looney)
Status: Closed
Priority: Normal
Assignee: naruse (Yui NARUSE)
Category: core
Target version: 
ruby -v: ruby 2.0.0p247 (2013-06-27 revision 41674) [x86_64-linux]
Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWN


I have about 2M serialized ruby objects in a database (don't ask). The objects are serialized via Marshal.dump, then zipped, then base64 encoded before being saved.  After upgrading to 2.0 (built from source), a tiny minority (3-4) of objects thusly stored will fail to Marshal.load, with "ArgumentError: Null byte in string"

Given that the other 1.9M objects load just fine, and the issue never manifested in 1.8.7 MRI, and further, that the zip CRC's & such were not corrupted, I suspect there may be some subtle bug in the Marshal.dump code.

Please see attached file for a sample.  I'd be happy if there'd even be any way to "fix" said broken string, or even any insight into what might be going on, but I'm more worried that there may be a Marshal bug lurking..


-- 
http://bugs.ruby-lang.org/