Brian Candler wrote:
> Greg Willits wrote:
>>> I don't think anyone has answered this question directly but on POSIX-
>>> like file systems a seek past the end of the file and a subsequent
>>> write will cause the intervening bytes (which have never been written)
>>> to read as zeros. Whether those 'holes' occupy disk space or not is
>>> implementation dependent.
>> 
>> 
>> If in deed this is a fact (and it's consistent with my observation), 
>> then I'd say it's worth taking advantage of. I can't find a definitive 
>> reference to cite though (Pickaxe, The Ruby Way).
> 
> Well, those aren't POSIX references. But from "Advanced Programming in 
> the UNIX Environment" by the late great Richard Stevens, pub. 
> Addison-Wesley, p53:
> 
> "`lseek` only records the current file offset within the kernel - it 
> does not cause any I/O to take place. This offset is then used by the 
> next read or write operation.
> 
> The file's offset can get greater than the file's current size, in which 
> case the next `write` to the file will extend the file. This is referred 
> to as creating a hole in a file and is allowed. Any bytes in a file that 
> have not been written are read back as 0."


I see, you guys are saying it's an OS-level detail, not a Ruby-specfic 
detail.

It seems though that any hole in the file must be written to. Otherwise 
the file format itself must keep track of every byte that it has written 
to or not in order to have a write-nothing / read-as-zero capability. 
This would seem to be very inefficient overhead.

Hmm... duh, I can bust out the hex editor and have a look.

<pause>

OK, well, empty bytes created by extending the filesize of a new file 
are 0.chr not an ASCII zero character (well, at least according to the 
hex editor app). That could simply be the absence of data from virgin 
disk space. I suppose, that absence of data could be interpreted however 
the app wants, so the hex editor says it is 0.chr and the POSIX code 
says it is 48.chr.

Still though, since the file isn't being filled with the data that is 
provided by the read-back, that still confuses me. How does the read 
know to convert those particular NULL values into ASCII zeros vs a NULL 
byte I write on purpose? And it still doesn't really confirm what would 
happen when non-virgin disk space is being written to.

Hrrmmm. :-\

Thanks for the discussion so far.

-- gw

-- 
Posted via http://www.ruby-forum.com/.