Brian Candler wrote:

>> It seems though that any hole in the file must be written to. Otherwise 
>> the file format itself must keep track of every byte that it has written 
>> to or not in order to have a write-nothing / read-as-zero capability. 

>Unless you seek over entire blocks, in which case the filesystem can 
>create a "sparse" file with entirely missing blocks (i.e. the disk usage 
>reported by du can be much less than the file size)

>When you read any of these blocks, you will see all zero bytes.

OK. But the file system doesn't keep track of aything smaller than the 
block, right? So, it's not keeping track of the misc individual holes 
created by each extension of the seek (?).


> No, POSIX says it is a zero byte (character \0, \x00, byte value 0, 
> binary 00000000, ASCII NUL, however you want to think of it)

Doh! My zeros are coming from a step in my process which includes 
converting this particular data chunk to integers which I was 
forgetting. And nil.to_i will generate a zero. So, my bad; that detail 
is cleared up.

The only thing I'm still not real clear on is....

 - file X gets written to disk block 999 -- the data is a stream of 200 
contiguous "A" characters

 - file X gets deleted (which AFAIK only deletes the directory entry, 
and does not null-out the file data unless the OS has been told to do 
just that with a "secure delete" operation)

 - file Y gets written to disk block 999 -- the data has holes in it 
from extending the seek position

Generally, I wouldn't read in the holes, but I have this one little step 
that does end up with some holes, and I know it. What I don't know is 
what to expect in those holes. Null values or, garbage "A' characters 
left over from file X.

Logically I would expect garbage data, but the literal impact of 
paragraphs quoted earlier from the Unix book above indicates I should 
expect null values. I can't think of any tools I have that would enable 
me to test this.

Because I don't know, I've gone ahead and packed the holes with a known 
character. However, if I can avoid that I want to because it sucks up 
some time I'd like to avoid in large files, but it's not super critical.

At this point I'm more curious than anything. I appreciate the dialog.

-- gw


-- 
Posted via http://www.ruby-forum.com/.