On 06.07.2009 00:13, Greg Willits wrote:

> Generally, I wouldn't read in the holes, but I have this one little step 
> that does end up with some holes, and I know it. What I don't know is 
> what to expect in those holes. Null values or, garbage "A' characters 
> left over from file X.
> 
> Logically I would expect garbage data, but the literal impact of 
> paragraphs quoted earlier from the Unix book above indicates I should 
> expect null values. I can't think of any tools I have that would enable 
> me to test this.

I would not expect anything in those bytes for the simple reason that 
this reduces portability of your program.  If anything the whole 
discussion has shown that apparently there are (or were) different 
approaches to handling this (including return of old data which should 
not happen any more nowadays).

> Because I don't know, I've gone ahead and packed the holes with a known 
> character. However, if I can avoid that I want to because it sucks up 
> some time I'd like to avoid in large files, but it's not super critical.
> 
> At this point I'm more curious than anything. I appreciate the dialog.

I stick to the point I made earlier: if you need particular data to be 
present in the slack of your records you need to make sure it's there. 
Since your IO is done block wise and you probably aligned your offsets 
with block boundaries anyway there should not be a noticeable difference 
in IO.  You probably need a bit more CPU time to generate that data but 
that's probably negligible in light of the disk IO overhead.

If you want to save yourself that effort you should probably make sure 
that your record format allows for easy separation of the data and slack 
area.  There are various well established practices, for example 
preceding the data area with a length indicator or terminating data with 
a special marker byte.

My 0.02 EUR.

Kind regards

	robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/