Robert Klemme wrote:
>>     # Assume offsets is a pre-computed array of positive integer
>> positions into the String originalstr.
>>     
> Care to unveil a bit of the nature of the computation that yields those
> indexes?
>   
It's not really relevant to the question I was asking - but I've no 
problem saying more.  I've a domain specific (order-preserving and 
extensible) 'type-system' which is imposed over otherwise opaque data 
structures.  Given an instance of a 'type-signature' and a pointer, it 
is possible to determine the number of bytes which represent each 'typed 
value' - and (significantly) list construction drops out as being the 
concatenation of the value representations and type-signatures.  The 
type signatures range in complexity from the simplest constant 'N-bytes 
interpreted as a natural number' through sentinel encodings (Null 
terminated strings on steroids) and (in principle - if not frequently in 
practice) arbitrary computation ranging over named integer values 
occurring 'earlier' in the list.
At the moment I'm toying with the idea that I can memory-map the values 
(using a C-implemented module) and do the computations on the mapped 
values in Ruby - having presented opaque values and 'type-signatures' as 
String objects to Ruby.  I expect that typical computations may involve 
matching regular expressions; doing arithmetic; computing various hashes 
and summations etc.  At the moment I'm concentrating on establishing if 
Ruby is a suitable tool for the task at hand.
>>     # with offsets[0]==0 and offsets[-1]==@originalstr.size
>>     @fields=Array.new (offsets.size-1)
>>     for i in 1..(offsets.size) do
>>       # I assume this next line is what is meant by a Ruby sub-string?
>>       @fields[i-1]=@originalstr[offsets[i-1]..offsets[i]]
>>     end
>>
>> .. and, assuming that @fields is exposed only as a read-only
>> attribute, that I can assume the memory it consumes to be independent
>> of the length of originalstr and dependent only upon numfields?
>>     
> You can help keeping this read only be freezing all strings involved.
>   
Yes - that sounds a good idea to me.
>> While I've no reason to doubt this confirmed answer, by any chance can
>> someone suggest a good way to demonstrate that this is the case
>> without resorting to either using very large strings and looking at
>> VM usage of the interpreter process... or resorting to reviewing the
>> source to Ruby's implementation?
>>     
> The only additional method of verification that comes to mind is to ask
> Matz. :-)
>   
Hmmm - a lack of profiling tools might prove something of a stumbling 
block... I'll need to have a careful think about that.  Rather than 
wanting to check up on fellow Rubyists, I really want to periodically 
check that I make no invalid assumptions as I work forwards from this 
basis towards an implementation.  I don't want to find out only after I 
think I've finished that a resource leak or extravagant resource demands 
will require a re-write before the software can be used against real data.

Steve