In article <op.upklh9q19245dp / kool>, "Michael Selig" <michael.selig / fs.com.au> writes: > In more detail: I have a legacy system that uses fixed length fields. Yes, > a name is variable length, but some old systems use a fixed length field, > say 40 chars, which is space filled on the right (or truncated). In my > case, the data input is by a form, and each field is fixed width. I am > changing the system so that the SAME forms can be used, but extended to > use UTF-8 not just ASCII. So this means that the number of characters is > still fixed, but the number of bytes is no longer fixed. I do *not* want > to change the format of the file (though it probably should be, but that > would be a lot more work), because I want the application to be backward > compatible (when using ASCII data). This is what I'd like to hear. Thank you for explanation. It seems the number, 40, is a number for "big enough for names". Why don't you use 40 bytes data format, both with Ruby 1.8 and 1.9? Do you think that 40 bytes is not big enough for names in some country? If the data format uses 40 bytes, instead of 40 chars, it is easy to read it in Ruby 1.8, even if it contains UTF-8 chars. Also, in-place update and seek may be possible. Althogh I know you don't need them. > Also I think there are other cases when applications which used to use > IO#read to read a fixed length ASCII string will need to be changed to > instead read the same fixed length but in chars. Currently the only way to > do this in Ruby is to use a loop I believe. I'd like to hear the actual example. > Also it seems to me that the current usage of the "limit" parameter of > IO#gets is not intuitive in 1.9. It is "maximum number of bytes, but don't > split a character", and I think it should be changed to mean "maximum > number of chars". That would be much more obvious, more useful (IMHO), and > still be backward compatible with 1.8. It is introduced for security reason. bytes are more stable than characters. -- Tanaka Akira