On 8/7/09, Vít Ondruch <v.ondruch / tiscali.cz> wrote:
> file, but I would like to see something in following manner:
>
> String.new 'zufällige_žluťouký', Encoding.CP852

You seem to be asking for the ability to have individual string
literals have encoding different from that of the program as a whole.
Why not this:

#encoding: ascii-8bit
'zufällige_žluťouký'.force_encoding 'cp852'
'some utf8 data'.force_encoding 'utf-8'
'some sjis data'.force_encoding 'sjis'

I am far from an expert on encodings, but in my (admittedly minimalist
and perhaps inadequate) testing, this seems to basically work.

There are going to be holes in this; data in nonascii compatible
encodings in particular may give trouble. However, if the string data
does not contain the bytes 0x27 (ascii ') or 0x5C (ascii \) there will
be no problem. Whether this will work in particular circumstances
given a known encoding and data to be represented in it is unknown in
general, but surely very often the case. If it's the single quote
character that causes the problem, you can switch to a different
character using the%q[] quote syntax. In extremis, a single quoted
here document may be called for:

  <<-'end'
    lotsa ' and \ here, but ruby don't care
  end

This form of string has the advantage of having no special characters
at all, and you can choose the sequence of bytes that makes up the
string terminator to be anything you want. (but you do end up with an
extra (ascii) newline at the end...)

Another challenge will be editing this file. There's no editor out
there that could actually display this kind of thing correctly; you'll
have to become proficient at editing it as binary, or at least find an
editor than can tolerate arbitrary binary chars in its ascii.