------art_74550_12242192.1151425634286
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

On 6/27/06, Austin Ziegler <halostatue / gmail.com> wrote:
>
> With an m17n String, he will need to have something else that isn't
> compatible with Java Strings, which hurts JRuby's use as a Java glue
> language. I think that there are ways around this. Maybe make the JRuby
> String class have an internal something like:
>
>   class JRuby.String
>   {
>         private Java.Lang.String        unicode;
>         private ByteVector                      m17n;
>         private Java.Lang.String        encoding;
>         private bool                            isUnicode;
>   }
>

This would certainly be an option once matz has solved all the hard problems
of an encoding-free String. Some minimal testing of a byte[] based UTF-8
Java String replacement has shown that there are very few general
performance issues arising from reimplementing string with a different data
structure (a testament to Java's JIT, since most Java code runs faster
without native bits). When there's something concrete in the m17n plan, we
shouldn't have much difficulty supporting it. We could also run with pure
unicode internally as well, for folks who didn't need any
unicode-incompatible encodings. Without the m17n code ready for general
consumption, it's hard to say what path will be best.

The other advantage of a byte[] or ByteVector-based JRuby string is for IO;
currently we use Java's StringBuffer for handling mutable string operations.
This works well, but StringBuffer maintains a char[] internally, so for
every byte of IO we waste a byte. We're considering various options to
improve that, and the end result may be closer to the UberString than to
Java's own.

So yes, there's some alterior motive in my support for pure Unicode and
ByteArray, but any path taken will be implementable in JRuby. However, I
support those because I feel they simplify rather than complicate, and not
because they might be easier to implement in Java.

-- 
Charles Oliver Nutter @ headius.blogspot.com
JRuby Developer @ www.jruby.org
Application Architect @ www.ventera.com

------art_74550_12242192.1151425634286--