Matz,

Is Mojikyo a superset of Unicode? If not, how hard is the translation to
UCS-4?

I designed the UCS-4 string class we use here in C++, with a UTF-8 storage
format (up to 31-bit with a six-byte UTF-8 sequence). The string class
remembers which character you last accessed and at what byte offset it
started, so that when you ask for another character, it can decide whether
heuristically to search forward from the start, forward or backward from
the remembered point (most common), or if it has ever counted the
characters, backward from the end. This minimises the search cost since
most string processing is largely sequential.

With the "remembered point" feature, I think UTF-8 has been a good tradeoff,
so much so that although I implemented the class using a pure interface and
a factory to allow alternate formats, we haven't needed to do it.

BTW, re "style", I like the definition I heard from a fashion figure:
"quirkiness with confidence". I guess the definition doesn't hold so well
for software though :-).

--
Clifford Heath