On 9/1/05, Morgan <taria / the-arc.net> wrote:
> Jacob Fugal wrote:
> >A solution, similar to that employed by ncurses and many other UI
> >systems, is to use the concept of an extended character. Each
> >character in the string is flagged with applicable attributes.
> >Translating marked up ASCII to a list of extended characters is easy
> >enough: maintain a bitmask of attributes and turn them on/off as you
> >encounter tags; apply the current bitmask to each character
> >encountered.
> 
> Seems reasonable, but I do wonder how efficient the process would be.
> Something like that on each character makes me nervous. `.`

Hence my disclaimer. :) I'm just demonstrating that it's possible,
making it efficient will be a big project.

> >As an example application, your string would decode as follows:
> >
> >something = decode("A <C red>red</C> and <C blue>blue</C> baseball bat.")
> ># => A, ' ', r|red, e|red, d|red, ' ', a, n, d, ' ', b|blue, l|blue,
> >u|blue, e|blue, ' ', b, a, s,  ...
> >
> >The regex /red and blue/ would match this substring
> ># r|red, e|red, d|red, ' ', a, n, d, ' ', b|blue, l|blue, u|blue, e|blue
> >
> >That substring is replaced with the substring (since it wasn't encoded):
> ># o, s, t, r, i, c, h
> 
> And then you'd get a bug report about how your replace is dropping color...
> 
> At least, the behavior I would expect is to preserve the existing coloration.
> (Of course, then I specifically picked an example where there's no obvious
> sensible way to do that. Still, I think keeping the color would generally be
> expected.)

Well, once possible workaround for this is to pass initial state when
decoding raw ascii. However, as you state, this example is ambiguous.
Assume we replaced the shorter substring "d and bl" with the raw
"ostrich", using this technique. We'd end up with something like:

"A <C red>reostrich</C><C blue>ue</C> baseball bat."

This is because the raw string "ostrich" was passed char.flags as its
initial flags (where char is the extended char object for the 'd' in
red, the first character in the replaced substring) rather than an
empty mask.

Jacob Fugal