On 11/13/06, Miquel Oliete <ktalanet / yahoo.es> wrote:
> Hi all
>
> How can I convert from utf-8 to HMTL ampersand entities and from HTML
> ampersand entities to utf-8 (I'm searching it a lot but I can found it)?
>
> Thanks in advance

UTF-8 to HTML convertion is trivial.
HTML to UTF-8 is almost trivial, you just need to decide the set of
supported &-entities. This example handles only &#xHEX; and &amp;,
but it should be obvious how to extend it to other entities (if you
want to do so).

class String
    def utf8_to_html
        gsub(/([^\000-\177])|(&)/u) {
            if $2
                "&amp;"
            else
                sprintf("&#x%x;", $1.unpack("U")[0])
            end
        }
    end
    def html_to_utf8
        gsub(/&(?:#x([0-9a-fA-F]+)|(amp));/) {
            if $2
                "&"
            else
                [$1.hex].pack "U"
            end
        }
    end
end

-- 
Tomasz Wegrzanowski [ http://t-a-w.blogspot.com/ ]