From: "HarryO" <harryo / zipworld.com.au>
>
> I want to do something similar to what I've seen on slashdot, where the
> user is allowed to enter HTML for responses to articles, but the set of
> tags that are acceptable is only a subset of full  HTML.
> 
> Now, rather than trying to do something like edit the string to remove
> any matches of UNacceptable tags, I'd prefer to be able to define which
> ones ARE acceptable and then edit the string to remove anything else
> that looks like a tag but doesn't match the ones I've decided are OK.

Here's one approach  . . .


Regards,

Bill

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#!/usr/local/bin/ruby -w

require 'runit/testcase'
require 'runit/cui/testrunner'

AllowedTagsList = %w{
    a b body em font head i p pre table td tr
    # etc...
}

AllowedTags = { }
AllowedTagsList.each {|e| AllowedTags[e] = true }


def zap_disallowed_tags(str)
    # note: this regexp doesn't handle possible '>' literals embedded
    # in quoted attribute values . . .
    str.gsub(/<\/?(\w+)[^>]*>/) {
        AllowedTags[$1] ? $& : ""
    }
end


class TestZap < RUNIT::TestCase
    def testZap
        {
            'helloworld' => 'helloworld',
            'hello<q href="spammy">world</q>' => 'helloworld',
            'one <em>two</em> <buckle>shoe</buckle>' => 'one <em>two</em> shoe'
        }.each {|k,v| assert zap_disallowed_tags(k) == v }
    end
end

RUNIT::CUI::TestRunner.run(TestZap.suite)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~