On Fri, Apr 1, 2011 at 9:20 AM, Ruby Ruby <ruby / ayni.com> wrote:
> Hi every
> installation here ruby-1.8.6.420-2.fc13.x86_64
>
> i wanted to highlight some text fields (change background color) in an
> xhtml page. To do that, i had to skip all xhtml markup, 'cause it is not
> advisable to put markup into markup.
>
> i created the following function:
>
> <snip>
> def highlight(what, crix)
> =A0if crix.empty? then return what end
> =A0 / tix.puts "WHAT: "+what
> =A0ccir =3D Regexp.new("(?:\<[^\>]+\>)|(?:("+crix.join("|")+"))", "i")

You should rather use Regexp.union(crix) in order to ensure proper
escaping.  Note that you can use string interpolation in regexp, e.g.

irb(main):007:0> s=3D123
=3D> 123
irb(main):008:0> /foo#{s}/
=3D> /foo123/

Also, I don't see why you use \< because the backslash disappears:

irb(main):022:0> puts "\<"
<
=3D> nil
irb(main):023:0> puts "<"
<
=3D> nil


> =A0return what.gsub(ccir) {|s|
> =A0 =A0 / tix.puts "COLL: "+s
> =A0 =A0if ! s.empty? && ! s.match(/^\</)
> =A0 =A0 =A0"<span style=3D'background-color:"+CCOL+";'>"+s+"</span>"
> =A0 =A0else s
> =A0 =A0end
> =A0}
> end =A0# highlight
> </snip>
>
> the main part of the function is the ccir regular expression.
> my thinking was, that with the first group in the regular expression i
> would skip all =A0markup, and with the second group in the regular
> expression i would collect the fields designated in the crix array.
> When the first group of the regular expression matches, i was awaiting
> an empty string to be returned to the block, otherwise the element to be
> highlighted would be returned to the block.
> NOPE.
> when the first group of the regular expression matched, the test file
> showed me, that an entire markup sequence was returned to the block,
> even if no collector was active in the first group. when the second
> group of the regular expression matched, it returned the expected string
> to the block.
> this is why i had to avoid the markup-in-markup by checking again in the
> block if the string returned to the block started with "<", i.e. if it
> was markup.
>
> so be warned, if you use groups in regular expressions, as they may not
> return what you expected.

Yes, groups can be tricky but I believe yours is rather a case of
malformed regexp.  I chime in with what Brian said: write a small
program demonstrating the effect and describe the desired output.

Otherwise I recommend "Mastering Regular Expressions" (O'Reilly).

Cheers

robert

--=20
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/