On Apr 2, 5:13 pm, Jens Wille <jens.wi... / uni-koeln.de> wrote:
> Thomas Wieczorek [2008-04-02 22:59]:> On Wed, Apr 2, 2008 at 10:55 PM, Yossef Mendelssohn
> > <ymen... / pobox.com> wrote:
> >> On Apr 2, 3:35 pm, "Thomas Wieczorek"
> >> <wieczo... / googlemail.com> wrote:
> >>> .* matches NO and ALL characters, so gsub() substitutes
> >>> ''(empty)(=>'x') and and 'test'(=>'x') with x, so you get
> >>> 'xx'
> >> That sounds like an explanation why ''.gsub(/.*/, 'x') is 'x'
> >> more than why 'test'.gsub(/.*/, 'x') is 'xx'. It seems to me
> >> that the .* should match [empty string]test[empty string] just once.
> > Yeah, it is confusing me, but I agreed on that explanation with
> > myself, when I read it once here. I'd also expect 'x' instead of 'xx'
>
> can't explain it either, i'm afraid. but you can see what it does
> like so:
>
> irb> 'test'.gsub(/.*/) { |m| p m; 'x'}
> "test"
> ""
> =>"xx"

That seems like a bug to me. The entire string is matched/consumed
by .*, so why try matching again? Or, if you are going to continue,
why stop with just one additional match? Is there code in gsub to
"only match one time after the string is consumed" ?

irb(main):001:0> 'test' =~ /(.*)(.*)(.*)/
=> 0
irb(main):002:0> $1
=> "test"
irb(main):003:0> $2
=> ""
irb(main):004:0> $3
=> ""