Hi -- On Thu, 21 Jun 2007, Stephen Ball wrote: > On 6/20/07, Daniel DeLorme <dan-ml / dan42.com> wrote: >> That doesn't really explain why the regexp finds an extra empty string. >> I know that zero occurrences is one match but after a greedy match that >> matches everything, there should be (logically?) no other match. I am no >> stranger to regexps and the result is counter-intuitive to me; I would >> consider it a bug. Or at least a very very peculiar behavior. >> >> Daniel >> > > It's because the pattern /.*/ matches everything, including the > absence of everything. Yes, with the proper regexs you can indeed have > tea and no tea at the same time. Certainly peculiar, but occasionally > useful. > > So: since * matches "zero or more" characters when it starts the > search for .* it matches the absence (the 'zero') and then matches the > string (the 'or more'). It's the other way around, though; it matches "hello" *first*, and then "". So the zero-matching (which I admit I'm among those who find unexpected) is happening at the end. > To prevent this you need to indicate to your regular expression that > you only want the subset of 'everything' that is actually something. > Here are a couple ways to do this: > > /.+/ will match 1 or more of something, so doesn't return the absence > > /^.*/ will start the search at the start of the pattern, in a way > bypassing the match of zero (the pattern /^.*$/ makes this more > clear). Here, again, "hello" is first, so /^.*/ matches it but doesn't match the second time ("") because the "" isn't anchored to ^. David -- * Books: RAILS ROUTING (new! http://www.awprofessional.com/title/0321509242) RUBY FOR RAILS (http://www.manning.com/black) * Ruby/Rails training & consulting: Ruby Power and Light, LLC (http://www.rubypal.com)