"Simon Strandgaard" <neoneye / adslhome.dk> schrieb im Newsbeitrag
news:20040531140451.3abb4fb2.neoneye / adslhome.dk...
> "Robert Klemme" <bob.news / gmx.net> wrote:
> > "Simon Strandgaard" <neoneye / adslhome.dk> schrieb im Newsbeitrag
> > news:20040531104155.074a42b0.neoneye / adslhome.dk...
> > > Simon Strandgaard wrote:
> > > > While extending my own regexp-engine with a split method,
> > > > I discovered something odd about Ruby's split.
> > > >
> > > > irb(main):001:0> 'ab1ab'.split(/\D+/)
> > > > => ["", "1"]
> > > >
> > > > Its asymmetric, it has a special case for eliminating
> > > > the last empty string.. but apparently not the first empty string.
> > > >
> > > > I would have expected above to be symmetric, and output:
> > > > => ["1"]
> > > >
> > >
> > > [10 minutes of experimenting later]
> > > I wasn't aware that Ruby inserts subcaptures this way.
> > >
> > > irb(main):001:0> "ab2cd3".split(/(\D+)/, 2)
> > > => ["", "ab", "2cd3"]
> > >
> > > Because of subcapture insertion, it make sense to keep the
> > > first empty string.
> > >
> > > I withdraw this bug-report.
> >
> > But what about:
> >
> > >> 'ab'.split(/\D+/)
> > => []
> >
> > You would at least expect one empty string in the result since there is
at
> > least one separator.  This strikes me as odd.
> >
>
> Guy Decoux very recently explained that to me.
>
> When split has no limit, it wipes empty strings.
>
> In your case you would have expected it to output [""].. but
> because its an empty-string in the tail.. it gets wiped.
>
> def split(pattern, limit=0)
>   ...
>   unless limit  # lets wipe tailing elements which are empty
>     result.pop while result.size > 0 and result.last.empty?
>   end
>   result
> end

But I though it will strip trailing empty strings - what about the leading
empty string in my example?  I'd expect that to be preserved.

Hm...

    robert