Pavel Smerk wrote: > Justin Collins wrote: >> Pavel Smerk wrote: >> >>> And once more question: > > one more :) > >>> In Czech, c followed by h is considered (for sorting etc.) as one >>> character/grapheme ch. I need to split string to single characters >>> with respect to this absurd manner. >>> >>> In Perl I can write >>> >>> split /(?<=(?![Cc][Hh]).)/, $string >>> >>> and it works fine. >>> >>> Unfortunately, Ruby does not implement/support this "zero-width >>> positive look-behind assertion", so the question is how can one >>> efficiently split the string in Ruby? > > Stupid question. :-) One should not insist on word-for-word > translation when rewriting some code from Perl to Ruby. :-) > > The solution can be e.g. scan(/[cC][hH]|./) > > irb(main):001:0> "cHeck czeCh".scan(/[cC][hH]|./) > => ["cH", "e", "c", "k", " ", "c", "z", "e", "Ch"] > >> Does this work? >> >> irb(main):001:0> "czech".split(/([Cc][Hh])|/) >> => ["c", "z", "e", "ch"] >> irb(main):002:0> "check czech".split(/([Cc][Hh])|/) >> => ["", "ch", "e", "c", "k", " ", "c", "z", "e", "ch"] >> irb(main):003:0> "cHeck czeCh".split(/([Cc][Hh])|/) >> => ["", "cH", "e", "c", "k", " ", "c", "z", "e", "Ch"] > > Scan version is slightly better as it never returns the empty string. > Of course, thanks anyway. > > But where can one find this feature of the split in the documentation? > http://www.rubycentral.com/ref/ref_c_string.html#split does not > mention split returns not only delimited substrings, but also > successful groups from the match of the regexp. > > Regards, > > P. > As far as I can see, it's not in the documentation. I found it by accident. But, yes, the scan method is better. :) -Justin