Justin Collins wrote:
> Pavel Smerk wrote:
> 
>> And once more question:

one more :)

>> In Czech, c followed by h is considered (for sorting etc.) as one 
>> character/grapheme ch. I need to split string to single characters 
>> with respect to this absurd manner.
>>
>> In Perl I can write
>>
>> split /(?<=(?![Cc][Hh]).)/, $string
>>
>> and it works fine.
>>
>> Unfortunately, Ruby does not implement/support this "zero-width 
>> positive look-behind assertion", so the question is how can one 
>> efficiently split the string in Ruby?

Stupid question. :-) One should not insist on word-for-word translation 
when rewriting some code from Perl to Ruby. :-)

The solution can be e.g. scan(/[cC][hH]|./)

irb(main):001:0> "cHeck czeCh".scan(/[cC][hH]|./)
=> ["cH", "e", "c", "k", " ", "c", "z", "e", "Ch"]

> Does this work?
> 
> irb(main):001:0> "czech".split(/([Cc][Hh])|/)
> => ["c", "z", "e", "ch"]
> irb(main):002:0> "check czech".split(/([Cc][Hh])|/)
> => ["", "ch", "e", "c", "k", " ", "c", "z", "e", "ch"]
> irb(main):003:0> "cHeck czeCh".split(/([Cc][Hh])|/)
> => ["", "cH", "e", "c", "k", " ", "c", "z", "e", "Ch"]

Scan version is slightly better as it never returns the empty string. Of 
course, thanks anyway.

But where can one find this feature of the split in the documentation? 
http://www.rubycentral.com/ref/ref_c_string.html#split does not mention 
split returns not only delimited substrings, but also successful groups 
from the match of the regexp.

Regards,

P.