Hallo, On Tue, 09 Jul 2002 05:20:12 GMT, Yukihiro Matsumoto <matz> wrote: > Insights? It's inherited from Perl. Try: > > % perl -le 'print join(":", split(".b", "abcabc"))' > :c:c I guess Perl has inherited it from awk. awk is somewhat simple language but I think it's at least consistent. Let me explain why it works this way in awk. awk doesn't have types. Not only types of variables it also lacks types of values. So it's impossible to have a variable whose value is a regexp. That's why in some situations strings are interpreted as regular expressions. So you write gsub("c+",...) and it's the same as writing gsub(/c+/,...). In these situations regular expression is expected, and if a string is found, it is converted to regexp. But in some situations, like the field separator parameter to split(), it's impossible to convert _all_ strings to regular expressions, since traditionally, one-character separators were used. So one-char strings has to retain their original meaning but, OTOH, there is no way to specify real regular expression in awk. That's why, in these situations, new rule has been introduced: one-char string means one-char field separator, longer strings mean regex field separators. > But if it turns out to be a bad inheritance (and I admit I'm starting > to feeling so), I'm open to a new RCR. Well, thus I'm speaking about "indirect inheritance from awk" or about "consistency with awk". 1) gsub() ... the parameter has to be regex, so I see no reason for accepting (and automatically converting) strings. As one cannot write "abc1bc".gsub(1,"X"), it's necessary to use at least "abc1bc".gsub(1.to_s,"X"), I'd propose that "abcabc".gsub("a","X") simply won't work, requiring the programmer to use this: "abcabc".gsub(Regexp.new("a"),"X") This also encouradges writing more effective programs, since it encouradges storing a compiled Regexp. Another plus of this: compiling regexps from strings is often source of errors when the regexp contains backslashes. Thus encouradging usage of /regexp/ instead of "regexp" is a good thing. Is this possible or will it break too much old programs? Matz will decide. :-) 2) split() ... one-character strings and regular expressing are absolutely necessery. Automatic conversion of anything to regexp obfuscates split(), I think. So I'd suggest either interpreting long strings as strings or forbidding them completely. No doubt the currect situation about split() is confusing. But if you change split() to interpret longer strings literally and leave gsub/sub as it is, the situation will be confusing again, I'm afraid: gsub translates strings to regexps, while split doesn't. Thus I think either split() should be changed in a fairly restrictive manner (accept only one-char strings or Regexp) or gsub should not automatically convert strings to regexps. I vote for the later alternative. Looking forward to comments, Stepan