2009/7/9 David A. Black <dblack / rubypal.com>: > On Fri, 10 Jul 2009, Robert Klemme wrote: >> 2009/7/9 David A. Black <dblack / rubypal.com>: >>> On Thu, 9 Jul 2009, Sarawut Poaitwinyu wrote: >>> ¨Âïòäó óôòéîç®óðìéô >>> >>> When you call split with no argument, it splits on whitespace >>> (including more than one character). >> >> I am more like the "positive" guy - meaning explicitly defining what I >> want returned. ¨Â ÷ïõìäï >> >> words = string.scan /\w+/ >> >> That way dot, question mark and other signs won't hurt. ¨Âíáîï>> make a difference but it's probably good to see different approaches. > > string.split does explicitly define what I want back; it's just > something different from what you want back :-) That's true. I just wanted to make the point that there are these two major approaches: define positively what you want in your result or define it ex negativo, i.e. state what you want to use as separator. The whole point is that both approaches may behave identical with the original set of test data but will exhibit different behavior as soon as the input changes. If you use #split, you might get something you did not want in the first place. With #scan you won't notice - which could be bad as well. The super safe variant would be to first do a match on the whole string to ensure it does contain expected data only and fail if not. After that it does not matter any more what extraction method one uses. > It depends exactly how > you define "word". I was assuming it was /\S+/ but it may indeed be > /\w+/ (or maybe /[^\W\d_]+/ or something). Absolutely. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/