On 30.04.2008 23:40, David A. Black wrote: > Hi -- > > On Thu, 1 May 2008, Robert Klemme wrote: > >> On 30.04.2008 16:18, Jens Wille wrote: >>> Phillip Gawlowski [2008-04-30 16:09]: >>>> John Butler wrote: >>>> | Hi, >>>> | >>>> | I have a sentence "This is my test sentence" and an array["is", "the", >>>> | "my"] and what i need to do is find the occurence of any of thearray >>>> | words in the sentence. >>>> | >>>> | I have this working in a loop but i was wondering is there a way to do >>>> | it using one of rubys string methods. >>>> | >>>> | Its sililar to the include method but searching for multiple words not >>>> | just one. >>>> | >>>> | "This is my test sentence".include?("This") returns true >>>> | >>>> | but i want something like >>>> | >>>> | "This is my test sentence".include?("This", "is", "my") >>>> | >>>> | anyone got a nice way to do this? I only need to find if one of the >>>> | words occure and then i exit. >>>> | >>>> | JB >>>> >>>> How about '["is", "the", "my"].each'? >>>> >>>> I.e.: >>>> >>>> ["is", "the", "my"].each do |word| >>>> ~ break if "the test sentence'.include? word >>>> end >>> i'd prefer Enumerable#any?: >>> >>> sentence, words = "This is my test sentence", ["This", "is", "my"] >>> words.any? { |word| sentence.include?(word) } >> I'd rather do it the other way round, i.e. iterate over the sentence and test >> words since the sentence is potentially longer: >> >> irb(main):001:0> require 'enumerator' >> => true >> irb(main):002:0> require 'set' >> => true >> irb(main):003:0> words = %w{This is my}.to_set >> => #<Set: {"my", "This", "is"}> >> irb(main):004:0> "This is my test sentence".to_enum(:scan,/\w+/).any? {|w| >> words.include? w} >> => true >> irb(main):005:0> > > Is there any reason not to just do: > > "This is my test sentence".scan(/\w+/).any? {|w| words.include? w } Yes. I used to_enum(:scan,/\w+/) because in this class of problems the text (sentence) is tends to be large. The approach using to_enum does the test while traversing while scan approach first converts the whole text into words and then applies the test thus iterating twice over the whole text plus doing more conversions (to words) and needs more temporary memory (i.e. for the whole sequence of words, although the overhead might be small because of internal String memory sharing). The Set approach scales better for larger sets of words because the Set lookup is O(1) while an Array based lookup is O(n). I am not saying that my approach is faster under all circumstances. But it surely scales better. Kind regards robert