On Oct 17, 2004, at 1:12 PM, Jamis Buck wrote: > So, according to my calculations, 48+ hours have elapsed. > > Thus, here's my solution to Regexp.build(). I assumed the following: My solution is pretty different and admittedly only so, so in functionality. My main idea was to treat all passed parameters as character data. This solves the leading zeros problem by letting you pass things like (1..60, "01".."09"). In addition, this approach also allows you to pass non-numerical data, though that wasn't part of the quiz. The other main point of my implementation was to not anchor at all. This may make built Regexps less convenient to use, but by allowing you to embed them in other patterns it greatly increases usability. For example, if you would like to allow for arbitrary leading zeros, you just embed the result of build() in another Regexp object with a leading "0*". You can use embedding to provide whatever anchoring you need, setup your own captures, or even to combine several built Regexp objects. Well, all that is how I intended this to work. It even gets close at times. <laughs> Unfortunately, my character collapsing system (to regex character classes) is dog slow and only works correctly on numerical data. Put simply, my library makes the quiz's (1..1_000_000) example impractical in build time. If I had it to do over, I would approach this part of the problem from a completely different angle. This is the one I built to throw away, as the saying goes. I'll post my library below, and then my unit tests, which probably better convey what I was aiming for. James Edward Gray II #!/usr/bin/env ruby class Regexp def self.build( *nums ) nums = nums.map { |e| Array(e) }.flatten.map { |e| String(e) } nums = nums.sort_by { |e| [-e.length, e] } patterns = [ ] while nums.size > 0 eq, nums = nums.partition { |e| e.length == nums[0].length } patterns.push(*build_char_classes( eq )) end /(?:#{patterns.join("|")})/ end private def self.build_char_classes( eq_len_strs ) results = [ ] while eq_len_strs.size > 1 first = eq_len_strs.shift if md = /^([^\[]*)([^\[])(.*)$/.match(first) chars = md[2] matches, eq_len_strs = eq_len_strs.partition do |e| e =~ /^#{md[1]}(.)#{Regexp.escape md[3]}$/ and chars << $1 end if matches.size == 0 results << first next end chars = build_short_class(chars.squeeze) eq_len_strs << "#{md[1]}[#{chars}]#{md[3]}" else results << first end end results << eq_len_strs[0] if eq_len_strs.size == 1 results end def self.build_short_class( char_class ) while md = /[^\-\0]{3,}/.match(char_class) short = md[0][1..-1].split("").inject(md[0][0, 1]) do |mem, c| if (mem.length == 1 or mem[-2] != ?-) and mem[-1, 1].succ == c mem + "-" + c elsif mem[-2, 2] =~ /-(.)/ and $1.succ == c mem[0..-2] + c else mem + c end end char_class.sub!(md[0], short.split("").join("\0")) end char_class.tr!("\0", "") char_class.gsub!(/([^\-])-([^\-])/) do |m| if $1.succ == $2 then $1 + $2 else m end end char_class end end === Unit Tests === #!/usr/bin/env ruby # Usage: ruby -r regexp_build_lib $0 require "test/unit" class TestRegexpBuild < Test::Unit::TestCase def test_integers lucky = /^#{Regexp.build(3, 7)}$/ assert_match(lucky, "7") assert_no_match(lucky, "13") assert_match(lucky, "3") month = /^#{Regexp.build(1..12)}$/ assert_no_match(month, "0") assert_match(month, "1") assert_match(month, "12") day = /^#{Regexp.build(1..31)}$/ assert_match(day, "6") assert_match(day, "16") assert_no_match(day, "Tues") year = /^#{Regexp.build(98, 99, 2000..20005)}$/ assert_no_match(year, "04") assert_match(year, "2004") assert_match(year, "99") num = /^#{Regexp.build(1..1_000)}$/ assert_no_match(num, "-1") (-10_000..10_000).each do |i| if i < 1 or i > 1_000 assert_no_match(num, i.to_s) else assert_match(num, i.to_s) end end end def test_embed month = Regexp.build("01".."09", 1..12) day = Regexp.build("01".."09", 1..31) year = Regexp.build(95..99, "00".."05") date = /\b#{month}\/#{day}\/(?:19|20)?#{year}\b/ assert_match(date, "6/16/2000") assert_match(date, "12/3/04") assert_match(date, "Today is 09/15/2004") assert_no_match(date, "Fri Oct 15") assert_no_match(date, "13/3/04") assert_no_match(date, "There's no date hiding in here: 00/00/00!") md = /^(#{Regexp.build(1..12)})$/.match("11") assert_not_nil(md) assert_equal(md[1], "11") end def test_words animal = /^#{Regexp.build("cat", "bat", "rat", "dog")}$/ assert_match(animal, "cat") assert_match(animal, "dog") assert_no_match(animal, "Wombat") end end