```On Mon, 18 Oct 2004 17:02:30 +0900
Mark Hubbart <discordantus / gmail.com> wrote:

| I defined the basic algorithm I was working on in my other email like
| this:
|
| > 1. break the range up into regexp friendly sections, like this:
| >   (23..1024) =>
| >     23..29,
| >     30..99,
| >     100..999,
| >     1000..1019,
| >     1020..1024
| >
| > 2. convert each range into a string regexp:
| >   23..29 => "2[3-9]"
| >   30..99 => "[3-9]\\d"
| >   100..999 => "[1-9]\\d\\d"
| >   1000..1019 => "10[01]\\d"
| >   1020..1024 => "102[0-4]"
| >
| > 3. join them all together
| >   /^0*(?:2[3-9]|[3-9]\d|[1-9]\d\d|10[01]\d|102[0-4])\$/
|

I have had the same idea once my simple solution (using each integer in a range) failed with large ranges.

I assume the following:

- leading zeros are not accepted
- nothing is captured
- must match the whole line (anchored to start and end of line)

Output for this file on my machine:

[penguin 73] ~/work/rubyquiz > ruby 4_20041018_regexp.rb
Started
.....
Finished in 0.021428 seconds.

5 tests, 108 assertions, 0 failures, 0 errors
[penguin 74] ~/work/rubyquiz >

So, better late than never, here is my code!

Thomas

ps. I tried to shorten the new Range methods as good as I could, can anything else be done??

------------------------------------------------------------------------
require 'test/unit/ui/console/testrunner'

class Integer

def to_rstr
"#{self}"
end

end

class Range

def get_regexps( a, b, negative = false )
arr = [a]

af = (a == 0 ? 1.0 : a.to_f)
bf = (b == 0 ? 1.0 : b.to_f)
1.upto( b.to_s.length-1 ) do |i|
pot = 10**i
num = (af/pot).ceil*(pot) # next higher number with i zeros
arr.insert( i,  num ) if num < b
num = (bf/pot).floor*(pot) # next lower number with i zeros
arr.insert( -i, num )
end
arr.uniq!
arr.push( b+1 ) # +1 -> to handle it in the same way as the other elements

result = []
0.upto( arr.length - 2 ) do |i|
first = arr[i].to_s
second = (arr[i+1] - 1).to_s
str = ''
0.upto( first.length-1 ) do |j|
if first[j] == second[j]
str << first[j]
else
str << "[#{first[j].chr}-#{second[j].chr}]"
end
end
result << str
end

result = result.join('|')
result = "-(?:#{result})" if negative
result
end

def to_rstr
if first < 0 && last < 0
get_regexps( -last, -first, true )
elsif first < 0
get_regexps( 1, -first, true ) + "|" + get_regexps( 0, last )
else
get_regexps( first, last )
end
end

end

class Regexp

def self.build( *args )
Regexp.new("^(?:" + args.collect {|a| a.to_rstr}.flatten.uniq.join('|') + ")\$" )
end

end

class RegexpTest < Test::Unit::TestCase

def rangeTest( first, last )
r = Regexp.build( first..last )
assert_match( r, "#{first}" )
assert_match( r, "#{(first + last)/2}" )
assert_match( r, "#{last}" )
assert_no_match( r, "#{first-1}" )
assert_no_match( r, "#{last+1}" )
end

def testBuild
lucky = Regexp.build( 3, 7 )
assert_match(lucky, "7")
assert_no_match(lucky, "13")
assert_match(lucky, "3")

rangeTest( 1, 12 )
month = Regexp.build( 1..12 )
assert_no_match(month, "0")
assert_match(month, "1")
assert_match(month, "12")

rangeTest( 1, 31 )
day = Regexp.build( 1..31 )
assert_match(day, "6")
assert_match(day, "16")
assert_no_match(day, "Tues")

rangeTest( 2000, 2005 )
year = Regexp.build( 98, 99, 2000..2005 )
assert_no_match(year, "04")
assert_match(year, "2004")
assert_match(year, "99")

rangeTest( 0, 1000000 )
num = Regexp.build( 0..1_000_000 )
assert_no_match(num, "-1")
end

def testPositive
rangeTest( 2, 10 )
rangeTest( 23432, 12312123 )
end

def testNegative
rangeTest( -10, -2 )
rangeTest( -15, 4 )
rangeTest( -100342, -343 )
end

def testOther
rangeTest( 5, 16 )
rangeTest( 10, 100 )
rangeTest( 11, 99 )
rangeTest( 1, 123456789 )
rangeTest( 10, 10 )
rangeTest( 5, 5 )
rangeTest( 0, 5 )
rangeTest( 0, 10 )
rangeTest( 1, 5 )
end

def testIllegal
num = Regexp.build( 1..12 )
assert_no_match( num, "012" )
assert_no_match( num, "A12" )
assert_no_match( num, "120" )
assert_no_match( num, "12A" )
assert_no_match( num, "3125" )
end

end

```