On 9/14/07, Ruby Quiz <james / grayproductions.net> wrote: > This week's Ruby Quiz is to write a simple utility. Your program should accept > an IP address as a command-line argument and print out the two letter code for > the country that IP is assigned in. You can find a database for the matching > at: > > http://software77.net/cgi-bin/ip-country/geo-ip.pl > > To keep the problem interesting though, let's write our programs with a focus on > speed and memory efficiency. Hi, This is my second submission to Ruby Quiz, any tip will be greatly appreciated. First I downloaded the file and removed all comments, then tried a straightforward approach. Here it is: require 'faster_csv' ip = ARGV[0].split(/\./) ip = ip[0].to_i * 16777216 + ip[1].to_i * 65536 + ip[2].to_i * 256 + ip[3].to_i file = ARGV[1] || 'ipdb.csv' FasterCSV.foreach(file) do |line| if (line[0].to_i <= ip && line[1].to_i >= ip) puts line[4] exit end end It allows to pass the IP as the first parameter, then transforms it to the format in the file (the "base" 256). The file can be passed as the second argument, by default I used my modified file without the comments. The method uses FasterCSV to parse the file line by line, checking ip against the range. The result for the IP in the example was: time ruby quiz139a.rb 68.97.89.187 US real 0m0.355s user 0m0.328s sys 0m0.024s So, it was a little bit more than the one presented with the problem. Unfortunately, things don't go so good when you search for an IP that matches entries at the end of the file: time ruby quiz139a.rb 255.0.0.1 ZZ real 0m5.337s user 0m5.036s sys 0m0.292s More than 5 seconds :-( So, I went on to build a second version. This time the approach I took, as the file is ordered is a binary search directly on the file. Here it is: require 'ftools' ip = ARGV[0].split(/\./) ip = ip[0].to_i * 16777216 + ip[1].to_i * 65536 + ip[2].to_i * 256 + ip[3].to_i file = ARGV[1] || 'ipdb.csv' File.open(file) do |f| low = 0 high = f.stat.size f.seek(high / 2) while true while ((a = f.getc) != 10) f.seek(-2, IO::SEEK_CUR) end pos = f.pos line = f.readline.split(",") low_range = line[0][1..-2].to_i high_range = line[1][1..-2].to_i if (low_range > ip) high = pos offset = (f.pos - pos) + ((high - low) / 2) f.seek(-offset, IO::SEEK_CUR) elsif (high_range < ip) low = f.pos f.seek((high-low) / 2, IO::SEEK_CUR) else puts line[4][1..-2] exit end end end What the method does is a binary search in the file. It starts a low and high variables to manage the partition, then positions the cursor in the file to the middle. Every time, the method reads back until it finds a 10 (\n), then reads that line to check the IP range. If the lower range is higher than the IP, the method moves the cursor in the file, down to half of the current low-high range. If the higher range is lower than the IP, then it moves up half of the range. The timings here are much better: time ruby quiz139b.rb 68.97.89.187 US real 0m0.077s user 0m0.048s sys 0m0.004s time ruby quiz139b.rb 255.0.0.1 ZZ real 0m0.069s user 0m0.060s sys 0m0.008s Apart from the general strategy of the binary search, I haven't had the time to give too much thought to the details, like how to extract the values after the readline. Any tips regarding that would be greatly appreciated. Also I don't know if there's a better way of moving back in the file to the previous line break (the while (a = f.getc) != 10 part). Tips? Thanks for this quiz. I hope I had time to work on every quiz. Regards, Jesus.