On Wed, 26 Sep 2007 16:49:09 +0900, 7stud -- wrote:

> I'm not sure why data like this:
> 
>      database 1 is on server 3
>      database 8 is on server 3
> 
> should produce:
> 
>      server 3 handles databases 1 to 8
> 
> since server 3 only hands db's 1 and 8.

Sorry, I was being a lazy typist.  (In my defense, I only said the input
was "in the form!")  Assume that, for the example I gave, the output is
correct and the input has more data than I showed.

> 
> Or, what to do if there is only one db for the server, say:
> 
>     database 1 is on server 3
> 
> Should the output be:
> 
>     server 3 handles database 1 to 1

Yep, exactly.

> 
> Can there be duplicate lines like this:
> 
> database 1 is on server 3
> database 8 is on server 7
> database 1 is on server 3

Nope, never.  The actual input is akin to a list of symbolic NFS links,
something like:

/prod_dir/db1.file -> /mounts/server3/db1.file

So db1.file can only ever point to one place, and can only ever show up in
the input once.

For that same reason, the input's initially unsorted - it's *mostly*
sorted, but alphanumerically (db1, db11, db12, ... db199, db2, db21, etc.)

> Also, is the output supposed to be in ascending order by server?

Nope, not necessary.  Ascending by DB number probably makes it easier to
manually edit or check, though.

> In any case,  here is the input I used:
> 
> database 8 is on server 7
> database 10 is on server 7
> database 10 is on server 7
> database 5 is on server 9
> database 1 is on server 3
> database 2 is on server 3
> database 133 is on server 3
> database 4 is on server 144
> 
> and here is the output:
> 
> server 3 handles databases 1 to 133
> server 7 handles databases 8 to 10
> server 9 handles database 5
> server 144 handles database 4

Hmm, looks like it assumes the whole range if there are missing elements.
(I guess I didn't specify that behavior, did I?)  If a database isn't
listed in the input, it shouldn't be included in ranges in the output.  

So the actual output from the above input should be (in any order):

server 3 handles database 1 to 2
server 3 handles databases 133 to 133
server 7 handles databases 8 to 8
server 7 handles databases 10 to 10 
server 9 handles databases 5 to 5
server 144 handles databases 4 to 4

I -think- the "missing database" problem is endemic to the Set approach,
right?

Jay


> 
> require "set"
> 
> def output_data(arr)
>   arr.each do |elmt|
>     min = elmt[1].min
>     max = elmt[1].max
> 
>     if min == max
>       puts "server #{elmt[0]} handles database #{min}"
>     else
>       puts "server #{elmt[0]} handles databases #{min} to #{max}"
>     end
>   end
> end
> 
> servers = Hash.new{|hash, key| hash[key] = Set.new}
> 
> File.open("data.txt") do |file|
>   file.each_line do |line|
>     nums = line.scan(/\d+/)
>     servers[nums[1].to_i].add(nums[0].to_i)
>   end
> end
> 
> data = servers.sort
> output_data(data)


-- 
Jay Levitt                |
Boston, MA                | My character doesn't like it when they
Faster: jay at jay dot fm | cry or shout or hit.
http://www.jay.fm         | - Kristoffer