On 11.08.2007 06:19, Ryan Davis wrote:
> 
> On Aug 10, 2007, at 13:54 , William James wrote:
> 
>> On Aug 10, 1:29 pm, Frank Meyer <lolz.ll... / gmail.com> wrote:
>>> I've written a little ruby program which can sort logfiles with the
>>> following format:
>>>
>>> 4.text text text
>>> 1.text text text
>>> 2.text text text
>>> 10.text text text
>>> 2.text2 text2 text2
>> ...
>> File.open( ARGV.first, "r+" ){|file|
>>   array = file.readlines
>>   file.rewind
>>   file.truncate(0)
>>   file.puts array.sort_by{|s| s[/^\d+/].to_i }
>> }
> 
> your version takes a lot of memory, is slow, and doesn't properly sort 
> the content of the line, just the number. swap the two "2." lines and 
> you'll see what I mean. Using the right tool for the job (`sort`) does 
> wonders:
> 
> % ruby -e 'n = 1_000_000; File.open("blah.txt", "w") { |f| n.times { m = 
> rand 5; f.puts "#{rand n}. file#{m} file#{m} file#{m}" } }'
> % cp blah.txt blah2.txt
> % time ruby -e 'File.open( ARGV.first, "r+" ) { |file| array = 
> file.readlines; file.rewind; file.truncate(0); file.puts 
> array.sort_by{|s| s[/^\d+/].to_i } }' blah.txt
> real    0m8.182s ...
> % time ruby -e 'path = ARGV.shift; system %(sort -n "#{path}" > 
> "#{path}.tmp"); File.rename "#{path}.tmp", path' blah2.txt
> real    0m3.175s ...
> % cmp blah.txt blah2.txt
> blah.txt blah2.txt differ: char 50, line 3
> % head blah.txt blah2.txt
> ==> blah.txt <==
> 3. file4 file4 file4
> 4. file4 file4 file4
> 6. file3 file3 file3
> 6. file1 file1 file1
> 6. file0 file0 file0
> 7. file0 file0 file0
> 7. file4 file4 file4
> 8. file1 file1 file1
> 8. file3 file3 file3
> 8. file3 file3 file3
> 
> ==> blah2.txt <==
> 3. file4 file4 file4
> 4. file4 file4 file4
> 6. file0 file0 file0
> 6. file1 file1 file1
> 6. file3 file3 file3
> 7. file0 file0 file0
> 7. file4 file4 file4
> 8. file1 file1 file1
> 8. file3 file3 file3
> 8. file3 file3 file3
> 532 %

It's a one liner:

ruby -i.bak -e 'puts ARGF.readlines.sort_by {|l| l[/^\d+/].to_i}' file

Less memory usage:

ruby -i.bak -e 'puts ARGF.readlines.sort! {|a,b| a[/^\d+/].to_i <=> 
b[/^\d+/].to_i}' file

Kind regards

	robert