Hi --

On Thu, 13 Oct 2005, rubyhacker / gmail.com wrote:

> This is one of those things that "anyone can do" and it doesn't
> take that long. But it's always fun/educational to see how
> different people would do it.
>
> Given: A text is in two languages (say, English and French) --
> assume separate files or whatever is convenient. They're
> formatted properly, so that paragraphs correspond to each other
> predictably. We define a "paragraph" as simply a group of non-blank
> lines followed by one or more blank lines or end of file. (Thus
> even a simple title or heading would count.) Assume a page length
> N (lines per page).
>
> Reformat both texts such that:
>
> 1. Corresponding paragraphs start on corresponding lines of the
> page.
>
> 2. If either paragraph is shorter than the other, it will be padded
> with blank lines so that the next paragraphs coincide.
>
> 3. Preserve any "extra" blank lines that were already there
> between paragraphs.
>
> 4. Neither text will allow a page break in the middle of a paragraph.
> If it won't fit in either case, do a page break for both.
>
> 5. If you want to simplify output, represent a page break as "----"
> or the equivalent.
>
>
> I'll be playing at this in my spare minutes.
>
> Let the games begin.

Quite the brute force approach, and probably full of holes, but anyway:

   PARAGRAPH_RE = /.*?\n(?:\n+|\z)/m

   def parallelize(a,b)
     short,long = [a.dup,b.dup].sort_by {|text| text.to_a.size }
     short << "\n" until short.to_a.size == long.to_a.size
     return short,long
   end

   def pagify(text,n)
     paragraphs = text.scan(PARAGRAPH_RE)
     line = 1
     paragraphs.each do |para|
       if line + para.size > n
         para.replace("----\n#{para}")
         line = 1
       end
     end
     paragraphs.join
   end

   # Sample usage

   english = File.read....
   french = File.read....

   eng_final = ""
   fr_final = ""

   eng.scan(PARAGRAPH_RE).zip(fr.scan(PARAGRAPH_RE)).each do |e,f|
     ep,fp = parallelize(e,f)
     eng_final << ep
     fr_final << fp
   end

   puts pagify(eng_final,60), pagify(fr_final,60)


David

-- 
David A. Black
dblack / wobblini.net