William James wrote:

> Tom Cloyd wrote:
> 
> > I'm baffled by this strange outcome - I cannot reduce multiple
> > spaces from a text file. This isn't just a regex problem, somehow.
> > I'm failing to grasp something essential, but don't know what it
> > is. All help appreciated, as usual!
> > 
> > Here is a demo of my problem, in which I try two different ways,
> > and both fail:
> > 
> > === code ===
> > # h2t.rb
> > 
> > def main
> >   # conversion table spec
> >   conv = [
> >   [ '<h1>', 'h1. ' ], [ '<h2>', 'h2. ' ], [ '<h3>', 'h3. ' ],
> >   [ '<h4>', 'h4. ' ], [ '<h5>', 'h5. ' ], [ '<h6>', 'h6. ' ],  [ 
> > /<\/h\d>/, '' ],
> >   [ " +", ' ' ]]  # <= this last array element should do the trick,
> > but doesn't
> > 
> >   data = open( 'h2t-in2.txt', 'r' ) { |f| ( f.readlines( data
> > )).to_s }  
> >   conv.each do |i|
> >     data.gsub!( i[0], i[1] )
> >   end
> >   data.squeeze(' ')  # <= putting this here was sheer desperations,
> > but even THIS fails
> > 
> >   open( "h2t-out.txt", "w" ) { |f| f.write( data ) }
> > 
> > end
> > 
> > %w(rubygems ruby-debug readline strscan logger fileutils).each{
> > |lib| require lib }
> > 
> > main
> > 
> > === input file ===
> > 
> > <h1>Library    catalog  listing           </h1>x
> > 
> > <h3>Library    catalog  listing           </h3>x
> > 
> > <h2>Library    catalog  listing           </h2>x
> > 
> > p(subtitle).   A    complete listing of all material in the Library
> > 
> > 
> > === output file ===
> > 
> > 
> > h1. Library    catalog  listing           x
> > 
> > h3. Library    catalog  listing           x
> > 
> > h2. Library    catalog  listing           x
> > 
> > p(subtitle).   A    complete listing of all material in the Library
> > 
> > ==============
> > 
> > The "x"s in the input file are to show that while the end tags are
> > being removed the space before them is NOT.
> > 
> > t.
> 
> puts IO.readlines("data2").map{|line|
>   line.sub( /<(h\d)>/, '\1. ' ).sub( /<\/h\d>/, "").
>   squeeze " " }
> 
> --- output ---
> 
> h1. Library catalog listing x
> 
> h3. Library catalog listing x
> 
> h2. Library catalog listing x
> 
> p(subtitle). A complete listing of all material in the Library

puts IO.read("data2").gsub( /<(h\d)>/, '\1. ' ).gsub( /<\/h\d>/, "").
  squeeze " "