In message <2B92A3D0D399D311BA7E00A0C90F8FDD88BF21 / mailgate.snellingcorp.com>
ChrisM / SNELLINGCORP.COM writes:

> I'm in need of a word wrap method -- anyone know of an existing one
> (preferably in Ruby) I could use?

I have code inspired from filladapt.el.... It's not so beautiful and
effective, but at least works for usual input :-)

A script after including many extra parts, such as EUC-JP handling and
incomplete implementation of traditional Japanese formatting.  The
actual part implementing word wrap is in String#fold!, roughly

  str = ""
  bol = 0
  while bol < size
    index(/(?:.*?)?.{1,#{w}}#{EOLHangingChars}*(?:\s|\Z|(?=[\xa1-\xfe]{2}+(?:[ -~\s]|\Z)))/n, bol)
    ln = $&
    bol += ln.size
    str << ln+"\n"
  end
  str

Ehm, if target strings are represented in only codeset like ASCII or
Latin1 codeset, which only uses 1 byte for each characters, Regexp for
index can be more simple....


Hmm, like

  ruby -e 'puts $<.read.gsub(/\n/, " ").gsub(/.{1,65}(?:\s|\Z)/){$&+"\n"}' foo

for 65 coloumns.


-- 
kjana / os.xaxon.ne.jp                               February 9, 2001
Never put off till tomorrow what you can do today.

#!/usr/local/bin/ruby -Ke class String BOLInhibitChars = %q=(?:\xa1[\xb3-\xb9\xbc-\xbe\xc5\xc6\xeb-\xed])= EOLInhibitChars = %q=(?:[({[`]|\xa1[\xc6\xc8\xca\xcc\xce\xd0\xd2\xd4\xd6\xd8\xda])= EOLHangingChars = %q=(?:[]}),.:;!?'"]|\xa1[\xa2-\xad\xc7\xc9\xcb\xcd\xcf\xd1\xd3\xd5\xd7\xd9\xdb\xeb\xec\xed])= def width str = self.dup str.expand! str.size end def expand!(tw = 8) true while sub!(/(^|\n)([^\t\n]*)\t/) { $1+$2+" "*(tw-$2.size%tw) } end def join! gsub!(/\s*\n(?:\s*|\Z)/, "\n") true while sub!(/(.)\n(.)/) { ($1+$2).size != 4 ? $1+" "+$2 : $1+$2 } chomp! self end #def fold!(w = 70) # gsub!(/([\xa1-\xfe]{2})([0-9a-zA-Z])/no, "\\1 \\2") # gsub!(/([0-9a-zA-Z])([\xa1-\xfe]{2})/no, "\\1 \\2") # gsub!(/.{1,#{w}}(?:\s|\Z|(?=[\xa1-\xfe]{2}+(?:[ -~\s]|\Z)))/no) { $&+"\n" } # self #end def fold!(w = 70) gsub!(/([\xa1-\xfe]{2})([0-9a-zA-Z])/no, "\\1 \\2") gsub!(/([0-9a-zA-Z])([\xa1-\xfe]{2})/no, "\\1 \\2") bol = 0 str = "" ln = tmp = nil # for effectiveness while bol < size index(/(?:.*?)?.{1,#{w}}#{EOLHangingChars}*(?:\s|\Z|(?=[\xa1-\xfe]{2}+(?:[ -~\s]|\Z)))/n, bol) ln = $& tmp = ln.dup ln.sub!(/(?!#{BOLInhibitChars})#{BOLInhibitChars}*$/no, "") if index(/#{BOLInhibitChars}/no, bol+ln.size) == bol+ln.size ln.sub!(/#{EOLInhibitChars}+$/no, "") ln = tmp if ln.empty? bol += ln.size str << ln+"\n" end self.replace str self end end if $0 == __FILE__ w = 70 if ARGV[0] =~ /-w(\d*)/ ARGV.shift if $1 and not $1.empty? w = $1.to_i else w = ARGV.shift.to_i end end DotPref = [ '\d+\.\s+', # 1. foo '\(?\d+\)\s+', # 1) foo or (1) foo '[a-zA-Z]\.\s+', # a. foo '\(?[a-zA-Z]\)\s+', # a) foo or (a) foo '\d+[a-zA-Z].?\s+', # 1a. foo '\(?\d+[a-zA-Z]\)\s+', # 1a) foo or (1a) foo '[+-=*o]\s+', # single char bullets ] DupPref = [ '[#>;%]+\s*', # single char prefixes '\w+>\s*', # FOO> foo '[ \t]+', ] PrefExp = Regexp.new('^(?:'+DotPref.join('|')+'|'+DupPref.join('|')+')') DotPrefExp = Regexp.new('^(?:'+DotPref.join('|')+')') DupPrefExp = Regexp.new('^(?:'+DupPref.join('|')+')') def format(para, w) return "" if para.empty? preforig = "" prefnext = "" para[0].each do |tok| case tok[0] when :dot preforig << tok[1] prefnext << " "*tok[1].width when :dup preforig << tok[1] prefnext << tok[1] end end str = para.collect { |e| e.pop.pop }.join("\n").join!.fold!(w-preforig.width) str.gsub!(/^/, prefnext) str.sub!(/^#{prefnext}/, preforig) str end ARGV.each do |fn| File.open(fn) do |inf| para = [] inf.each do |ln| pref = [] if ln =~ /^\s*$/ ln = [[:empty, ln]] else while ln =~ PrefExp tok = $& ln = $' pref << [(tok =~ DupPrefExp ? :dup : :dot), tok] end ln = pref+[[:body, ln]] end if br = ln.find { |e| e[0] == :empty or e[0] == :dot } puts format(para, w) if not para.empty? case br[0] when :dot para.replace [ln] when :empty puts para.clear end next end para << ln end puts format(para, w) if not para.empty? end end end