<--- On Mar 12, William James wrote ---> > Aditya Mahajan wrote: >> I am trying to convert a plain text file into a tex file with >> somemarkup. I have the following piece of code that does most of the >> work. It works fine but does not "look good". Can someone suggest how >> can I improve this code. >> >> ---------------[snip]---------------------- >> file = File.new(filename, 'r') >> texfile = File.open(basename + ".tex", 'w') >> >> CHAPTER = Regexp.new("CHAPTER") >> SPACES = Regexp.new("\s\s\s\s") >> BLANK = Regexp.new(/^\s*$/) >> >> chapter = true >> verse = false >> prev_line = "" >> >> file.each_line do |line| >> if chapter && !BLANK.match(line) >> chapter = false >> texfile.puts "\\chapter{#{line.chomp}}" >> elsif CHAPTER.match(line) >> chapter = true >> elsif verse && !SPACES.match(line) >> texfile.puts '\stoplines\stopnarrower' >> texfile.puts line.chomp >> verse = false >> elsif !verse && BLANK.match(prev_line) && SPACES.match(line) >> texfile.puts '\startnarrower\startlines' >> texfile.puts line.chomp >> verse = true >> else >> texfile.puts line.chomp >> end >> prev_line = line >> end >> ------------------[snip]-------------------- >> > ruby -p01e'gsub(/(\n\s*\n)((?:^\s{4}.*\n)+)/, > "\\1\\startnarrower\\startlines\n\\2\\stoplines\\stopnarrower\n"); > gsub(/(CHAPTER\s+)(.*)\n/,"\\chapter{\\2}\n")' in >out > Thank you for the regex. The chapter part of your regex is better than what I was doing, it does not correctly identify narrower region. I think that I can tweak it a little to make it work correctly. But what I want to know is there a better way to do this in ruby. I am not too comfortable with coding using a regex as it can be very difficult to maintain. Each time I have to look into the expression and try to understand it again. Basically the logic of the program depends on the "state" which I am keeping track of using flags. Your code gets rid of the flags using a two pass algorithm. What are the pros and cons. In using a gsub, the program needs to read the entire file before it can make any changes. I thought that this would be memory inefficient but for a ~160 kb file, it is almost instantaneous. Even for a 3MB file it takes less than a second. At what file sizes should one read the file line by line rather than entire file in a single shot? Thanks -- Aditya Mahajan, EECS Systems, University of Michigan http://www.eecs.umich.edu/~adityam || Ph: 7342624008