I'm sure there must be a more idiomatic+efficient way to do this, but I
can't figure it out. Any suggestions?
Also, I'm not sure all of the tests are necessary. Many of them were
added to avoid "Nil class does not implement..." messages. Is there a
better approach?
# parse1 separates a chunk into the non-word stuff before it, the word
stuff, and the non-word stuff after it
# word stuff is letters, digits, hyphens, and periods
def parse1(chunk)
pb = /^([^-A-Za-z0-9]*)/
pe = /([^-A-Za-z0-9]*)$/
mtch = pb.match(chunk)
a = mtch[0]
mtch = pe.match(mtch.post_match)
b = mtch.pre_match
c = mtch[0]
#print " parse1:a #{a.inspect} " if a and
a.length > 0
yield a if a and a.length > 0
#print " parse1:b #{b.inspect} " if b and
b.length > 0
yield b if b and b.length > 0
#print " parse1:c #{c.inspect} " if c and
c.length > 0
yield c if c and c .length > 0
end
# parse2 takes a hunk of word stuff, and possibly separates it at a
double-hyphen
def parse2(chunk)
unless chunk.include?("--") then
yield chunk
else
val = chunk.split(/--/)
v2 = []
val.each do |v|
v2 << v
v2 << "--"
end
v2.delete_at(v2.length)
v2.each { |v| yield v }
end
end
# parse3 takes a hunk of word stuff, and possibly separates it at an
ellipsis (triple '.')
def parse3(chunk)
unless chunk.include?("...") then
yield chunk
else
val = chunk.split(/\.\.\./)
v2 = []
val.each do |v|
v2 << v
v2 << "..."
end
v2.delete_at(v2.length)
v2.each { |v| yield v }
end
end
def wrds(chunk)
return "" unless chunk.respond_to?("[]")
parse1(chunk) do |p1|
#print " wrds:p1 = #{p1.inspect} "
parse2(p1) do |p2|
#print " wrds:p2 = #{p2.inspect} "
parse3(p2) do |p3|
#print " wrds:p3 = #{p3.inspect} "
yield p3
end
end
end
end