hi Cee -
copying the text you posted above into the file "0text.txt" and
running this:
f = "0text.txt"
file = File.open(f)
buffer = []
bufferindex = 0
file.each_line(sep=">"){|line|
buffer[bufferindex] = line.chomp
bufferkey+=1
}
p buffer[0]
p buffer[1]
p buffer[2]
p buffer[3]
i get this as output:
#=> ">"
#=> "gi|329295464|ref|NM_2005745.3Acc1| Def1 zgc:65895 (zgc:65895),
mRNA\\n\nAGCTCGGGGGCTCTAGCGATTTAAGGAGCGATGCGATCGAGCTGACCGTCGCG\\n\n\\n\n>"
#=> "gi|456299107|ref|NM_2342343.3Acc2| Def2 zgc:65895 (zgc:65895),
mRNA\\n\nGTCGCTGGGTCGAAAAGTGGTGCTATATCGCGGCTCGCGTCGATGTCGCGATG\\n\nCGTGCGCGCGAGAGCGCGCTATGATGAAAGGATGAGAGAG\\n\n\\n\n>"
#=> "gi|3542945647|ref|NM_7453343.5Acc3| Def3 zgc:65895 (zgc:65895),
mRNA\\n\nCGTGCGGGGABCCGTACGTGCCGTGGGGGTTTAATAGCGCGCCATCTGAGCAG\\n\nTTAGTCGCTGACGCATGCACG\\n\n\\n"
does this work for you? you could easily write ways to deal with,
dump, and reset the buffers when they fill up. you can of course also
clean up all the "\n"'s...
i agree with 7stud that using #.pos and #.gets seems like a long walk
off a short pier. i'm pretty green myself, and there are probably
better ways to iterate through the file, but #.each_line(sep=">") works
just fine, and doesn't eat up memory.
- j
--
Posted via http://www.ruby-forum.com/.