On Tue, Sep 13, 2011 at 3:01 PM, Cyril J. <cyril.varghese.jose / gmail.com> wrote: >> What exactly are "header lines" in your case? ¨Âéôèïõô ëîï÷éîôè>> format and your parsing / processing requirements it's difficult to >> come up with suggestions. > > A header line starts with ">" > > First Entry, first file: >>1_4_138_F5-P2 > 234234234234234 > > First Entry, second file: >>1_4_138_F3 > 234234234234234 > > I have two large files(several gigs) that I need to read with a bunch of > "entries" as the above shown. I need to match the header lines, in this > case i need to make sure "1_4_138_" is the same in both entries and if > it is, write those entries with matching headers to new seperate files. Do you have guaranteed ordering for the header lines in each file?