2007/9/18, Peter Bailey <pbailey / bna.com>: > Robert Klemme wrote: > > 2007/9/18, Peter Bailey <pbailey / bna.com>: > >> > > >> > >> Thanks, William. I tried your regex, but, I'm still getting the first > >> entry as one that's 300 lines deep into the file. In fact, the results > >> look exactly the same to me. > > > > Still William's regexp is significantly better than the original one. > > You seem to be processing XML files. It may be that there is some > > white space between <issueList> and <issue> that you are not prepared > > for. You can handle that by replacing \n with \s*. > > > > A completely different approach is to use REXML or another XML tool > > and use XPath search. This is way less error prone - but usually also > > slower. If you just want to extract these codes then a SAX parser > > approach might still be pretty fast. > > > > Kind regards > > > > robert > > Same old output. I'll look into REXML. I downloaded it. It's part of the standard distribution. > But, it's enough > for me to just learn Ruby. I don't know if I can handle yet another > scripting language. Anyway, thanks a lot. Well, as William said: can you show a piece of the document you are trying to match? Kind regards robert