On 30.03.2007 17:34, Jon wrote: > I'm trying to translate a strange derivative of xml into valid xml. Here > is an example line: > > <SUBEVENTSTATUS > 1:2><OPERATIONNAME></OPERATIONNAME>gofast<OPERATIONSTATUS>stopped</OPERATIONSTATUS><TARGETOBJECTNAME>name</TARGETOBJECTNAME><TARGETOBJECTVALUE>val</TARGETOBJECTVALUE></SUBEVENTSTATUS > 1:1><SUBEVENTSTATUS 2:2><......and on > > REXML pukes on the <SUBEVENTSTATUS 1:2> tag... which it should. There > should be some kind of attribute declaration instead. I want to > translate it to something like this: <SUBEVENTSTATUS no="1" of="2"> > > I'm trying to make a regex to detect the funny tags. Here is what I have > so far: > > xml_fix=/<(\S+)\s+(\d+):(\d+)>/ > > This is great, but it will match this: > > <Request><code_set_list 1:2> > > instead of just this: > > <code_set_list 1:2> > > ..because there is no gauranteed whitespace between tags. Basically, I > need to stop matching if a ">" is found. I've never had to deal with > anything quite like this in my regex experience. Any help or thoughts of > a better way to do things is much appreciated! I can think of several solutions: /<([^>\s]+)\s+(\d+):(\d+)>/ Or even a two phased approach /<[^>]+>/ and then with the match /(\d+):(\d+)>\z/ HTH robert