On 6/10/07, Logan Capaldo <logancapaldo / gmail.com> wrote: > On 6/10/07, Robert Dober <robert.dober / gmail.com> wrote: > > > > On 6/10/07, Trochalakis Christos <yatiohi / ideopolis.gr> wrote: > > > Hello! > > > > > > I want to parse a tagged string like this: "<i>this is</i><i>my > > > string</i>" > > > > > > i am doing: > > > > > > >> "<i>this is</i><i>my string</i>".scan(/<i>(.*)<\/i>/) > > > => [["this is</i><i>my string"]] > > > > > > What i want is a regex that will return the *first* segment that > > > matches. > > > in the above case -> [["this is", "my string"]] > > > > > > Is there any way to do this? > > > > > > Thanks! > > > > > > > > > > > This is a FAQ, and yes I will give the solution ;) > > Regexps are gready par default, they consume as many chars as > > possible, there are some possibilities - not tested: > > > > (1) use non gready matches > > "<i>this is</i><i>my string</i>".scan(/<i>(.*?)<\/i>/) > > (2) use less general expressions > > "<i>this is</i><i>my string</i>".scan(/<i>(.[^<]*)<\/i>/) > > (3) Combine both ;) > > "<i>this is</i><i>my string</i>".scan(/<i>(.[^<]*?)<\/i>/) > > > .Unless you want to match strings like <i><foo</i>, it would be simple to > just use [^<]*, and not .[^<]*. .[^<]* will also not match <i></i>. If the > intent was to make the regexp not match that, a better regexp would be [^<]+ Thanks for correcting my typos. Robert -- You see things; and you say Why? But I dream things that never were; and I say Why not? -- George Bernard Shaw