Harry Ohlsen wrote:
> A colleague of mine just asked me whether it was possible to invert
an
> arbitrary regular expression.  Ie, create a new regular expression
that
> matches whatever the original *didn't*.
>
> It seems like quite a difficult problem to me, but maybe I'm just not
> looking at it the right way.
>
> Obviously simple cases are easy. Eg
>
>    re = /[a-z]/
>    inverse = /[^a-z]/
>
> However, I think it would be much harder for arbitrary cases.  Eg,
how
> would one automatically invert something like the following
>
>    /<[^<>]*id="!?(region8-).*?>/
>
> Note that I'm not talking about finding lines that don't contain the
> pattern (a la "grep -v"); I'm talking about finding all the
occurrences
> within a single line that don't match it.

How about splitting on the regex instead of scanning for it? It'll give
you the longest sequences between matches -> everything that didn't
match.

irb(main):002:0> s = 'DELIMtexttextDELIMtextDELIM'
=> "DELIMtexttextDELIMtextDELIM"
irb(main):003:0> s.scan /DELIM/
=> ["DELIM", "DELIM", "DELIM"]
irb(main):004:0> s.split /DELIM/
=> ["", "texttext", "text"]