On Saturday 31 May 2003 1:28 am, nobu.nokada / softhome.net wrote: > Hi, > > At Sat, 31 May 2003 08:59:45 +0900, > > Wesley J Landaker wrote: > > Mine also does the + and | operators, as well, though; I'm not sure > > if that's universally useful. > > As for +, is it right to just concatinate them? Regexp#| is > provided in lib/eregex.rb. And you can see also > <http://member.nifty.ne.jp/nokada/archive/reop.rb>. Well, + meaning concatination makes sense to me. What else would it mean? Notice that I do put regexps in (?:) groups so that you don't have any ambiguity if you do something like: /foo|bar/ + /.*/ # => /(?:foo|bar)(?:.*)/u (vs. getting /foo|bar.*/ which would be, I think, not what you expected, especially if the regexps were extremely complex) I wasn't aware that there were so several other regexp-operators packages. Must be a good idea if so several different people have also thought of it. ;) One thing that's missing from the packages you point at is that the object you get back isn't completely usable as a regexp. They could be extended to have the missing methods, of course, but they don't currently support them. And if you've added or modified any methods in regexp, these objects are of a different type (and aren't class descendants) so won't have the changes applied to them (say, if I redefine to_s or source or something like that) i.e.: irb(main):001:0> require 'eregex' => true irb(main):002:0> x = /foo/ | /bar/ => #<RegOr:0x401c0ba4 @re2=/bar/, @re1=/foo/> irb(main):003:0> /test/.methods - x.methods => ["casefold?", "|", "source", "&", "~", "match", "kcode"] Anyway, looks like eregex & is pretty handy; and your reop.rb looks even better, but for me, I think mine is a lot more useful in that it is totally transparent: when you do an operation on regexps, you get a regexp back. It doesn't create an object hierarchy as the other two you cited do; I toyed with that idea, but I didn't like it because I got objects back that behaved differently than regexps and couldn't be easily redefined without having some intimate knowledge of the operator package. BTW, I never wrote '&' because I didn't really need it, but it could be done with something like this: In RegexpOps.rb: # the other code I posted goes here class Regexp def &(other) /(?=#{self})#{other}/u end end Then: irb(main):001:0> require 'RegexpOps' => true irb(main):002:0> /foo/ & /bar/ => /(?=(?:foo))(?:bar)/u Of course, that regexp will never match anything, but you get the idea. ;) > > Looks like 1.8 still doesn't catch encoding flag in this case; > > there doesn't appear to be any '(?' prefix that changes encodings, > > though, which would be a prerequisite. (Personally, I'm happy with > > UTF-8. ;) > > Current regexp engine (and perhaps Oniguruma too) can not mix > encodings. Well, would it be better to preserve it and raise > an exception when it doesn't match? For me, the encodings are not a problem, as I only use UTF-8; I do a lot of multilingual stuff, and UTF-8 is the only way I can support English, French, Spanish, German, and Japanese (strange mix, but those are the languages I work with!) simultaneously in Ruby. In general, though, it seems like it would be a good idea to catch attempts at mixing encodings and throw an exception if they are incompatible. I might add that to mine. -- Wesley J. Landaker - wjl / icecavern.net OpenPGP FP: C99E DF40 54F6 B625 FD48 B509 A3DE 8D79 541F F830