Ok, I'm going to go out on a limb here and say HOLY GOD THIS IS AWESOME.

Sorry for the shouting.




vikkous wrote:
> I would like to announce the first version, 0.4.0, of Reg, the Ruby
> Extended Grammar. Reg is a library for pattern matching in ruby data
> structures. Reg provides Regexp-like match and match-and-replace for
> all data structures (particularly Arrays, Objects, and Hashes), not
> just Strings.
> 
> The Reg RubyForge project:   http://rubyforge.org/projects/reg/
> 
> The Reg Tarball:
> http://rubyforge.org/frs/download.php/4199/reg-0.4.0.tar.bz2
> 
> Reg is best thought of in analogy to regular expressions; Regexps are
> special data structures for matching Strings; Regs are special data
> structures for matching ANY type of ruby data (Strings included, using
> Regexps).
> 
> This table compares syntax of reg and regexp for various constructs.
> Keep
> in mind that all Regs are ordinary ruby expressions. The special syntax
> 
> is acheived by overriding ruby operators.
> These abbreviations are used:
> re,re1,re2 represent arbitrary regexp subexpressions,
> r,r1,r2 represent arbitrary reg subexpressions
> s,t represent any single character (perhaps appropriately escaped, if
> the char is magical)
> 
> 
> reg           regexp               #description
> 
> +[r1,r2,r3]   /re1re2re3/          #sequence
> -[r1,r2]      (re1re2)             #subsequence
> r.lit         \re                  #escaping a magical
> regproc{r}    #{re}                #dynamic inclusion
> r1|r2 or :OR  (re1|re2) or [st]    #alternation
> ~r            [^s]                 #negation (for scalar r and s)
> r+0           re*                  #zero or more matches
> r+1           re+                  #one or more matches
> r-1           re?                  #zero or one matches
> r*n           re{n}                #exactly n matches
> r*(n..m)      re{n,m}              #at least n, at most m matches
> r-n           re{n,}               #at least n matches
> r+m           re{,m}               #at most m matches
> OB            .                    #a single item
> OBS           .*                   #zero or more items
> BR[1,2]       \1,\2                #backreference   ***
> r>>x or sub   sub,gsub             #search and replace   ***
> 
> 
> here are features of reg that don't have an equivalent in regexp
> r.la                  #lookahead ***
> ~-[]                  #subsequence negation w/lookahead ***
> & or :AND             #all alternatives match
> ^ or :XOR             #exactly one of alternatives matches
> +{r1=>r2}             #hash matcher
> -{name=>r}            #object matcher
> obj.reg               #turn any ruby object into a reg that matches if
> obj.=== succeeds
> /re/.sym              #a symbol regex
> proceq(klass){rcode}  #a proc{} that responds to === by invoking the
> proc's call
> OBS as un-anchor      #opposite of ^ and $ when placed at edges of a
> reg array (kinda cheesy)
> name=r                #named subexpressions
> 
> recursive matches via regvariables&regconstants  ***
> 
> *** = not implemented yet.
> 
> 
> Reg is kind of hard to wrap your mind around, so here are some
> examples:
> 
> Matches array containing exactly 2 elements; 1st is another array, 2nd
> is integer:
> +[Array,Integer]
> 
> Like above, but 1st is array of arrays of symbol
> +[+[+[Symbol.reg+0]+0],Integer]
> 
> Matches array of at least 3 consecutive symbols and nothing else:
> +[Symbol.reg+3]
> 
> Matches array with at least 3 symbols in it somewhere:
> +[OBS, Symbol.reg+3, OBS]
> 
> Matches array of at most 6 strings starting with 'g'
> +[/^g/-6]    #no .reg necessary for regexp
> 
> Matches array of between 5 and 9 hashes containing a key :k pointing to
> something non-nil:
> +[ +{:k=>~nil.reg}*(5..9) ]
> 
> Matches an object with Integer instance variable @k and property (ie
> method) foobar that returns a string with 'baz' somewhere in it:
> -{:@k=>Integer, :foobar=>/baz/}
> 
> Matches array of 6 hashes with 6 as a value of at least one key,
> followed by 18 objects with an attribute @s which is a String:
> +[ +{OB=>6}*6, -{:@s=>String}*18 ]
> 
> 
> Status:
> Some highly nested vector reg constructions still don't work quite
> right.  (For examples, search on eat_unworking in regtest.rb.) A number
> of features are unimplemented at this point, most notably
> backreferences and substitutions.
> 
> 
>