"Brian Candler" <B.Candler / pobox.com> schrieb im Newsbeitrag
news:20030522114658.A83352 / linnet.org...
> There was a discussion a few weeks back about Ruby's handling of ^ and $
in
> regexps, and I have realised what may me so uncomfortable with it. I'm
used
> to matching strings on /^...$/ to mean "match exactly this", and it
doesn't
> work. In fact it could lead to very nasty security holes. Consider this
> example:
>
>        str = cgi['unsafe_item']
>        str.untaint if str =~ /^[a-z0-9]+$/
>
> Looks perfectly safe, doesn't it? Errm, no.
>
>        str = "rf -rf /*\nabcde\ndrop table master_db;"
>        puts "oops!" if str =~ /^[a-z0-9]+$/   #>> "oops!"
>
> For this to be safe, you actually have to write:
>
>       str.untaint if str =~ /\A[a-z0-9]+\z/
>
> The asymmetry between \A and \z is annoying (I have to keep looking it
up to
> remember which one is capital and which is lower-case), and it leaves
> regular expressions looking a lot less readable.

I always use uppercase, because that's a reasonable choice if you process
lines from a file like in

while ( line = gets ) do
  case line
    when /\Abegin\Z/
      ...
  end
end

\A and \Z might be even more mnemonic than ^ and $ if you think a moment
about it - but then, we're used to cryptic symbols. :-)

> I guess this is fixed in concrete now, but I thought it was pointing
this
> out as potentially a very important "gotcha"

Yes, it really is.  But I would not blame regexp syntax.  Designing
applications that do potentially dangerous things with input from the
outside world should be crafted carefully anyway.

Regards

    robert