"Brian Candler" <B.Candler / pobox.com> schrieb im Newsbeitrag news:20030522114658.A83352 / linnet.org... > There was a discussion a few weeks back about Ruby's handling of ^ and $ in > regexps, and I have realised what may me so uncomfortable with it. I'm used > to matching strings on /^...$/ to mean "match exactly this", and it doesn't > work. In fact it could lead to very nasty security holes. Consider this > example: > > str = cgi['unsafe_item'] > str.untaint if str =~ /^[a-z0-9]+$/ > > Looks perfectly safe, doesn't it? Errm, no. > > str = "rf -rf /*\nabcde\ndrop table master_db;" > puts "oops!" if str =~ /^[a-z0-9]+$/ #>> "oops!" > > For this to be safe, you actually have to write: > > str.untaint if str =~ /\A[a-z0-9]+\z/ > > The asymmetry between \A and \z is annoying (I have to keep looking it up to > remember which one is capital and which is lower-case), and it leaves > regular expressions looking a lot less readable. I always use uppercase, because that's a reasonable choice if you process lines from a file like in while ( line = gets ) do case line when /\Abegin\Z/ ... end end \A and \Z might be even more mnemonic than ^ and $ if you think a moment about it - but then, we're used to cryptic symbols. :-) > I guess this is fixed in concrete now, but I thought it was pointing this > out as potentially a very important "gotcha" Yes, it really is. But I would not blame regexp syntax. Designing applications that do potentially dangerous things with input from the outside world should be crafted carefully anyway. Regards robert