On Jan 24, 2008, at 9:23 AM, Dan Cuddeford wrote: > So it seems using the two together > > > require 'uri' > > uri = URI.parse("http://www.ruBy-lang.org/ARSE") > > can = uri.normalize > p can > > p can.host > > p can.path > > > means the path keeps it's case sensitivity but the host is normalized. > > I think that's it - however, > > try it with ruby-lang..org and > > /usr/lib/ruby/1.8/uri/generic.rb:195:in `initialize': the scheme http > does not accept registry part: www.ruBy-lang..org (or bad hostname?) > (URI::InvalidURIError) > from /usr/lib/ruby/1.8/uri/http.rb:78:in `initialize' > from /usr/lib/ruby/1.8/uri/common.rb:488:in `new' > from /usr/lib/ruby/1.8/uri/common.rb:488:in `parse' > from canon.rb:3 > > So I guess it needs a bit or error checking before hand. require 'uri' def canonicalize(uri) u = uri.kind_of?(URI) ? uri : URI.parse(uri.to_s) u.normalize! newpath = u.path while newpath.gsub!(%r{([^/]+)/\.\./?}) { |match| $1 == '..' ? match : '' } do end newpath = newpath.gsub(%r{/\./}, '/').sub(%r{/\.\z}, '/') u.path = newpath u.to_s end canonicalize('http://www.Ruby-Lang.ORG/ARSE/done/../../rear/./end/.') => "http://www.ruby-lang.org/rear/end/" -Rob Rob Biedenharn http://agileconsultingllc.com Rob / AgileConsultingLLC.com