On 04.01.2009 22:58, Rob Biedenharn wrote: > On Jan 4, 2009, at 4:44 PM, Robert Klemme wrote: > >> On 04.01.2009 21:46, Rob Biedenharn wrote: >>> On Jan 4, 2009, at 1:04 PM, Robert Klemme wrote: >>>> On 04.01.2009 17:29, Sam Fent wrote: >>>>> This is just a basic parsing question, really. I'm trying to work >>>>> out >>>>> how I would process a URL such as >>>>> "http://www.example.com/x/y/z/myfile.txt" and get back the filename >>>>> "myfile". Basically the pattern is to get the past part of the >>>>> string >>>>> after the final /, and then strip off the filetype. >>>> IMHO it is not a good idea to use a File method for URL's because >>>> File.basename has different criteria >>>> >>>> irb(main):003:0> File.basename 'http://test.com/aaa\\bbb.txt' >>>> => "bbb.txt" >>>> >>>> Although I am not sure whether a backslash is allowed there, this >>>> is what I'd do: >>>> >>>> irb(main):001:0> url = 'http://www.example.com/x/y/z/myfile.txt' >>>> => "http://www.example.com/x/y/z/myfile.txt" >>>> irb(main):002:0> name = url[%r{[^/]+\z}] >>>> => "myfile.txt" >>> Rather than jump to a Regexp, just use the right tool for the job. >>> irb> require 'uri' >>> => true >>> irb> u=URI.parse 'http://www.example.com/x/y/z/myfile.txt' >>> => #<URI::HTTP:0x1cac14 URL:http://www.example.com/x/y/z/myfile.txt> >>> irb> u.path >>> => "/x/y/z/myfile.txt" >>> irb> File.basename u.path, '.txt' >>> => "myfile" >> I considered URI as well but what makes your code the "right tool >> for the job"? Basically you use URI only to extract the path and >> then use File.basename to get the last bit of the path. But: while >> the URI path consists of elements separated by "/", File.basename >> also considers "\\" as delimiter. So IMHO it is by no means "the >> right tool" - at least not more than using a regular expression >> which extracts exactly the part needed from the string at hand (and >> is likely faster as well). >> >> The situation would be different if URI provided a method which >> returns the last path element but as far as I can see this does not >> exist. >> >> Kind regards >> >> robert > > > I guess it depends on what your url might look like. For example, if > it contains a query string: > > irb> str = 'http://a.b.c/root/sub/dir/file?param=a' > => "http://a.b.c/root/sub/dir/file?param=a" > irb> File.basename str > => "file?param=a" > > Oops! File.basename just doesn't fit. > > irb> require 'uri' > => true > irb> url = URI.parse(str) > => #<URI::HTTP:0x1c8446 URL:http://a.b.c/root/sub/dir/file?param=a> > irb> url.path > => "/root/sub/dir/file" > irb> File.basename url.path > => "file" > > The OP will have to make the final tool selection, but there may be > lurkers that have similar problems who find URI a better fit than File. Certainly. I do have to say that I get the impression we are talking a bit past each other. I wasn't advocating to use File.basename at all - not alone and not in combination with URI! For the URL with query part I would still rather do name = URI.parse(str).path[%r{[^/]+\z}] Kind regards robert -- remember.guy do |as, often| as.you_can - without end