Issue #8352 has been updated by knu (Akinori MUSHA).


I've also checked the `url` module of node.js and it didn't, neither.  [Their test cases](https://github.com/nodejs/node/blob/78545039d65fa24841454f161c3711ce4b5226bc/test/parallel/test-url-relative.js) do not include explicit examples of how to deal with sequences of slashes in a path, but there are some occurrences of double-slash retained in the expected results of relative path resolution, which means double-slash is not a subject of squeezing.

Looking into [WHATWG URL spec](https://url.spec.whatwg.org/), there's no indication that a sequence of slashes in a URL path should be treated specially.  A path is simply a "list" of "items" separated with the slash (/, U+002F) and any item can naturally be an empty string.  Even when resolving a "double-dot segment" and consequently "removing" a path "item" you are never told to "remove" extra items that are empty.

So, as you can see, Ruby and Python3 are the only exceptions, there's no specification that indicates that a sequence of slashes in a URL path should be treated specially, and the majority of library implementations found in other languages supports that.  I presume there are few programmers who would rely on the current behavior.

----------------------------------------
Bug #8352: URI squeezes a sequence of slashes in merging paths when it shouldn't
https://bugs.ruby-lang.org/issues/8352#change-67823

* Author: knu (Akinori MUSHA)
* Status: Open
* Priority: Normal
* Assignee: akira (akira yamada)
* Target version: 
* ruby -v: ruby 2.1.0dev (2013-05-01 trunk 40540) [x86_64-freebsd9]
* Backport: 
----------------------------------------
RFC 2396 (on which the library currently is based) or RFC 3986 says nothing about a sequence of slashes in the path part except for parsing rules when a URI (path) starts with two slashes.

It should be perfectly valid to have a slash right after another, and there is no reason to "normalize" a sequence of slashes into a single slash, which uri actually does in merging paths:

~~~
URI.parse('http://example.com/foo//bar/')+'.'
=> #<URI::HTTP:0x0000080303d2b0 URL:http://example.com/foo/bar/>
~~~

Fixing this may be as easy as changing the regexp in URI::Generic#split_path from %r{/+} to %r{/}, but I wonder how the impact of incompatibility it may introduce would be.



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>