Issue #10740 has been updated by Yusuke Endoh.

File urlsafe_base64.patch added

Tony Arcieri wrote:
> My interpretation of RFC4648 would suggest this behavior:
>
> Base64.urlsafe_encode64(bin) should produce padded output like it does today
> Base64.urlsafe_decode64(str) should work on both padded and unpadded inputs,

Thank you, sounds reasonable.  I like the behavior of Java's Base64.Decoder:

https://docs.oracle.com/javase/8/docs/api/java/util/Base64.Decoder.html

> The Base64 padding character '=' is accepted and interpreted as the end of the encoded byte data, but is not required. So if the final unit of the encoded byte data only has two or three Base64 characters (without the corresponding padding character(s) padded), they are decoded as if followed by padding character(s). If there is a padding character present in the final unit, the correct number of padding character(s) must be present, otherwise IllegalArgumentException ( IOException when reading from a Base64 stream) is thrown during decoding.


How about this?

       # This method complies with ``Base 64 Encoding with URL and Filename Safe
       # Alphabet'' in RFC 4648.
       # The alphabet uses '-' instead of '+' and '_' instead of '/'.
    +  # Note that the result can still contain '='.
    +  # You can remove the padding by setting "padding" as false.
    +  def urlsafe_encode64(bin, padding: true)
    +    str = strict_encode64(bin).tr("+/", "-_")
    +    str = str.delete("=") unless padding
    +    str
       end
 
       # Returns the Base64-decoded version of +str+.
       # This method complies with ``Base 64 Encoding with URL and Filename Safe
       # Alphabet'' in RFC 4648.
       # The alphabet uses '-' instead of '+' and '_' instead of '/'.
    +  #
    +  # The padding characters are optional.
    +  # This method accepts both correctly-padded and unpadded input.
    +  # Note that it still rejects incorrectly-padded input.
    +  def urlsafe_decode64(str)
    +    str = str.tr("-_", "+/")
    +    if !str.end_with?("=") && str.length % 4 != 0
    +      str = str.ljust((str.length + 3) & ~3, "=")
    +    end
    +    strict_decode64(str)
       end



Off topic:

> because RFC4648 allows other RFCs that implement RFC4648-compliant base64url encoding to explicitly stipulate that there is no padding.

RFC 4648 says that the encoder MUST NOT add line feeds, unless bla bla:

> Implementations MUST NOT add line feeds to base-encoded data unless
> the specification referring to this document explicitly directs base
> encoders to add line feeds after a specific number of characters.

Also, it says that the decoder MUST reject the input containing line feeds, unless bla bla:

> Implementations MUST reject the encoded data if it contains
> characters outside the base alphabet when interpreting base-encoded
> data, unless the specification referring to this document explicitly
> states otherwise.

RFC4648-compliant encoder WITH the exemption emits a data with line feed, and RFC4648-compliant decoder WITHOUT the exemption rejects the emitted data.  Which is broken?  IMO, RFC 4648 is broken ;-)

-- 
Yusuke Endoh <mame / ruby-lang.org>

----------------------------------------
Feature #10740: Base64 urlsafe methods are not urlsafe
https://bugs.ruby-lang.org/issues/10740#change-51051

* Author: Scott Blum
* Status: Feedback
* Priority: Normal
* Assignee: Yusuke Endoh
----------------------------------------
Base64.urlsafe_decode64 is not to spec, because it currently REQUIRES appropriate trailing '=' characters.
Base64.urlsafe_encode64 produces trailing '=' characters.

'=' is not web safe, and is not recommended for base64url.  Some specs even disallow.

Suggested fix:

~~~
  # Returns the Base64-encoded version of +bin+.
  # This method complies with ``Base 64 Encoding with URL and Filename Safe
  # Alphabet'' in RFC 4648.
  # The alphabet uses '-' instead of '+' and '_' instead of '/'
  # and has no trailing pad characters.
  def urlsafe_encode64(bin)
    strict_encode64(bin).tr("+/", "-_").tr('=', '')
  end

  # Returns the Base64-decoded version of +str+.
  # This method complies with ``Base 64 Encoding with URL and Filename Safe
  # Alphabet'' in RFC 4648.
  # The alphabet uses '-' instead of '+' and '_' instead of '/'.
  # Trailing pad characters are optional.
  def urlsafe_decode64(str)
    str = str.tr("-_", "+/")
    str = str.ljust((str.length + 3) & ~3, '=')
    strict_decode64(str)
  end
~~~


---Files--------------------------------
base64-urlsafe-encode64-search-result.txt (19.9 KB)
urlsafe_base64.patch (2.97 KB)


-- 
https://bugs.ruby-lang.org/