Issue #7780 has been updated by marcandre (Marc-Andre Lafortune).


mame (Yusuke Endoh) wrote:
> I think that this issue is not a bug, but a new feature.

I would rather not argue about this.

> We should keep {YAML|Marshal}.load "as is" (i.e., dangerous), and that we will introduce {YAML|Marshal}.safe_load in the next minor.

This is worth arguing over.

What downside do you see to my proposition?
What upsides do you see to yours?
Do you believe that the typical use is to call `safe_load` or `unsafe_load`?
Why should the shortest and default way not be the safe one?

charliesome (Charlie Somerville) wrote:
> However I think YAML.load should be safe, since most people using YAML only use it for primitive types and are not aware that it is able to deserialize into any class.

I'm glad to have support on this.
Another source that supports this point of view: http://nedbatchelder.com/blog/201302/war_is_peace.html
It discusses PyYAML which decided to have `load` (unsafe) and `safe_load`. It doesn't come bundled with python but is still used; a google search will point to different pull requests for python libraries to use `safe_load` instead of `load`, e.g. https://bugs.launchpad.net/cloud-init/+bug/1015818

This could all be avoided with `load` being safe!

I hope that Charliesome, myself and others can convince Matz / tenderlove that YAML.load should be safe by default.

----------------------------------------
Bug #7780: Marshal & YAML should deserialize only basic types by default.
https://bugs.ruby-lang.org/issues/7780#change-35820

Author: marcandre (Marc-Andre Lafortune)
Status: Assigned
Priority: Normal
Assignee: matz (Yukihiro Matsumoto)
Category: 
Target version: 2.0.0
ruby -v: r39035


YAML is a wonderful, powerful and expressive format to serialize data in a human readable way.

It can be used, for example, to read and write nice configuration files, to store strings, numbers, dates & times in a hash.

YAML.load will, by default, instantiate any user class and set instance variables directly.

On the other hand, this can make apparently innocent code lead to major vulnerabilities, as was clearly illustrated by different exploits recently.

I feel YAML.load should, by default, be safe to use, for example by instantiating only known core classes.

The same can be said for Marshal, even though it would more rarely be used as a public interface to exchange data.

Maybe the following transition path could be taken:
1) Have {YAML|Marshal}.load  issue a warning (once) that next minor will only deserialize basic types.
2) Create {YAML|Marshal}.unsafe_load, which does the same thing as current `load`, without a warning of course.
As these changes are compatible and extremely minor, I would like them to be considered for Ruby 2.0.0. They also make for a 

"Secure by default" is not a new concept.
Rails 3.0 has XSS protection by default, for example. The fact that one needs to do extra processing like calling `raw` when that security needs to be bypassed makes XSS attacks less likely.
I believe the typical use of Yaml.load is to load basic types.
We should expect users to use the easiest solution, so that should be the safe way.
If a tool makes the safe way of doing things the default, and makes it easy to do more complex deserializing (e.g. whitelisting some user classes), this can only lead to less vulnerabilities.

I hope nobody will take offence that I've tagged this issue as a "bug". The current behavior is as speced, but it can be argued that a design that is inherently insecure is a defect.


-- 
http://bugs.ruby-lang.org/