Issue #7780 has been updated by trans (Thomas Sawyer).


After refactoring Psych to handle Tag Schema, I have to concur with @marcandre. I don't think people realize the extent to which Psych is mapping tags to classes. It's no where near YAML spec (albeit the spec is nice enough to allow you to shoot yourself in the foot and the head if you want). And well beyond anything I even realized, and I already knew it was doing some of this. Add a single domain tag and something like 13 actual tags will match it. But most people never notice such things b/c YAML is used mostly to load simple configurations.

It would be prudent to make loading stick to YAML failsafe and json schema, and *maybe* ruby's core classes. A `:schema` option can be used for anything else. For instance I've add an `OBJECT_SCHEMA` which can be used to load any Ruby object using the `!ruby/object:` notation. It won't break most programs and for those that it would, it's a quick fix via a schema.

I am about to add a formal issue for my work, but you can see it now at: https://github.com/trans/psych/tree/isotag

----------------------------------------
Bug #7780: Marshal & YAML should deserialize only basic types by default.
https://bugs.ruby-lang.org/issues/7780#change-35864

Author: marcandre (Marc-Andre Lafortune)
Status: Assigned
Priority: Normal
Assignee: 
Category: 
Target version: next minor
ruby -v: r39035


YAML is a wonderful, powerful and expressive format to serialize data in a human readable way.

It can be used, for example, to read and write nice configuration files, to store strings, numbers, dates & times in a hash.

YAML.load will, by default, instantiate any user class and set instance variables directly.

On the other hand, this can make apparently innocent code lead to major vulnerabilities, as was clearly illustrated by different exploits recently.

I feel YAML.load should, by default, be safe to use, for example by instantiating only known core classes.

The same can be said for Marshal, even though it would more rarely be used as a public interface to exchange data.

Maybe the following transition path could be taken:
1) Have {YAML|Marshal}.load  issue a warning (once) that next minor will only deserialize basic types.
2) Create {YAML|Marshal}.unsafe_load, which does the same thing as current `load`, without a warning of course.
As these changes are compatible and extremely minor, I would like them to be considered for Ruby 2.0.0. They also make for a 

"Secure by default" is not a new concept.
Rails 3.0 has XSS protection by default, for example. The fact that one needs to do extra processing like calling `raw` when that security needs to be bypassed makes XSS attacks less likely.
I believe the typical use of Yaml.load is to load basic types.
We should expect users to use the easiest solution, so that should be the safe way.
If a tool makes the safe way of doing things the default, and makes it easy to do more complex deserializing (e.g. whitelisting some user classes), this can only lead to less vulnerabilities.

I hope nobody will take offence that I've tagged this issue as a "bug". The current behavior is as speced, but it can be argued that a design that is inherently insecure is a defect.


-- 
http://bugs.ruby-lang.org/