Hi, everybody.
In the #ruby-core design meeting, during the discussion about MVM,
there was some mention of the sandbox API. I thought it would be
worth while to write up an RCR. I mean: all though there has been
some talk about the sandbox extension for Ruby 1.8 on this list,
there hasn't been any talk about the API itself.
Considering that $SAFE has fallen out of use and there is a renewed
interest in managing many namespaces/environments on a single VM,
I figured hey.
ABSTRACT
Ruby of yore has only had one interpreter environment. The sandbox
API gives that central environment a means of creating other
in-process environments for executing code. Be it restricted
sandboxes for running unsafe code or fully-featured sandboxes to
offer a clean namespace.
PROS & CONS
The benefits of this particular API:
* Rather simple (yeah?)
* Basic (albeit unstable) extensions exist for Ruby 1.8 and JRuby.
* Patterned after other successful sandboxes (such as Firefox's
XPCNativeWrapper[1] and Io's Sandbox[2])
* Generic enough to work in other Ruby impls.
The drawbacks are:
* Not fully proven on Ruby 1.8.
* My extension does rely on Thread.kill! to stop a Sandbox,
which is taboo. (Same problem timeout.rb has.)
* Haven't worked out how tainting could play out.
* Could be closer coupled with threading to offer concurrent
interps in separate threads.
THE API
All classes and methods are enclosed in the Sandbox module.
The primary classes are Sandbox::Full and Sandbox::Safe.
Sandbox::Safe is descended from Sandbox::Full.
Methods for these two classes are:
* self.new(opts = {})
Returns a newly created sandbox.
Available options: :init, :ref
* eval(str, opts = {}) => obj
Evaluates +str+ as Ruby code inside the sandbox
and returns the result.
Available options: :timeout
* load(io, opts = {}) => nil
At heart, just an alias for: eval(IO.read(io), opts)
* ref(klass) => nil
Adds a boxed reference to +klass+ in the sandbox.
(Ex.: @box.ref(YAML) would create a YAML class in the
sandbox which is derived from Sandbox::BoxedClass, a
proxy to the YAML class on the outside.)
* require(str)
Requires a file into the Sandbox, using the $LOAD_PATH and
file permissions of the current sandbox.
The Sandbox module itself has a few methods:
* Sandbox.safe(opts = {})
An alias for Sandbox::Safe.new(opts)
* Sandbox.new(opts = {})
An alias for Sandbox::Full.new(opts)
* Sandbox.current
Returns an object representing the current Sandbox.
* Sandbox.screen(obj) => true or Sandbox::ScreenException
Traverses an object and its related symbols to be sure
it is entirely composed of objects from the current
sandbox. Purely for testing.
As for the `opts` hash in the above methods, here's a brief
description of those:
* init: The portions of Ruby core to initialize.
:load - $:, $-I, $LOAD_PATH, $\, $LOADED_FEATURES,
load, require, autoload, autoload?
:io - IOError, EOFError, IO, FileTest, File, Dir,
File::Constants, test, File::Stat,
:env - syscall, open, printf, print, putc, puts,
gets, readline, getc, select, readlines,
p, display, STDIN, STDOUT, STDERR
:real - abort, at_exit, caller, exit, trace_var,
untrace_var, set_trace_func, warn, ThreadError
Thread, Continuation, ThreadGroup, trap,
exec, fork, exit!, system, `, sleep, Process,
Process::Status, Process::Sys, GC,
ObjectSpace, hash, __id__, object_id
:all - the whole enchilada
(Sandbox::Full assumes :init => :all and Sandbox::Safe
assumes :init => nil.)
* ref: Classes to create boxed references for.
(Ex.: :ref => [RedCloth, BlueCloth])
* timeout: Maximum seconds, a time limit for the sandbox.
BOXED CLASSES
Inside each Sandbox, a BoxedClass constant is defined. This class
has two methods: method_missing and const_missing.
So, let's say you're running a web app in the sandbox. And you
want it to speak to Mongrel in the main interp. Imagine a
MongrelConnector class that acts as medium between the two.
-- master.rb --
require 'mongrel'
class MongrelConnector
def self.each
str = yield ""
# send str to mongrel
end
end
box = Sandbox.safe
box.load 'rails.rb'
box.ref MongrelConnector
box.eval 'start'
-- web.rb --
def start
MongrelConnector.each do |cgi|
cgi << "hallo!"
end
end
Inside the sandbox (where web.rb is running,) the MongrelConnector
class is a BoxedClass. When `each` is called, method_missing
switches sandboxes and runs the method on the class outside the box.
When method_missing gets an answer back, it switches back inside the
sandbox and returns an answer.
For primitive data, such as numbers and strings and floats which
have no instance variables, the data is marshalled. For other
objects, a Sandbox::Ref is received. Both inside and outside the
sandbox, a Sandbox::Ref points to data not inside the current
sandbox. This ref also has a method_missing, which works just like
BoxedClass' method_missing.
It is not allowed to pass a Sandbox::Ref for an object whose class
is not referred to in the receiving sandbox. So, if, for some
reason, a method call tries to return an IO object to a sandbox and
no IO class is defined (and properly ref'd,) a
Sandbox::TransferException is thrown.
THE PRELUDE
Beyond the API, it is also required that the Sandbox run versions of
common methods which are not exploitable. For example, the freaky
freaky sandbox has a lib/sandbox/prelude.rb which includes a pure
Ruby version of the `**` method since very high squares can lock the
interpreter up in C.
AND DONE
That's it for now. I'm not an extreme zealot of this API, so I'd be
glad to alter it or scrap it. But it has evolved through trial and
error, based on xp points awarded during Try Ruby and the
sandboxed wiki[3].
Thankyou for your generous attentions.
[1] http://developer.mozilla.org/en/docs/XPCNativeWrapper
[2] http://iolanguage.com/scm/git/checkout/Io/docs/IoReference.html
[3] http://redhanded.hobix.com/inspect/howToLetAnyoneElseFinishYourWiki.html