This is a multi-part message in MIME format.
--------------B31AB18F0E265FCD09174948
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit


I don't think there's any *general* way to marshal objects that have a C
struct attached to them (i.e., created with DATA_MAKE_STRUCT or
DATA_WRAP_STRUCT) and which contain arbitrary data (including references
to Ruby objects), so here's a patch that cleanly adds that
functionality.

This patch is conservative in the sense that it only affects execution
paths that resulted in exceptions before. It should not break existing
code or formatted data. The patch is based on 1.6.6.


USAGE

The following methods should be implemented in a class used to wrap C
data if some of that C data needs to be serialized:

Instance methods:

_dump_data
  
  Returns an Object that encapsulates the data stored in the struct.
  
_load_data Object

  Called on the object after it is allocated, but before instance vars
are restored. (Unlike the _load method used with Marshal, this is an
instance method.) Used to populate the struct with data from the
argument object.

These two instance methods can be written in Ruby if there are accessor
methods available for reading and writing all the persistent C data. (It
may be advantageous to write them in C, though.)

Class method:
  
_alloc

  Invoke DATA_MAKE_STRUCT, and return the resulting object.


Notes:

(1) I originally thought of using just a string argument in place of the
Object, for consistency with _dump/_load, but that made life hard within
the _dump_data/_load_data implementation, and also it is impossible to
connect to Marhsal's hash of objects that it has seen, so your object
graph gets turned into a tree with duplicates.

(2) There is no limit parameter to _dump_data because this gets applied
automatically in the w_obejct call in the T_DATA case.

(3) The reason I didn't just use _dump/_load or redefine
Marshal.dump/load, is that my C data is very general. It may refer to
other Ruby objects. To maintain referential integrity, I need to
continue with the same Marshal.dump/load call without starting the
process from scratch with a new arg->data hash.

(4) The _dump_data/_load_data methods are called only in the case of a
T_DATA object. They are used in conjuntion with w_object/r_object.

(5) I have tested this code in my CGen/CShadow library to make sure that
both Ruby attrs and C data are marshalled, and that referential
integrity is preserved (including cycles).

(6) I've also tested that the proc argument to Marshal.load is called.
When this proc is called, instance vars are all nil, but that's the way
it works in other cases now, anyway.

<plug>
This new functionality supports CGen/CShadow, which can now manage the
marshalling of C attributes for you. I'll upload the latest version in a
day or two.
</plug>

--
Joel VanderWerf                          California PATH, UC Berkeley
mailto:vjoel / path.berkeley.edu                     Ph. (510) 231-9446
http://www.path.berkeley.edu                       FAX (510) 231-9512
--------------B31AB18F0E265FCD09174948
Content-Type: text/plain; charset=us-ascii;
 namearshal.patch"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filenamearshal.patch"

--- marshal.c.orig	Mon Feb 18 21:54:34 2002
+++ marshal.c	Tue Feb 19 18:23:56 2002
@@ -62,4 +62,5 @@
 #define TYPE_UCLASS	'C'
 #define TYPE_OBJECT	'o'
+#define TYPE_DATA       'd'
 #define TYPE_USERDEF	'u'
 #define TYPE_FLOAT	'f'
@@ -82,4 +83,5 @@
 
 static ID s_dump, s_load;
+static ID s_dump_data, s_load_data, s_alloc;
 
 struct dump_arg {
@@ -484,4 +486,32 @@
 	    break;
 
+          case T_DATA:
+            w_byte(TYPE_DATA, arg);
+            {
+                VALUE klass  LASS_OF(obj);
+                char *path;
+
+                if (FL_TEST(klass, FL_SINGLETON)) {
+                    if (RCLASS(klass)->m_tbl->num_entries > 0 ||
+                        RCLASS(klass)->iv_tbl->num_entries > 1) {
+                        rb_raise(rb_eTypeError, "singleton can't be dumped");
+                    }
+                }
+                path  b_class2name(klass);
+                w_unique(path, arg);
+            }
+            {
+                VALUE v;
+
+                if (!rb_respond_to(obj, s_dump_data)) {
+                    rb_raise(rb_eTypeError,
+                             "class %s needs to have instance method `_dump_data'",
+                             rb_class2name(CLASS_OF(obj)));
+                }
+                v  b_funcall(obj, s_dump_data, 0);
+                w_object(v, arg, limit);
+            }
+            break;
+
 	  default:
 	    rb_raise(rb_eTypeError, "can't dump %s",
@@ -992,4 +1022,29 @@
 	break;
 
+      case TYPE_DATA:
+        {
+            VALUE klass;
+
+            klass  b_path2class(r_unique(arg));
+            if (!rb_respond_to(klass, s_alloc)) {
+                rb_raise(rb_eTypeError,
+                         "class %s needs to have class method `_alloc'",
+                         rb_class2name(klass));
+            }
+            v  b_funcall(klass, s_alloc, 0);
+            if (TYPE(v) ! _DATA) {
+                rb_raise(rb_eArgError, "dump format error");
+            }
+            r_regist(v, arg);
+            if (!rb_respond_to(v, s_load_data)) {
+                rb_raise(rb_eTypeError,
+                         "class %s needs to have instance method `_load_data'",
+                         rb_class2name(klass));
+            }
+            rb_funcall(v, s_load_data, 1, r_object(arg));
+            return v;
+        }
+        break;
+
       case TYPE_MODULE_OLD:
         {
@@ -1114,4 +1169,8 @@
     s_dump  b_intern("_dump");
     s_load  b_intern("_load");
+    s_dump_data  b_intern("_dump_data");
+    s_load_data  b_intern("_load_data");
+    s_alloc  b_intern("_alloc");
+
     rb_define_module_function(rb_mMarshal, "dump", marshal_dump, -1);
     rb_define_module_function(rb_mMarshal, "load", marshal_load, -1);

--------------B31AB18F0E265FCD09174948--