On 1/4/10, Benjie Chen <benjie / lablife.org> wrote:
> On Mon, Jan 4, 2010 at 7:23 PM, Caleb Clausen <vikkous / gmail.com> wrote:
>> Or maybe the token.text is being freed between when the token is
>> created and when it's assigned to a ruby variable? There are no
>> ruby-land references to it or the token which refers to it during that
>> brief time, so if the garbage collector happens to be invoked there...
>> but presumably there is a reference to token somewhere on the c call
>> stack, so that shouldn't be an issue.
>
> Hmm, can you elaborate on where the reference in C is created to
> prevent GC? If you look at the code, it has
>
>    RToken *token = ALLOC(RToken);
>    token->text = rb_str_new2(tk->text);
>    return Data_Wrap_Struct(cToken, &frt_token_mark, &frt_token_free, token);

So, Data_Wrap_Struct returns a VALUE, that is, something that ruby's
GC is responsible for freeing when it dies. That VALUE contains a
c-level pointer to your token struct. So, when the wrapping VALUE
dies, that token struct also needs to be freed. Which is what
frt_token_free does. So far, so good.

But, when the GC is scanning thru memory looking for objects to mark
(as being not eligible to free, because they're already referenced
somewhere), if it sees that the wrapping VALUE is live, it calls
frt_token_mark. Which lets the GC know about the c-level reference to
another VALUE, the token.text. The GC would otherwise miss this
reference, since it does not scan or manage objects created via ALLOC.
The mark callback passed to Data_Wrap_Struct helps the GC know about
such references. And frt_token_mark is doing the right thing; manually
marking token.text for the gc.

Ah-hah, but, what happens if the GC happens to get invoked during the
call to Data_Wrap_Struct? There's no ruby-level reference to the token
struct; that's ok, the GC won't try to free it. There's no ruby-level
reference to the VALUE wrapping the token struct, that's what
Data_Wrap_Struct is trying to create. Still ok so far, but...

!!  There's no reference to the token.text  !!
!!  Which is a VALUE, managed by the GC  !!
!!  So, GC thinks its ok to free token.text  !!

Normally, there should be a reference to that token.text VALUE
somewhere on the c call stack, (which is part of the roots used by the
GC), but in this case its being assigned directly into a struct
member, which is not stored on the stack. And GC does not know to scan
the token struct, because it doesn't follow pointers to c objects, and
doesn't yet have a wrapper VALUE that tells it what to do.

Try rewriting those 3 lines as:

    RToken *token = ALLOC(RToken);
    volatile VALUE text = rb_str_new2(tk->text);
    token->text = text;
    return Data_Wrap_Struct(cToken, &frt_token_mark, &frt_token_free, token);

That should ensure that the reference to text remains on the c call
stack. The volatile may be overkill, but you can't be too careful....

I think this may be your problem. I've had similar issues with ferret,
which I never managed to track down. (I wonder how many other places
in ferret have the same problem...?)

> My question is, where is the reference to the Ruby token object
> created to keep GC from reaping it?

Hopefully, the explanation above will be informative to you. If not,
ask again and I'll try to clarify.

> 1. token is a ptr to an allocated memory from ALLOC. Does ALLOC create
> memory pts that keeps reference? I don't believe so, because RToken is
> just a struct, and there's no fancy C++ copy constructors to bump
> references on assignment.

There's no reference counting in ruby; it's a mark-sweep garbage collector.

> 2. Does Data_Wrap_Struct automatically creates a Ruby object with 1
> reference? Or a new, un-assigned Ruby object w/ reference 0 and
> waiting to be assigned to a Ruby variable or a temporary? In the
> latter case, then we'd have an issue, because if GC runs before any
> assignment, GC would get rid of the token and token->text memory.

The reference to the VALUE returned by Data_Wrap_Struct kept on the c
call stack should keep it from being garbage collected.

> Note that I am assuming in Ruby, if I do t = token, then a reference
> to token is created on the assignment and that keeps token->text alive
> as well.

Yes.


> Actually, I believe a better solution to the problem I described is
> before returning a VALUE from a C proc to Ruby land, keeps a reference
> of it somewhere in Ruby. So the code looks like
>
> VALUE v;
> ...
> v = Data_Wrap_Struct (...);
> rb_ivar_set (..., &v);
> return v;
>
> So that what you returned, v, is not removed by GC before it's
> assigned or referenced in Ruby land.
>
> Does this sound like the right approach?

This is still hacky; you shouldn't need to create any more references
in ruby-level variables. If the diagnosis I gave above is correct this
won't solve the problem for you anyway, since it's the token.text that
seems to be vulnerable to a premature free, not token itself.