How do Native Extensions Manage Memory? Part 1: Ruby (MRI)
One of the most interesting design challenges in a programming language interpreter or VM is what kind of memory management API to offer to native extensions.
This issue is more or less invisible unless you’re writing a native extension. When you are simply writing a program in Ruby, Python, Lua, etc., you don’t usually care how the memory management is implemented. You only care is that you can create objects with reckless abandon and that they will get cleaned up for you, somehow. The people who write the VM get a lot of latitude in how to actually implement this cleanup because the implementation is so hidden from user programs. Different VMs can use completely different strategies, even for the same language! For example, CPython uses reference counting whereas PyPy uses tracing, even though both are VMs for Python.
But when the VM offers a C/C++ API for native extensions, it
usually exposes more implementation details. Memory
management tasks that would be performed automatically if
you were writing directly in Python (for example) need to be
performed manually at the C/C++ level. For example, CPython
knows how to increment and decrement refcounts when you say
x = y
in Python. But at the C/C++ level you must
explicitly increment and decrement the refcounts yourself.
A native extension can create pointers to VM-owned memory
(like a char*
pointer to data from a string object), and
it needs some way of knowing how long this data will stay
alive. In many native extension APIs, you can even get a
pointer directly to a VM object, and you need a way of
ensuring that the target object isn’t deleted while you are
using it!
This is an interesting design problem. What makes it especially interesting is how many different ways there are to solve it. I know of native extension APIs for Ruby (MRI), JavaScript (V8), Lua, PHP, and Python (CPython), and literally no two VMs in this list solve the problem the same way! When I discovered this, the situation seemed ripe for a series of blog articles exploring this problem and all of the solutions.
There is a fair bit of information to cover for each VM, so I’ve split this article into parts. This article will explore Ruby, specifically the C VM for Ruby known as MRI (Matz’s Ruby Interpreter).
Representing Ruby objects in C
Let’s start by painting a quick picture of how we refer to
Ruby objects from a C extension. The ruby.h
header
defines a type called VALUE
which we can use to refer to
any Ruby object from C:
I put the Ruby object in “Ruby VM space,” which means that it’s owned/managed by Ruby, and its C-level representation is private to the Ruby interpreter.
If the Ruby object is of type String
, then it will be
represented internally by Ruby as an RString
, and we can
get a pointer to the string data by writing:
Now we have something that looks like this:
Often when we are writing a C extension, we want to be able
to attach some custom C data (like a struct) to a Ruby
value. Ruby provides APIs for creating Ruby objects that
are just wrappers around some C data. These internally use
Ruby VM object types RData
and RTypedData
(depending on
whether you use the old or new API for this).
As a simple, example, suppose we wanted to implement a “pair” Ruby object in C that points to two Ruby values:
It would look something like this:
Ruby’s Garbage Collector
Ruby uses a tracing garbage collector. The basic idea behind tracing GC is to traverse the graph of live objects, starting from the “root” objects (objects that are being directly used) and following all known references, which will determine which objects are still reachable. Unreachable objects can then be freed. There are lots of complicated variations on this theme that try to avoid scanning the entire heap every time, but the basic principle remains the same.
There are a couple key questions that a tracing garbage collector needs to be able to answer:
-
what are the “root objects” (objects directly in use)?
-
given an object, what other objects does it directly reference (so we can mark them too)?
Let’s start by addressing #2.
Finding References Between Objects
For types that are part of the Ruby VM
(Array
, Hash
, etc) Ruby already knows how to scan them
to find the VALUE
pointers inside. Ruby can trace
references between objects in “Ruby VM space” without our
help.
But what about types we create in a C extension, like
RubyPair
above? If the GC “mark” phase reaches this
object, it needs to know how to follow these two Ruby object
pointers. How will it do this, since it doesn’t know how to
find the VALUE
pointers in our RubyPair
struct?
Ruby’s answer is to put this responsibility directly in the
hands of the extension author. Anyone who implements a Ruby
data type in C that can reference other Ruby objects must
implement a “mark” function that visits them. For example,
here is a mark
function that would work for RubyPair
.
Ruby calls RubyPair_mark
whenever the “mark” phase of
garbage collection reaches one of our RubyPair
objects.
It counts on us to call rb_gc_mark()
for every Ruby object
we reference. It will then free things that can’t be
reached from any mark function. If you make a mistake and
forget to call rb_gc_mark()
on one of your references, it
may get freed even though you are still referencing it. You
will probably SEGV the interpreter next time you try to
access that object!
Finding GC Roots
So how does the VM find GC roots? This is where things get tricky. Ruby already knows about any live references from Ruby-level variables (local, global, or instance variables). That part is easy. But what about Ruby objects that are only referenced from C? For example, you could write something like this in C:
Any of those calls to rb_str_*
functions could trigger a
garbage collection cycle. How will the GC know to avoid
collecting str
itself?
Ruby’s solution to this problem is to scan the C stack and
registers (!!) looking for VALUE
pointers. This is a
highly platform-specific and non-portable thing to do. Ruby
needs some way of deciding what addresses signify the begin
and end of the stack. And as it scans, it needs to
skip/discard any data that isn’t a VALUE
– this is
somewhat heuristic by its nature. Despite all these
caveats, the scheme must work well enough, since Ruby has
been using it for over 10 years now. The details of how
this is done are beyond the scope of this article. But if
you are interested in knowing more, there is some fun
reading in
gc.c.
If the GC is triggered inside our RubyPair_inspect
function above, Ruby will find the str
local variable and
know to treat it as a GC root, since the C stack references
it. This will keep str
alive for as long as the function
runs. And if we get a pointer to the string’s internal data
with RSTRING_PTR(str)
, we know it will live as long as the
String
object does.
It is also possible for VALUE
pointers to be in global
variables . Ruby provides the functions
rb_gc_register_address(VALUE*)
and
rb_gc_unregister_address(VALUE*)
that you can call with
the address of any variable of type VALUE
. This will
probably be a global variable (while it could technically be
on the heap, I’ve never seen any reason to do this). Ruby
will read the variable and treat it as a root every time the
GC runs its “mark” phase.
We can put this all together to get a full picture of Ruby’s memory management model for native extensions:
This memory model and API are convenient to use from C.
Functions like RubyPair_inspect
are nice to write and are
GC-safe without requiring any manual memory management
bookkeeping.
The trickiest part is making sure you get the mark()
function right. To test this properly you need to make sure
you trigger some GC cycles while your user-defined objects
are alive, and then traverse your objects again to make sure
none of the VALUE
pointers are dangling. You also need to
make sure that every VALUE
you are using from C is
discoverable as a root by the GC.