Whenever we run into an inline cache miss when we try to set
an ivar, we may need to take the global lock, just to be able to
lookup inside `shape->edges`.
To solve that, when we're in multi-ractor mode, we can treat
the `shape->edges` as immutable. When we need to add a new
edge, we first copy the table, and then replace it with
CAS.
This increases memory allocations, however we expect that
creating new transitions becomes increasingly rare over time.
```ruby
class A
def initialize(bool)
@a = 1
if bool
@b = 2
else
@c = 3
end
end
def test
@d = 4
end
end
def bench(iterations)
i = iterations
while i > 0
A.new(true).test
A.new(false).test
i -= 1
end
end
if ARGV.first == "ractor"
ractors = 8.times.map do
Ractor.new do
bench(20_000_000 / 8)
end
end
ractors.each(&:take)
else
bench(20_000_000)
end
```
The above benchmark takes 27 seconds in Ractor mode on Ruby 3.4,
and only 1.7s with this branch.
Co-Authored-By: Étienne Barrié <etienne.barrie@gmail.com>
This makes `RBobject` `4B` larger on 32 bit systems
but simplifies the implementation a lot.
[Feature #21353]
Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
As well as `RB_OBJ_SHAPE_ID` -> `rb_obj_shape_id`
and `RSHAPE` is now a simple alias for `rb_shape_lookup`.
I tried to turn all these into `static inline` but I'm having
trouble with `RUBY_EXTERN rb_shape_tree_t *rb_shape_tree_ptr;`
not being exposed as I'd expect.
And get rid of the `obj_to_id_tbl`
It's no longer needed, the `object_id` is now stored inline
in the object alongside instance variables.
We still need the inverse table in case `_id2ref` is invoked, but
we lazily build it by walking the heap if that happens.
The `object_id` concern is also no longer a GC implementation
concern, but a generic implementation.
Co-Authored-By: Matt Valentine-House <matt@eightbitraptor.com>
This opens the door to store more informations in shapes, such
as the `object_id` or object address in case it has been observed
and the object has to be moved.
Ivars will longer be the only thing stored inline
via shapes, so keeping the `iv_index` and `ivptr` names
would be confusing.
Instance variables won't be the only thing stored inline
via shapes, so keeping the `ivptr` name would be confusing.
`field` encompass anything that can be stored in a VALUE array.
Similarly, `gen_ivtbl` becomes `gen_fields_tbl`.
Most of this code use the `type * name` style, while the
overwhemling majority of the rest of ruby use the `type *name`
style.
This is a cosmetic change, but helps with readability.
Now that we've inlined the eden_heap into the size_pool, we should
rename the size_pool to heap. So that Ruby contains multiple heaps, with
different sized objects.
The term heap as a collection of memory pages is more in memory
management nomenclature, whereas size_pool was a name chosen out of
necessity during the development of the Variable Width Allocation
features of Ruby.
The concept of size pools was introduced in order to facilitate
different sized objects (other than the default 40 bytes). They wrapped
the eden heap and the tomb heap, and some related state, and provided a
reasonably simple way of duplicating all related concerns, to provide
multiple pools that all shared the same structure but held different
objects.
Since then various changes have happend in Ruby's memory layout:
* The concept of tomb heaps has been replaced by a global free pages list,
with each page having it's slot size reconfigured at the point when it
is resurrected
* the eden heap has been inlined into the size pool itself, so that now
the size pool directly controls the free_pages list, the sweeping
page, the compaction cursor and the other state that was previously
being managed by the eden heap.
Now that there is no need for a heap wrapper, we should refer to the
collection of pages containing Ruby objects as a heap again rather than
a size pool
On 32-bit systems, the shape cache size is 1048576 (value of
REDBLACK_CACHE_SIZE), but a 16-bit unsigned integer can only go up to
65536. This means that the redblack_id_t can overflow and lead to a
corrupted red-black tree.
The following script crashes on 32-bit systems:
o = Object.new
1_000_000.times do |i|
o.instance_variable_set(:"@i#{i}", i)
end
That function is a bit too low level to called from multiple
places. It's always used in tandem with `rb_shape_set_too_complex`
and both have to know how the object is laid out to update the
`iv_ptr`.
So instead we can provide two higher level function:
- `rb_obj_copy_ivs_to_hash_table` to prepare a `st_table` from an
arbitrary oject.
- `rb_obj_convert_to_too_complex` to assign the new `st_table`
to the old object, and safely free the old `iv_ptr`.
Unfortunately both can't be combined into one, because `rb_obj_copy_ivar`
need `rb_obj_copy_ivs_to_hash_table` to copy from one object
to another.