Commit graph

107 commits

Author SHA1 Message Date
Jean Boussier
e9fd44dd72 shape.c: Implement a lock-free version of get_next_shape_internal
Whenever we run into an inline cache miss when we try to set
an ivar, we may need to take the global lock, just to be able to
lookup inside `shape->edges`.

To solve that, when we're in multi-ractor mode, we can treat
the `shape->edges` as immutable. When we need to add a new
edge, we first copy the table, and then replace it with
CAS.

This increases memory allocations, however we expect that
creating new transitions becomes increasingly rare over time.

```ruby
class A
  def initialize(bool)
    @a = 1
    if bool
      @b = 2
    else
      @c = 3
    end
  end

  def test
    @d = 4
  end
end

def bench(iterations)
  i = iterations
  while i > 0
    A.new(true).test
    A.new(false).test
    i -= 1
  end
end

if ARGV.first == "ractor"
  ractors = 8.times.map do
    Ractor.new do
      bench(20_000_000 / 8)
    end
  end
  ractors.each(&:take)
else
  bench(20_000_000)
end
```

The above benchmark takes 27 seconds in Ractor mode on Ruby 3.4,
and only 1.7s with this branch.

Co-Authored-By: Étienne Barrié <etienne.barrie@gmail.com>
2025-06-02 17:49:53 +02:00
Jean Boussier
749bda96e5 Refactor attr_index_t caches
Ensure the same helpers are used for packing and unpacking.
2025-05-28 12:39:21 +02:00
Jean Boussier
326c120aa7 Rename rb_shape_id_canonical_p -> rb_shape_canonical_p 2025-05-27 15:34:02 +02:00
Jean Boussier
925dec8d70 Rename rb_shape_set_shape_id in rb_obj_set_shape_id 2025-05-27 15:34:02 +02:00
Jean Boussier
ccf2b7c5b8 Refactor rb_shape_too_complex_p to take a shape_id_t. 2025-05-27 15:34:02 +02:00
Jean Boussier
a1f72d23a9 Refactor rb_shape_has_object_id
Now takes a `shape_id_t` and the version that takes a `rb_shape_t *`
is private.
2025-05-27 15:34:02 +02:00
Jean Boussier
a80a5000ab Refactor rb_obj_shape out.
It still exists but only in `shape.c`.
2025-05-27 15:34:02 +02:00
Jean Boussier
97f44ac54e Get rid of rb_shape_set_shape 2025-05-27 15:34:02 +02:00
Jean Boussier
a59835e1d5 Refactor rb_shape_get_iv_index to take a shape_id_t
Further reduce exposure of `rb_shape_t`.
2025-05-27 15:34:02 +02:00
Jean Boussier
e535f8248b Get rid of rb_shape_id(rb_shape_t *)
We should avoid conversions from `rb_shape_t *` into `shape_id_t`
outside of `shape.c` as the short term goal is to have `shape_id_t`
contain tags.
2025-05-27 12:45:24 +02:00
Jean Boussier
cc48fcdff2 Get rid of rb_shape_canonical_p 2025-05-27 12:45:24 +02:00
Jean Boussier
8b0868cbb1 Refactor rb_shape_rebuild_shape to stop exposing rb_shape_t 2025-05-27 12:45:24 +02:00
John Hawthorn
f483befd90 Add shape_id to RBasic under 32 bit
This makes `RBobject` `4B` larger on 32 bit systems
but simplifies the implementation a lot.

[Feature #21353]

Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
2025-05-26 10:31:54 +02:00
Jean Boussier
52da5f8bbc Refactor rb_shape_transition_remove_ivar
Move the fields management logic in `rb_ivar_delete`, and keep
shape managment logic in `rb_shape_transition_remove_ivar`.
2025-05-23 17:33:17 +02:00
Jean Boussier
60ffb714d2 Ensure shape_id is never used on T_IMEMO
It doesn't make sense to set ivars or anything shape
related on a T_IMEMO.

Co-Authored-By: John Hawthorn <john@hawthorn.email>
2025-05-15 16:06:52 +02:00
Jean Boussier
a6435befa7 variable.c: Refactor rb_obj_field_* to take shape_id_t 2025-05-13 10:35:34 +02:00
Jean Boussier
3135eddb4e Refactor FIRST_T_OBJECT_SHAPE_ID to not be used outside shape.c 2025-05-09 20:45:48 +02:00
Jean Boussier
ea77250847 Rename RB_OBJ_SHAPE -> rb_obj_shape
As well as `RB_OBJ_SHAPE_ID` -> `rb_obj_shape_id`
and `RSHAPE` is now a simple alias for `rb_shape_lookup`.

I tried to turn all these into `static inline` but I'm having
trouble with `RUBY_EXTERN rb_shape_tree_t *rb_shape_tree_ptr;`
not being exposed as I'd expect.
2025-05-09 10:22:51 +02:00
Jean Boussier
0b81359b3f Stop exposing rb_shape_frozen_shape_p 2025-05-09 10:22:51 +02:00
Jean Boussier
a970d35de2 Get rid of rb_shape_get_parent. 2025-05-09 10:22:51 +02:00
Jean Boussier
5782561fc1 Rename rb_shape_get_shape_id -> RB_OBJ_SHAPE_ID
And `rb_shape_get_shape` -> `RB_OBJ_SHAPE`.
2025-05-09 10:22:51 +02:00
Jean Boussier
a007575497 Remove unused rb_shape_object_id_index 2025-05-09 10:22:51 +02:00
Jean Boussier
c9b08882b7 Refactor rb_shape_get_next to return an ID
Also rename it, and change parameters to be consistent with
other transition functions.
2025-05-09 10:22:51 +02:00
Jean Boussier
e0200cfba0 Refactor rb_shape_transition_shape_remove_ivar to not take a shape pointer
It's more consistent with other transition functions.
2025-05-09 10:22:51 +02:00
Jean Boussier
3f7c0af051 Rename rb_shape_obj_too_complex -> rb_shape_obj_too_complex_p 2025-05-09 10:22:51 +02:00
Jean Boussier
677d075c29 Refactor rb_shape_transition_too_complex to return an ID. 2025-05-09 10:22:51 +02:00
Jean Boussier
f82523f14b Refactor rb_shape_transition_frozen to return a shape_id. 2025-05-09 10:22:51 +02:00
Jean Boussier
31d0a5815c Get rid of useless SHAPE_MASK 2025-05-09 10:22:51 +02:00
Jean Boussier
334ebba221 Rename rb_shape_get_shape_by_id -> RSHAPE 2025-05-09 10:22:51 +02:00
Jean Boussier
9966de11fb Refactor rb_shape_get_next_iv_shape to take and return ids. 2025-05-09 10:22:51 +02:00
Jean Boussier
df7d25bb3e Stop exposing rb_shape_get_root_shape 2025-05-09 10:22:51 +02:00
Jean Boussier
62eb2007f6 Remove unused rb_obj_debug_shape 2025-05-09 10:22:51 +02:00
Jean Boussier
e4f97ce387 Refactor rb_shape_depth to take an ID rather than a pointer.
As well as `rb_shape_edges_count` and `rb_shape_memsize`.
2025-05-09 10:22:51 +02:00
Jean Boussier
f8b3fc520f Refactor rb_shape_traverse_from_new_root to not expose rb_shape_t 2025-05-09 10:22:51 +02:00
Jean Boussier
7116b0a7f1 Extract rb_shape_free_all 2025-05-09 10:22:51 +02:00
Jean Boussier
f48e45d1e9 Move object_id in object fields.
And get rid of the `obj_to_id_tbl`

It's no longer needed, the `object_id` is now stored inline
in the object alongside instance variables.

We still need the inverse table in case `_id2ref` is invoked, but
we lazily build it by walking the heap if that happens.

The `object_id` concern is also no longer a GC implementation
concern, but a generic implementation.

Co-Authored-By: Matt Valentine-House <matt@eightbitraptor.com>
2025-05-08 07:58:05 +02:00
Jean Boussier
d34c150547 shape.c: refactor frozen shape to no longer be final
This opens the door to store more informations in shapes, such
as the `object_id` or object address in case it has been observed
and the object has to be moved.
2025-05-08 07:58:05 +02:00
Jean Boussier
6c9b3ac232 Refactor OBJ_TOO_COMPLEX_SHAPE_ID to not be referenced outside shape.h
Also refactor checks for `->type == SHAPE_OBJ_TOO_COMPLEX`.
2025-05-08 07:58:05 +02:00
Jean Boussier
0ea210d1ea Rename ivptr -> fields, next_iv_index -> next_field_index
Ivars will longer be the only thing stored inline
via shapes, so keeping the `iv_index` and `ivptr` names
would be confusing.

Instance variables won't be the only thing stored inline
via shapes, so keeping the `ivptr` name would be confusing.

`field` encompass anything that can be stored in a VALUE array.

Similarly, `gen_ivtbl` becomes `gen_fields_tbl`.
2025-05-08 07:58:05 +02:00
Jean Boussier
a3af4e905f Make rb_shape.capacity an attr_index_t 2025-05-05 14:44:49 +02:00
Jean Boussier
18dac125cb Improve syntax style consistency in shape.c and shape.h
Most of this code use the `type * name` style, while the
overwhemling majority of the rest of ruby use the `type *name`
style.

This is a cosmetic change, but helps with readability.
2025-04-30 08:10:55 +02:00
Matt Valentine-House
8e7df4b7c6 Rename size_pool -> heap
Now that we've inlined the eden_heap into the size_pool, we should
rename the size_pool to heap. So that Ruby contains multiple heaps, with
different sized objects.

The term heap as a collection of memory pages is more in memory
management nomenclature, whereas size_pool was a name chosen out of
necessity during the development of the Variable Width Allocation
features of Ruby.

The concept of size pools was introduced in order to facilitate
different sized objects (other than the default 40 bytes). They wrapped
the eden heap and the tomb heap, and some related state, and provided a
reasonably simple way of duplicating all related concerns, to provide
multiple pools that all shared the same structure but held different
objects.

Since then various changes have happend in Ruby's memory layout:

* The concept of tomb heaps has been replaced by a global free pages list,
  with each page having it's slot size reconfigured at the point when it
  is resurrected
* the eden heap has been inlined into the size pool itself, so that now
  the size pool directly controls the free_pages list, the sweeping
  page, the compaction cursor and the other state that was previously
  being managed by the eden heap.

Now that there is no need for a heap wrapper, we should refer to the
collection of pages containing Ruby objects as a heap again rather than
a size pool
2024-10-03 21:20:09 +01:00
Jean Boussier
f7b53a75b6 Do not emit shape transition warnings when YJIT is compiling
[Bug #20522]

If `Warning.warn` is redefined in Ruby, emitting a warning would invoke
Ruby code, which can't safely be done when YJIT is compiling.
2024-06-04 19:21:01 +02:00
Peter Zhu
3896f9940e Make special const and too complex shapes before T_OBJECT shapes 2024-03-13 09:55:52 -04:00
Peter Zhu
6b0434c0f7 Don't create per size pool shapes for non-T_OBJECT 2024-03-13 09:55:52 -04:00
Peter Zhu
df5b8ea4db Remove unneeded RUBY_FUNC_EXPORTED 2024-02-23 10:24:21 -05:00
Peter Zhu
71babe5536 Use 32-bit integers for redblack_id_t
On 32-bit systems, the shape cache size is 1048576 (value of
REDBLACK_CACHE_SIZE), but a 16-bit unsigned integer can only go up to
65536. This means that the redblack_id_t can overflow and lead to a
corrupted red-black tree.

The following script crashes on 32-bit systems:

    o = Object.new
    1_000_000.times do |i|
      o.instance_variable_set(:"@i#{i}", i)
    end
2023-12-04 13:57:12 -05:00
Aaron Patterson
6fce8c7980 Don't try compacting ivars on Classes that are "too complex"
Too complex classes use a hash table to store ivs, and should always pin
their IVs.  We shouldn't touch those classes in compaction.
2023-11-20 16:09:48 -08:00
Jean Boussier
94c9f16663 Refactor rb_obj_evacuate_ivs_to_hash_table
That function is a bit too low level to called from multiple
places. It's always used in tandem with `rb_shape_set_too_complex`
and both have to know how the object is laid out to update the
`iv_ptr`.

So instead we can provide two higher level function:

  - `rb_obj_copy_ivs_to_hash_table` to prepare a `st_table` from an
    arbitrary oject.
  - `rb_obj_convert_to_too_complex` to assign the new `st_table`
    to the old object, and safely free the old `iv_ptr`.

Unfortunately both can't be combined into one, because `rb_obj_copy_ivar`
need `rb_obj_copy_ivs_to_hash_table` to copy from one object
to another.
2023-11-17 09:19:21 +01:00
Peter Zhu
68869e9bd9 Revert "Revert "Remove SHAPE_CAPACITY_CHANGE shapes""
This reverts commit 5f3fb4f4e3.
2023-11-13 18:26:36 -05:00