Commit graph

796 commits

Author SHA1 Message Date
Jean Boussier
edbd9ed468 variable.c: avoid out of bound write in generic_field_set
[Bug #21445]
2025-06-21 14:40:40 +01:00
John Hawthorn
912edb4716 Fix missing write barrier on class fields
Found by wbcheck

    klass = Class.new
    200.times do |iv|
      klass.instance_variable_set("@_iv_#{iv}", Object.new)
    end
2025-06-18 20:54:01 -07:00
Jean Boussier
8faa32327b Add missing write barriers in rb_imemo_fields_clone. 2025-06-17 15:28:05 +02:00
Jean Boussier
cd9f447be2 Refactor generic fields to use T_IMEMO/fields objects.
Followup: https://github.com/ruby/ruby/pull/13589

This simplify a lot of things, as we no longer need to manually
manage the memory, we can use the Read-Copy-Update pattern and
avoid numerous race conditions.

Co-Authored-By: Étienne Barrié <etienne.barrie@gmail.com>
2025-06-17 15:28:05 +02:00
Jean Boussier
164486a954 Refactor rb_imemo_fields_new to not assume T_CLASS 2025-06-17 15:28:05 +02:00
Jean Boussier
fb68721f63 Rename imemo_class_fields -> imemo_fields 2025-06-17 15:28:05 +02:00
Nobuyoshi Nakada
8992689118
Adjust indent [ci] 2025-06-17 16:30:40 +09:00
John Hawthorn
055fef00a1 Free after insert in generic_ivar_set_shape_fields
Previously we were performing a realloc and then inserting the new value
into the table. If the table was flagged as requiring a rebuild, this
could trigger GC work and marking within that GC could access the fields
freed by realloc.
2025-06-17 08:01:23 +02:00
John Hawthorn
39697ffd01 Remove fields_tbl in gen_fields_lookup_ensure_size 2025-06-13 23:29:41 -07:00
John Hawthorn
5342d9130b Fix generic_ivar_set_shape_field for table rebuild
[Bug #21438]

Previously GC could trigger a table rebuild of the generic fields
st_table in the middle of calling the st_update callback. This could
cause entries to be reallocated or rearranged and the update to be for
the wrong entry.

This commit adds an assertion to make that case easier to detect, and
replaces the st_update with a separate st_lookup and st_insert.

Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
2025-06-13 23:29:41 -07:00
Jean Boussier
15084fbc3c Get rid of FL_EXIVAR
Now that the shape_id gives us all the same information, it's no
longer needed.
2025-06-13 23:50:30 +02:00
Jean Boussier
6dbe24fe56 Use the shape_id rather than FL_EXIVAR
We still keep setting `FL_EXIVAR` so that `rb_shape_verify_consistency`
can detect discrepancies.
2025-06-13 23:50:30 +02:00
Jean Boussier
b51078f82e Enforce consistency between shape_id and FL_EXIVAR
The FL_EXIVAR is a bit redundant with the shape_id.
Now that the `shape_id` is embedded in all objects on all archs,
we can cheaply check if an object has any fields with a simple
bitmask.
2025-06-13 23:50:30 +02:00
Jean Boussier
a74c385208 Make setting and accessing class ivars lock-free
Now that class fields have been deletated to a T_IMEMO/class_fields
when we're in multi-ractor mode, we can read and write class instance
variable in an atomic way using Read-Copy-Update (RCU).

Note when in multi-ractor mode, we always use RCU. In theory
we don't need to, instead if we ensured the field is written
before the shape is updated it would be safe.

Benchmark:

```ruby
Warning[:experimental] = false

class Foo
  @foo = 1
  @bar = 2
  @baz = 3
  @egg = 4
  @spam = 5

  class << self
    attr_reader :foo, :bar, :baz, :egg, :spam
  end
end

ractors = 8.times.map do
  Ractor.new do
    1_000_000.times do
      Foo.bar + Foo.baz * Foo.egg - Foo.spam
    end
  end
end

if Ractor.method_defined?(:value)
  ractors.each(&:value)
else
  ractors.each(&:take)
end
```

This branch vs Ruby 3.4:

```bash
$ hyperfine -w 1 'ruby --disable-all ../test.rb' './miniruby ../test.rb'

Benchmark 1: ruby --disable-all ../test.rb
  Time (mean ± σ):      3.162 s ±  0.071 s    [User: 2.783 s, System: 10.809 s]
  Range (min … max):    3.093 s …  3.337 s    10 runs

Benchmark 2: ./miniruby ../test.rb
  Time (mean ± σ):     208.7 ms ±   4.6 ms    [User: 889.7 ms, System: 6.9 ms]
  Range (min … max):   202.8 ms … 222.0 ms    14 runs

Summary
  ./miniruby ../test.rb ran
   15.15 ± 0.47 times faster than ruby --disable-all ../test.rb
```
2025-06-12 14:55:13 +02:00
Jean Boussier
8b5ac5abf2 Fix class instance variable inside namespaces
Now that classes fields are delegated to an object with its own
shape_id, we no longer need to mark all classes as TOO_COMPLEX.
2025-06-12 13:43:29 +02:00
Jean Boussier
3abdd4241f Turn rb_classext_t.fields into a T_IMEMO/class_fields
This behave almost exactly as a T_OBJECT, the layout is entirely
compatible.

This aims to solve two problems.

First, it solves the problem of namspaced classes having
a single `shape_id`. Now each namespaced classext
has an object that can hold the namespace specific
shape.

Second, it open the door to later make class instance variable
writes atomics, hence be able to read class variables
without locking the VM.
In the future, in multi-ractor mode, we can do the write
on a copy of the `fields_obj` and then atomically swap it.

Considerations:

  - Right now the `RClass` shape_id is always synchronized,
    but with namespace we should likely mark classes that have
    multiple namespace with a specific shape flag.
2025-06-12 07:58:16 +02:00
Jean Boussier
95201299fd Refactor the last references to rb_shape_t
The type isn't opaque because Ruby isn't often compiled with LTO,
so for optimization purpose it's better to allow as much inlining
as possible.

However ideally only `shape.c` and `shape.h` should deal with
the actual struct, and everything else should just deal with opaque
`shape_id_t`.
2025-06-11 16:38:38 +02:00
Jean Boussier
f9966b9b76 Get rid of gen_fields_tbl.fields_count
This data is redundant because the shape already contains both the
length and capacity of the object's fields.

So it both waste space and create the possibility of a desync between
the two.

We also do not need to initialize everything to Qundef, this seem
to be a left-over from pre-shape instance variables.
2025-06-09 16:38:29 +02:00
Jean Boussier
6eb0cd8df7 Get rid of SHAPE_T_OBJECT
Now that we have the `heap_index` in shape flags we no longer
need `T_OBJECT` shapes.
2025-06-07 18:30:44 +02:00
Jean Boussier
8c4e368dcf shape.c: ensure heap_index is consistent for complex shapes 2025-06-07 18:30:44 +02:00
Jean Boussier
4e39580992 Refactor raw accesses to rb_shape_t.capacity 2025-06-05 22:06:15 +02:00
Nobuyoshi Nakada
edaa27ce45
Suppress warnings by gcc-13 with -Og 2025-06-05 22:33:02 +09:00
Jean Boussier
772fc1f187 Get rid of rb_shape_t.flags
Now all flags are only in the `shape_id_t`, and can all be checked
without needing to dereference a pointer.
2025-06-05 07:44:44 +02:00
Jean Boussier
675f33508c Get rid of TOO_COMPLEX shape type
Instead it's now a `shape_id` flag.

This allows to check if an object is complex without having
to chase the `rb_shape_t` pointer.
2025-06-04 13:13:50 +02:00
Jean Boussier
206110a2a8 Add missing lock in rb_ivar_defined
If called on a class, we should acquire the lock.
2025-06-04 08:26:20 +02:00
Jean Boussier
625d6a9cbb Get rid of frozen shapes.
Instead `shape_id_t` higher bits contain flags, and the first one
tells whether the shape is frozen.

This has multiple benefits:
  - Can check if a shape is frozen with a single bit check instead of
    dereferencing a pointer.
  - Guarantees it is always possible to transition to frozen.
  - This allow reclaiming `FL_FREEZE` (not done yet).

The downside is you have to be careful to preserve these flags
when transitioning.
2025-06-04 07:59:20 +02:00
Jean Boussier
326c120aa7 Rename rb_shape_id_canonical_p -> rb_shape_canonical_p 2025-05-27 15:34:02 +02:00
Jean Boussier
925dec8d70 Rename rb_shape_set_shape_id in rb_obj_set_shape_id 2025-05-27 15:34:02 +02:00
Jean Boussier
ccf2b7c5b8 Refactor rb_shape_too_complex_p to take a shape_id_t. 2025-05-27 15:34:02 +02:00
Jean Boussier
a80a5000ab Refactor rb_obj_shape out.
It still exists but only in `shape.c`.
2025-05-27 15:34:02 +02:00
Jean Boussier
a59835e1d5 Refactor rb_shape_get_iv_index to take a shape_id_t
Further reduce exposure of `rb_shape_t`.
2025-05-27 15:34:02 +02:00
Jean Boussier
cc48fcdff2 Get rid of rb_shape_canonical_p 2025-05-27 12:45:24 +02:00
Jean Boussier
8b0868cbb1 Refactor rb_shape_rebuild_shape to stop exposing rb_shape_t 2025-05-27 12:45:24 +02:00
John Hawthorn
f483befd90 Add shape_id to RBasic under 32 bit
This makes `RBobject` `4B` larger on 32 bit systems
but simplifies the implementation a lot.

[Feature #21353]

Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
2025-05-26 10:31:54 +02:00
Nobuyoshi Nakada
aad9fa2853
Use RB_VM_LOCKING 2025-05-25 15:22:43 +09:00
Jean Boussier
1435ea7f44 Add missing lock for Module#remove_instance_variable
We must take a lock to ensure another ractor isn't reading the ivars
while we're moving them.
2025-05-23 17:33:17 +02:00
Jean Boussier
52da5f8bbc Refactor rb_shape_transition_remove_ivar
Move the fields management logic in `rb_ivar_delete`, and keep
shape managment logic in `rb_shape_transition_remove_ivar`.
2025-05-23 17:33:17 +02:00
Luke Gruber
966fcb77e4 lock vm around rb_free_generic_ivar
Currently, this can be reproduced by:

r = Ractor.new do
    a = [1, 2, 3]
    a.object_id
    a.dup # this frees the generic ivar for `object_id` on the copied object
    :done
end
r.take

In debug builds, this hits an assertion failure without this fix.
2025-05-23 18:20:35 +09:00
Aaron Patterson
ef935705cf Use shape_id for determining "too complex"
Using `rb_shape_obj_too_complex_p` looks up the shape, but we already
have the shape id. This avoids looking up the shape twice.
2025-05-21 10:27:14 -07:00
Jean Boussier
22c09135a8 rb_copy_generic_ivar: reset shape_id when no ivar are present 2025-05-16 20:16:52 +02:00
Jean Boussier
a6435befa7 variable.c: Refactor rb_obj_field_* to take shape_id_t 2025-05-13 10:35:34 +02:00
Jean Boussier
574127b0db Add missing gc_mark in autoload_const_mark
If we're referencing the namespace object we have to mark it.
2025-05-11 23:32:50 +09:00
Satoshi Tagomori
c2c5b05423 Unexpectedly miss-fixed macro during rebase 2025-05-11 23:32:50 +09:00
Satoshi Tagomori
294b52fb9b Follow the code style about else 2025-05-11 23:32:50 +09:00
Satoshi Tagomori
382645d440 namespace on read 2025-05-11 23:32:50 +09:00
Jean Boussier
6dd7a7bb41 Refactor rb_obj_field_get to handle complex shapes
This allow to get or set fields without having to worry about the
shape type.
2025-05-10 16:55:22 +02:00
Jean Boussier
d9502a8386 Rename rb_field_get -> rb_obj_field_get
To be consistent with `rb_obj_field_set`.
2025-05-10 15:39:33 +02:00
Jean Boussier
ea77250847 Rename RB_OBJ_SHAPE -> rb_obj_shape
As well as `RB_OBJ_SHAPE_ID` -> `rb_obj_shape_id`
and `RSHAPE` is now a simple alias for `rb_shape_lookup`.

I tried to turn all these into `static inline` but I'm having
trouble with `RUBY_EXTERN rb_shape_tree_t *rb_shape_tree_ptr;`
not being exposed as I'd expect.
2025-05-09 10:22:51 +02:00
Jean Boussier
a970d35de2 Get rid of rb_shape_get_parent. 2025-05-09 10:22:51 +02:00
Jean Boussier
5782561fc1 Rename rb_shape_get_shape_id -> RB_OBJ_SHAPE_ID
And `rb_shape_get_shape` -> `RB_OBJ_SHAPE`.
2025-05-09 10:22:51 +02:00