Commit graph

604 commits

Author SHA1 Message Date
John Hawthorn
f483befd90 Add shape_id to RBasic under 32 bit
This makes `RBobject` `4B` larger on 32 bit systems
but simplifies the implementation a lot.

[Feature #21353]

Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
2025-05-26 10:31:54 +02:00
Nobuyoshi Nakada
aad9fa2853
Use RB_VM_LOCKING 2025-05-25 15:22:43 +09:00
John Hawthorn
4f9f2243e9 Stricter assert for RCLASS_ALLOCATOR
I'd like to make this only valid to T_CLASS also, but currently it is
called in some places for T_ICLASS and expected to return 0.
2025-05-23 10:33:48 -07:00
John Hawthorn
05cdcfcefd Only call RCLASS_SET_ALLOCATOR on T_CLASS objects
It's invalid to set an allocator on a T_ICLASS or T_MODULE, as those use
the other fields from the union.
2025-05-23 10:33:48 -07:00
John Hawthorn
11ad7f5f47 Don't use namespaced classext for superclasses
Superclasses can't be modified by user code, so do not need namespace
indirection. For example Object.superclass is always BasicObject, no
matter what modules are included onto it.
2025-05-23 10:22:24 -07:00
Samuel Williams
73c9d6ccaa
Allow IO#close to interrupt IO operations on fibers using fiber_interrupt hook. (#12839) 2025-05-23 14:55:05 +09:00
Samuel Williams
87261c2d95
Ensure that forked process do not see invalid blocking operations. (#13343) 2025-05-15 15:50:15 +09:00
Jean Boussier
9400119702 Fix object_id for classes and modules in namespace context
Given classes and modules have a different set of fields in every
namespace, we can't store the object_id in fields for them.

Given that some space was freed in `RClass` we can store it there
instead.
2025-05-14 10:26:48 +02:00
Jean Boussier
130d6aaef2 Reclaim one VALUE from rb_classext_t by shrinking super_classdepth
By making `super_classdepth` `uint16_t`, classes and modules can
now fit in 160B slots again.

The downside of course is that before `super_classdepth` was large
enough we never had to care about overflow, as you couldn't
realistically create enough classes to ever go over it.

With this change, while it is stupid, you could realistically
create an ancestor chain containing 65k classes and modules.
2025-05-14 10:17:03 +02:00
Jean Boussier
2ca8769443 Reclaim one VALUE from rb_classext_t
The `includer` field is only used for `T_ICLASS`, so by moving
it into the existing union we can save one `VALUE` per class
and module.
2025-05-13 14:55:39 +02:00
Samuel Williams
425fa0aeb5
Make waiting_fd behaviour per-IO. (#13127)
- `rb_thread_fd_close` is deprecated and now a no-op.
- IO operations (including close) no longer take a vm-wide lock.
2025-05-13 19:02:03 +09:00
Jean Boussier
a6435befa7 variable.c: Refactor rb_obj_field_* to take shape_id_t 2025-05-13 10:35:34 +02:00
Satoshi Tagomori
f0b41ef669 Describe the basic documents of Namespace 2025-05-11 23:32:50 +09:00
Satoshi Tagomori
8199e6e1a6 Show experimental warning when namespace is enabled 2025-05-11 23:32:50 +09:00
Satoshi Tagomori
8ecc04dc04 Delete code for debugging namespace 2025-05-11 23:32:50 +09:00
Satoshi Tagomori
90e5ce6132 Rename RCLASS_EXT() macro to RCLASS_EXT_PRIME() to prevent using it wrongly
The macro RCLASS_EXT() accesses the prime classext directly, but it can be
valid only in a limited situation when namespace is enabled.
So, to prevent using RCLASS_EXT() in the wrong way, rename the macro and
let the developer check it is ok to access the prime classext or not.
2025-05-11 23:32:50 +09:00
Satoshi Tagomori
ff790c759e Compact prime classext readable/writable flags
To make RClass size smaller, move flags of prime classext readable/writable to:
 readable - use ns_classext_tbl is NULL or not (if NULL, it's readable)
 writable - use FL_USER2 of RBasic flags
2025-05-11 23:32:50 +09:00
Satoshi Tagomori
5ee1ec313a initialize method tables before any GC chance 2025-05-11 23:32:50 +09:00
Satoshi Tagomori
f24ba27d6d avoid calling ZALLOC after NEWOBJ_OF for RClass: need to return RClass not promoted 2025-05-11 23:32:50 +09:00
Yusuke Endoh
8d3cd4301a Remove unnecessary prototype declarations
```
internal/class.h:158:20: warning: ‘RCLASS_SET_CLASSEXT_TABLE’ declared ‘static’ but never defined [-Wunused-function]
  158 | static inline void RCLASS_SET_CLASSEXT_TABLE(VALUE obj, st_table *tbl);
      |                    ^~~~~~~~~~~~~~~~~~~~~~~~~
internal/class.h:271:20: warning: ‘RCLASS_WRITE_SUBCLASSES’ declared ‘static’ but never defined [-Wunused-function]
  271 | static inline void RCLASS_WRITE_SUBCLASSES(VALUE klass, rb_subclass_anchor_t *anchor);
      |                    ^~~~~~~~~~~~~~~~~~~~~~~
```
2025-05-11 23:32:50 +09:00
Satoshi Tagomori
382645d440 namespace on read 2025-05-11 23:32:50 +09:00
Jean Boussier
d9502a8386 Rename rb_field_get -> rb_obj_field_get
To be consistent with `rb_obj_field_set`.
2025-05-10 15:39:33 +02:00
Jean Boussier
3f7c0af051 Rename rb_shape_obj_too_complex -> rb_shape_obj_too_complex_p 2025-05-09 10:22:51 +02:00
Jean Boussier
334ebba221 Rename rb_shape_get_shape_by_id -> RSHAPE 2025-05-09 10:22:51 +02:00
Jean Boussier
f48e45d1e9 Move object_id in object fields.
And get rid of the `obj_to_id_tbl`

It's no longer needed, the `object_id` is now stored inline
in the object alongside instance variables.

We still need the inverse table in case `_id2ref` is invoked, but
we lazily build it by walking the heap if that happens.

The `object_id` concern is also no longer a GC implementation
concern, but a generic implementation.

Co-Authored-By: Matt Valentine-House <matt@eightbitraptor.com>
2025-05-08 07:58:05 +02:00
Jean Boussier
0ea210d1ea Rename ivptr -> fields, next_iv_index -> next_field_index
Ivars will longer be the only thing stored inline
via shapes, so keeping the `iv_index` and `ivptr` names
would be confusing.

Instance variables won't be the only thing stored inline
via shapes, so keeping the `ivptr` name would be confusing.

`field` encompass anything that can be stored in a VALUE array.

Similarly, `gen_ivtbl` becomes `gen_fields_tbl`.
2025-05-08 07:58:05 +02:00
Jeremy Evans
926411171d
Support Marshal.{dump,load} for core Set
This was missed when adding core Set, because it's handled
implicitly for T_OBJECT.

Keep marshal compatibility between core Set and stdlib Set,
so you can unmarshal core Set with stdlib Set and vice versa.

Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
2025-04-28 08:38:35 -07:00
Jean Boussier
c0417bd094 Use set_table to track const caches
Now that we have a `set_table` implementation, we can
use it to track const caches and save some memory.

We could even save some more memory if `numtable` didn't
store a copy of the `hash` and instead recomputed it every
time, but this is a quick win.
2025-04-26 12:10:32 +02:00
Jeremy Evans
e4f85bfc31 Implement Set as a core class
Set has been an autoloaded standard library since Ruby 3.2.
The standard library Set is less efficient than it could be, as it
uses Hash for storage, which stores unnecessary values for each key.

Implementation details:

* Core Set uses a modified version of `st_table`, named `set_table`.
  than `s/st_/set_/`, the main difference is that the stored records
  do not have values, making them 1/3 smaller. `st_table_entry` stores
  `hash`, `key`, and `record` (value), while `set_table_entry` only
  stores `hash` and `key`.  This results in large sets using ~33% less
  memory compared to stdlib Set.  For small sets, core Set uses 12% more
  memory (160 byte object slot and 64 malloc bytes, while stdlib set
  uses 40 for Set and 160 for Hash).  More memory is used because
  the set_table is embedded and 72 bytes in the object slot are
  currently wasted. Hopefully we can make this more efficient and have
  it stored in an 80 byte object slot in the future.

* All methods are implemented as cfuncs, except the pretty_print
  methods, which were moved to `lib/pp.rb` (which is where the
  pretty_print methods for other core classes are defined).  As is
  typical for core classes, internal calls call C functions and
  not Ruby methods.  For example, to check if something is a Set,
  `rb_obj_is_kind_of` is used, instead of calling `is_a?(Set)` on the
  related object.

* Almost all methods use the same algorithm that the pure-Ruby
  implementation used.  The exception is when calling `Set#divide` with a
  block with 2-arity.  The pure-Ruby method used tsort to implement this.
  I developed an algorithm that only allocates a single intermediate
  hash and does not need tsort.

* The `flatten_merge` protected method is no longer necessary, so it
  is not implemented (it could be).

* Similar to Hash/Array, subclasses of Set are no longer reflected in
  `inspect` output.

* RDoc from stdlib Set was moved to core Set, with minor updates.

This includes a comprehensive benchmark suite for all public Set
methods.  As you would expect, the native version is faster in the
vast majority of cases, and multiple times faster in many cases.
There are a few cases where it is significantly slower:

* Set.new with no arguments (~1.6x)
* Set#compare_by_identity for small sets (~1.3x)
* Set#clone for small sets (~1.5x)
* Set#dup for small sets (~1.7x)

These are slower as Set does not currently use the AR table
optimization that Hash does, so a new set_table is initialized for
each call.  I'm not sure it's worth the complexity to have an AR
table-like optimization for small sets (for hashes it makes sense,
as small hashes are used everywhere in Ruby).

The rbs and repl_type_completor bundled gems will need updates to
support core Set.  The pull request marks them as allowed failures.

This passes all set tests with no changes.  The following specs
needed modification:

* Modifying frozen set error message (changed for the better)
* `Set#divide` when passed a 2-arity block no longer yields the same
  object as both the first and second argument (this seems like an issue
  with the previous implementation).
* Set-like objects that override `is_a?` such that `is_a?(Set)` return
  `true` are no longer treated as Set instances.
* `Set.allocate.hash` is no longer the same as `nil.hash`
* `Set#join` no longer calls `Set#to_a` (it calls the underlying C
   function).
* `Set#flatten_merge` protected method is not implemented.

Previously, `set.rb` added a `SortedSet` autoload, which loads
`set/sorted_set.rb`.  This replaces the `Set` autoload in `prelude.rb`
with a `SortedSet` autoload, but I recommend removing it and
`set/sorted_set.rb`.

This moves `test/set/test_set.rb` to `test/ruby/test_set.rb`,
reflecting that switch to a core class.  This does not move the spec
files, as I'm not sure how they should be handled.

Internally, this uses the st_* types and functions as much as
possible, and only adds set_* types and functions as needed.
The underlying set_table implementation is stored in st.c, but
there is no public C-API for it, nor is there one planned, in
order to keep the ability to change the internals going forward.

For internal uses of st_table with Qtrue values, those can
probably be replaced with set_table.  To do that, include
internal/set_table.h.  To handle symbol visibility (rb_ prefix),
internal/set_table.h uses the same macro approach that
include/ruby/st.h uses.

The Set class (rb_cSet) and all methods are defined in set.c.
There isn't currently a C-API for the Set class, though C-API
functions can be added as needed going forward.

Implements [Feature #21216]

Co-authored-by: Jean Boussier <jean.boussier@gmail.com>
Co-authored-by: Oliver Nutter <mrnoname1000@riseup.net>
2025-04-26 10:31:11 +09:00
Samuel Williams
d6d4e6877c
Tidy up rb_io_fptr_finalize. (#13136) 2025-04-19 19:16:54 +09:00
Samuel Williams
20a1c1dc6b
Ensure struct rb_io is passed through to thread.c. (#13134) 2025-04-19 09:55:16 +09:00
Takashi Kokubun
0252ce1cd4 Implement Options as part of ZJITState 2025-04-18 21:52:57 +09:00
Takashi Kokubun
d993307d4c Add --zjit option 2025-04-18 21:52:55 +09:00
John Hawthorn
57b6a7503f Lock-free hash set for fstrings [Feature #21268]
This implements a hash set which is wait-free for lookup and lock-free
for insert (unless resizing) to use for fstring de-duplication.

As highlighted in https://bugs.ruby-lang.org/issues/19288, heavy use of
fstrings (frozen interned strings) can significantly reduce the
parallelism of Ractors.

I tried a few other approaches first: using an RWLock, striping a series
of RWlocks (partitioning the hash N-ways to reduce lock contention), and
putting a cache in front of it. All of these improved the situation, but
were unsatisfying as all still required locks for writes (and granular
locks are awkward, since we run the risk of needing to reach a vm
barrier) and this table is somewhat write-heavy.

My main reference for this was Cliff Click's talk on a lock free
hash-table for java https://www.youtube.com/watch?v=HJ-719EGIts. It
turns out this lock-free hash set is made easier to implement by a few
properties:

 * We only need a hash set rather than a hash table (we only need keys,
   not values), and so the full entry can be written as a single VALUE
 * As a set we only need lookup/insert/delete, no update
 * Delete is only run inside GC so does not need to be atomic (It could
   be made concurrent)
 * I use rb_vm_barrier for the (rare) table rebuilds (It could be made
   concurrent) We VM lock (but don't require other threads to stop) for
   table rebuilds, as those are rare
 * The conservative garbage collector makes deferred replication easy,
   using a T_DATA object

Another benefits of having a table specific to fstrings is that we
compare by value on lookup/insert, but by identity on delete, as we only
want to remove the exact string which is being freed. This is faster and
provides a second way to avoid the race condition in
https://bugs.ruby-lang.org/issues/21172.

This is a pretty standard open-addressing hash table with quadratic
probing. Similar to our existing st_table or id_table. Deletes (which
happen on GC) replace existing keys with a tombstone, which is the only
type of update which can occur. Tombstones are only cleared out on
resize.

Unlike st_table, the VALUEs are stored in the hash table itself
(st_table's bins) rather than as a compact index. This avoids an extra
pointer dereference and is possible because we don't need to preserve
insertion order. The table targets a load factor of 2 (it is enlarged
once it is half full).
2025-04-18 13:03:54 +09:00
John Hawthorn
89199a47db Extract rb_gc_free_fstring to string.c
This allows more flexibility in how we deal with the fstring table
2025-04-18 13:03:54 +09:00
Samuel Williams
8d21f666b8
Introduce enum rb_io_mode. (#7894) 2025-04-16 07:50:37 +00:00
Xavier Noria
c5c0bb5afc Restore the original order of const_added and inherited callbacks
Originally, if a class was defined with the class keyword, the cref had a
const_added callback, and the superclass an inherited callback, const_added was
called first, and inherited second.

This was discussed in

    https://bugs.ruby-lang.org/issues/21143

and an attempt at changing this order was made.

While both constant assignment and inheritance have happened before these
callbacks are invoked, it was deemed nice to have the same order as in

    C = Class.new

This was mostly for alignment: In that last use case things happen at different
times and therefore the order of execution is kind of obvious, whereas when the
class keyword is involved, the order is opaque to the user and it is up to the
interpreter.

However, soon in

    https://bugs.ruby-lang.org/issues/21193

Matz decided to play safe and keep the existing order.

This reverts commits:

    de097fbe5f
    de48e47ddf
2025-04-10 10:20:31 +02:00
Jean Boussier
085cc6e434 Ractor: revert to moving object bytes, but size pool aware
Using `rb_obj_clone` introduce other problems, such as `initialize_*`
callbacks invocation in the context of the parent ractor.

So we can revert back to copy the content of the object slots,
but in a way that is aware of size pools.
2025-04-04 16:26:29 +02:00
Jean Boussier
0350290262 Ractor: Fix moving embedded objects
[Bug #20271]
[Bug #20267]
[Bug #20255]

`rb_obj_alloc(RBASIC_CLASS(obj))` will always allocate from the basic
40B pool, so if `obj` is larger than `40B`, we'll create a corrupted
object when we later copy the shape_id.

Instead we can use the same logic than ractor copy, which is
to use `rb_obj_clone`, and later ask the GC to free the original
object.

We then must turn it into a `T_OBJECT`, because otherwise
just changing its class to `RactorMoved` leaves a lot of
ways to keep using the object, e.g.:

```
a = [1, 2, 3]
Ractor.new{}.send(a, move: true)
[].concat(a) # Should raise, but wasn't.
```

If it turns out that `rb_obj_clone` isn't performant enough
for some uses, we can always have carefully crafted specialized
paths for the types that would benefit from it.
2025-03-31 12:01:55 +02:00
Étienne Barrié
6ecfe643b5 Freeze $/ and make it ractor safe
[Feature #21109]

By always freezing when setting the global rb_rs variable, we can ensure
it is not modified and can be accessed from a ractor.

We're also making sure it's an instance of String and does not have any
instance variables.

Of course, if $/ is changed at runtime, it may cause surprising behavior
but doing so is deprecated already anyway.

Co-authored-by: Jean Boussier <jean.boussier@gmail.com>
2025-03-27 17:54:56 +01:00
Jean Boussier
de097fbe5f Trigger inherited and const_set callbacks after const has been defined
[Misc #21143]
[Bug #21193]

The previous change caused a backward compatibility issue with code
that called `Object.const_source_location` from the `inherited` callback.

To fix this, the order is now:

- Define the constant
- Invoke `inherited`
- Invoke `const_set`
2025-03-20 18:18:11 +01:00
Nobuyoshi Nakada
453f88f7f1 Make ASAN default option string built-in libruby
The content depends on ruby internal, not responsibility of the
caller.  Revive `RUBY_GLOBAL_SETUP` macro to define the hook function.
2025-03-16 17:33:58 +09:00
Jean Boussier
87f9c3c65e String#gsub! Elide MatchData allocation when we know it can't escape
In gsub is used with a string replacement or a map that doesn't
have a default proc, we know for sure no code can cause the MatchData
to escape the `gsub` call.

In such case, we still have to allocate a new MatchData because we
don't know what is the lifetime of the backref, but for any subsequent
match we can re-use the MatchData we allocated ourselves, reducing
allocations significantly.

This partially fixes [Misc #20652], except when a block is used,
and partially reduce the performance impact of
abc0304cb2 / [Bug #17507]

```
compare-ruby: ruby 3.5.0dev (2025-02-24T09:44:57Z master 5cf146399f) +PRISM [arm64-darwin24]
built-ruby: ruby 3.5.0dev (2025-02-24T10:58:27Z gsub-elude-match da966636e9) +PRISM [arm64-darwin24]
warming up....

|                 |compare-ruby|built-ruby|
|:----------------|-----------:|---------:|
|escape           |      3.577k|    3.697k|
|                 |           -|     1.03x|
|escape_bin       |      5.869k|    6.743k|
|                 |           -|     1.15x|
|escape_utf8      |      3.448k|    3.738k|
|                 |           -|     1.08x|
|escape_utf8_bin  |      6.361k|    7.267k|
|                 |           -|     1.14x|
```

Co-Authored-By: Étienne Barrié <etienne.barrie@gmail.com>
2025-02-24 18:32:46 +01:00
Peter Zhu
7b6e07ea93 Add rb_gc_object_metadata API
This function replaces the internal rb_obj_gc_flags API. rb_gc_object_metadata
returns an array of name and value pairs, with the last element having
0 for the name.
2025-02-19 09:47:28 -05:00
Nobuyoshi Nakada
3f07bc76ff [Bug #21144] Win32: Use Windows time zone ID if TZ is not set
If the TZ environment variable is not set, the time zone names
retrieved from the system are localized for UI display and may vary
across editions and language packs for the same time zone.
Use the time zone IDs that are invariant across environments instead.
2025-02-19 18:27:32 +09:00
Aaron Patterson
8cafa5b8ce Only count VM instructions in YJIT stats builds
The instruction counter is slowing multi-Ractor applications.  I had
changed it to use a thread local, but using a thread local is slowing
single threaded applications.  This commit only enables the instruction
counter in YJIT stats builds until we can figure out a way to gather the
information with lower overhead.

Co-authored-by: Randy Stauner <randy.stauner@shopify.com>
2025-02-14 14:39:35 -05:00
Nobuyoshi Nakada
4a67ef09cc
[Feature #21116] Extract RJIT as a third-party gem 2025-02-13 18:01:03 +09:00
Jean Boussier
f32d5071b7 Elide string allocation when using String#gsub in MAP mode
If the provided Hash doesn't have a default proc, we know for
sure that we'll never call into user provided code, hence the
string we allocate to access the Hash can't possibly escape.

So we don't actually have to allocate it, we can use a fake_str,
AKA a stack allocated string.

```
compare-ruby: ruby 3.5.0dev (2025-02-10T13:47:44Z master 3fb455adab) +PRISM [arm64-darwin23]
built-ruby: ruby 3.5.0dev (2025-02-10T17:09:52Z opt-gsub-alloc ea5c28958f) +PRISM [arm64-darwin23]
warming up....

|                 |compare-ruby|built-ruby|
|:----------------|-----------:|---------:|
|escape           |      3.374k|    3.722k|
|                 |           -|     1.10x|
|escape_bin       |      5.469k|    6.587k|
|                 |           -|     1.20x|
|escape_utf8      |      3.465k|    3.734k|
|                 |           -|     1.08x|
|escape_utf8_bin  |      5.752k|    7.283k|
|                 |           -|     1.27x|
```
2025-02-12 10:23:50 +01:00
Peter Zhu
3fb455adab Move global symbol reference updating to rb_sym_global_symbols_update_references 2025-02-10 08:47:44 -05:00
Peter Zhu
8d0416ae0b Make ruby_global_symbols movable
The `ids` array and `dsymbol_fstr_hash` were pinned because they were
kept alive by rb_vm_register_global_object. This prevented the GC from
moving them even though there were reference updating code.

This commit changes it to be marked movable by marking it as a root object.
2025-02-10 08:47:44 -05:00