Commit graph

1114 commits

Author SHA1 Message Date
Alan Wu
38558dd95e YJIT: Fix defined?(yield) and block_given? at top level
Previously, YJIT returned truthy for the block given query at the top
level. That's incorrect because the top level script never receives a
block, and `yield` is a syntax error there.

Inside methods, the number of hops to get from `iseq` to
`iseq->body->local_iseq` is the same as the number of
`VM_ENV_PREV_EP(ep)` hops to get to an environment with
`VM_ENV_FLAG_LOCAL`. YJIT and the interpreter both rely on this as can
be seen in get_lvar_level(). However, this identity does not hold for
the top level frame because of vm_set_eval_stack(), which sets up
`TOPLEVEL_BINDING`.

Since only methods can take a block that `yield` goes to, have ISEQs
that are the child of a non-method ISEQ return falsy for the block given
query. This fixes the issue for the top level script and is an
optimization for non-method contexts such as inside `ISEQ_TYPE_CLASS`.
2025-08-14 17:01:11 -04:00
Alan Wu
6e3790b17f YJIT: Fix mismatched_lifetime_syntaxes, new in Rust 1.89.0 2025-08-11 15:49:14 -04:00
Stan Lo
4a70f946a7
ZJIT: Implement SingleRactorMode invalidation (#14121)
* ZJIT: Implement SingleRactorMode invalidation

* ZJIT: Add macro for compiling jumps

* ZJIT: Fix typo in comment

* YJIT: Fix typo in comment

* ZJIT: Avoid using unexported types in zjit.h

`enum ruby_vminsn_type` is declared in `insns.inc` and is not exported.
Using it in `zjit.h` would cause build errors when the file including it
doesn't include `insns.inc`.
2025-08-06 13:51:41 -07:00
Takashi Kokubun
12306c0c6f
ZJIT: Stub JIT-to-JIT calls (#14052) 2025-07-31 12:57:59 -07:00
Takashi Kokubun
2cd10de330
ZJIT: Prepare for sharing JIT hooks with ZJIT (#14044) 2025-07-30 10:11:10 -07:00
Jean Boussier
7ee127d2d1 Get rid of imemo_ast
It has been marked as obsolete for a while and I see no reason
to keep it.
2025-07-29 13:05:12 +02:00
Takashi Kokubun
f1acf47ca2
YJIT: Call YJIT hooks before enabling YJIT (#14032) 2025-07-28 15:17:45 -07:00
Kunshan Wang
5ef20b3a27 YJIT: Use raw memory write to update pointers in code
Because we have set all code memory to writable before the reference
updating phase, we can use raw memory writes directly.
2025-07-24 11:37:44 -04:00
Peter Zhu
f186f2cb70 Remove unused imemo_parser_strterm 2025-07-24 09:49:13 -04:00
Alan Wu
1a20765074 DRY up CARGO_VERBOSE for JITs 2025-07-16 19:50:30 -04:00
Takashi Kokubun
571a8d2753
YJIT: Side-exit on String#dup when it's not leaf (#13921)
* YJIT: Side-exit on String#dup when it's not leaf

* Use an enum instead of a macro for bindgen
2025-07-16 22:59:32 +00:00
Kunshan Wang
3a47f4eacf YJIT: Move RefCell one level down
This is the second part of making YJIT work with parallel GC.

During GC, `rb_yjit_iseq_mark` and `rb_yjit_iseq_update_references` need
to resolve offsets in `Block::gc_obj_offsets` into absolute addresses
before reading or updating the fields.  This needs the base address
stored in `VirtualMemory::region_start` which was previously behind a
`RefCell`.  When multiple GC threads scan multiple iseq simultaneously
(which is possible for some GC modules such as MMTk), it will panic
because the `RefCell` is already borrowed.

We notice that some fields of `VirtualMemory`, such as `region_start`,
are never modified once `VirtualMemory` is constructed.  We change the
type of the field `CodeBlock::mem_block` from `Rc<RefCell<T>>` to
`Rc<T>`, and push the `RefCell` into `VirtualMemory`.  We extract
mutable fields of `VirtualMemory` into a dedicated struct
`VirtualMemoryMut`, and store them in a field `VirtualMemory::mutable`
which is a `RefCell<VirtualMemoryMut>`.  After this change, methods that
access immutable fields in `VirtualMemory`, particularly `base_ptr()`
which reads `region_start`, will no longer need to borrow any `RefCell`.
Methods that access mutable fields will need to borrow
`VirtualMemory::mutable`, but the number of borrowing operations becomes
strictly fewer than before because borrowing operations previously done
in callers (such as `CodeBlock::write_mem`) are moved into methods of
`VirtualMemory` (such as `VirtualMemory::write_bytes`).
2025-07-14 16:21:55 -04:00
Kunshan Wang
51a3ea5ade YJIT: Set code mem permissions in bulk
Some GC modules, notably MMTk, support parallel GC, i.e. multiple GC
threads work in parallel during a GC.  Currently, when two GC threads
scan two iseq objects simultaneously when YJIT is enabled, both threads
will attempt to borrow `CodeBlock::mem_block`, which will result in
panic.

This commit makes one part of the change.

We now set the YJIT code memory to writable in bulk before the
reference-updating phase, and reset it to executable in bulk after the
reference-updating phase.  Previously, YJIT lazily sets memory pages
writable while updating object references embedded in JIT-compiled
machine code, and sets the memory back to executable by calling
`mark_all_executable`.  This approach is inherently unfriendly to
parallel GC because (1) it borrows `CodeBlock::mem_block`, and (2) it
sets the whole `CodeBlock` as executable which races with other GC
threads that are updating other iseq objects.  It also has performance
overhead due to the frequent invocation of system calls.  We now set the
permission of all the code memory in bulk before and after the reference
updating phase.  Multiple GC threads can now perform raw memory writes
in parallel.  We should also see performance improvement during moving
GC because of the reduced number of `mprotect` system calls.
2025-07-14 16:21:55 -04:00
Takashi Kokubun
f5085c70f2
ZJIT: Mark profiled objects when marking ISEQ (#13784) 2025-07-09 16:03:23 -07:00
Alan Wu
0828dff3f8 ZJIT: Codegen for defined?(yield)
Lots of stdlib methods such as Integer#times and Kernel#then use this,
so at least this will make writing tests slightly easier.
2025-06-28 00:03:42 +09:00
Max Bernstein
3a9bf4a2ae
ZJIT: Optimize frozen array aref (#13666)
If we have a frozen array `[..., a, ...]` and a compile-time fixnum index `i`,
we can do the array load at compile-time.
2025-06-23 17:41:49 -05:00
Jean Boussier
fb68721f63 Rename imemo_class_fields -> imemo_fields 2025-06-17 15:28:05 +02:00
Jean Boussier
15084fbc3c Get rid of FL_EXIVAR
Now that the shape_id gives us all the same information, it's no
longer needed.
2025-06-13 23:50:30 +02:00
Jean Boussier
a99d941cac Add SHAPE_ID_HAS_IVAR_MASK for quick ivar check
This allow checking if an object has ivars with just a shape_id
mask.
2025-06-13 19:46:29 +02:00
Jean Boussier
e070d93573 Get rid of rb_shape_lookup 2025-06-12 17:08:22 +02:00
Jean Boussier
3abdd4241f Turn rb_classext_t.fields into a T_IMEMO/class_fields
This behave almost exactly as a T_OBJECT, the layout is entirely
compatible.

This aims to solve two problems.

First, it solves the problem of namspaced classes having
a single `shape_id`. Now each namespaced classext
has an object that can hold the namespace specific
shape.

Second, it open the door to later make class instance variable
writes atomics, hence be able to read class variables
without locking the VM.
In the future, in multi-ractor mode, we can do the write
on a copy of the `fields_obj` and then atomically swap it.

Considerations:

  - Right now the `RClass` shape_id is always synchronized,
    but with namespace we should likely mark classes that have
    multiple namespace with a specific shape flag.
2025-06-12 07:58:16 +02:00
Alan Wu
e5c7f1695e YJIT: x86: Fix panic writing 32-bit number with top bit set
Previously, `asm.mov(m32, imm32)` panicked when `imm32 > 0x80000000`. It
attempted to split imm32 into a register before doing the store, but
then the register size didn't match the destination size.

Instead of splitting, use the `MOV r/m32, imm32` form which works for
all 32-bit values. Adjust asserts that assumed that all forms undergo
sign extension, which is not true for this case.

See: 54edc930f9
2025-06-11 19:49:49 +09:00
Jean Boussier
191f6e3b87 Get rid of rb_shape_t.heap_id 2025-06-07 18:30:44 +02:00
Jean Boussier
4e39580992 Refactor raw accesses to rb_shape_t.capacity 2025-06-05 22:06:15 +02:00
Jean Boussier
772fc1f187 Get rid of rb_shape_t.flags
Now all flags are only in the `shape_id_t`, and can all be checked
without needing to dereference a pointer.
2025-06-05 07:44:44 +02:00
Jean Boussier
675f33508c Get rid of TOO_COMPLEX shape type
Instead it's now a `shape_id` flag.

This allows to check if an object is complex without having
to chase the `rb_shape_t` pointer.
2025-06-04 13:13:50 +02:00
Jean Boussier
e27404af9e Use all 32bits of shape_id_t on all platforms
Followup: https://github.com/ruby/ruby/pull/13341 / [Feature #21353]

Even thought `shape_id_t` has been make 32bits, we were still limited
to use only the lower 16 bits because they had to fit alongside `attr_index_t`
inside a `uintptr_t` in inline caches.

By enlarging inline caches we can unlock the full 32bits on all
platforms, allowing to use these extra bits for tagging.
2025-06-03 21:15:41 +02:00
Jean Boussier
e9fd44dd72 shape.c: Implement a lock-free version of get_next_shape_internal
Whenever we run into an inline cache miss when we try to set
an ivar, we may need to take the global lock, just to be able to
lookup inside `shape->edges`.

To solve that, when we're in multi-ractor mode, we can treat
the `shape->edges` as immutable. When we need to add a new
edge, we first copy the table, and then replace it with
CAS.

This increases memory allocations, however we expect that
creating new transitions becomes increasingly rare over time.

```ruby
class A
  def initialize(bool)
    @a = 1
    if bool
      @b = 2
    else
      @c = 3
    end
  end

  def test
    @d = 4
  end
end

def bench(iterations)
  i = iterations
  while i > 0
    A.new(true).test
    A.new(false).test
    i -= 1
  end
end

if ARGV.first == "ractor"
  ractors = 8.times.map do
    Ractor.new do
      bench(20_000_000 / 8)
    end
  end
  ractors.each(&:take)
else
  bench(20_000_000)
end
```

The above benchmark takes 27 seconds in Ractor mode on Ruby 3.4,
and only 1.7s with this branch.

Co-Authored-By: Étienne Barrié <etienne.barrie@gmail.com>
2025-06-02 17:49:53 +02:00
John Hawthorn
d1343e12d2 Use flag for RCLASS_IS_INITIALIZED
Previously we used a flag to set whether a module was uninitialized.
When checked whether a class was initialized, we first had to check that
it had a non-zero superclass, as well as that it wasn't BasicObject.

With the advent of namespaces, RCLASS_SUPER is now an expensive
operation, and though we could just check for the prime superclass, we
might as well take this opportunity to use a flag so that we can perform
the initialized check with as few instructions as possible.

It's possible in the future that we could prevent uninitialized classes
from being available to the user, but currently there are a few ways to
do that.
2025-05-28 11:44:07 -04:00
Jean Boussier
ccf2b7c5b8 Refactor rb_shape_too_complex_p to take a shape_id_t. 2025-05-27 15:34:02 +02:00
Jean Boussier
a59835e1d5 Refactor rb_shape_get_iv_index to take a shape_id_t
Further reduce exposure of `rb_shape_t`.
2025-05-27 15:34:02 +02:00
Jean Boussier
e535f8248b Get rid of rb_shape_id(rb_shape_t *)
We should avoid conversions from `rb_shape_t *` into `shape_id_t`
outside of `shape.c` as the short term goal is to have `shape_id_t`
contain tags.
2025-05-27 12:45:24 +02:00
Jean Boussier
186e60cb68 YJIT: handle opt_aset_with
```
 # frozen_string_ltieral: true
hash["literal"] = value
```
2025-05-15 11:56:24 +02:00
Alan Wu
92b218fbc3 YJIT: ZJIT: Allow both JITs in the same build
This commit allows building YJIT and ZJIT simultaneously, a "combo
build". Previously, `./configure --enable-yjit --enable-zjit` failed. At
runtime, though, only one of the two can be enabled at a time.

Add a root Cargo workspace that contains both the yjit and zjit crate.
The common Rust build integration mechanisms are factored out into
defs/jit.mk.

Combo YJIT+ZJIT dev builds are supported; if either JIT uses
`--enable-*=dev`, both of them are built in dev mode.

The combo build requires Cargo, but building one JIT at a time with only
rustc in release build remains supported.
2025-05-15 00:39:03 +09:00
Takashi Kokubun
53a27f114a
YJIT: Split the block on optimized getlocal/setlocal (#13282) 2025-05-12 09:03:46 -07:00
Satoshi Tagomori
e81d50207b Add yjit/zjit bindings for adding namespace 2025-05-11 23:32:50 +09:00
Jean Boussier
ea77250847 Rename RB_OBJ_SHAPE -> rb_obj_shape
As well as `RB_OBJ_SHAPE_ID` -> `rb_obj_shape_id`
and `RSHAPE` is now a simple alias for `rb_shape_lookup`.

I tried to turn all these into `static inline` but I'm having
trouble with `RUBY_EXTERN rb_shape_tree_t *rb_shape_tree_ptr;`
not being exposed as I'd expect.
2025-05-09 10:22:51 +02:00
Jean Boussier
5782561fc1 Rename rb_shape_get_shape_id -> RB_OBJ_SHAPE_ID
And `rb_shape_get_shape` -> `RB_OBJ_SHAPE`.
2025-05-09 10:22:51 +02:00
Jean Boussier
c9b08882b7 Refactor rb_shape_get_next to return an ID
Also rename it, and change parameters to be consistent with
other transition functions.
2025-05-09 10:22:51 +02:00
Jean Boussier
3f7c0af051 Rename rb_shape_obj_too_complex -> rb_shape_obj_too_complex_p 2025-05-09 10:22:51 +02:00
Jean Boussier
334ebba221 Rename rb_shape_get_shape_by_id -> RSHAPE 2025-05-09 10:22:51 +02:00
Jean Boussier
f48e45d1e9 Move object_id in object fields.
And get rid of the `obj_to_id_tbl`

It's no longer needed, the `object_id` is now stored inline
in the object alongside instance variables.

We still need the inverse table in case `_id2ref` is invoked, but
we lazily build it by walking the heap if that happens.

The `object_id` concern is also no longer a GC implementation
concern, but a generic implementation.

Co-Authored-By: Matt Valentine-House <matt@eightbitraptor.com>
2025-05-08 07:58:05 +02:00
Jean Boussier
6c9b3ac232 Refactor OBJ_TOO_COMPLEX_SHAPE_ID to not be referenced outside shape.h
Also refactor checks for `->type == SHAPE_OBJ_TOO_COMPLEX`.
2025-05-08 07:58:05 +02:00
Jean Boussier
0ea210d1ea Rename ivptr -> fields, next_iv_index -> next_field_index
Ivars will longer be the only thing stored inline
via shapes, so keeping the `iv_index` and `ivptr` names
would be confusing.

Instance variables won't be the only thing stored inline
via shapes, so keeping the `ivptr` name would be confusing.

`field` encompass anything that can be stored in a VALUE array.

Similarly, `gen_ivtbl` becomes `gen_fields_tbl`.
2025-05-08 07:58:05 +02:00
Takashi Kokubun
cbf9c088f8
YJIT: End the block after OPTIMIZE_METHOD_TYPE_CALL (#13245) 2025-05-05 13:35:28 -07:00
Jean Boussier
a3af4e905f Make rb_shape.capacity an attr_index_t 2025-05-05 14:44:49 +02:00
Alan Wu
33909a1c69 YJIT: ZJIT: Share identical glue functions
Working towards having YJIT and ZJIT in the same build, we need to
deduplicate some glue code that would otherwise cause name collision.
Add jit.c for this and build it for YJIT and ZJIT builds. Update bindgen
to look at jit.c; some shuffling of functions in the output, but the set
of functions shouldn't have changed.
2025-05-02 23:47:57 +09:00
Takashi Kokubun
0f3d6ee578
ZJIT: Disable ZJIT instructions when USE_ZJIT is 0 (#13199)
* ZJIT: Disable ZJIT instructions when USE_ZJIT is 0

* Test the order of ZJIT instructions

* Add more jobs that disable JITs

* Show instruction names in the message
2025-04-29 11:03:13 -07:00
Takashi Kokubun
58e3aa0224
ZJIT: Drop trace_zjit_* instructions (#13189) 2025-04-28 09:25:56 -07:00
Rian McGuire
80a1a1bb8a
YJIT: Fix potential infinite loop when OOM (GH-13186)
Avoid generating an infinite loop in the case where:
1. Block `first` is adjacent to block `second`, and the branch from `first` to
   `second` is a fallthrough, and
2. Block `second` immediately exits to the interpreter, and
3. Block `second` is invalidated and YJIT is OOM

While pondering how to fix this, I think I've stumbled on another related edge case:
1. Block `incoming_one` and `incoming_two` both branch to block `second`. Block
   `incoming_one` has a fallthrough
2. Block `second` immediately exits to the interpreter (so it starts with its exit)
3. When Block `second` is invalidated, the incoming fallthrough branch from
   `incoming_one` might be rewritten first, which overwrites the start of block
   `second` with a jump to a new branch stub.
4. YJIT runs of out memory
5. The incoming branch from `incoming_two` is then rewritten, but because we're
   OOM we can't generate a new stub, so we use `second`'s exit as the branch
   target. However `second`'s exit was already overwritten with a jump to the
   branch stub for `incoming_one`, so `incoming_two` will end up jumping to
   `incoming_one`'s branch stub.

Fixes [Bug #21257]
2025-04-28 21:50:29 +09:00