Commit graph

1114 commits

Author SHA1 Message Date
Takashi Kokubun
31bdffb5b9
YJIT: Specialize String#dup (#12090) 2024-11-14 18:15:39 -05:00
Takashi Kokubun
7e2f9eaccd
YJIT: Specialize Integer#pred (#12082) 2024-11-14 12:04:48 -05:00
Takashi Kokubun
30e1d6b5a8
YJIT: Add inline_block_count stat (#12081) 2024-11-13 16:17:29 -05:00
Randy Stauner
beafae9750
YJIT: Specialize String#[] (String#slice) with fixnum arguments (#12069)
* YJIT: Specialize `String#[]` (`String#slice`) with fixnum arguments

String#[] is in the top few C calls of several YJIT benchmarks:
liquid-compile rubocop mail sudoku

This speeds up these benchmarks by 1-2%.

* YJIT: Try harder to get type info for `String#[]`

In the large generated code of the mail gem the context doesn't have
the type info.  In that case if we peek at the stack and add a guard
we can still apply the specialization
and it speeds up the mail benchmark by 5%.

Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
Co-authored-by: Takashi Kokubun (k0kubun) <takashikkbn@gmail.com>

---------

Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
Co-authored-by: Takashi Kokubun (k0kubun) <takashikkbn@gmail.com>
2024-11-13 12:25:09 -05:00
Jean byroot Boussier
6deeec5d45
Mark strings returned by Symbol#to_s as chilled (#12065)
* Use FL_USER0 for ELTS_SHARED

This makes space in RString for two bits for chilled strings.

* Mark strings returned by `Symbol#to_s` as chilled

[Feature #20350]

`STR_CHILLED` now spans on two user flags. If one bit is set it
marks a chilled string literal, if it's the other it marks a
`Symbol#to_s` chilled string.

Since it's not possible, and doesn't make much sense to include
debug info when `--debug-frozen-string-literal` is set, we can't
include allocation source, but we can safely include the symbol
name in the warning message, making it much easier to find the source
of the issue.

Co-Authored-By: Étienne Barrié <etienne.barrie@gmail.com>

---------

Co-authored-by: Étienne Barrié <etienne.barrie@gmail.com>
Co-authored-by: Jean Boussier <jean.boussier@gmail.com>
2024-11-13 09:20:00 -05:00
Peter Zhu
1d1c80e644
Fix false-positive memory leak using Valgrind in YJIT (#12057)
When we run with RUBY_FREE_AT_EXIT, there's a false-positive memory leak
reported in YJIT because the METHOD_CODEGEN_TABLE is never freed. This
commit adds rb_yjit_free_at_exit that is called at shutdown when
RUBY_FREE_AT_EXIT is set.

Reported memory leak:

    ==699816== 1,104 bytes in 1 blocks are possibly lost in loss record 1 of 1
    ==699816==    at 0x484680F: malloc (vg_replace_malloc.c:446)
    ==699816==    by 0x155B3E: UnknownInlinedFun (unix.rs:14)
    ==699816==    by 0x155B3E: UnknownInlinedFun (stats.rs:36)
    ==699816==    by 0x155B3E: UnknownInlinedFun (stats.rs:27)
    ==699816==    by 0x155B3E: alloc (alloc.rs:98)
    ==699816==    by 0x155B3E: alloc_impl (alloc.rs:181)
    ==699816==    by 0x155B3E: allocate (alloc.rs:241)
    ==699816==    by 0x155B3E: do_alloc<alloc::alloc::Global> (alloc.rs:15)
    ==699816==    by 0x155B3E: new_uninitialized<alloc::alloc::Global> (mod.rs:1750)
    ==699816==    by 0x155B3E: fallible_with_capacity<alloc::alloc::Global> (mod.rs:1788)
    ==699816==    by 0x155B3E: prepare_resize<alloc::alloc::Global> (mod.rs:2864)
    ==699816==    by 0x155B3E: resize_inner<alloc::alloc::Global> (mod.rs:3060)
    ==699816==    by 0x155B3E: reserve_rehash_inner<alloc::alloc::Global> (mod.rs:2950)
    ==699816==    by 0x155B3E: hashbrown::raw::RawTable<T,A>::reserve_rehash (mod.rs:1231)
    ==699816==    by 0x5BC39F: UnknownInlinedFun (mod.rs:1179)
    ==699816==    by 0x5BC39F: find_or_find_insert_slot<(usize, fn(&mut yjit::codegen::JITState, &mut yjit::backend::ir::Assembler, *const yjit::cruby::autogened::rb_callinfo, *const yjit::cruby::autogened::rb_callable_method_entry_struct, core::option::Option<yjit::codegen::BlockHandler>, i32, core::option::Option<yjit::cruby::VALUE>) -> bool), alloc::alloc::Global, hashbrown::map::equivalent_key::{closure_env#0}<usize, usize, fn(&mut yjit::codegen::JITState, &mut yjit::backend::ir::Assembler, *const yjit::cruby::autogened::rb_callinfo, *const yjit::cruby::autogened::rb_callable_method_entry_struct, core::option::Option<yjit::codegen::BlockHandler>, i32, core::option::Option<yjit::cruby::VALUE>) -> bool>, hashbrown::map::make_hasher::{closure_env#0}<usize, fn(&mut yjit::codegen::JITState, &mut yjit::backend::ir::Assembler, *const yjit::cruby::autogened::rb_callinfo, *const yjit::cruby::autogened::rb_callable_method_entry_struct, core::option::Option<yjit::codegen::BlockHandler>, i32, core::option::Option<yjit::cruby::VALUE>) -> bool, std:#️⃣:random::RandomState>> (mod.rs:1413)
    ==699816==    by 0x5BC39F: hashbrown::map::HashMap<K,V,S,A>::insert (map.rs:1754)
    ==699816==    by 0x57C5C6: insert<usize, fn(&mut yjit::codegen::JITState, &mut yjit::backend::ir::Assembler, *const yjit::cruby::autogened::rb_callinfo, *const yjit::cruby::autogened::rb_callable_method_entry_struct, core::option::Option<yjit::codegen::BlockHandler>, i32, core::option::Option<yjit::cruby::VALUE>) -> bool, std:#️⃣:random::RandomState> (map.rs:1104)
    ==699816==    by 0x57C5C6: yjit::codegen::reg_method_codegen (codegen.rs:10521)
    ==699816==    by 0x57C295: yjit::codegen::yjit_reg_method_codegen_fns (codegen.rs:10464)
    ==699816==    by 0x5C6B07: rb_yjit_init (yjit.rs:40)
    ==699816==    by 0x393723: ruby_opt_init (ruby.c:1820)
    ==699816==    by 0x393723: ruby_opt_init (ruby.c:1767)
    ==699816==    by 0x3957D4: prism_script (ruby.c:2215)
    ==699816==    by 0x3957D4: process_options (ruby.c:2538)
    ==699816==    by 0x396065: ruby_process_options (ruby.c:3166)
    ==699816==    by 0x236E56: ruby_options (eval.c:117)
    ==699816==    by 0x15BAED: rb_main (main.c:43)
    ==699816==    by 0x15BAED: main (main.c:62)

After this patch, there are no more memory leaks reported when running
RUBY_FREE_AT_EXIT with Valgrind on an empty Ruby script:

    $ RUBY_FREE_AT_EXIT=1 valgrind --leak-check=full ruby -e ""
    ...
    ==700357== HEAP SUMMARY:
    ==700357==     in use at exit: 0 bytes in 0 blocks
    ==700357==   total heap usage: 36,559 allocs, 36,559 frees, 6,064,783 bytes allocated
    ==700357==
    ==700357== All heap blocks were freed -- no leaks are possible
2024-11-11 20:45:11 +00:00
Alan Wu
dccfab0c53
YJIT: Always abandon the block when gen_branch() or defer_compilation() fails
In [1], we started checking for gen_branch failures, but I made two
crucial mistakes. One, defer_compilation() had the same issue as
gen_branch() but wasn't checked. Two, returning None from a codegen
function does not throw away the block. Checking how gen_single_block()
handles codegen functions, you can see that None terminates the block
with an exit, but does not overall return an Err. This handling is fine
for unimplemented instructions, for example, but incorrect in case
gen_branch() fails. The missing branch essentially corrupts the
block; adding more code after a missing branch doesn't correct the code.

Always abandon the block when defer_compilation() or gen_branch() fails.

[1]: cb661d7d82
Fixup: [1]
2024-11-08 14:09:55 -05:00
Alan Wu
d1969474e9 YJIT: Pass panic message to rb_bug()
So that the Rust panic message is forwarded to the RUBY_CRASH_REPORT
system, instead of only the static "YJIT panicked" message done so
previously. This helps with triaging crashes since it's easier than
trying to parse stderr output.

Sample:

    <internal:yjit_hook>:2: [BUG] YJIT: panicked at src/codegen.rs:1197:5:
    explicit panic
    ...
2024-11-08 10:06:47 -05:00
Nobuyoshi Nakada
c690ca03f3 Ignore return value of into_raw_fd
Fix as the compiler orders:
```
warning: unused return value of `into_raw_fd` that must be used
   --> ../src/yjit/src/disasm.rs:123:21
    |
123 |                     file.into_raw_fd(); // keep the fd open
    |                     ^^^^^^^^^^^^^^^^^^
    |
    = note: losing the raw file descriptor may leak resources
    = note: `#[warn(unused_must_use)]` on by default
help: use `let _ = ...` to ignore the resulting value
    |
123 |                     let _ = file.into_raw_fd(); // keep the fd open
    |                     +++++++

warning: unused return value of `into_raw_fd` that must be used
  --> ../src/yjit/src/log.rs:84:21
   |
84 |                     file.into_raw_fd(); // keep the fd open
   |                     ^^^^^^^^^^^^^^^^^^
   |
   = note: losing the raw file descriptor may leak resources
help: use `let _ = ...` to ignore the resulting value
   |
84 |                     let _ = file.into_raw_fd(); // keep the fd open
   |                     +++++++
```
2024-11-06 12:37:13 +09:00
Takashi Kokubun
478e0fc710
YJIT: Replace Array#each only when YJIT is enabled (#11955)
* YJIT: Replace Array#each only when YJIT is enabled

* Add comments about BUILTIN_ATTR_C_TRACE

* Make Ruby Array#each available with --yjit as well

* Fix all paths that expect a C location

* Use method_basic_definition_p to detect patches

* Copy a comment about C_TRACE flag to compilers

* Rephrase a comment about add_yjit_hook

* Give METHOD_ENTRY_BASIC flag to Array#each

* Add --yjit-c-builtin option

* Allow inconsistent source_location in test-spec

* Refactor a check of BUILTIN_ATTR_C_TRACE

* Set METHOD_ENTRY_BASIC without touching vm->running
2024-11-04 11:14:28 -05:00
Alan Wu
8e509380a2 YJIT: Make PendingBranch::set_target must_use [ci skip] 2024-10-23 10:20:44 -04:00
Alan Wu
cb661d7d82
YJIT: Check when gen_branch() fails
We got some core dumps in the wild where a PendingBranch had everything
as None, leading to a panic unwrapping in PendingBranch::into_branch().
This happened while compiling a `branchif`.

It seems that the only way this can happen is when core::gen_branch()
fails, but not due to OOM. We wouldn't have reach into_branch() when
OOM, and the only way to not leave markers that would've set the
branch's start_addr to some value in gen_branch() is for set_target() to
fail, causing an early return.

Unfortunately, it's hard to tell the exact sequence of events that led
to this situation, but regardless, the dumps show us that we should
check for errors in gen_branch().

Because gen_branch() is used deep in the stack during compilation (e.g.
guard_known_class() -> jit_chain_guard() -> gen_branch()), it'd be bad
for compile speed to propagate the error everywhere, not to mention the
massive patch required. Opt for a flag checked near the end of
compilation.
2024-10-23 10:17:08 -04:00
Alan Wu
1e59fa2bae YJIT: Count compiled_branch_count when branch is finalized [ci skip] 2024-10-23 09:53:44 -04:00
Takashi Kokubun
0f3723c644
Rewrite Numeric#dup and Numeric#+@ in Ruby (#11933) 2024-10-22 11:01:29 -07:00
Alan Wu
b41c65b577 YJIT: Implement specialization for no-op {Kernel,Numeric}#dup
Type information in the context for no additional work!

This is the `if (special_object_p(obj)) return obj;` path in
rb_obj_dup() and for Numeric#dup, it's always the identity function.
2024-10-22 11:30:35 -04:00
Alan Wu
20c5a3e133
YJIT: Rename method substitution functions and improve docs (+1) (#11919)
* YJIT: Fill in commented-out assertion

* YJIT: Rename yjit_reg_method() and add links in docs
2024-10-21 12:12:24 -04:00
John Hawthorn
7be9a333ca
YJIT: Allow shareable consts in multi-ractor mode (#11917)
* Update yjit-bindgen deps

* YJIT: Allow shareable consts in multi-ractor mode

* Update yjit/src/codegen.rs

Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>

---------

Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
2024-10-18 15:01:45 -04:00
Alan Wu
cb39283cbf YJIT: In stats, group by resolved C method name
Previously, in the "Top-N most frequent C calls"
section of --yjit-stats output, we printed the class
name of the receiver, not the method owner. This meant
that calls on subclass instances that land on the same
method showed up as different entires.

Similarly, method called using an alias showed up as
different entries from other aliases.

Group by the resolved method instead.

Test program:

    1.itself; [].itself; true.inspect; true.to_s

Before:

    Top-4 most frequent C calls (80.0% of C calls):
      1 (20.0%): Integer#itself
      1 (20.0%): TrueClass#to_s
      1 (20.0%): TrueClass#inspect
      1 (20.0%): Array#itself

After:

    Top-2 most frequent C calls (80.0% of C calls):
      2 (40.0%): Kernel#itself
      2 (40.0%): TrueClass#to_s
2024-10-17 17:59:27 -04:00
Kevin Menard
158b8cb52e
YJIT: Add compilation log (#11818)
* YJIT: Add `--yjit-compilation-log` flag to print out the compilation log at exit.

* YJIT: Add an option to enable the compilation log at runtime.

* YJIT: Fix a typo in the `IseqPayload` docs.

* YJIT: Add stubs for getting the YJIT compilation log in memory.

* YJIT: Add a compilation log based on a circular buffer to cap the log size.

* YJIT: Allow specifying either a file or directory name for the YJIT compilation log.

The compilation log will be populated as compilation events occur. If a directory is supplied, then a filename based on the PID will be used as the write target. If a file name is supplied instead, the log will be written to that file.

* YJIT: Add JIT compilation of C function substitutions to the compilation log.

* YJIT: Add compilation events to the circular buffer even if output is sent to a file.

Previously, the two modes were treated as being exclusive of one another. However, it could be beneficial to log all events to a file while also allowing for direct access of the last N events via `RubyVM::YJIT.compilation_log`.

* YJIT: Make timestamps the first element in the YJIT compilation log tuple.

* YJIT: Stream log to stderr if `--yjit-compilation-log` is supplied without an argument.

* YJIT: Eagerly compute compilation log messages to avoid hanging on to references that may GC.

* YJIT: Log all compiled blocks, not just the method entry points.

* YJIT: Remove all compilation events other than block compilation to slim down the log.

* YJIT: Replace circular buffer iterator with a consuming loop.

* YJIT: Support `--yjit-compilation-log=quiet` as a way to activate the in-memory log without printing it.

Co-authored-by: Randy Stauner <randy.stauner@shopify.com>

* YJIT: Promote the compilation log to being the one YJIT log.

Co-authored-by: Randy Stauner <randy.stauner@shopify.com>

* Update doc/yjit/yjit.md

* Update doc/yjit/yjit.md

---------

Co-authored-by: Randy Stauner <randy.stauner@shopify.com>
Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
2024-10-17 21:36:43 +00:00
John Bampton
5e799cc182
Fix spelling 2024-10-11 15:16:05 +00:00
Alan Wu
ded078c2c4
YJIT: Fastpath for Module#name (#11819)
Module#name shows up as a top C method callee in lobsters so probably
common enough. It's also easy to substitute thanks to rb_mod_name()
already having no GC yield points.

    klass = BasicObject
    50_000_000.times { klass.name }

    Benchmark 1: /.rubies/post/bin/ruby --yjit mod_name.rb
      Time (mean ± σ):      1.433 s ±  0.010 s    [User: 1.410 s, System: 0.010 s]
      Range (min … max):    1.421 s …  1.449 s    10 runs

    Benchmark 2: /.rubies/mstr/bin/ruby --yjit mod_name.rb
      Time (mean ± σ):      1.491 s ±  0.012 s    [User: 1.468 s, System: 0.010 s]
      Range (min … max):    1.470 s …  1.511 s    10 runs

    Summary
      /.rubies/post/bin/ruby --yjit mod_name.rb ran
        1.04 ± 0.01 times faster than /.rubies/mstr/bin/ruby --yjit mod_name.rb
2024-10-08 11:44:59 -04:00
Takashi Kokubun
35711903f2
YJIT: Add --yjit-mem-size option (#11810)
* YJIT: Add --yjit-mem-size option

* Improve --help

* s/the region/this virtual memory region/

Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>

---------

Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
2024-10-07 13:07:23 -04:00
Matt Valentine-House
8e7df4b7c6 Rename size_pool -> heap
Now that we've inlined the eden_heap into the size_pool, we should
rename the size_pool to heap. So that Ruby contains multiple heaps, with
different sized objects.

The term heap as a collection of memory pages is more in memory
management nomenclature, whereas size_pool was a name chosen out of
necessity during the development of the Variable Width Allocation
features of Ruby.

The concept of size pools was introduced in order to facilitate
different sized objects (other than the default 40 bytes). They wrapped
the eden heap and the tomb heap, and some related state, and provided a
reasonably simple way of duplicating all related concerns, to provide
multiple pools that all shared the same structure but held different
objects.

Since then various changes have happend in Ruby's memory layout:

* The concept of tomb heaps has been replaced by a global free pages list,
  with each page having it's slot size reconfigured at the point when it
  is resurrected
* the eden heap has been inlined into the size pool itself, so that now
  the size pool directly controls the free_pages list, the sweeping
  page, the compaction cursor and the other state that was previously
  being managed by the eden heap.

Now that there is no need for a heap wrapper, we should refer to the
collection of pages containing Ruby objects as a heap again rather than
a size pool
2024-10-03 21:20:09 +01:00
Alan Wu
2f5ab4c4b8 YJIT: Merge impl VALUE blocks [ci skip]
Reported by Kevin Menard.
2024-10-02 13:47:35 -04:00
whtsht
af63b4f8b7
Return an Iterator Instead of a Vector in addrs_to_pages Method (#11725)
* Returning an iterator instead of a vec

* Avoid changing the meaning of end_page

---------

Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
2024-09-30 16:00:54 -07:00
Takashi Kokubun
505206b8ac
YJIT: Cache Context decoding (#11680) 2024-09-25 12:18:13 -04:00
Takashi Kokubun
48b3386f6a Fix a typo 2024-09-23 16:40:20 -07:00
Randy Stauner
7c4b028435
YJIT: Accept key for runtime_stats to return only that stat (#11536) 2024-09-17 20:06:27 -04:00
Takashi Kokubun
f250296efa
YJIT: Speed up block_assumptions_free (#11556) 2024-09-05 12:39:57 -07:00
Étienne Barrié
bf9879791a Optimized instruction for Hash#freeze
If a Hash which is empty or only using literals is frozen, we detect
this as a peephole optimization and change the instructions to be
`opt_hash_freeze`.

[Feature #20684]

Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
2024-09-05 12:46:02 +02:00
Étienne Barrié
a99707cd9c Optimized instruction for Array#freeze
If an Array which is empty or only using literals is frozen, we detect
this as a peephole optimization and change the instructions to be
`opt_ary_freeze`.

[Feature #20684]

Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
2024-09-05 12:46:02 +02:00
Randy Stauner
942317ebf8
YJIT: Encode doubles to VALUE objects and move stat generation to rust (#11388)
* YJIT: Encode doubles to VALUE objects and move stat generation to rust

Stats that can now be generated from rust have been moved there.

* Move object_shape_count call for runtime_stats to rust

This reduces the ruby method to a single primitive.

* Change hash_aset_usize from macro to function
2024-08-27 22:24:17 -04:00
Takashi Kokubun
5b129c899a
YJIT: Pass method arguments using registers (#11280)
* YJIT: Pass method arguments using registers

* s/at_current_insn/at_compile_target/

* Implement register shuffle
2024-08-27 17:04:43 -07:00
Alan Wu
525008cd78
Delete newarraykwsplat
The pushtoarraykwsplat instruction was designed to replace newarraykwsplat,
and we now meet the condition for deletion mentioned in
77c1233f79.
2024-08-13 20:56:35 +00:00
Takashi Kokubun
77ffdfe79f
YJIT: Allow tracing fallback counters (#11347)
* YJIT: Allow tracing fallback counters

* Update yjit.md about --yjit-trace-exits=counter
2024-08-08 16:13:16 -07:00
Peter Zhu
0bff07644b
Make YJIT a GC root rather than an object (#11343)
YJIT currently uses the YJIT root object to mark objects during GC and
update references during compaction. This object otherwise serves no
purpose.

This commit changes it YJIT to be step when marking the GC root. This
saves some memory from being allocated from the system and the GC.
2024-08-08 12:19:35 -04:00
Kevin Menard
04a6165ac0
YJIT: Enhance the String#<< method substitution to handle integer codepoint values. (#11032)
* Document why we need to explicitly spill registers.

* Simplify passing a byte value to `str_buf_cat`.

* YJIT: Enhance the `String#<<` method substitution to handle integer codepoint values.

* YJIT: Move runtime type check into YJIT.

Performing the check in YJIT means we can make assumptions about the type. It also improves correctness of stack traces in cases where the codepoint argument is not a String or a Fixnum.
2024-08-02 15:45:22 -04:00
Takashi Kokubun
70b4f45d9f
YJIT: Decouple Context from encoding details (#11283) 2024-07-31 10:51:40 -04:00
Randy Stauner
acbb8d4fb5 Expand opt_newarray_send to support Array#pack with buffer keyword arg
Use an enum for the method arg instead of needing to add an id
that doesn't map to an actual method name.

$ ruby --dump=insns -e 'b = "x"; [v].pack("E*", buffer: b)'

before:

```
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,34)>
local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 1] b@0
0000 putchilledstring                       "x"                       (   1)[Li]
0002 setlocal_WC_0                          b@0
0004 putself
0005 opt_send_without_block                 <calldata!mid:v, argc:0, FCALL|VCALL|ARGS_SIMPLE>
0007 newarray                               1
0009 putchilledstring                       "E*"
0011 getlocal_WC_0                          b@0
0013 opt_send_without_block                 <calldata!mid:pack, argc:2, kw:[#<Symbol:0x000000000023110c>], KWARG>
0015 leave
```

after:

```
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,34)>
local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 1] b@0
0000 putchilledstring                       "x"                       (   1)[Li]
0002 setlocal_WC_0                          b@0
0004 putself
0005 opt_send_without_block                 <calldata!mid:v, argc:0, FCALL|VCALL|ARGS_SIMPLE>
0007 putchilledstring                       "E*"
0009 getlocal                               b@0, 0
0012 opt_newarray_send                      3, 5
0015 leave
```
2024-07-29 16:26:58 -04:00
Takashi Kokubun
8df74deab1 YJIT: Tweak a comment a little [ci skip] 2024-07-18 13:03:17 -07:00
Takashi Kokubun
2de8b5b805
YJIT: Allow dev_nodebug to disasm release-mode code (#11198)
* YJIT: Allow dev_nodebug to disasm release-mode code

* Revert "YJIT: Squash canary before falling back"

This reverts commit f05ad373d8.
The stray canary issue should have been solved by
def7023ee4, alleviating this codegen
accommodation.

* s/runtime_assertions/runtime_checks/

---------

Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
2024-07-18 13:01:47 -07:00
Maxime Chevalier-Boisvert
d989bc54e2
YJIT: split chain_depth and flag booleans in context (#11169)
Split these values to avoid using a bit mask in the context
Use variable length encoding to save a few bits on chain depth
2024-07-15 14:45:18 -04:00
Takashi Kokubun
ec773e15f4
YJIT: Local variable register allocation (#11157)
* YJIT: Local variable register allocation

* locals are not stack temps

* Rename RegTemps to RegMappings

* Rename RegMapping to RegOpnd

* Rename local_size to num_locals

* s/stack value/operand/

* Rename spill_temps() to spill_regs()

* Clarify when num_locals becomes None

* Mention that InsnOut uses different registers

* Rename get_reg_mapping to get_reg_opnd

* Resurrect --yjit-temp-regs capability

* Use MAX_CTX_TEMPS and MAX_CTX_LOCALS
2024-07-15 10:56:57 -04:00
Maxime Chevalier-Boisvert
3fbf9df39a
YJIT: increase context cache size to 1024 redux (#11140)
* YJIT: increase context cache size to 1024 redux

* Move context hashing code outside of unsafe block

* Avoid allocating large table on the stack, which would cause a stack overflow

Co-authored by Alan Wu @XrXr
2024-07-11 19:01:05 +00:00
Maxime Chevalier-Boisvert
48e7112baa
YJIT: increase context cache size to 1024 (#10983)
* YJIT: increase context cache size to 1024

The other day I ran into a mysterious bug while increasing the
cache size to 1024. I was not able to reproduce this locally.
Opening this PR for testing/debugging.

* Add extra debug assertions

* Add more comments to context code

* Update yjit/src/core.rs

Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>

* Update yjit/src/core.rs

* Comment out potentially problematic assertion

* Revert cache size to 512 so we can merge other changes

---------

Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
2024-07-10 19:45:23 +00:00
Alan Wu
3be9ce3cf6
YJIT: dump-disasm: Print comments and bytes in release builds
This change implements a fallback mode for the `--yjit-dump-disasm`
development command-line option to make it usable in release builds.
Previously, using the option with release builds of YJIT yielded only
a warning asking the user to build with `--enable-yjit=dev`.

While builds that use the `disasm` feature still give the best output,
just having the comments is useful enough for many kinds of debugging.
Having it usable in release builds is nice for new hackers, too, since
this allows for tinkering without having to learn how to build YJIT in
development mode.

Sample output on A64:

```
  # regenerate_branch
  # Insn: 0001 opt_send_without_block (stack_size: 1)
  # guard known object with singleton class
  0x11f7e0034: 4b 00 00 58 03 00 00 14 08 ce 9c 04 01 00 00
  0x11f7e0043: 00 3f 00 0b eb 81 06 01 54 1f 20 03 d5
  # RUBY_VM_CHECK_INTS(ec)
  0x11f7e0050: 8b 02 42 b8 cb 07 01 35
  # stack overflow check
  0x11f7e0058: ab 62 02 91 7f 02 0b eb 69 07 01 54
  # save PC to CFP
  0x11f7e0064: 0b 3b 9a d2 2b 2f a0 f2 0b 00 cc f2 6b 02 00
  0x11f7e0073: f8 ab 82 00 91
```

To ensure this feature doesn't incur too much cost when running without
the `--yjit-dump-disasm` option, I checked that there is no significant
impact to compile time and memory usage with the `compile_time_ns` and
`yjit_alloc_size` entry in `RubyVM::YJIT.runtime_stats`. For each
sample, I ran 3 iterations of the `lobsters` YJIT benchmark. The
statistics summary and done with the `summary` function in R.

Compile time, sample size of 60, lower is better:

```
       Before              After
 Min.   :2.054e+09   Min.   :2.028e+09
 1st Qu.:2.069e+09   1st Qu.:2.044e+09
 Median :2.081e+09   Median :2.060e+09
 Mean   :2.089e+09   Mean   :2.066e+09
 3rd Qu.:2.109e+09   3rd Qu.:2.085e+09
 Max.   :2.146e+09   Max.   :2.144e+09
```

Allocation size, sample size of 20, lower is better:

```
       Before             After
 Min.   :21804742   Min.   :21794082
 1st Qu.:21826682   1st Qu.:21816282
 Median :21844042   Median :21826814
 Mean   :21960664   Mean   :22026291
 3rd Qu.:21861228   3rd Qu.:22040439
 Max.   :22587426   Max.   :22930614
```

The `yjit_alloc_size` samples are noisy, but since the average increased
by only 0.3%, and the median is lower, I feel safe saying that there is
no significant change.
2024-07-08 20:02:30 +00:00
Alan Wu
b160a78d6b YJIT: Remove done TODO, fix indent
Type check now done in rb_iseqw_to_iseq().
2024-07-03 19:10:57 -04:00
Kevin Menard
3407565d2f
YJIT: Use a special breakpoint address if one isn't explicitly supplied in order to support natural line stepping. (#11083)
Use a special breakpoint address if one isn't explicitly supplied in order to support natural line stepping.

ARM64 will not increment the program counter (PC) upon hitting a breakpoint instruction. Consequently, stepping through code with a debugger ends up looping back to the breakpoint instruction. LLDB has a special breakpoint address of 0xf000 that will increment the PC and allow the debugger to work as expected. This change makes it possible to debug YJIT generated code on ARM64.

More details at: https://discourse.llvm.org/t/stepping-over-a-brk-instruction-on-arm64/69766/8

Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2024-07-02 15:55:17 -04:00
Gabriel Lacroix
4d94d28a4a
YJIT: Inline simple ISEQs with unused keyword parameters
This commit expands inlining for simple ISeqs to accept
callees that have unused keyword parameters and callers
that specify unused keywords. The following shows 2 new
callsites that will be inlined:

```ruby
def let(a, checked: true) = a

let(1)
let(1, checked: false)
```

Co-authored-by: Kaan Ozkan <kaan.ozkan@shopify.com>
2024-07-02 18:34:48 +00:00
Aaron Patterson
a2c27bae96 [YJIT] Don't expand kwargs on forwarding
Similarly to splat arrays, we shouldn't expand splat kwargs.

[ruby-core:118401]
2024-06-29 11:25:59 -06:00