Commit graph

1240 commits

Author SHA1 Message Date
Nobuyoshi Nakada
aad9fa2853
Use RB_VM_LOCKING 2025-05-25 15:22:43 +09:00
Jean Boussier
186e60cb68 YJIT: handle opt_aset_with
```
 # frozen_string_ltieral: true
hash["literal"] = value
```
2025-05-15 11:56:24 +02:00
Satoshi Tagomori
382645d440 namespace on read 2025-05-11 23:32:50 +09:00
Jean Boussier
3f7c0af051 Rename rb_shape_obj_too_complex -> rb_shape_obj_too_complex_p 2025-05-09 10:22:51 +02:00
Jean Boussier
334ebba221 Rename rb_shape_get_shape_by_id -> RSHAPE 2025-05-09 10:22:51 +02:00
Jean Boussier
9966de11fb Refactor rb_shape_get_next_iv_shape to take and return ids. 2025-05-09 10:22:51 +02:00
Jean Boussier
f48e45d1e9 Move object_id in object fields.
And get rid of the `obj_to_id_tbl`

It's no longer needed, the `object_id` is now stored inline
in the object alongside instance variables.

We still need the inverse table in case `_id2ref` is invoked, but
we lazily build it by walking the heap if that happens.

The `object_id` concern is also no longer a GC implementation
concern, but a generic implementation.

Co-Authored-By: Matt Valentine-House <matt@eightbitraptor.com>
2025-05-08 07:58:05 +02:00
Jean Boussier
6c9b3ac232 Refactor OBJ_TOO_COMPLEX_SHAPE_ID to not be referenced outside shape.h
Also refactor checks for `->type == SHAPE_OBJ_TOO_COMPLEX`.
2025-05-08 07:58:05 +02:00
Jean Boussier
0ea210d1ea Rename ivptr -> fields, next_iv_index -> next_field_index
Ivars will longer be the only thing stored inline
via shapes, so keeping the `iv_index` and `ivptr` names
would be confusing.

Instance variables won't be the only thing stored inline
via shapes, so keeping the `ivptr` name would be confusing.

`field` encompass anything that can be stored in a VALUE array.

Similarly, `gen_ivtbl` becomes `gen_fields_tbl`.
2025-05-08 07:58:05 +02:00
Jean Boussier
3ec7bfff2e Use a set_table for rb_vm_struct.unused_block_warning_table
Now that we have a hash-set implementation we can use that
instead of a hash-table with a static value.
2025-04-27 11:59:28 +02:00
Jean Boussier
c0417bd094 Use set_table to track const caches
Now that we have a `set_table` implementation, we can
use it to track const caches and save some memory.

We could even save some more memory if `numtable` didn't
store a copy of the `hash` and instead recomputed it every
time, but this is a quick win.
2025-04-26 12:10:32 +02:00
Nobuyoshi Nakada
5dc155351a
Do not allocate new objects at machine stack overflow 2025-04-24 17:28:18 +09:00
Nobuyoshi Nakada
c218862d3c
Fix style [ci skip] 2025-04-19 22:02:10 +09:00
Xavier Noria
c5c0bb5afc Restore the original order of const_added and inherited callbacks
Originally, if a class was defined with the class keyword, the cref had a
const_added callback, and the superclass an inherited callback, const_added was
called first, and inherited second.

This was discussed in

    https://bugs.ruby-lang.org/issues/21143

and an attempt at changing this order was made.

While both constant assignment and inheritance have happened before these
callbacks are invoked, it was deemed nice to have the same order as in

    C = Class.new

This was mostly for alignment: In that last use case things happen at different
times and therefore the order of execution is kind of obvious, whereas when the
class keyword is involved, the order is opaque to the user and it is up to the
interpreter.

However, soon in

    https://bugs.ruby-lang.org/issues/21193

Matz decided to play safe and keep the existing order.

This reverts commits:

    de097fbe5f
    de48e47ddf
2025-04-10 10:20:31 +02:00
Jeremy Evans
67d1dd2ebd Avoid array allocation for *nil, by not calling nil.to_a
The following method call:

```ruby
a(*nil)
```

A method call such as `a(*nil)` previously allocated an array, because
it calls `nil.to_a`, but I have determined this array allocation is
unnecessary.  The instructions in this case are:

```
0000 putself                                                          (   1)[Li]
0001 putnil
0002 splatarray                             false
0004 opt_send_without_block                 <calldata!mid:a, argc:1, ARGS_SPLAT|FCALL>
0006 leave
```

The method call uses `ARGS_SPLAT` without `ARGS_SPLAT_MUT`, so the
returned array doesn't need to be mutable.  I believe all cases where
`splatarray false` are used allow the returned object to be frozen,
since the `false` means to not duplicate the array.  The optimization
in this case is to have `splatarray false` push a shared empty frozen
array, instead of calling `nil.to_a` to return a newly allocated array.

There is a slightly backwards incompatibility with this optimization,
in that `nil.to_a` is not called.  However, I believe the new behavior
of `*nil` not calling `nil.to_a` is more consistent with how `**nil`
does not call `nil.to_hash`.  Also, so much Ruby code would break if
`nil.to_a` returned something different from the empty hash, that it's
difficult to imagine anyone actually doing that in real code, though
we have a few tests/specs for that.

I think it would be bad for consistency if `*nil` called `nil.to_a`
in some cases and not others, so this changes other cases to not
call `nil.to_a`:

For `[*nil]`, this uses `splatarray true`, which now allocates a
new array for a `nil` argument without calling `nil.to_a`.

For `[1, *nil]`, this uses `concattoarray`, which now returns
the first array if the second array is `nil`.

This updates the allocation tests to check that the array allocations
are avoided where possible.

Implements [Feature #21047]
2025-03-27 11:17:40 -07:00
Jean Boussier
de097fbe5f Trigger inherited and const_set callbacks after const has been defined
[Misc #21143]
[Bug #21193]

The previous change caused a backward compatibility issue with code
that called `Object.const_source_location` from the `inherited` callback.

To fix this, the order is now:

- Define the constant
- Invoke `inherited`
- Invoke `const_set`
2025-03-20 18:18:11 +01:00
Jean Boussier
de48e47ddf Invoke inherited callbacks before const_added
[Misc #21143]

Conceptually this makes sense and is more consistent with using
the `Name = Class.new(Superclass)` alternative method.

However the new class is still named before `inherited` is called.
2025-03-14 09:51:57 +01:00
Alan Wu
08b3a45bc9 Push a real iseq in rb_vm_push_frame_fname()
Previously, vm_make_env_each() (used during proc
creation and for the debug inspector C API) picked up the
non-GC-allocated iseq that rb_vm_push_frame_fname() creates,
which led to a SEGV when the GC tried to mark the non GC object.

Put a real iseq imemo instead. Speed should be about the same since
the old code also did a imemo allocation and a malloc allocation.

Real iseq allows ironing out the special-casing of dummy frames in
rb_execution_context_mark() and rb_execution_context_update(). A check
is added to RubyVM::ISeq#eval, though, to stop attempts to run dummy
iseqs.

[Bug #21180]

Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2025-03-12 15:00:26 -04:00
Nobuyoshi Nakada
4a67ef09cc
[Feature #21116] Extract RJIT as a third-party gem 2025-02-13 18:01:03 +09:00
Peter Zhu
f65a6c090c Fix use-after-free in constant cache
[Bug #20921]

When we create a cache entry for a constant, the following sequence of
events could happen:

- vm_track_constant_cache is called to insert a constant cache.
- In vm_track_constant_cache, we first look up the ST table for the ID
  of the constant. Assume the ST table exists because another iseq also
  holds a cache entry for this ID.
- We then insert into this ST table with the iseq_inline_constant_cache.
- However, while inserting into this ST table, it allocates memory, which
  could trigger a GC. Assume that it does trigger a GC.
- The GC frees the one and only other iseq that holds a cache entry for
  this ID.
- In remove_from_constant_cache, it will appear that the ST table is now
  empty because there are no more iseq with cache entries for this ID, so
  we free the ST table.
- We complete GC and continue our st_insert. However, this ST table has
  been freed so we now have a use-after-free.

This issue is very hard to reproduce, because it requires that the GC runs
at a very specific time. However, we can make it show up by applying this
patch which runs GC right before the st_insert to mimic the st_insert
triggering a GC:

    diff --git a/vm_insnhelper.c b/vm_insnhelper.c
    index 3cb23f06f0..a93998136a 100644
    --- a/vm_insnhelper.c
    +++ b/vm_insnhelper.c
    @@ -6338,6 +6338,10 @@ vm_track_constant_cache(ID id, void *ic)
            rb_id_table_insert(const_cache, id, (VALUE)ics);
        }

    +    if (id == rb_intern("MyConstant")) rb_gc();
    +
        st_insert(ics, (st_data_t) ic, (st_data_t) Qtrue);
    }

And if we run this script:

    Object.const_set("MyConstant", "Hello!")

    my_proc = eval("-> { MyConstant }")
    my_proc.call

    my_proc = eval("-> { MyConstant }")
    my_proc.call

We can see that ASAN outputs a use-after-free error:

    ==36540==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000049528 at pc 0x000102f3ceac bp 0x00016d607a70 sp 0x00016d607a68
    READ of size 8 at 0x606000049528 thread T0
        #0 0x102f3cea8 in do_hash st.c:321
        #1 0x102f3ddd0 in rb_st_insert st.c:1132
        #2 0x103140700 in vm_track_constant_cache vm_insnhelper.c:6345
        #3 0x1030b91d8 in vm_ic_track_const_chain vm_insnhelper.c:6356
        #4 0x1030b8cf8 in rb_vm_opt_getconstant_path vm_insnhelper.c:6424
        #5 0x1030bc1e0 in vm_exec_core insns.def:263
        #6 0x1030b55fc in rb_vm_exec vm.c:2585
        #7 0x1030fe0ac in rb_iseq_eval_main vm.c:2851
        #8 0x102a82588 in rb_ec_exec_node eval.c:281
        #9 0x102a81fe0 in ruby_run_node eval.c:319
        #10 0x1027f3db4 in rb_main main.c:43
        #11 0x1027f3bd4 in main main.c:68
        #12 0x183900270  (<unknown module>)

    0x606000049528 is located 8 bytes inside of 56-byte region [0x606000049520,0x606000049558)
    freed by thread T0 here:
        #0 0x104174d40 in free+0x98 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x54d40)
        #1 0x102ada89c in rb_gc_impl_free default.c:8183
        #2 0x102ada7dc in ruby_sized_xfree gc.c:4507
        #3 0x102ac4d34 in ruby_xfree gc.c:4518
        #4 0x102f3cb34 in rb_st_free_table st.c:663
        #5 0x102bd52d8 in remove_from_constant_cache iseq.c:119
        #6 0x102bbe2cc in iseq_clear_ic_references iseq.c:153
        #7 0x102bbd2a0 in rb_iseq_free iseq.c:166
        #8 0x102b32ed0 in rb_imemo_free imemo.c:564
        #9 0x102ac4b44 in rb_gc_obj_free gc.c:1407
        #10 0x102af4290 in gc_sweep_plane default.c:3546
        #11 0x102af3bdc in gc_sweep_page default.c:3634
        #12 0x102aeb140 in gc_sweep_step default.c:3906
        #13 0x102aeadf0 in gc_sweep_rest default.c:3978
        #14 0x102ae4714 in gc_sweep default.c:4155
        #15 0x102af8474 in gc_start default.c:6484
        #16 0x102afbe30 in garbage_collect default.c:6363
        #17 0x102ad37f0 in rb_gc_impl_start default.c:6816
        #18 0x102ad3634 in rb_gc gc.c:3624
        #19 0x1031406ec in vm_track_constant_cache vm_insnhelper.c:6342
        #20 0x1030b91d8 in vm_ic_track_const_chain vm_insnhelper.c:6356
        #21 0x1030b8cf8 in rb_vm_opt_getconstant_path vm_insnhelper.c:6424
        #22 0x1030bc1e0 in vm_exec_core insns.def:263
        #23 0x1030b55fc in rb_vm_exec vm.c:2585
        #24 0x1030fe0ac in rb_iseq_eval_main vm.c:2851
        #25 0x102a82588 in rb_ec_exec_node eval.c:281
        #26 0x102a81fe0 in ruby_run_node eval.c:319
        #27 0x1027f3db4 in rb_main main.c:43
        #28 0x1027f3bd4 in main main.c:68
        #29 0x183900270  (<unknown module>)

    previously allocated by thread T0 here:
        #0 0x104174c04 in malloc+0x94 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x54c04)
        #1 0x102ada0ec in rb_gc_impl_malloc default.c:8198
        #2 0x102acee44 in ruby_xmalloc gc.c:4438
        #3 0x102f3c85c in rb_st_init_table_with_size st.c:571
        #4 0x102f3c900 in rb_st_init_table st.c:600
        #5 0x102f3c920 in rb_st_init_numtable st.c:608
        #6 0x103140698 in vm_track_constant_cache vm_insnhelper.c:6337
        #7 0x1030b91d8 in vm_ic_track_const_chain vm_insnhelper.c:6356
        #8 0x1030b8cf8 in rb_vm_opt_getconstant_path vm_insnhelper.c:6424
        #9 0x1030bc1e0 in vm_exec_core insns.def:263
        #10 0x1030b55fc in rb_vm_exec vm.c:2585
        #11 0x1030fe0ac in rb_iseq_eval_main vm.c:2851
        #12 0x102a82588 in rb_ec_exec_node eval.c:281
        #13 0x102a81fe0 in ruby_run_node eval.c:319
        #14 0x1027f3db4 in rb_main main.c:43
        #15 0x1027f3bd4 in main main.c:68
        #16 0x183900270  (<unknown module>)

This commit fixes this bug by adding a inserting_constant_cache_id field
to the VM, which stores the ID that is currently being inserted and, in
remove_from_constant_cache, we don't free the ST table for ID equal to
this one.

Co-Authored-By: Alan Wu <alanwu@ruby-lang.org>
2024-11-29 10:46:43 -05:00
Randy Stauner
1dd40ec18a
Optimize instructions when creating an array just to call include? (#12123)
* Add opt_duparray_send insn to skip the allocation on `#include?`

If the method isn't going to modify the array we don't need to copy it.
This avoids the allocation / array copy for things like `[:a, :b].include?(x)`.

This adds a BOP for include? and tracks redefinition for it on Array.

Co-authored-by: Andrew Novoselac <andrew.novoselac@shopify.com>

* YJIT: Implement opt_duparray_send include_p

Co-authored-by: Andrew Novoselac <andrew.novoselac@shopify.com>

* Update opt_newarray_send to support simple forms of include?(arg)

Similar to opt_duparray_send but for non-static arrays.

* YJIT: Implement opt_newarray_send include_p

---------

Co-authored-by: Andrew Novoselac <andrew.novoselac@shopify.com>
2024-11-26 14:31:08 -05:00
Maximillian Polhill
56fbf64a53 Fix vm_objtostring optimization for Symbol
Co-authored-by: John Hawthorn <john@hawthorn.email>
2024-11-25 17:29:58 -08:00
Koichi Sasada
ab7ab9e450 Warning[:strict_unused_block]
to show unused block warning strictly.

```ruby
class C
  def f = nil
end

class D
  def f = yield
end

[C.new, D.new].each{|obj| obj.f{}}
```

In this case, `D#f` accepts a block. However `C#f` doesn't
accept a block. There are some cases passing a block with
`obj.f{}` where `obj` is `C` or `D`. To avoid warnings on
such cases, "unused block warning" will be warned only if
there is not same name which accepts a block.
On the above example, `C.new.f{}` doesn't show any warnings
because there is a same name `D#f` which accepts a block.

We call this default behavior as "relax mode".

`strict_unused_block` new warning category changes from
"relax mode" to "strict mode", we don't check same name
methods and `C.new.f{}` will be warned.

[Feature #15554]
2024-11-06 11:06:18 +09:00
Nobuyoshi Nakada
abfefd8e0c
Define VM_ASSERT_TYPE macros 2024-10-31 22:12:16 +09:00
John Hawthorn
7be9a333ca
YJIT: Allow shareable consts in multi-ractor mode (#11917)
* Update yjit-bindgen deps

* YJIT: Allow shareable consts in multi-ractor mode

* Update yjit/src/codegen.rs

Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>

---------

Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
2024-10-18 15:01:45 -04:00
Nobuyoshi Nakada
6f6735898a
Cast via uintptr_t function pointer between object pointer 2024-10-10 11:29:57 +09:00
Nobuyoshi Nakada
49ccc31d90 Add a macro to initialize union cfunc_type
```
vm_insnhelper.c:2430:49: error: ISO C prohibits argument conversion to union type [-Wpedantic]
 2430 |     if (!vm_method_cfunc_is(cd_owner, cd, recv, rb_obj_equal)) {
      |                                                 ^~~~~~~~~~~~
vm_insnhelper.c:2448:42: error: ISO C prohibits argument conversion to union type [-Wpedantic]
 2448 |     if (cc && check_cfunc(vm_cc_cme(cc), rb_obj_equal)) {
      |                                          ^~~~~~~~~~~~
```
and so on.
2024-10-08 23:29:49 +09:00
Nobuyoshi Nakada
9a90cd2284 Cast via uintptr_t function pointer between object pointer
- ISO C forbids conversion of function pointer to object pointer type
- ISO C forbids conversion of object pointer to function pointer type
2024-10-08 23:29:49 +09:00
Peter Zhu
a6cf132475 Revert "Add debugging code to vm_objtostring in ASAN"
This reverts commit c32fd1b5ed.

The bug seems to have been fixed with 6acf03618a.
2024-10-07 10:47:30 -04:00
Peter Zhu
b77772496a Don't check poisoned for immediates 2024-09-25 11:14:14 -04:00
Peter Zhu
c32fd1b5ed Add debugging code to vm_objtostring in ASAN
To debug this issue on CI:
http://ci.rvm.jp/logfiles/brlog.trunk_asan.20240922-002945
2024-09-25 11:00:04 -04:00
Étienne Barrié
bf9879791a Optimized instruction for Hash#freeze
If a Hash which is empty or only using literals is frozen, we detect
this as a peephole optimization and change the instructions to be
`opt_hash_freeze`.

[Feature #20684]

Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
2024-09-05 12:46:02 +02:00
Étienne Barrié
a99707cd9c Optimized instruction for Array#freeze
If an Array which is empty or only using literals is frozen, we detect
this as a peephole optimization and change the instructions to be
`opt_ary_freeze`.

[Feature #20684]

Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
2024-09-05 12:46:02 +02:00
Jeremy Evans
ea7ceff82c
Avoid hash allocation for certain proc calls
Previously, proc calls such as:

```ruby
proc{|| }.(**empty_hash)
proc{|b: 1| }.(**r2k_array_with_empty_hash)
```

both allocated hashes unnecessarily, due to two separate code paths.

The first call goes through CALLER_SETUP_ARG/vm_caller_setup_keyword_hash,
and is simple to fix by not duping an empty keyword hash that will be
dropped.

The second case is more involved, in setup_parameters_complex, but is
fixed the exact same way as when the ruby2_keywords hash is not empty,
by flattening the rest array to the VM stack, ignoring the last
element (the empty keyword splat).  Add a flatten_rest_array static
function to handle this case.

Update test_allocation.rb to automatically convert the method call
allocation tests to proc allocation tests, at least for the calls
that can be converted.  With the code changes, all proc call
allocation tests pass, showing that proc calls and method calls
now allocate the same number of objects.

I've audited the allocation tests, and I believe that all of the low
hanging fruit has been collected.  All remaining allocations are
either caller side:

* Positional splat + post argument
* Multiple positional splats
* Literal keywords + keyword splat
* Multiple keyword splats

Or callee side:

* Positional splat parameter
* Keyword splat parameter
* Keyword to positional argument conversion for methods that don't accept keywords
* ruby2_keywords method called with keywords

Reapplies abc04e898b, which was reverted at
d56470a27c, with the addition of a bug fix and
test.

Fixes [Bug #20679]
2024-08-19 19:00:37 -07:00
Jeremy Evans
d56470a27c Revert "Avoid hash allocation for certain proc calls"
This reverts commit abc04e898b.

This caused problems in a Rails test.
2024-08-16 17:59:05 -07:00
Nobuyoshi Nakada
21dfe34aae
Stringize VM_ASSERT expression before expansion 2024-08-16 16:55:51 +09:00
Jeremy Evans
abc04e898b Avoid hash allocation for certain proc calls
Previous, proc calls such as:

```ruby
proc{|| }.(**empty_hash)
proc{|b: 1| }.(**r2k_array_with_empty_hash)
```

both allocated hashes unnecessarily, due to two separate code paths.

The first call goes through CALLER_SETUP_ARG/vm_caller_setup_keyword_hash,
and is simple to fix by not duping an empty keyword hash that will be
dropped.

The second case is more involved, in setup_parameters_complex, but is
fixed the exact same way as when the ruby2_keywords hash is not empty,
by flattening the rest array to the VM stack, ignoring the last
element (the empty keyword splat).  Add a flatten_rest_array static
function to handle this case.

Update test_allocation.rb to automatically convert the method call
allocation tests to proc allocation tests, at least for the calls
that can be converted.  With the code changes, all proc call
allocation tests pass, showing that proc calls and method calls
now allocate the same number of objects.

I've audited the allocation tests, and I believe that all of the low
hanging fruit has been collected.  All remaining allocations are
either caller side:

* Positional splat + post argument
* Multiple positional splats
* Literal keywords + keyword splat
* Multiple keyword splats

Or callee side:

* Positional splat parameter
* Keyword splat parameter
* Keyword to positional argument conversion for methods that don't accept keywords
* ruby2_keywords method called with keywords
2024-08-15 13:00:09 -07:00
Koichi Sasada
d5afa2cc95 do not show unused block on send
Some case it is difficult to know the calling method uses a block
or not with `send` on a general framework. So this patch stops
showing unused block warning on `send`.

example with test/unit:

```ruby
require 'test/unit'

class T < Test::Unit::TestCase
  def setup
  end

  def test_foo = nil
end
```

=> /home/ko1/ruby/install/master/lib/ruby/gems/3.4.0+0/gems/test-unit-3.6.2/lib/test/unit/fixture.rb:284: warning: the block passed to 'priority_setup' defined at /home/ko1/ruby/install/master/lib/ruby/gems/3.4.0+0/gems/test-unit-3.6.2/lib/test/unit/priority.rb:183 may be ignored

because test/unit can call any setup method (`priority_setup` in this case) with a block.

Maybe we can show the warning again when we provide a way to recognize
the calling method uses a block or not.
2024-08-13 12:17:56 +09:00
Jean Boussier
6ee9a08d32 rb_setup_fake_ary: use precomputed flags
Setting up the fake array is a bit more expensive than would be
expected because `rb_ary_freeze` does a lot of checks and lookup
a shape transition.

If we assume fake arrays will always be frozen, we can precompute
the flags state and just assign it.
2024-08-10 10:09:14 +02:00
Alan Wu
057c53f771 Make rb_vm_invoke_bmethod() static 2024-08-07 19:17:31 -04:00
Your Name
34715bdd91 Tune codegen for rb_yield() calls landing in ISeqs
Unlike in older revisions in the year, GCC 11 isn't inlining the call
to vm_push_frame() inside invoke_iseq_block_from_c() anymore. We do
want it to be inlined since rb_yield() speed is fairly important.
Logs from -fopt-info-optimized-inline reveal that GCC was blowing its
code size budget inlining invoke_block_from_c_bh() into its various
callers, leaving suboptimal code for its body.

Take away some uses of the `inline` keyword and merge a common tail
call to vm_exec() for overall better code.

This tweak gives about 18% on a micro benchmark and 1% on the
chunky-png benchmark from yjit-bench. I tested on a Skylake server.

```
$ cat c-to-ruby-call.yml
benchmark:
  - 0.upto(10_000_000) {}

$ benchmark-driver --chruby '+patch;master' c-to-ruby-call.yml
Warming up --------------------------------------
0.upto(10_000_000) {}      2.299 i/s -       3.000 times in 1.304689s (434.90ms/i)
Calculating -------------------------------------
                          +patch      master
0.upto(10_000_000) {}      2.299       1.943 i/s -       6.000 times in 2.609393s 3.088353s

Comparison:
             0.upto(10_000_000) {}
               +patch:         2.3 i/s
               master:         1.9 i/s - 1.18x  slower

$ ruby run_benchmarks.rb --chruby 'master;+patch' chunky-png
<snip>

----------  -----------  ----------  -----------  ----------  --------------  -------------
bench       master (ms)  stddev (%)  +patch (ms)  stddev (%)  +patch 1st itr  master/+patch
chunky-png  1156.1       0.1         1142.2       0.2         1.01            1.01
----------  -----------  ----------  -----------  ----------  --------------  -------------
```
2024-08-07 18:49:20 -04:00
Alan Wu
e5fb851fe7 Delete unused declaration 2024-08-02 21:53:38 -04:00
Randy Stauner
acbb8d4fb5 Expand opt_newarray_send to support Array#pack with buffer keyword arg
Use an enum for the method arg instead of needing to add an id
that doesn't map to an actual method name.

$ ruby --dump=insns -e 'b = "x"; [v].pack("E*", buffer: b)'

before:

```
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,34)>
local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 1] b@0
0000 putchilledstring                       "x"                       (   1)[Li]
0002 setlocal_WC_0                          b@0
0004 putself
0005 opt_send_without_block                 <calldata!mid:v, argc:0, FCALL|VCALL|ARGS_SIMPLE>
0007 newarray                               1
0009 putchilledstring                       "E*"
0011 getlocal_WC_0                          b@0
0013 opt_send_without_block                 <calldata!mid:pack, argc:2, kw:[#<Symbol:0x000000000023110c>], KWARG>
0015 leave
```

after:

```
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,34)>
local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 1] b@0
0000 putchilledstring                       "x"                       (   1)[Li]
0002 setlocal_WC_0                          b@0
0004 putself
0005 opt_send_without_block                 <calldata!mid:v, argc:0, FCALL|VCALL|ARGS_SIMPLE>
0007 putchilledstring                       "E*"
0009 getlocal                               b@0, 0
0012 opt_newarray_send                      3, 5
0015 leave
```
2024-07-29 16:26:58 -04:00
kimuraw (Wataru Kimura)
7472fff7f1
[Bug #20633] Fix the condition for atomic_signal_fence
`AC_CHECK_DECLS` defines `HAVE_DECL_SYMBOL` to 1 if declared, 0
otherwise, not undefined.
2024-07-14 10:36:35 +09:00
Koichi Sasada
43aee3393d fix defined?(@ivar) with Ractors
`defined?(@ivar)` on the non main Ractor has two issues:

1. raising an exception

```ruby
class C
  @iv1 = []
  def self.defined_iv1 = defined?(@iv1)
end

Ractor.new{
  p C.defined_iv1
  #=> can not get unshareable values from instance variables of classes/modules from non-main Ractors (Ractor::IsolationError)
}.take
```

-> Do not raise an exception but return `"instance-variable"` because
it is defined.

2. returning `"instance-variable"` if there is not defined.

```
class C
  # @iv2 is not defined
  def self.defined_iv2 = defined?(@iv2)
end

Ractor.new{
  p C.defined_iv2 #=> "instance-variable"
}.take
```

-> returns `nil`
2024-07-12 04:43:14 +09:00
Peter Zhu
51bd816517 [Feature #20470] Split GC into gc_impl.c
This commit splits gc.c into two files:

- gc.c now only contains code not specific to Ruby GC. This includes
  code to mark objects (which the GC implementation may choose not to
  use) and wrappers for internal APIs that the implementation may need
  to use (e.g. locking the VM).

- gc_impl.c now contains the implementation of Ruby's GC. This includes
  marking, sweeping, compaction, and statistics. Most importantly,
  gc_impl.c only uses public APIs in Ruby and a limited set of functions
  exposed in gc.c. This allows us to build gc_impl.c independently of
  Ruby and plug Ruby's GC into itself.
2024-07-03 09:03:40 -04:00
Ivo Anjo
64fef3b870 Add explicit compiler fence when pushing frames to ensure safe profiling
**What does this PR do?**

This PR tweaks the `vm_push_frame` function to add an explicit compiler
fence (`atomic_signal_fence`) to ensure profilers that use signals
to interrupt applications (stackprof, vernier, pf2, Datadog profiler)
can safely sample from the signal handler.

**Motivation:**

The `vm_push_frame` was specifically tweaked in
https://github.com/ruby/ruby/pull/3296 to initialize the a frame
before updating the `cfp` pointer.

But since there's nothing stopping the compiler from reordering
the initialization of a frame (`*cfp =`) with the update of the cfp
pointer (`ec->cfp = cfp`) we've been hesitant to rely on this on
the Datadog profiler.

In practice, after some experimentation + talking to folks, this
reordering does not seem to happen.

But since modern compilers have a way for us to exactly tell them
not to do the reordering (`atomic_signal_fence`), this seems even
better.

I've actually extracted `vm_push_frame` into the "Compiler Explorer"
website, which you can use to see the assembly output of this function
across many compilers and architectures: https://godbolt.org/z/3oxd1446K

On that link you can observe two things across many compilers:
1. The compilers are not reordering the writes
2. The barrier does not change the generated assembly output
   (== has no cost in practice)

**Additional Notes:**

The checks added in `configure.ac` define two new macros:
* `HAVE_STDATOMIC_H`
* `HAVE_DECL_ATOMIC_SIGNAL_FENCE`

Since Ruby generates an arch-specific `config.h` header with
these macros upon installation, this can be used by profilers
and other libraries to test if Ruby was compiled with the fence enabled.

**How to test the change?**

As I mentioned above, you can check https://godbolt.org/z/3oxd1446K
to confirm the compiled output of `vm_push_frame` does not change
in most compilers (at least all that I've checked on that site).
2024-07-03 18:08:57 +09:00
eileencodes
b2b8306b46 Fix forwarding for optimized send
Always treat forwarding as a complex call.
2024-07-02 11:48:43 -07:00
eileencodes
cc8c4a60b7 Calling into a C func shouldn't fast path when forwarding
When we forward calls to C functions if the callsite is a forwarding
site it might not always be a splat, so we can't use the fast path.

Fixes:

[ruby-core:118418]
2024-07-02 11:48:43 -07:00
Koichi Sasada
b182f2a045 fix sendfwd with send and method_missing
combination with `send` method (optimized) or `method_missing`
and forwarding send (`...`) needs to respect given
`rb_forwarding_call_data`. Otherwize it causes critical error
such as SEGV.
2024-06-21 00:43:48 +09:00