* Skip assertion when cc->klass is Qundef
* Invalidate CCs when cme is invalidated in marking
* Add additional assertions that CC references stay valid
Co-authored-by: Peter Zhu <peter@peterzhu.ca>
Previously we were issuing writebarriers for each cc, but were missing
the cme.
We also need to avoid it being possible to run GC after we've copied the
values into the allocated array, but before they're visible in the
object.
`rb_class_cc_entries` is little more than a `len` and `capa`.
Hence embedding the entries doesn't cost much extra copying and
saves a bit of memory and some pointer chasing.
Co-Authored-By: Étienne Barrié <etienne.barrie@gmail.com>
In multi-ractor mode, the `cc_tbl` mutations use the RCU pattern,
which allow lock-less reads.
Based on the assumption that invalidations and misses should be
increasingly rare as the process ages, locking on modification
isn't a big concern.
For now this doesn't change anything, but now that the table
is managed by GC, it opens the door to use RCU when in multi-ractor
mode, hence allow unsynchornized reads.
One of the biggest remaining contention point is `RClass.cc_table`.
The logical solution would be to turn it into a managed object, so
we can use an RCU strategy, given it's read heavy.
However, that's not currently possible because the table can't
be freed before the owning class, given the class free function
MUST go over all the CC entries to invalidate them.
However if the `CC->klass` reference is weak marked, then the
GC will take care of setting the reference to `Qundef`.
ZJIT: Support invalidating method redefinition
This commit adds support for the MethodRedefined invariant to be invalidated
when a method is redefined.
Changes:
- Added CME pointer to the MethodRedefined invariant in HIR
- Updated all places where MethodRedefined invariants are created to
include the CME pointer
- Added handling for MethodRedefined invariants in gen_patch_point to
call track_cme_assumption, which registers the patch point for
invalidation when rb_zjit_cme_invalidate is called
This ensures that when a method is redefined, all JIT code that
depends on that method will be properly invalidated.
[DOC] Make protected documentation more explicit about differences
Protected is a common source of confusion for devs
coming from different languages to Ruby. This
commit makes the documentation more explicit about
the differences, so that the use case for
protected is clearer.
These functions conflict with the planned C-API functions. Since they
deal with the underlying set_table pointers and not Set instances,
this seems like a more accurate name as well.
Fixes [Bug #21201]
This change addresses a performance regression where defining methods
inside `refine` blocks caused severe slowdowns. The issue was due to
`rb_clear_all_refinement_method_cache()` triggering a full object
space scan via `rb_objspace_each_objects` to find and invalidate
affected callcaches, which is very inefficient.
To fix this, I introduce `vm->cc_refinement_table` to track
callcaches related to refinements. This allows us to invalidate
only the necessary callcaches without scanning the entire heap,
resulting in significant performance improvement.
Now that we have a `set_table` implementation, we can
use it to track const caches and save some memory.
We could even save some more memory if `numtable` didn't
store a copy of the `hash` and instead recomputed it every
time, but this is a quick win.
This changes reference_count on rb_method_definition_struct into an
atomic.
Ractors can create additional references as part of `bind_call` or
(presumably) similar. Because this can be done inside Ractors, we should
use a lock or atomics so that we don't race and avoid incrementing.
Co-authored-by: wanabe <s.wanabe@gmail.com>
It used to quote only part of the method name because NUL byte in
the method terminates the C string:
```
(irb)> "abcdef".encode("UTF-16LE").bytes
=> [97, 0, 98, 0, 99, 0, 100, 0, 101, 0, 102, 0]
```
```
expected: /abcdef/
actual: warning: Skipping set of ruby2_keywords flag for a (method not defined in Ruby)\n".
```
So that it doesn't get included in the generated binaries for builds
that don't support loading shared GC modules
Co-Authored-By: Peter Zhu <peter@peterzhu.ca>
The hash value of a Method must remain constant after a compaction, otherwise
it may not work as the key in a hash table.
For example:
def a; end
# Need this method here because otherwise the iseq may be on the C stack
# which would get pinned and not move during compaction
def get_hash
method(:a).hash
end
puts get_hash # => 2993401401091578131
GC.verify_compaction_references(expand_heap: true, toward: :empty)
puts get_hash # => -2162775864511574135
* YJIT: Replace Array#each only when YJIT is enabled
* Add comments about BUILTIN_ATTR_C_TRACE
* Make Ruby Array#each available with --yjit as well
* Fix all paths that expect a C location
* Use method_basic_definition_p to detect patches
* Copy a comment about C_TRACE flag to compilers
* Rephrase a comment about add_yjit_hook
* Give METHOD_ENTRY_BASIC flag to Array#each
* Add --yjit-c-builtin option
* Allow inconsistent source_location in test-spec
* Refactor a check of BUILTIN_ATTR_C_TRACE
* Set METHOD_ENTRY_BASIC without touching vm->running
This commit ensures warnings about `object_id` and `__send__` method
redefinitions are emitted for other method types such as aliases, procs,
and attr readers—anything except C functions.
`me->defined_class` will be used for ancestor searching
so that it should be T_CLASS or T_ICLASS.
This patch will resoleve https://github.com/ruby/ruby/pull/11715
This patch optimizes forwarding callers and callees. It only optimizes methods that only take `...` as their parameter, and then pass `...` to other calls.
Calls it optimizes look like this:
```ruby
def bar(a) = a
def foo(...) = bar(...) # optimized
foo(123)
```
```ruby
def bar(a) = a
def foo(...) = bar(1, 2, ...) # optimized
foo(123)
```
```ruby
def bar(*a) = a
def foo(...)
list = [1, 2]
bar(*list, ...) # optimized
end
foo(123)
```
All variants of the above but using `super` are also optimized, including a bare super like this:
```ruby
def foo(...)
super
end
```
This patch eliminates intermediate allocations made when calling methods that accept `...`.
We can observe allocation elimination like this:
```ruby
def m
x = GC.stat(:total_allocated_objects)
yield
GC.stat(:total_allocated_objects) - x
end
def bar(a) = a
def foo(...) = bar(...)
def test
m { foo(123) }
end
test
p test # allocates 1 object on master, but 0 objects with this patch
```
```ruby
def bar(a, b:) = a + b
def foo(...) = bar(...)
def test
m { foo(1, b: 2) }
end
test
p test # allocates 2 objects on master, but 0 objects with this patch
```
How does it work?
-----------------
This patch works by using a dynamic stack size when passing forwarded parameters to callees.
The caller's info object (known as the "CI") contains the stack size of the
parameters, so we pass the CI object itself as a parameter to the callee.
When forwarding parameters, the forwarding ISeq uses the caller's CI to determine how much stack to copy, then copies the caller's stack before calling the callee.
The CI at the forwarded call site is adjusted using information from the caller's CI.
I think this description is kind of confusing, so let's walk through an example with code.
```ruby
def delegatee(a, b) = a + b
def delegator(...)
delegatee(...) # CI2 (FORWARDING)
end
def caller
delegator(1, 2) # CI1 (argc: 2)
end
```
Before we call the delegator method, the stack looks like this:
```
Executing Line | Code | Stack
---------------+---------------------------------------+--------
1| def delegatee(a, b) = a + b | self
2| | 1
3| def delegator(...) | 2
4| # |
5| delegatee(...) # CI2 (FORWARDING) |
6| end |
7| |
8| def caller |
-> 9| delegator(1, 2) # CI1 (argc: 2) |
10| end |
```
The ISeq for `delegator` is tagged as "forwardable", so when `caller` calls in
to `delegator`, it writes `CI1` on to the stack as a local variable for the
`delegator` method. The `delegator` method has a special local called `...`
that holds the caller's CI object.
Here is the ISeq disasm fo `delegator`:
```
== disasm: #<ISeq:delegator@-e:1 (1,0)-(1,39)>
local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 1] "..."@0
0000 putself ( 1)[LiCa]
0001 getlocal_WC_0 "..."@0
0003 send <calldata!mid:delegatee, argc:0, FCALL|FORWARDING>, nil
0006 leave [Re]
```
The local called `...` will contain the caller's CI: CI1.
Here is the stack when we enter `delegator`:
```
Executing Line | Code | Stack
---------------+---------------------------------------+--------
1| def delegatee(a, b) = a + b | self
2| | 1
3| def delegator(...) | 2
-> 4| # | CI1 (argc: 2)
5| delegatee(...) # CI2 (FORWARDING) | cref_or_me
6| end | specval
7| | type
8| def caller |
9| delegator(1, 2) # CI1 (argc: 2) |
10| end |
```
The CI at `delegatee` on line 5 is tagged as "FORWARDING", so it knows to
memcopy the caller's stack before calling `delegatee`. In this case, it will
memcopy self, 1, and 2 to the stack before calling `delegatee`. It knows how much
memory to copy from the caller because `CI1` contains stack size information
(argc: 2).
Before executing the `send` instruction, we push `...` on the stack. The
`send` instruction pops `...`, and because it is tagged with `FORWARDING`, it
knows to memcopy (using the information in the CI it just popped):
```
== disasm: #<ISeq:delegator@-e:1 (1,0)-(1,39)>
local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 1] "..."@0
0000 putself ( 1)[LiCa]
0001 getlocal_WC_0 "..."@0
0003 send <calldata!mid:delegatee, argc:0, FCALL|FORWARDING>, nil
0006 leave [Re]
```
Instruction 001 puts the caller's CI on the stack. `send` is tagged with
FORWARDING, so it reads the CI and _copies_ the callers stack to this stack:
```
Executing Line | Code | Stack
---------------+---------------------------------------+--------
1| def delegatee(a, b) = a + b | self
2| | 1
3| def delegator(...) | 2
4| # | CI1 (argc: 2)
-> 5| delegatee(...) # CI2 (FORWARDING) | cref_or_me
6| end | specval
7| | type
8| def caller | self
9| delegator(1, 2) # CI1 (argc: 2) | 1
10| end | 2
```
The "FORWARDING" call site combines information from CI1 with CI2 in order
to support passing other values in addition to the `...` value, as well as
perfectly forward splat args, kwargs, etc.
Since we're able to copy the stack from `caller` in to `delegator`'s stack, we
can avoid allocating objects.
I want to do this to eliminate object allocations for delegate methods.
My long term goal is to implement `Class#new` in Ruby and it uses `...`.
I was able to implement `Class#new` in Ruby
[here](https://github.com/ruby/ruby/pull/9289).
If we adopt the technique in this patch, then we can optimize allocating
objects that take keyword parameters for `initialize`.
For example, this code will allocate 2 objects: one for `SomeObject`, and one
for the kwargs:
```ruby
SomeObject.new(foo: 1)
```
If we combine this technique, plus implement `Class#new` in Ruby, then we can
reduce allocations for this common operation.
Co-Authored-By: John Hawthorn <john@hawthorn.email>
Co-Authored-By: Alan Wu <XrXr@users.noreply.github.com>
Currently redefining a method doesn't emit one but two warnings.
One at the location of the new method, and one at the location of
the old method.
I believe this is not ideal because when collecting warnings via
a custom `Warning.warn`, it has to be pieced together. It's even
more tricky because the second part may or may not be emitted
depending on whether the original method has an associated ISeq.
I think it's much better to emit a single warning with all the
information in one go.
When we're searching for CCs, compare the argc and flags for CI rather
than comparing pointers. This means we don't need to store a reference
to the CI, and it also naturally "de-duplicates" CC objects.
We can observe the effect with the following code:
```ruby
require "objspace"
hash = {}
p ObjectSpace.memsize_of(Hash)
eval ("a".."zzz").map { |key|
"hash.merge(:#{key} => 1)"
}.join("; ")
p ObjectSpace.memsize_of(Hash)
```
On master:
```
$ ruby -v test.rb
ruby 3.4.0dev (2024-04-15T16:21:41Z master d019b3baec) [arm64-darwin23]
test.rb:3: warning: assigned but unused variable - hash
3424
527736
```
On this branch:
```
$ make runruby
compiling vm.c
linking miniruby
builtin_binary.inc updated
compiling builtin.c
linking static-library libruby.3.4-static.a
ln -sf ../../rbconfig.rb .ext/arm64-darwin23/rbconfig.rb
linking ruby
ld: warning: ignoring duplicate libraries: '-ldl', '-lobjc', '-lpthread'
RUBY_ON_BUG='gdb -x ./.gdbinit -p' ./miniruby -I./lib -I. -I.ext/common ./tool/runruby.rb --extout=.ext -- --disable-gems ./test.rb
2240
2368
```
Co-authored-by: John Hawthorn <jhawthorn@github.com>
In cfd7729ce7 we started using inline
caches for refinements. However, we weren't clearing inline caches when
defined on a reopened refinement module.
Fixes [Bug #20246]
This frees FL_USER0 on both T_MODULE and T_CLASS.
Note: prior to this, FL_SINGLETON was never set on T_MODULE,
so checking for `FL_SINGLETON` without first checking that
`FL_TYPE` was `T_CLASS` was valid. That's no longer the case.
Rather than exposing that an imemo has a flag and four fields, this
changes the implementation to only expose one field (the klass) and
fills the rest with 0. The type will have to fill in the values themselves.
Previously every call to vm_ci_new (when the CI was not packable) would
result in a different callinfo being returned this meant that every
kwarg callsite had its own CI.
When calling, different CIs result in different CCs. These CIs and CCs
both end up persisted on the T_CLASS inside cc_tbl. So in an eval loop
this resulted in a memory leak of both types of object. This also likely
resulted in extra memory used, and extra time searching, in non-eval
cases.
For simplicity in this commit I always allocate a CI object inside
rb_vm_ci_lookup, but ideally we would lazily allocate it only when
needed. I hope to do that as a follow up in the future.