Commit graph

74 commits

Author SHA1 Message Date
John Hawthorn
fccd96cc6c Add stricter assertions on CC access 2025-08-06 15:57:13 -07:00
Jean Boussier
bc789ca804 Convert rb_class_cc_entries.entries in a flexible array member
`rb_class_cc_entries` is little more than a `len` and `capa`.
Hence embedding the entries doesn't cost much extra copying and
saves a bit of memory and some pointer chasing.

Co-Authored-By: Étienne Barrié <etienne.barrie@gmail.com>
2025-08-01 19:49:38 +02:00
Jean Boussier
547f111b5b Refactor vm_lookup_cc to allow lock-free lookups in RClass.cc_tbl
In multi-ractor mode, the `cc_tbl` mutations use the RCU pattern,
which allow lock-less reads.

Based on the assumption that invalidations and misses should be
increasingly rare as the process ages, locking on modification
isn't a big concern.
2025-08-01 10:42:04 +02:00
Jean Boussier
f2a7e48dea Make RClass.cc_table a managed object
For now this doesn't change anything, but now that the table
is managed by GC, it opens the door to use RCU when in multi-ractor
mode, hence allow unsynchornized reads.
2025-08-01 10:42:04 +02:00
Jean Boussier
fc5e1541e4 Use rb_gc_mark_weak for cc->klass.
One of the biggest remaining contention point is `RClass.cc_table`.
The logical solution would be to turn it into a managed object, so
we can use an RCU strategy, given it's read heavy.

However, that's not currently possible because the table can't
be freed before the owning class, given the class free function
MUST go over all the CC entries to invalidate them.

However if the `CC->klass` reference is weak marked, then the
GC will take care of setting the reference to `Qundef`.
2025-08-01 10:42:04 +02:00
Jean Boussier
dd4c5acc0f vm_callinfo.h: Stick to using user flags
For some unclear reasons VM_CALLCACHE_UNMARKABLE
and VM_CALLCACHE_UNMARKABLE used global flags rather than the
available IMEMO_FL_USER flags.
2025-06-13 10:08:42 +02:00
alpaca-tc
c8ddc0a843 Optimize callcache invalidation for refinements
Fixes [Bug #21201]

This change addresses a performance regression where defining methods
inside `refine` blocks caused severe slowdowns. The issue was due to
`rb_clear_all_refinement_method_cache()` triggering a full object
space scan via `rb_objspace_each_objects` to find and invalidate
affected callcaches, which is very inefficient.

To fix this, I introduce `vm->cc_refinement_table` to track
callcaches related to refinements. This allows us to invalidate
only the necessary callcaches without scanning the entire heap,
resulting in significant performance improvement.
2025-06-09 12:33:35 +09:00
Nobuyoshi Nakada
0ecb689617
Correct comments for packed shape and index [ci skip] 2025-06-04 11:38:45 +09:00
Jean Boussier
e27404af9e Use all 32bits of shape_id_t on all platforms
Followup: https://github.com/ruby/ruby/pull/13341 / [Feature #21353]

Even thought `shape_id_t` has been make 32bits, we were still limited
to use only the lower 16 bits because they had to fit alongside `attr_index_t`
inside a `uintptr_t` in inline caches.

By enlarging inline caches we can unlock the full 32bits on all
platforms, allowing to use these extra bits for tagging.
2025-06-03 21:15:41 +02:00
Jean Boussier
749bda96e5 Refactor attr_index_t caches
Ensure the same helpers are used for packing and unpacking.
2025-05-28 12:39:21 +02:00
Jean Boussier
658fcbe91a Get rid of vm_cc_attr_index and vm_cc_attr_index_dest_shape_id 2025-05-28 12:29:25 +02:00
Jean Boussier
a0e9af0146 Get rid of unused vm_ic_attr_index_dest_shape_id 2025-05-28 12:20:37 +02:00
Satoshi Tagomori
382645d440 namespace on read 2025-05-11 23:32:50 +09:00
Peter Zhu
dd863714bf Remove dead vm_cc_valid_p 2025-01-24 11:07:20 -05:00
Aaron Patterson
bbeebc9258 Only set shape id for CCs on attr_set + ivar
Only ivar reader and writer methods should need the shape ID set on the
inline cache.  We shouldn't accidentally overwrite parts of the CC union
when it's not necessary.
2024-07-31 16:23:28 -07:00
Gabriel Lacroix
1652c194c8
Fix comment for VM_CALL_ARGS_SIMPLE (#11067)
* Set VM_CALL_KWARG flag first and reuse it to avoid checking kw_arg twice

* Fix comment for VM_CALL_ARGS_SIMPLE

* Make VM_CALL_ARGS_SIMPLE set-site match its comment
2024-06-28 10:11:35 -04:00
Aaron Patterson
cdf33ed5f3 Optimized forwarding callers and callees
This patch optimizes forwarding callers and callees. It only optimizes methods that only take `...` as their parameter, and then pass `...` to other calls.

Calls it optimizes look like this:

```ruby
def bar(a) = a
def foo(...) = bar(...) # optimized
foo(123)
```

```ruby
def bar(a) = a
def foo(...) = bar(1, 2, ...) # optimized
foo(123)
```

```ruby
def bar(*a) = a

def foo(...)
  list = [1, 2]
  bar(*list, ...) # optimized
end
foo(123)
```

All variants of the above but using `super` are also optimized, including a bare super like this:

```ruby
def foo(...)
  super
end
```

This patch eliminates intermediate allocations made when calling methods that accept `...`.
We can observe allocation elimination like this:

```ruby
def m
  x = GC.stat(:total_allocated_objects)
  yield
  GC.stat(:total_allocated_objects) - x
end

def bar(a) = a
def foo(...) = bar(...)

def test
  m { foo(123) }
end

test
p test # allocates 1 object on master, but 0 objects with this patch
```

```ruby
def bar(a, b:) = a + b
def foo(...) = bar(...)

def test
  m { foo(1, b: 2) }
end

test
p test # allocates 2 objects on master, but 0 objects with this patch
```

How does it work?
-----------------

This patch works by using a dynamic stack size when passing forwarded parameters to callees.
The caller's info object (known as the "CI") contains the stack size of the
parameters, so we pass the CI object itself as a parameter to the callee.
When forwarding parameters, the forwarding ISeq uses the caller's CI to determine how much stack to copy, then copies the caller's stack before calling the callee.
The CI at the forwarded call site is adjusted using information from the caller's CI.

I think this description is kind of confusing, so let's walk through an example with code.

```ruby
def delegatee(a, b) = a + b

def delegator(...)
  delegatee(...)  # CI2 (FORWARDING)
end

def caller
  delegator(1, 2) # CI1 (argc: 2)
end
```

Before we call the delegator method, the stack looks like this:

```
Executing Line | Code                                  | Stack
---------------+---------------------------------------+--------
              1| def delegatee(a, b) = a + b           | self
              2|                                       | 1
              3| def delegator(...)                    | 2
              4|   #                                   |
              5|   delegatee(...)  # CI2 (FORWARDING)  |
              6| end                                   |
              7|                                       |
              8| def caller                            |
          ->  9|   delegator(1, 2) # CI1 (argc: 2)     |
             10| end                                   |
```

The ISeq for `delegator` is tagged as "forwardable", so when `caller` calls in
to `delegator`, it writes `CI1` on to the stack as a local variable for the
`delegator` method.  The `delegator` method has a special local called `...`
that holds the caller's CI object.

Here is the ISeq disasm fo `delegator`:

```
== disasm: #<ISeq:delegator@-e:1 (1,0)-(1,39)>
local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 1] "..."@0
0000 putself                                                          (   1)[LiCa]
0001 getlocal_WC_0                          "..."@0
0003 send                                   <calldata!mid:delegatee, argc:0, FCALL|FORWARDING>, nil
0006 leave                                  [Re]
```

The local called `...` will contain the caller's CI: CI1.

Here is the stack when we enter `delegator`:

```
Executing Line | Code                                  | Stack
---------------+---------------------------------------+--------
              1| def delegatee(a, b) = a + b           | self
              2|                                       | 1
              3| def delegator(...)                    | 2
           -> 4|   #                                   | CI1 (argc: 2)
              5|   delegatee(...)  # CI2 (FORWARDING)  | cref_or_me
              6| end                                   | specval
              7|                                       | type
              8| def caller                            |
              9|   delegator(1, 2) # CI1 (argc: 2)     |
             10| end                                   |
```

The CI at `delegatee` on line 5 is tagged as "FORWARDING", so it knows to
memcopy the caller's stack before calling `delegatee`.  In this case, it will
memcopy self, 1, and 2 to the stack before calling `delegatee`.  It knows how much
memory to copy from the caller because `CI1` contains stack size information
(argc: 2).

Before executing the `send` instruction, we push `...` on the stack.  The
`send` instruction pops `...`, and because it is tagged with `FORWARDING`, it
knows to memcopy (using the information in the CI it just popped):

```
== disasm: #<ISeq:delegator@-e:1 (1,0)-(1,39)>
local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 1] "..."@0
0000 putself                                                          (   1)[LiCa]
0001 getlocal_WC_0                          "..."@0
0003 send                                   <calldata!mid:delegatee, argc:0, FCALL|FORWARDING>, nil
0006 leave                                  [Re]
```

Instruction 001 puts the caller's CI on the stack.  `send` is tagged with
FORWARDING, so it reads the CI and _copies_ the callers stack to this stack:

```
Executing Line | Code                                  | Stack
---------------+---------------------------------------+--------
              1| def delegatee(a, b) = a + b           | self
              2|                                       | 1
              3| def delegator(...)                    | 2
              4|   #                                   | CI1 (argc: 2)
           -> 5|   delegatee(...)  # CI2 (FORWARDING)  | cref_or_me
              6| end                                   | specval
              7|                                       | type
              8| def caller                            | self
              9|   delegator(1, 2) # CI1 (argc: 2)     | 1
             10| end                                   | 2
```

The "FORWARDING" call site combines information from CI1 with CI2 in order
to support passing other values in addition to the `...` value, as well as
perfectly forward splat args, kwargs, etc.

Since we're able to copy the stack from `caller` in to `delegator`'s stack, we
can avoid allocating objects.

I want to do this to eliminate object allocations for delegate methods.
My long term goal is to implement `Class#new` in Ruby and it uses `...`.

I was able to implement `Class#new` in Ruby
[here](https://github.com/ruby/ruby/pull/9289).
If we adopt the technique in this patch, then we can optimize allocating
objects that take keyword parameters for `initialize`.

For example, this code will allocate 2 objects: one for `SomeObject`, and one
for the kwargs:

```ruby
SomeObject.new(foo: 1)
```

If we combine this technique, plus implement `Class#new` in Ruby, then we can
reduce allocations for this common operation.

Co-Authored-By: John Hawthorn <john@hawthorn.email>
Co-Authored-By: Alan Wu <XrXr@users.noreply.github.com>
2024-06-18 09:28:25 -07:00
Aaron Patterson
0434dfb76b We don't need to check if the ci is markable anymore
It doesn't matter if CI's are stack allocated or not.
2024-04-24 15:09:06 -07:00
Aaron Patterson
147ca9585e Implement equality for CI comparison when CC searching
When we're searching for CCs, compare the argc and flags for CI rather
than comparing pointers.  This means we don't need to store a reference
to the CI, and it also naturally "de-duplicates" CC objects.

We can observe the effect with the following code:

```ruby
require "objspace"

hash = {}

p ObjectSpace.memsize_of(Hash)

eval ("a".."zzz").map { |key|
  "hash.merge(:#{key} => 1)"
}.join("; ")

p ObjectSpace.memsize_of(Hash)
```

On master:

```
$ ruby -v test.rb
ruby 3.4.0dev (2024-04-15T16:21:41Z master d019b3baec) [arm64-darwin23]
test.rb:3: warning: assigned but unused variable - hash
3424
527736
```

On this branch:

```
$ make runruby
compiling vm.c
linking miniruby
builtin_binary.inc updated
compiling builtin.c
linking static-library libruby.3.4-static.a
ln -sf ../../rbconfig.rb .ext/arm64-darwin23/rbconfig.rb
linking ruby
ld: warning: ignoring duplicate libraries: '-ldl', '-lobjc', '-lpthread'
RUBY_ON_BUG='gdb -x ./.gdbinit -p' ./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems  ./test.rb
2240
2368
```

Co-authored-by: John Hawthorn <jhawthorn@github.com>
2024-04-18 09:06:33 -07:00
Peter Zhu
330830dd1a Add IMEMO_NEW
Rather than exposing that an imemo has a flag and four fields, this
changes the implementation to only expose one field (the klass) and
fills the rest with 0. The type will have to fill in the values themselves.
2024-02-21 11:33:05 -05:00
John Hawthorn
1c97abaaba De-dup identical callinfo objects
Previously every call to vm_ci_new (when the CI was not packable) would
result in a different callinfo being returned this meant that every
kwarg callsite had its own CI.

When calling, different CIs result in different CCs. These CIs and CCs
both end up persisted on the T_CLASS inside cc_tbl. So in an eval loop
this resulted in a memory leak of both types of object. This also likely
resulted in extra memory used, and extra time searching, in non-eval
cases.

For simplicity in this commit I always allocate a CI object inside
rb_vm_ci_lookup, but ideally we would lazily allocate it only when
needed. I hope to do that as a follow up in the future.
2024-02-20 18:55:00 -08:00
Jeremy Evans
22e488464a Add VM_CALL_ARGS_SPLAT_MUT callinfo flag
This flag is set when the caller has already created a new array to
handle a splat, such as for `f(*a, b)` and `f(*a, *b)`.  Previously,
if `f` was defined as `def f(*a)`, these calls would create an extra
array on the callee side, instead of using the new array created
by the caller.

This modifies `setup_args_core` to set the flag whenver it would add
a `splatarray true` instruction.  However, when `splatarray true` is
changed to `splatarray false` in the peephole optimizer, to avoid
unnecessary allocations on the caller side, the flag must be removed.
Add `optimize_args_splat_no_copy` and have the peephole optimizer call
that.  This significantly simplifies the related peephole optimizer
code.

On the callee side, in `setup_parameters_complex`, set
`args->rest_dupped` to true if the flag is set.

This takes a similar approach for optimizing regular splats that was
previiously used for keyword splats in
d2c41b1bff (via VM_CALL_KW_SPLAT_MUT).
2024-01-24 18:25:55 -08:00
Jeremy Evans
3081c83169 Support tracing of struct member accessor methods
This follows the same approach used for attr_reader/attr_writer in
2d98593bf5, skipping the checking for
tracing after the first call using the call cache, and clearing the
call cache when tracing is turned on/off.

Fixes [Bug #18886]
2023-12-07 10:29:33 -08:00
HParker
c74dc8b4af Use reference counting to avoid memory leak in kwargs
Tracks other callinfo that references the same kwargs and frees them when all references are cleared.

[bug #19906]

Co-authored-by: Peter Zhu <peter@peterzhu.ca>
2023-10-01 10:55:19 -04:00
Koichi Sasada
cfd7729ce7 use inline cache for refinements
From Ruby 3.0, refined method invocations are slow because
resolved methods are not cached by inline cache because of
conservertive strategy. However, `using` clears all caches
so that it seems safe to cache resolved method entries.

This patch caches resolved method entries in inline cache
and clear all of inline method caches when `using` is called.

fix [Bug #18572]

```ruby
 # without refinements

class C
  def foo = :C
end

N = 1_000_000

obj = C.new
require 'benchmark'
Benchmark.bm{|x|
  x.report{N.times{
    obj.foo; obj.foo; obj.foo; obj.foo; obj.foo;
    obj.foo; obj.foo; obj.foo; obj.foo; obj.foo;
    obj.foo; obj.foo; obj.foo; obj.foo; obj.foo;
    obj.foo; obj.foo; obj.foo; obj.foo; obj.foo;
  }}
}

_END__
              user     system      total        real
master    0.362859   0.002544   0.365403 (  0.365424)
modified  0.357251   0.000000   0.357251 (  0.357258)
```

```ruby
 # with refinment but without using

class C
  def foo = :C
end

module R
  refine C do
    def foo = :R
  end
end

N = 1_000_000

obj = C.new
require 'benchmark'
Benchmark.bm{|x|
  x.report{N.times{
    obj.foo; obj.foo; obj.foo; obj.foo; obj.foo;
    obj.foo; obj.foo; obj.foo; obj.foo; obj.foo;
    obj.foo; obj.foo; obj.foo; obj.foo; obj.foo;
    obj.foo; obj.foo; obj.foo; obj.foo; obj.foo;
  }}
}
__END__
               user     system      total        real
master     0.957182   0.000000   0.957182 (  0.957212)
modified   0.359228   0.000000   0.359228 (  0.359238)
```

```ruby
 # with using

class C
  def foo = :C
end

module R
  refine C do
    def foo = :R
  end
end

N = 1_000_000

using R

obj = C.new
require 'benchmark'
Benchmark.bm{|x|
  x.report{N.times{
    obj.foo; obj.foo; obj.foo; obj.foo; obj.foo;
    obj.foo; obj.foo; obj.foo; obj.foo; obj.foo;
    obj.foo; obj.foo; obj.foo; obj.foo; obj.foo;
    obj.foo; obj.foo; obj.foo; obj.foo; obj.foo;
  }}
}
2023-07-31 17:13:43 +09:00
Koichi Sasada
36023d5cb7 mark cc->cme_ if it is for super
`vm_search_super_method()` makes orphan CCs (they are not connected
from ccs) and `cc->cme_` can be collected before without marking.
2023-07-31 14:04:31 +09:00
Ruby
6391132c03 fix typo (CACH_ -> CACHE_) 2023-07-28 18:01:42 +09:00
Nobuyoshi Nakada
469e644c93
Compile code for non-embedded CI always 2023-06-30 23:59:04 +09:00
Takashi Kokubun
df1b007fbd Remove unused VM_CALL_BLOCKISEQ flag 2023-04-01 10:22:47 -07:00
Takashi Kokubun
175538e433 Improve explanation of FCALL and VCALL 2023-04-01 10:17:59 -07:00
Koichi Sasada
c9fd81b860 vm_call_single_noarg_inline_builtin
If the iseq only contains `opt_invokebuiltin_delegate_leave` insn and
the builtin-function (bf) is inline-able, the caller doesn't need to
build a method frame.

`vm_call_single_noarg_inline_builtin` is fast path for such cases.
2023-03-23 14:03:12 +09:00
Takashi Kokubun
2e875549a9 s/MJIT/RJIT/ 2023-03-06 23:44:01 -08:00
Yusuke Endoh
2cc3963a00 Prevent wrong integer expansion
`(attr_index + 1)` leads to wrong integer expansion on 32-bit machines
(including Solaris 10 CI) because `attr_index_t` is uint16_t.

20221013T080004Z.fail.html.gz
```
  1) Failure:
TestRDocClassModule#test_marshal_load_version_2 [/export/home/users/chkbuild/cb-gcc/tmp/build/20221013T080004Z/ruby/test/rdoc/test_rdoc_class_module.rb:493]:
<[doc: [doc (file.rb): [para: "this is a comment"]]]> expected but was
<[doc: [doc (file.rb): [para: "this is a comment"]]]>.

  2) Failure:
TestRDocStats#test_report_method_line [/export/home/users/chkbuild/cb-gcc/tmp/build/20221013T080004Z/ruby/test/rdoc/test_rdoc_stats.rb:460]:
Expected /\#\ in\ file\ file\.rb:4/ to match "The following items are not documented:\n" +
"\n" +
"  class C # is documented\n" +
"\n" +
"    # in file file.rb\n" +
"    def m1; end\n" +
"\n" +
"  end\n" +
"\n" +
"\n".

  3) Failure:
TestRDocStats#test_report_attr_line [/export/home/users/chkbuild/cb-gcc/tmp/build/20221013T080004Z/ruby/test/rdoc/test_rdoc_stats.rb:91]:
Expected /\#\ in\ file\ file\.rb:3/ to match "The following items are not documented:\n" +
"\n" +
"  class C # is documented\n" +
"\n" +
"    attr_accessor :a # in file file.rb\n" +
"\n" +
"  end\n" +
"\n" +
"\n".
```
2022-10-13 08:14:04 -07:00
Nobuyoshi Nakada
b55e3b842a Initialize shape attr index also in non-markable CC 2022-10-12 09:14:55 -07:00
Yusuke Endoh
7a9f865a1d Do not read cached_id from callcache on stack
The inline cache is initialized by vm_cc_attr_index_set only when
vm_cc_markable(cc). However, vm_getivar attempted to read the cache
even if the cc is not vm_cc_markable.

This caused a condition that depends on uninitialized value.
Here is an output of valgrind:

```
==10483== Conditional jump or move depends on uninitialised value(s)
==10483==    at 0x4C1D60: vm_getivar (vm_insnhelper.c:1171)
==10483==    by 0x4C1D60: vm_call_ivar (vm_insnhelper.c:3257)
==10483==    by 0x4E8E48: vm_call_symbol (vm_insnhelper.c:3481)
==10483==    by 0x4EAD8C: vm_sendish (vm_insnhelper.c:5035)
==10483==    by 0x4C62B2: vm_exec_core (insns.def:820)
==10483==    by 0x4DD519: rb_vm_exec (vm.c:0)
==10483==    by 0x4F00B3: invoke_block (vm.c:1417)
==10483==    by 0x4F00B3: invoke_iseq_block_from_c (vm.c:1473)
==10483==    by 0x4F00B3: invoke_block_from_c_bh (vm.c:1491)
==10483==    by 0x4D42B6: rb_yield (vm_eval.c:0)
==10483==    by 0x259128: rb_ary_each (array.c:2733)
==10483==    by 0x4E8730: vm_call_cfunc_with_frame (vm_insnhelper.c:3227)
==10483==    by 0x4EAD8C: vm_sendish (vm_insnhelper.c:5035)
==10483==    by 0x4C6254: vm_exec_core (insns.def:801)
==10483==    by 0x4DD519: rb_vm_exec (vm.c:0)
==10483==
```

In fact, the CI on FreeBSD 12 started failing since ad63b668e2.

```
gmake[1]: Entering directory '/usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby'
/usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:924:in `complete': undefined method `complete' for nil:NilClass (NoMethodError)
	from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1816:in `block in visit'
	from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1815:in `reverse_each'
	from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1815:in `visit'
	from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1847:in `block in complete'
	from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1846:in `catch'
	from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1846:in `complete'
	from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1640:in `block in parse_in_order'
	from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1632:in `catch'
	from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1632:in `parse_in_order'
	from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1626:in `order!'
	from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1732:in `permute!'
	from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1757:in `parse!'
	from ./ext/extmk.rb:359:in `parse_args'
	from ./ext/extmk.rb:396:in `<main>'
```

This change adds a guard to read the cache only when vm_cc_markable(cc).
It might be better to initialize the cache as INVALID_SHAPE_ID when the
cc is not vm_cc_markable.
2022-10-12 20:28:24 +09:00
Jemma Issroff
913979bede
Make inline cache reads / writes atomic with object shapes
Prior to this commit, we were reading and writing ivar index and
shape ID in inline caches in two separate instructions when
getting and setting ivars. This meant there was a race condition
with ractors and these caches where one ractor could change
a value in the cache while another was still reading from it.

This commit instead reads and writes shape ID and ivar index to
inline caches atomically so there is no longer a race condition.

Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
Co-Authored-By: John Hawthorn <john@hawthorn.email>
2022-10-11 08:40:56 -07:00
Jemma Issroff
ad63b668e2
Revert "Revert "This commit implements the Object Shapes technique in CRuby.""
This reverts commit 9a6803c90b.
2022-10-11 08:40:56 -07:00
Aaron Patterson
9a6803c90b
Revert "This commit implements the Object Shapes technique in CRuby."
This reverts commit 68bc9e2e97d12f80df0d113e284864e225f771c2.
2022-09-30 16:01:50 -07:00
Jemma Issroff
d594a5a8bd
This commit implements the Object Shapes technique in CRuby.
Object Shapes is used for accessing instance variables and representing the
"frozenness" of objects.  Object instances have a "shape" and the shape
represents some attributes of the object (currently which instance variables are
set and the "frozenness").  Shapes form a tree data structure, and when a new
instance variable is set on an object, that object "transitions" to a new shape
in the shape tree.  Each shape has an ID that is used for caching. The shape
structure is independent of class, so objects of different types can have the
same shape.

For example:

```ruby
class Foo
  def initialize
    # Starts with shape id 0
    @a = 1 # transitions to shape id 1
    @b = 1 # transitions to shape id 2
  end
end

class Bar
  def initialize
    # Starts with shape id 0
    @a = 1 # transitions to shape id 1
    @b = 1 # transitions to shape id 2
  end
end

foo = Foo.new # `foo` has shape id 2
bar = Bar.new # `bar` has shape id 2
```

Both `foo` and `bar` instances have the same shape because they both set
instance variables of the same name in the same order.

This technique can help to improve inline cache hits as well as generate more
efficient machine code in JIT compilers.

This commit also adds some methods for debugging shapes on objects.  See
`RubyVM::Shape` for more details.

For more context on Object Shapes, see [Feature: #18776]

Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
Co-Authored-By: Eileen M. Uchitelle <eileencodes@gmail.com>
Co-Authored-By: John Hawthorn <john@hawthorn.email>
2022-09-28 08:26:21 -07:00
Aaron Patterson
06abfa5be6
Revert this until we can figure out WB issues or remove shapes from GC
Revert "* expand tabs. [ci skip]"

This reverts commit 830b5b5c35.

Revert "This commit implements the Object Shapes technique in CRuby."

This reverts commit 9ddfd2ca00.
2022-09-26 16:10:11 -07:00
Jemma Issroff
9ddfd2ca00 This commit implements the Object Shapes technique in CRuby.
Object Shapes is used for accessing instance variables and representing the
"frozenness" of objects.  Object instances have a "shape" and the shape
represents some attributes of the object (currently which instance variables are
set and the "frozenness").  Shapes form a tree data structure, and when a new
instance variable is set on an object, that object "transitions" to a new shape
in the shape tree.  Each shape has an ID that is used for caching. The shape
structure is independent of class, so objects of different types can have the
same shape.

For example:

```ruby
class Foo
  def initialize
    # Starts with shape id 0
    @a = 1 # transitions to shape id 1
    @b = 1 # transitions to shape id 2
  end
end

class Bar
  def initialize
    # Starts with shape id 0
    @a = 1 # transitions to shape id 1
    @b = 1 # transitions to shape id 2
  end
end

foo = Foo.new # `foo` has shape id 2
bar = Bar.new # `bar` has shape id 2
```

Both `foo` and `bar` instances have the same shape because they both set
instance variables of the same name in the same order.

This technique can help to improve inline cache hits as well as generate more
efficient machine code in JIT compilers.

This commit also adds some methods for debugging shapes on objects.  See
`RubyVM::Shape` for more details.

For more context on Object Shapes, see [Feature: #18776]

Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
Co-Authored-By: Eileen M. Uchitelle <eileencodes@gmail.com>
Co-Authored-By: John Hawthorn <john@hawthorn.email>
2022-09-26 09:21:30 -07:00
Jemma Issroff
ecff334995 Extract vm_ic_entry API to mimic vm_cc behavior 2022-07-18 12:44:01 -07:00
Nobuyoshi Nakada
90a8b1c543
Remove a typo hash [ci skip] 2022-01-29 15:33:13 +09:00
Jemma Issroff
1a180b7e18 Streamline cached attr reader / writer indexes
This commit removes the need to increment and decrement the indexes
used by vm_cc_attr_index getters and setters. It also introduces a
vm_cc_attr_index_p predicate function, and a vm_cc_attr_index_initalize
function.
2022-01-26 09:02:59 -08:00
Koichi Sasada
df48db987d mandatory_only_cme should not be in def
`def` (`rb_method_definition_t`) is shared by multiple callable
method entries (cme, `rb_callable_method_entry_t`).

There are two issues:

* old -> young reference: `cme1->def->mandatory_only_cme = monly_cme`
  if `cme1` is young and `monly_cme` is young, there is no problem.
  Howevr, another old `cme2` can refer `def`, in this case, old `cme2`
  points young `monly_cme` and it violates gengc assumption.
* cme can have different `defined_class` but `monly_cme` only has
  one `defined_class`. It does not make sense and `monly_cme`
  should be created for a cme (not `def`).

To solve these issues, this patch allocates `monly_cme` per `cme`.
`cme` does not have another room to store a pointer to the `monly_cme`,
so this patch introduces `overloaded_cme_table`, which is weak key map
`[cme] -> [monly_cme]`.

`def::body::iseqptr::monly_cme` is deleted.

The first issue is reported by Alan Wu.
2021-12-21 11:03:09 +09:00
Koichi Sasada
ca21eed6eb fix assertion on gc_cc_cme()
`cc->cme_` can be NULL when it is not initialized yet.
It can be observed on `GC.stress == true` running.
2021-11-25 13:57:49 +09:00
Koichi Sasada
7ec1fc37f4 add VM_CALLCACHE_ON_STACK
check if iseq refers to on stack CC (it shouldn't).
2021-11-17 22:21:42 +09:00
Koichi Sasada
8d7116552d assert cc->cme_ != NULL
when `vm_cc_markable(cc)`.
2021-11-17 22:21:42 +09:00
Koichi Sasada
b2255153cf vm_empty_cc_for_super
Same as `vm_empty_cc`, introduce a global variable which has
`.call_ = vm_call_super_method`. Use it if the `cme == NULL` on
`vm_search_super_method`.
2021-11-17 22:21:42 +09:00
Koichi Sasada
84aba25031 assert cc->call_ != NULL 2021-11-17 22:21:42 +09:00