ruby/internal
Aaron Patterson 50c2c4bdde Make rb_vm_insns_count a thread local variable
`rb_vm_insns_count` is a global variable used for reporting YJIT
statistics. It is a counter that tallies the number of interpreter
instructions that have been executed, this way we can approximate how
much time we're spending in YJIT compared to the interpreter.

Unfortunately keeping this statistic means that every instruction
executed in the interpreter loop must increment the counter. Normally
this isn't a problem, but in multi-threaded situations (when Ractors are
used), incrementing this counter can become quite costly due to page
caching issues.

Additionally, since there is no locking when incrementing this global,
the count can't really make sense in a multi-threaded environment.

This commit changes `rb_vm_insns_count` to a thread local. That way each
Ractor has it's own copy of the counter and incrementing the counter
becomes quite cheap. Of course this means that in multi-threaded
situations, the value doesn't really make sense (but it didn't make
sense before because of the lack of locking).

The counter is used for YJIT statistics, and since YJIT is basically
disabled when Ractors are in use, I don't think we care about
inaccuracies (for the time being). We can revisit this counter when we
give YJIT multi-threading support, but for the time being this commit
restores multi-threaded performance.

To test this, I used the benchmark in [Bug #20489].

Here is the performance on Ruby 3.2:

```
$ time RUBY_MAX_CPU=12 ./miniruby -v ../test.rb 8 8
ruby 3.2.0 (2022-12-25 revision a528908271) [x86_64-linux]
[0...1, 1...2, 2...3, 3...4, 4...5, 5...6, 6...7, 7...8]
../test.rb:43: warning: Ractor is experimental, and the behavior may change in future versions of Ruby! Also there are many implementation issues.

________________________________________________________
Executed in    2.53 secs    fish           external
   usr time   19.86 secs  370.00 micros   19.86 secs
   sys time    0.02 secs  320.00 micros    0.02 secs
```

We can see the regression in performance on the master branch:

```
$ time RUBY_MAX_CPU=12 ./miniruby -v ../test.rb 8 8
ruby 3.5.0dev (2025-01-10T16:22:26Z master 4a2702dafb) +PRISM [x86_64-linux]
[0...1, 1...2, 2...3, 3...4, 4...5, 5...6, 6...7, 7...8]
../test.rb:43: warning: Ractor is experimental, and the behavior may change in future versions of Ruby! Also there are many implementation issues.

________________________________________________________
Executed in   24.87 secs    fish           external
   usr time  195.55 secs    0.00 micros  195.55 secs
   sys time    0.00 secs  716.00 micros    0.00 secs
```

Here are the stats after this commit:

```
$ time RUBY_MAX_CPU=12 ./miniruby -v ../test.rb 8 8
ruby 3.5.0dev (2025-01-10T20:37:06Z tl 3ef0432779) +PRISM [x86_64-linux]
[0...1, 1...2, 2...3, 3...4, 4...5, 5...6, 6...7, 7...8]
../test.rb:43: warning: Ractor is experimental, and the behavior may change in future versions of Ruby! Also there are many implementation issues.

________________________________________________________
Executed in    2.46 secs    fish           external
   usr time   19.34 secs  381.00 micros   19.34 secs
   sys time    0.01 secs  321.00 micros    0.01 secs
```

[Bug #20489]
2025-01-10 13:39:21 -08:00
..
array.h Optimized instruction for Array#freeze 2024-09-05 12:46:02 +02:00
basic_operators.h Optimize instructions when creating an array just to call include? (#12123) 2024-11-26 14:31:08 -05:00
bignum.h Stop exporting symbols for MJIT 2023-03-06 21:59:23 -08:00
bits.h Add integer overflow check macros for add/sub as well as mul 2024-11-09 00:08:03 +09:00
class.h Rename size_pool -> heap 2024-10-03 21:20:09 +01:00
cmdlineopt.h [Feature #19790] Rename BUGREPORT_PATH as CRASH_REPORT 2023-09-25 22:57:28 +09:00
compar.h Introduce BOP_CMP for optimized comparison 2022-12-06 12:37:23 -08:00
compile.h Move the PC regardless of the leaf flag (#8232) 2023-08-16 20:28:33 -07:00
compilers.h
complex.h
cont.h Free everything at shutdown 2023-12-07 15:52:35 -05:00
dir.h
enc.h
encoding.h string.c: Directly create strings with the correct encoding 2024-11-13 13:32:32 +01:00
enum.h
enumerator.h
error.h Implement rb_bug_without_die 2024-12-12 14:07:56 -05:00
eval.h [Bug #20342] Consider wrapped load in main methods 2024-04-05 01:33:08 +09:00
file.h Revert "reuse open(2) from rb_file_load_ok on POSIX-like system" 2023-02-27 09:24:45 -08:00
fixnum.h rb_fix_mul_fix needs internal/bits.h for MUL_OVERFLOW_FIXNUM_P 2024-10-08 23:29:49 +09:00
gc.h Remove stale declaration for modular GC 2025-01-11 01:22:26 +09:00
hash.h Optimized instruction for Hash#freeze 2024-09-05 12:46:02 +02:00
imemo.h Pass allocation size to rb_imemo_new 2025-01-08 09:11:59 -05:00
inits.h Merge rb_objspace_alloc and Init_heap. 2024-04-04 15:00:57 +01:00
io.h Introduce Fiber::Scheduler#blocking_operation_wait. (#12016) 2024-11-20 19:40:17 +13:00
load.h
loadpath.h
math.h
missing.h Free environ when RUBY_FREE_AT_EXIT 2024-01-11 10:09:53 -05:00
numeric.h Faster Integer.sqrt for large bignum 2024-03-18 13:52:27 +09:00
object.h Move rb_class_allocate_instance from gc.c to object.c 2024-02-14 13:43:02 -05:00
parse.h [Bug #20989] Ripper: Pass compile_error 2024-12-28 11:25:57 +09:00
proc.h Fix use-after-free in ep in Proc#dup for ifunc procs 2024-12-13 10:10:03 -05:00
process.h Put rb_fork back into process.c 2023-05-21 23:00:27 +09:00
ractor.h Fix shared GC with -DRUBY_DEBUG 2024-10-24 16:08:46 +01:00
random.h Free everything at shutdown 2023-12-07 15:52:35 -05:00
range.h Implement Struct on VWA 2023-06-05 15:47:16 -04:00
rational.h Don't redefine RB_OBJ_WRITE 2023-01-18 08:49:32 -05:00
re.h Stop allocating unused backref strings at defined? 2023-06-27 23:14:10 +09:00
ruby_parser.h Change return value of gets function to be rb_parser_string_t * instead of VALUE 2024-05-04 11:59:10 +09:00
sanitizers.h Prefix asan_poison_object with rb 2024-12-19 09:14:34 -05:00
serial.h
signal.h Revert "hijack SIGCHLD handler for internal use" 2024-04-04 21:48:14 +09:00
st.h Move internal ST functions to internal/st.h 2023-12-25 10:41:12 -05:00
static_assert.h
string.h YJIT: Specialize String#[] (String#slice) with fixnum arguments (#12069) 2024-11-13 12:25:09 -05:00
struct.h Remove dead function rb_struct_const_heap_ptr 2025-01-03 17:02:50 -05:00
symbol.h Free everything at shutdown 2023-12-07 15:52:35 -05:00
thread.h introduce rb_ec_check_ints() 2024-11-08 18:02:46 +09:00
time.h
transcode.h Free everything at shutdown 2023-12-07 15:52:35 -05:00
util.h
variable.h Fix compaction for generic ivars 2023-11-24 13:29:04 -05:00
vm.h Make rb_vm_insns_count a thread local variable 2025-01-10 13:39:21 -08:00
warnings.h