archive/ruby - Eplg Git: Free And Private Git Hosting

mirror of https://github.com/ruby/ruby.git synced 2025-08-15 13:39:04 +02:00

Author	SHA1	Message	Date
Aaron Patterson	8ac8225c50	Inline Class#new. This commit inlines instructions for Class#new. To make this work, we added a new YARV instructions, `opt_new`. `opt_new` checks whether or not the `new` method is the default allocator method. If it is, it allocates the object, and pushes the instance on the stack. If not, the instruction jumps to the "slow path" method call instructions. Old instructions: ``` > ruby --dump=insns -e'Object.new' == disasm: #<ISeq:<main>@-e:1 (1,0)-(1,10)> 0000 opt_getconstant_path <ic:0 Object> ( 1)[Li] 0002 opt_send_without_block <calldata!mid:new, argc:0, ARGS_SIMPLE> 0004 leave ``` New instructions: ``` > ./miniruby --dump=insns -e'Object.new' == disasm: #<ISeq:<main>@-e:1 (1,0)-(1,10)> 0000 opt_getconstant_path <ic:0 Object> ( 1)[Li] 0002 putnil 0003 swap 0004 opt_new <calldata!mid:new, argc:0, ARGS_SIMPLE>, 11 0007 opt_send_without_block <calldata!mid:initialize, argc:0, FCALL\|ARGS_SIMPLE> 0009 jump 14 0011 opt_send_without_block <calldata!mid:new, argc:0, ARGS_SIMPLE> 0013 swap 0014 pop 0015 leave ``` This commit speeds up basic object allocation (`Foo.new`) by 60%, but classes that take keyword parameters see an even bigger benefit because no hash is allocated when instantiating the object (3x to 6x faster). Here is an example that uses `Hash.new(capacity: 0)`: ``` > hyperfine "ruby --disable-gems -e'i = 0; while i < 10_000_000; Hash.new(capacity: 0); i += 1; end'" "./ruby --disable-gems -e'i = 0; while i < 10_000_000; Hash.new(capacity: 0); i += 1; end'" Benchmark 1: ruby --disable-gems -e'i = 0; while i < 10_000_000; Hash.new(capacity: 0); i += 1; end' Time (mean ± σ): 1.082 s ± 0.004 s [User: 1.074 s, System: 0.008 s] Range (min … max): 1.076 s … 1.088 s 10 runs Benchmark 2: ./ruby --disable-gems -e'i = 0; while i < 10_000_000; Hash.new(capacity: 0); i += 1; end' Time (mean ± σ): 627.9 ms ± 3.5 ms [User: 622.7 ms, System: 4.8 ms] Range (min … max): 622.7 ms … 633.2 ms 10 runs Summary ./ruby --disable-gems -e'i = 0; while i < 10_000_000; Hash.new(capacity: 0); i += 1; end' ran 1.72 ± 0.01 times faster than ruby --disable-gems -e'i = 0; while i < 10_000_000; Hash.new(capacity: 0); i += 1; end' ``` This commit changes the backtrace for `initialize`: ``` aaron@tc ~/g/ruby (inline-new)> cat test.rb class Foo def initialize puts caller end end def hello Foo.new end hello aaron@tc ~/g/ruby (inline-new)> ruby -v test.rb ruby 3.4.2 (2025-02-15 revision `d2930f8e7a`) +PRISM [arm64-darwin24] test.rb:8:in 'Class#new' test.rb:8:in 'Object#hello' test.rb:11:in '<main>' aaron@tc ~/g/ruby (inline-new)> ./miniruby -v test.rb ruby 3.5.0dev (2025-03-28T23:59:40Z inline-new c4157884e4) +PRISM [arm64-darwin24] test.rb:8:in 'Object#hello' test.rb:11:in '<main>' ``` It also increases memory usage for calls to `new` by 122 bytes: ``` aaron@tc ~/g/ruby (inline-new)> cat test.rb require "objspace" class Foo def initialize puts caller end end def hello Foo.new end puts ObjectSpace.memsize_of(RubyVM::InstructionSequence.of(method(:hello))) aaron@tc ~/g/ruby (inline-new)> make runruby RUBY_ON_BUG='gdb -x ./.gdbinit -p' ./miniruby -I./lib -I. -I.ext/common ./tool/runruby.rb --extout=.ext -- --disable-gems ./test.rb 656 aaron@tc ~/g/ruby (inline-new)> ruby -v test.rb ruby 3.4.2 (2025-02-15 revision `d2930f8e7a`) +PRISM [arm64-darwin24] 544 ``` Thanks to @ko1 for coming up with this idea! Co-Authored-By: John Hawthorn <john@hawthorn.email>	2025-04-25 13:46:05 -07:00
Takashi Kokubun	ae3d6a321b	Fix yjit-bindgen	2025-04-18 21:53:01 +09:00
Takashi Kokubun	ae17323a65	Move a couple of bindgen targets to ZJIT bindgen We filed https://github.com/Shopify/zjit/pull/65 and https://github.com/Shopify/zjit/pull/64 concurrently.	2025-04-18 21:53:00 +09:00
Alan Wu	19e8e45f69	Rust tests: Load builtins (core library written in ruby) Key here is calling rb_call_builtin_inits(), which sticking to public API for robustness is done by calling ruby_options(). Fixes: https://github.com/Shopify/zjit/issues/61	2025-04-18 21:53:00 +09:00
Max Bernstein	97f022b5e7	Print Ruby exception in test utils	2025-04-18 21:53:00 +09:00
Max Bernstein	ec41dffd05	Add compact Type lattice This will be used for local type inference and potentially SCCP.	2025-04-18 21:52:59 +09:00
Takashi Kokubun	0a543daf15	Add zjit_* instructions to profile the interpreter (https://github.com/Shopify/zjit/pull/16 ) * Add zjit_* instructions to profile the interpreter * Rename FixnumPlus to FixnumAdd * Update a comment about Invalidate * Rename Guard to GuardType * Rename Invalidate to PatchPoint * Drop unneeded debug!() * Plan on profiling the types * Use the output of GuardType as type refined outputs	2025-04-18 21:52:59 +09:00
Alan Wu	e24be0b8d5	Upgrade bindgen, so it generates `unsafe extern` as 2024 expects	2025-04-18 21:52:59 +09:00
Alan Wu	4326b0cece	boot_vm boots and runs	2025-04-18 21:52:57 +09:00
Alan Wu	14a4edaea6	bindgen works in --enable-zjit=dev mode.	2025-04-18 21:52:56 +09:00
Alan Wu	106b328117	make zjit-bindgen runs, but doesn't graft the right things yet	2025-04-18 21:52:56 +09:00
Takashi Kokubun	809b63c804	Fix bindgen	2025-04-18 21:52:56 +09:00
Takashi Kokubun	e6ffc141b1	Define ZJIT libs for non-gmake	2025-04-18 21:52:55 +09:00
Alan Wu	98790faae3	YJIT: Add Counter::invalidate_everything When YJIT is forced to discard all the code, that's bad for performance, so there should be an easy way to know about it.	2025-03-07 20:23:32 -05:00
Takashi Kokubun	bb91c303ba	YJIT: Rename get_temp_regs2() back to get_temp_regs() (#12866 )	2025-03-06 10:52:49 -05:00
annichai-stripe	5085ec3ed9	Allow YJIT `mem-size` and `call-threshold` to be set at runtime via `YJIT.enable()` (#12505 ) * first commit * yjit.rb change * revert formatting * rename mem-size to exec-mem-size for correctness * wip, move setting into rb_yjit_enable directly * remove unused helper functions * add in call threshold * input validation with extensive eprintln * delete test script * exec-mem-size -> mem-size * handle input validation with asserts * add test cases related to input validation * modify test cases * move validation out of rs, into rb * add comments * remove trailing spaces * remove logging Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> * remove helper fn * Update test/ruby/test_yjit.rb Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> * trailing white space --------- Co-authored-by: Alan Wu <XrXr@users.noreply.github.com> Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>	2025-03-03 15:45:39 -05:00
Aaron Patterson	6b3a97d74b	Remove undefined function from bindgen `rb_get_iseq_body_total_calls` was removed in `cd8d20cd1f`, but it's still in the YJIT bindgen file. This commit just removes it from bindgen	2025-02-16 16:37:36 -05:00
Aaron Patterson	8cafa5b8ce	Only count VM instructions in YJIT stats builds The instruction counter is slowing multi-Ractor applications. I had changed it to use a thread local, but using a thread local is slowing single threaded applications. This commit only enables the instruction counter in YJIT stats builds until we can figure out a way to gather the information with lower overhead. Co-authored-by: Randy Stauner <randy.stauner@shopify.com>	2025-02-14 14:39:35 -05:00
Alan Wu	41251fdd30	YJIT: Fix linker warnings on macOS for Cargo (development) builds	2025-02-13 17:27:28 -05:00
Peter Zhu	16f41eca53	Remove dead iv_index_tbl field in RObject	2025-02-12 14:03:07 -05:00
dependabot[bot]	afb47a1f10	Bump capstone from 0.12.0 to 0.13.0 in /yjit Bumps [capstone](https://github.com/capstone-rust/capstone-rs) from 0.12.0 to 0.13.0. - [Release notes](https://github.com/capstone-rust/capstone-rs/releases) - [Changelog](https://github.com/capstone-rust/capstone-rs/blob/master/CHANGELOG.md) - [Commits](https://github.com/capstone-rust/capstone-rs/compare/capstone-v0.12.0...capstone-v0.13.0) --- updated-dependencies: - dependency-name: capstone dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>	2025-02-05 11:37:34 +09:00
Alan Wu	9497820bcf	YJIT: Remove comments that refer to the removed "stats" feature The Cargo feature was removed in `2de8b5b805` and it's available in all build configs now. [ci skip]	2025-01-30 18:00:53 -05:00
Alan Wu	95bf359087	YJIT: Turn on dead code lint for the stats module	2025-01-30 18:00:53 -05:00
Alan Wu	7e733ca551	YJIT: Explicitly specify C ABI to fix a nightly Rust warning	2025-01-30 18:00:53 -05:00
Alan Wu	5a7089fc03	YJIT: A64: Remove assert that trips when OOM at page boundary With a well-timed OOM around a page switch in the backend, it can return RetryOnNextPage twice and crash due to the assert. (More places can signal OOM now since VirtualMem tracks Rust malloc heap size for --yjit-mem-size.) Return error in these cases instead of crashing. Fixes: https://github.com/Shopify/ruby/issues/566	2025-01-29 19:09:39 -05:00
Alan Wu	58ccce60cf	YJIT: Initialize locals in ISeqs defined with `...` (#12660 ) * YJIT: Fix indentation [ci skip] Fixes: `cdf33ed5f3` * YJIT: Initialize locals in ISeqs defined with `...` Previously, callers of forwardable ISeqs moved the stack pointer up without writing to the stack. If there happens to be a stale value in the area skipped over, it could crash due to "try to mark T_NONE". Also, the uninitialized local variables were observable through `binding`. Initialize the locals to nil. [Bug #21021]	2025-01-28 23:54:38 -05:00
Alan Wu	4d8eaa9e45	YJIT: Rename send_iseq_forwarding->send_forwarding It's in gen_send_general(), so nothing specifically to do with iseqs.	2025-01-10 18:03:31 -05:00
Aaron Patterson	50c2c4bdde	Make rb_vm_insns_count a thread local variable `rb_vm_insns_count` is a global variable used for reporting YJIT statistics. It is a counter that tallies the number of interpreter instructions that have been executed, this way we can approximate how much time we're spending in YJIT compared to the interpreter. Unfortunately keeping this statistic means that every instruction executed in the interpreter loop must increment the counter. Normally this isn't a problem, but in multi-threaded situations (when Ractors are used), incrementing this counter can become quite costly due to page caching issues. Additionally, since there is no locking when incrementing this global, the count can't really make sense in a multi-threaded environment. This commit changes `rb_vm_insns_count` to a thread local. That way each Ractor has it's own copy of the counter and incrementing the counter becomes quite cheap. Of course this means that in multi-threaded situations, the value doesn't really make sense (but it didn't make sense before because of the lack of locking). The counter is used for YJIT statistics, and since YJIT is basically disabled when Ractors are in use, I don't think we care about inaccuracies (for the time being). We can revisit this counter when we give YJIT multi-threading support, but for the time being this commit restores multi-threaded performance. To test this, I used the benchmark in [Bug #20489]. Here is the performance on Ruby 3.2: ``` $ time RUBY_MAX_CPU=12 ./miniruby -v ../test.rb 8 8 ruby 3.2.0 (2022-12-25 revision `a528908271`) [x86_64-linux] [0...1, 1...2, 2...3, 3...4, 4...5, 5...6, 6...7, 7...8] ../test.rb:43: warning: Ractor is experimental, and the behavior may change in future versions of Ruby! Also there are many implementation issues. ________________________________________________________ Executed in 2.53 secs fish external usr time 19.86 secs 370.00 micros 19.86 secs sys time 0.02 secs 320.00 micros 0.02 secs ``` We can see the regression in performance on the master branch: ``` $ time RUBY_MAX_CPU=12 ./miniruby -v ../test.rb 8 8 ruby 3.5.0dev (2025-01-10T16:22:26Z master `4a2702dafb`) +PRISM [x86_64-linux] [0...1, 1...2, 2...3, 3...4, 4...5, 5...6, 6...7, 7...8] ../test.rb:43: warning: Ractor is experimental, and the behavior may change in future versions of Ruby! Also there are many implementation issues. ________________________________________________________ Executed in 24.87 secs fish external usr time 195.55 secs 0.00 micros 195.55 secs sys time 0.00 secs 716.00 micros 0.00 secs ``` Here are the stats after this commit: ``` $ time RUBY_MAX_CPU=12 ./miniruby -v ../test.rb 8 8 ruby 3.5.0dev (2025-01-10T20:37:06Z tl 3ef0432779) +PRISM [x86_64-linux] [0...1, 1...2, 2...3, 3...4, 4...5, 5...6, 6...7, 7...8] ../test.rb:43: warning: Ractor is experimental, and the behavior may change in future versions of Ruby! Also there are many implementation issues. ________________________________________________________ Executed in 2.46 secs fish external usr time 19.34 secs 381.00 micros 19.34 secs sys time 0.01 secs 321.00 micros 0.01 secs ``` [Bug #20489]	2025-01-10 13:39:21 -08:00
Alan Wu	dd80d9b089	YJIT: Filter `&` calls from specialized C method codegen Evident with the crash reported in [Bug #20997], the C replacement codegen functions aren't authored to handle block arguments (nor should they because the extra code from the complexity defeats optimization). Filter sites with VM_CALL_ARGS_BLOCKARG.	2025-01-08 19:47:39 -05:00
Alan Wu	c71addc522	YJIT: Fix crash when yielding keyword arguments Previously, the code for dropping surplus arguments when yielding into blocks erroneously attempted to drop keyword arguments when there is in fact no surplus arguments. Fix the condition and test that supplying the exact number of keyword arguments as require compiles without fallback.	2025-01-04 12:53:20 -05:00
Takashi Kokubun	527cc73282	YJIT: Return None if entry block compilation fails (#12445 )	2024-12-23 22:12:08 +00:00
Takashi Kokubun	6bf7a1765f	YJIT: Load registers on JIT entry to reuse blocks (#12355 )	2024-12-17 12:32:42 -05:00
Alan Wu	f3a117605c	YJIT: Speculate block arg for `c_func_method(&nil)` calls (#12326 ) A good amount of call sites always pass nil as block argument, but the nil doesn't show up in the context. Put a runtime guard for those cases to handle it. Particular relevant for the `ruby-lsp` benchmark in `yjit-bench`. Up to a 2% speedup across headline benchmarks. Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> Co-authored-by: Kevin Menard <kevin@nirvdrum.com> Co-authored-by: Randy Stauner <randy.stauner@shopify.com>	2024-12-13 10:41:04 -05:00
Alan Wu	e7ee7d43f3	YJIT: Allow then-unknown `static_mut_refs` on older Rusts [ci skip]	2024-12-12 18:52:51 -05:00
Alan Wu	d53e4545f4	YJIT: Fix unread field lint in release builds ``` warning: fields `blue_begin` and `blue_end` are never read ```	2024-12-11 17:44:43 -05:00
Alan Wu	9fe06cc035	YJIT: Disable static_mut_refs for now	2024-12-11 17:44:43 -05:00
Alan Wu	6cb75564f9	YJIT: Use the correct size constant	2024-12-11 17:44:43 -05:00
Takashi Kokubun	14e0a40cd0	YJIT: Add a comment about a lazy frame call jit_prepare_lazy_frame_call is a complicated trick and comes with memory overhead. Every use of the function should come with justification.	2024-12-09 10:09:40 -08:00
Takashi Kokubun	cff031253f	YJIT: Spill/load argument registers to reuse blocks (#12287 ) * YJIT: Spill/load argument registers to reuse blocks * Mention the immediate function name * Explain the context behind spill/load operations	2024-12-09 10:02:40 -08:00
Max Bernstein	8010d79bb4	YJIT: Only enable disassembly colors for tty (#12283 ) * YJIT: Use fully-qualified name for OPTIONS in get_options! * YJIT: Only enable disassembly colors for tty	2024-12-09 10:36:17 -05:00
Maximillian Polhill	1c4dbb133e	YJIT: Generate specialized code for Symbol for objtostring (#12247 ) * YJIT: Generate specialized code for Symbol for objtostring Co-authored-by: John Hawthorn <john@hawthorn.email> * Update yjit/src/codegen.rs --------- Co-authored-by: John Hawthorn <john@hawthorn.email> Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>	2024-12-04 21:34:16 +00:00
Maxime Chevalier-Boisvert	4b4d52ef50	YJIT: track time since initialization (#12263 )	2024-12-04 21:24:36 +00:00
Alan Wu	2fc357c16d	YJIT: Avoid std::ffi::CString with rb_intern2() during boot Fewer allocations on boot, too. Suggested-by: https://github.com/ruby/ruby/pull/12217	2024-11-29 16:45:22 -05:00
John Hawthorn	a5119a3f27	YJIT: Add missing prepare before calling str_dup	2024-11-28 15:04:12 -05:00
Randy Stauner	8f9b9aecd0	YJIT: Implement opt_reverse insn (#12175 )	2024-11-26 16:49:24 -05:00
Randy Stauner	1dd40ec18a	Optimize instructions when creating an array just to call `include?` (#12123 ) * Add opt_duparray_send insn to skip the allocation on `#include?` If the method isn't going to modify the array we don't need to copy it. This avoids the allocation / array copy for things like `[:a, :b].include?(x)`. This adds a BOP for include? and tracks redefinition for it on Array. Co-authored-by: Andrew Novoselac <andrew.novoselac@shopify.com> * YJIT: Implement opt_duparray_send include_p Co-authored-by: Andrew Novoselac <andrew.novoselac@shopify.com> * Update opt_newarray_send to support simple forms of include?(arg) Similar to opt_duparray_send but for non-static arrays. * YJIT: Implement opt_newarray_send include_p --------- Co-authored-by: Andrew Novoselac <andrew.novoselac@shopify.com>	2024-11-26 14:31:08 -05:00
Maxime Chevalier-Boisvert	081bdc5125	YJIT: fix small typo in command line options help (#12167 )	2024-11-25 19:32:19 +00:00
Alan Wu	bf718cef59	YJIT: Make compilation_failure a default stat (#12128 ) It's good to monitor compilation failures.	2024-11-20 17:13:31 -05:00
Alan Wu	350b544468	YJIT: Refactor to forward jump_to_next_insn() return value It's more concise this way and since `return Some(EndBlock)` is the only correct answer, no point repeating it everywhere.	2024-11-20 10:06:14 -05:00
Alan Wu	199877d258	YJIT: Abandon block when gen_outlined_exit() fails When CodeBlock::set_page fails (part of next_page(), see their docs for exact conditions), it can cause gen_outlined_exit() to fail while there is still plenty of memory available. Previously, this can have YJIT running incomplete code due to taking the early return in end_block_with_jump() that manifested as crashes with SIGILL. Add and use a wrapper with error handling.	2024-11-20 10:06:14 -05:00

1 2 3 4 5 ...

1114 commits