Commit graph

457 commits

Author SHA1 Message Date
Alan Wu
f901b934fd YJIT: Make Block::start_addr non-optional
We set the block address as soon as we make the block, so there is no
point in making it `Option<CodePtr>`. No memory saving, unfortunately,
as `mem::size_of::<Block>() = 176` before and after this change. Still
a simplification for the logic, though.
2023-02-03 14:58:01 -05:00
Takashi Kokubun
08c529be90
YJIT: Support ifunc on invokeblock (#7233) 2023-02-03 10:14:42 -05:00
Maxime Chevalier-Boisvert
73674cac2b
YJIT: log the names of methods we call to in disasm (#7231)
* YJIT: log the names of methods we call to in disasm

* Assert that pointer is not null

* Handle case where UTF8 conversion not possible
2023-02-02 16:54:16 -05:00
Alan Wu
92ac5f686b Fix typos in YJIT [ci skip] 2023-02-02 16:16:45 -05:00
Alan Wu
3b83b265f1 YJIT: Crash with rb_bug() when panicking
Helps with getting good bug reports in the wild. Intended to be
backported to the 3.2.x series.
2023-02-02 15:16:09 -05:00
Alan Wu
188688a53e YJIT: ARM64: Fix long jumps to labels
Previously, with Code GC, YJIT panicked while trying to emit a B.cond
instruction with an offset that is not encodable in 19 bits. This only
happens when the code in an assembler instance straddles two pages.

To fix this, when we detect that a jump to a label can land on a
different page, we switch to a fresh new page and regenerate all the
code in the assembler there. We still assume that no one assembler has
so much code that it wouldn't fit inside a fresh new page.

[Bug #19385]

Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
2023-02-02 10:05:00 -05:00
Alan Wu
905e12a30d YJIT: ARM64: Move functions out of arm64_emit() 2023-02-02 10:05:00 -05:00
Alan Wu
a690db390d YJIT: other_cb is None in tests
Since the other cb is in CodegenGlobals, and we want Rust tests to be
self-contained.
2023-02-02 10:05:00 -05:00
Alan Wu
81b7f86f47 YJIT: Move CodegenGlobals::freed_pages into an Rc
This allows for supplying a freed_pages vec in Rust tests. We need it so we
can test scenarios that occur after code GC.
2023-02-02 10:05:00 -05:00
Koichi Sasada
0a82bfe5e1
use correct svar (#7225)
* use correct svar

Without this patch, svar location is used "nearest Ruby frame".
It is almost correct but it doesn't correct when the `each` method
is written in Ruby.

```ruby
class C
  include Enumerable
  def each
    %w(bar baz).each{|e| yield e}
  end
end

C.new.grep(/(b.)/){|e| p [$1, e]}
```

This patch fix this issue by traversing ifunc's cfp.

Note that if cfp doesn't specify this Thread's cfp stack, reserved
svar location (`ec->root_svar`) is used.

* make yjit-bindgen

---------

Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
2023-02-01 16:13:19 -08:00
Maxime Chevalier-Boisvert
2675f2c864 Remove whitespace 2023-02-01 16:05:22 -05:00
Jimmy Miller
1148fab7ae
YJIT: Handle splat with opt more fully (#7209)
* YJIT: Handle splat with opt more fully

* Update yjit/src/codegen.rs

---------

Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
2023-01-31 16:18:56 -05:00
Takashi Kokubun
e11067ebbf
YJIT: Fix BorrowMutError on BOP invalidation (#7212) 2023-01-31 15:26:56 -05:00
Alan Wu
eac5ae22e2 YJIT: Group unimplemented method types together
Grouping these together helps with finding all of the unimplemented
method types. It was interleaved with some other match arm long and
short previously.
2023-01-31 14:29:18 -05:00
Takashi Kokubun
2a0bf269c9
YJIT: Implement codegen for Kernel#block_given? (#7202) 2023-01-31 10:11:10 -05:00
Nobuyoshi Nakada
be81495c16
Silence dozens of useless warnings from nm on macOS 2023-01-31 19:42:01 +09:00
Jimmy Miller
07d1b3ddc3
YJIT: Add splat optimized_send (#7167) 2023-01-30 15:54:09 -05:00
Jimmy Miller
b32e1169c9
YJIT: Initial implementation of splat with optional params (#7166) 2023-01-30 15:51:55 -05:00
Takashi Kokubun
2e0f3b5546
YJIT: Fix BorrowMutError on GC.compact (#7176)
YJIT: Fix BorrowMutError
2023-01-30 11:16:33 -08:00
Takashi Kokubun
bc0dc9d40e
YJIT: Skip defer_compilation for fixnums if possible (#7168)
* YJIT: Skip defer_compilation for fixnums if possible

* YJIT: It should be Some(false)

* YJIT: Define two_fixnums_on_stack on Context
2023-01-30 13:55:00 -05:00
Alan Wu
e1ffafb285
YJIT: Inline return address callback (#7198)
This makes it so that the generator and the output code read in the same
order. I think it reads better this way.
2023-01-30 12:50:08 -05:00
Alan Wu
7d4395cb69 YJIT: Fix shared/static library symbol leaks
Rust 1.58.0 unfortunately doesn't provide facilities to control symbol
visibility/presence, but we care about controlling the list of
symbols exported from libruby-static.a and libruby.so.

This commit uses `ld -r` to make a single object out of rustc's
staticlib output, libyjit.a. This moves libyjit.a out of MAINLIBS and adds
libyjit.o into COMMONOBJS, which obviates the code for merging libyjit.a
into libruby-static.a. The odd appearance of libyjit.a in SOLIBS is also
gone.

To filter out symbols we do not want to export on ELF platforms, we use
objcopy after the partial link. On darwin, we supply a symbol list to
the linker which takes care of hiding unprefixed symbols.

[Bug #19255]

Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
2023-01-27 12:28:09 -05:00
Takashi Kokubun
887d21613c
YJIT: Avoid BorrowError on GC.compact (#7164) 2023-01-20 13:07:03 -08:00
Jimmy Miller
36fa4f13ca
YJIT: get rid of unneeded .into() 2023-01-20 10:57:41 -05:00
Jimmy Miller
bf3940a306 YJIT: Refactor side_exits 2023-01-19 16:10:58 -05:00
Takashi Kokubun
5ce0c13f18
YJIT: Remove duplicated information in BranchTarget (#7151)
Note: On the new code of yjit/src/core.rs:2178, we no longer leave the state `.block=None` but `.address=Some...`, which might be important.

We assume it's actually not needed and take a risk here to minimize heap allocations, but in case it turns out to be necessary, we could signal/resurrect that state by introducing a new BranchTarget (or BranchShape) variant dedicated to it.
2023-01-19 12:02:25 -08:00
Jimmy Miller
762a3d80f7
Implement splat for cfuncs. Split exit exit cases to better capture where we are exiting (#6929)
YJIT: Implement splat for cfuncs. Split exit cases

This also implements a new check for ruby2keywords as the last
argument of a splat. This does mean that we generate more code, but in
actual benchmarks where we gained speed from this (binarytrees) I
don't see any significant slow down. I did have to struggle here with
the register allocator to find code that didn't allocate too many
registers. It's a bit hard when everything is implicit. But I think I
got to the minimal amount of copying and stuff given our current
allocation strategy.
2023-01-19 13:42:49 -05:00
Alan Wu
4b42392f8e YJIT: Use .as_side_exit() for jumps to counted exits
Fewer cycles running nops when these jumps are not taken. Fixing all
these so when they get copy pasted in the future we save on padding.
2023-01-18 20:52:19 -05:00
Maxime Chevalier-Boisvert
6bb576fe75
YJIT: implement codegen for String#empty? (#7148)
YJIT: implement codegen for String#empty?
2023-01-18 15:41:28 -05:00
Maxime Chevalier-Boisvert
cd97976328
Add stats so we can keep track of x86 rel32 vs register calls (#7142)
* Add stats so we can keep track of x86 rel32 vs register calls

To know if we get that "prime real estate" as Alan put it.

* Fix bug pointed by Alan
2023-01-18 11:08:55 -05:00
Alan Wu
14fe7a081a YJIT: Use ThinLTO for Rust parts in release mode
This reduces the code size of libyjit.a by a lot. On darwin it went from
23 MiB to 12 MiB for me. I chose ThinLTO over fat LTO for the relatively
fast build time; in case we need to debug release-build-only problems
it won't be painful.
2023-01-16 17:32:15 -05:00
Alan Wu
b4cdde468b YJIT: Use SIZEOF_VALUE_I32 instead of ... as i32
Shorter, and easier to parse without parentheses.
2023-01-13 15:32:28 -05:00
Alan Wu
84b1f48891 YJIT: Factor out VALUE_BITS = (8 * SIZE_OF_VALUE as u8)
Using a constant shows intention better and is less noisy. It always
took me a second to parse the long expression.
2023-01-13 15:32:28 -05:00
Ian Ker-Seymer
8d3ff66389
Enable clippy checks for yjit in CI (#7093)
* Add job to check clippy lints in CI

* Address all remaining clippy lints

* Check lints on arm64 as well

* Apply latest clippy lints

* Do not exit 0 on clippy warnings
2023-01-12 10:14:17 -05:00
Nobuyoshi Nakada
cc15963aa3
Strip trailing spaces [ci skip] 2023-01-12 09:29:56 +09:00
Takashi Kokubun
3642006872
YJIT: Add a few asm comments (#7105)
* YJIT: Add a few asm comments

* YJIT: Clarify exiting insns

* YJIT: Fix cargo test
2023-01-11 11:12:15 -08:00
Aaron Patterson
5bf7218b01
Differentiate T_ARRAY and array subclasses (#7091)
* Differentiate T_ARRAY and array subclasses

This commit teaches the YJIT context the difference between Arrays
(objects with type T_ARRAY and class rb_cArray) vs Array subclasses
(objects with type T_ARRAY but _not_ class rb_cArray).  It uses this
information to reduce the number of guards emitted when using
`jit_guard_known_klass` with rb_cArray, notably opt_aref

* Update yjit/src/core.rs

Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
2023-01-10 13:54:07 -05:00
Alan Wu
aeddc19340
YJIT: Save PC and SP before calling leaf builtins (#7090)
Previously, we did not update `cfp->sp` before calling the C function of
ISEQs marked with `Primitive.attr! "inline"` (leaf builtins). This
caused the GC to miss temporary values on the stack in case the function
allocates and triggers a GC run. Right now, there is only a few leaf
builtins in numeric.rb on Integer methods such as `Integer#~`. Since
these methods only allocate when operating on big numbers, we missed
this issue.

Fix by saving PC and SP before calling the functions -- our usual
protocol for calling C functions that may allocate on the GC heap.

[Bug #19316]
2023-01-10 11:11:10 -05:00
Takashi Kokubun
6a585dbd5a
YJIT: Fix a compilation warning with release build (#7092)
warning: unused variable: `start_addr`
   --> ../yjit/src/asm/mod.rs:359:39
    |
359 |     pub fn remove_comments(&mut self, start_addr: CodePtr, end_addr: CodePtr) {
    |                                       ^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_start_addr`
    |
    = note: `#[warn(unused_variables)]` on by default

warning: unused variable: `end_addr`
   --> ../yjit/src/asm/mod.rs:359:60
    |
359 |     pub fn remove_comments(&mut self, start_addr: CodePtr, end_addr: CodePtr) {
    |
2023-01-10 11:00:25 -05:00
Takashi Kokubun
a7fbdc35a2
YJIT: Remove old comments for regenerated branches (#7083) 2023-01-09 11:29:41 -05:00
Takashi Kokubun
00d58afb5d
YJIT: Make iseq_get_location consistent with iseq.c (#7074)
* YJIT: Make iseq_get_location consistent with iseq.c

* YJIT: Call it "YJIT entry point"

Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>

Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
2023-01-06 11:49:59 -08:00
Takashi Kokubun
311ce91733
YJIT: Colorize outlined code differently on --yjit-dump-disasm (#7073)
* YJIT: Colorize outlined code differently

on --yjit-dump-disasm

* YJIT: Reduce the number of escape sequences
2023-01-06 11:49:45 -08:00
Aaron Patterson
6c618cb789 Use a different name for megamorphic setivar exits
We should differentiate between set and get for megamorphic exits.  This
patch fixes the megamorphic exit name in gen_setinstancevariable so that
we can tell the difference between megamorphic get / set sites
2023-01-05 17:49:30 -08:00
Alan Wu
c240a18968 YJIT: Dump spill error to stderr [ci skip]
Since the panic message is in stderr, better to use the same stream in
case stdout and stderr are not synced due to IO redirection.
2023-01-03 16:33:47 -05:00
Alan Wu
43ff0c2c48 YJIT: Fix yield into block with >=30 locals on ARM
It's a register spill issue. Fix by moving the Qnil fill snippet to
after registers are released.

[Bug #19299]
2023-01-03 16:17:50 -05:00
Takashi Kokubun
1d3bfd804c
MJIT: Export fewer shape functions (#7007) 2022-12-23 10:18:57 -08:00
John Hawthorn
fbaa5db44a Use a BOP for Hash#default
On a hash miss we need to call default if it is redefined in order to
return the default value to be used. Previously we checked this with
rb_method_basic_definition_p, which avoids the method call but requires
a method lookup.

This commit replaces the previous check with BASIC_OP_UNREDEFINED_P and
a new BOP_DEFAULT. We still need to fall back to
rb_method_basic_definition_p when called on a subclasss of hash.

    |                |compare-ruby|built-ruby|
    |:---------------|-----------:|---------:|
    |hash_aref_miss  |       2.692|     3.531|
    |                |           -|     1.31x|

Co-authored-by: Daniel Colson <danieljamescolson@gmail.com>
Co-authored-by: "Ian C. Anderson" <ian@iancanderson.com>
Co-authored-by: Jack McCracken <me@jackmc.xyz>
2022-12-17 14:51:49 -08:00
Alan Wu
14158f1f8c
YJIT: Fix obj.send(:call)
All the method call types need to handle argument shifting in case they're
called by `.send`, and we weren't handling that in `OPTIMIZED_METHOD_TYPE_CALL`.

Lack of shifting caused the stack size assertion in gen_leave() to fail.

Discovered by Rails CI: https://buildkite.com/rails/rails/builds/91705#018516c4-f8f8-469e-bc2d-ddeb25ca8317/1920-2067
Diagnosed with help from `@eileencodes` and `@k0kubun`.
2022-12-15 18:10:28 -05:00
Peter Zhu
c505448cdb Move definition of SIZE_POOL_COUNT back to gc.h
SIZE_POOL_COUNT is a GC macro, it should belong in gc.h and not shape.h.
SIZE_POOL_COUNT doesn't depend on shape.h so we can have shape.h depend
on gc.h.

Co-Authored-By: Matt Valentine-House <matt@eightbitraptor.com>
2022-12-15 16:33:46 -05:00
Alan Wu
5fa608ed79
YJIT: Fix code GC freeing stubs with a trampoline (#6937)
Stubs we generate for invalidation don't necessarily co-locate with the
code that jump to the stub. Since we rely on co-location to keep stubs
alive as they are in the outlined code block, it used to be possible for
code GC inside branch_stub_hit() to free the stub that's its direct
caller, leading us to return to freed code after.

Stubs used to look like:

```
mov arg0, branch_ptr
mov arg1, target_idx
mov arg2, ec
call branch_stub_hit
jmp return_reg
```

Since the call and the jump after the call is the same for all stubs, we
can extract them and use a static trampoline for them. That makes
branch_stub_hit() always return to static code. Stubs now look like:

```
mov arg0, branch_ptr
mov arg1, target_idx
jmp trampoline
```

Where the trampoline is:

```
mov arg2, ec
call branch_stub_hit
jmp return_reg
```

Code GC can now free stubs without problems since we'll always return
to the trampoline, which we generate once on boot and lives forever.

This might save a small bit of memory due to factoring out the static
part of stubs, but it's probably minor.

[Bug #19234]

Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
2022-12-15 15:10:14 -05:00