Commit graph

1109 commits

Author SHA1 Message Date
Koichi Sasada
b1b73936c1 Primitive.mandatory_only? for fast path
Compare with the C methods, A built-in methods written in Ruby is
slower if only mandatory parameters are given because it needs to
check the argumens and fill default values for optional and keyword
parameters (C methods can check the number of parameters with `argc`,
so there are no overhead). Passing mandatory arguments are common
(optional arguments are exceptional, in many cases) so it is important
to provide the fast path for such common cases.

`Primitive.mandatory_only?` is a special builtin function used with
`if` expression like that:

```ruby
  def self.at(time, subsec = false, unit = :microsecond, in: nil)
    if Primitive.mandatory_only?
      Primitive.time_s_at1(time)
    else
      Primitive.time_s_at(time, subsec, unit, Primitive.arg!(:in))
    end
  end
```

and it makes two ISeq,

```
  def self.at(time, subsec = false, unit = :microsecond, in: nil)
    Primitive.time_s_at(time, subsec, unit, Primitive.arg!(:in))
  end

  def self.at(time)
    Primitive.time_s_at1(time)
  end
```

and (2) is pointed by (1). Note that `Primitive.mandatory_only?`
should be used only in a condition of an `if` statement and the
`if` statement should be equal to the methdo body (you can not
put any expression before and after the `if` statement).

A method entry with `mandatory_only?` (`Time.at` on the above case)
is marked as `iseq_overload`. When the method will be dispatch only
with mandatory arguments (`Time.at(0)` for example), make another
method entry with ISeq (2) as mandatory only method entry and it
will be cached in an inline method cache.

The idea is similar discussed in https://bugs.ruby-lang.org/issues/16254
but it only checks mandatory parameters or more, because many cases
only mandatory parameters are given. If we find other cases (optional
or keyword parameters are used frequently and it hurts performance),
we can extend the feature.
2021-11-15 15:58:56 +09:00
Peter Zhu
84202963c5 [Bug #18329] Fix crash when calling non-existent super method
The cme is NULL when a method does not exist, so check it before
accessing the callcache.
2021-11-11 14:08:38 -05:00
Yusuke Endoh
c1228f833c vm_core.h: Avoid unaligned access to ic_serial on 32-bit machine
This caused Bus error on 32 bit Solaris
2021-10-29 10:57:46 +09:00
Satoshi Moris Tagomori
489e5e3a82 the core problem is the Proc is not shareable 2021-10-27 16:13:43 +09:00
Alan Wu
b74d6563a6 Extract yjit_force_iv_index and make it work when object is frozen
In an effort to simplify the logic YJIT generates for accessing instance
variable, YJIT ensures that a given name-to-index mapping exists at
compile time. In the case that the mapping doesn't exist, it was created
by using rb_ivar_set() with Qundef on the sample object we see at
compile time. This hack isn't fine if the sample object happens to be
frozen, in which case YJIT would raise a FrozenError unexpectedly.

To deal with this, make a new function that only reserves the mapping
but doesn't touch the object. This is rb_obj_ensure_iv_index_mapping().
This new function superceeds the functionality of rb_iv_index_tbl_lookup()
so it was removed.

Reported by and includes a test case from John Hawthorn <john@hawthorn.email>

Fixes: GH-282
2021-10-20 18:19:43 -04:00
Alan Wu
5906a5a732 Add comments about special runtime routines YJIT calls
When YJIT make calls to routines without reconstructing interpreter
state through jit_prepare_routine_call(), it relies on the routine to
never allocate, raise, and push/pop control frames. Comment about this
on the routines that YJTI calls.

This is probably something we should dynamically verify on debug builds.
It's hard to statically verify this as it requires verifying all
functions in the call tree. Maybe something to look at in the future.
2021-10-20 18:19:43 -04:00
Alan Wu
7c08538aa3 Cleanup diff against upstream. Add comments
I did a `git diff --stat` against upstream and looked at all the files
that are outside of YJIT to come up with these minor changes.
2021-10-20 18:19:42 -04:00
Alan Wu
f6da559d5b Put YJIT into a single compilation unit
For upstreaming, we want functions we export either prefixed with "rb_"
or made static. Historically we haven't been following this rule, so we
were "leaking" a lot of symbols as `make leak-globals` would tell us.

This change unifies everything YJIT into a single compilation unit,
yjit.o, and makes everything unprefixed static to pass `make leak-globals`.
This manual "unified build" setup is similar to that of vm.o.

Having everything in one compilation unit allows static functions to
be visible across YJIT files and removes the need for declarations in
headers in some cases. Unnecessary declarations were removed.

Other changes of note:
  - switched to MJIT_SYMBOL_EXPORT_BEGIN which indicates stuff as being
    off limits for native extensions
  - the first include of each YJIT file is change to be "internal.h"
  - undefined MAP_STACK before explicitly redefining it since it
    collide's with a definition in system headers. Consider renaming?
2021-10-20 18:19:42 -04:00
eileencodes
c8e157bb5c Implement getclassvariable in yjit
Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2021-10-20 18:19:42 -04:00
Alan Wu
78b5e95e41 Add a slowpath for opt_getinlinecache
Before this change, when we encounter a constant cache that is specific
to a lexical scope, we unconditionally exit. This change falls back to
the interpreter's cache in this situation.

This should help constant expressions in `class << self`, which is popular
at Shopify due to the style guide.

This change relies on the cache being warm while compiling to detect the
need for checking the lexical scope for simplicity.
2021-10-20 18:19:41 -04:00
John Hawthorn
5e37f280d1 Remove vm_opt_aset 2021-10-20 18:19:41 -04:00
Aaron Patterson
bf8557f487 Add comments for new function 2021-10-20 18:19:40 -04:00
Aaron Patterson
5bc0343261 Refactor attrset to use a function
This new function will do the write barrier / resize the object / check
frozen for us
2021-10-20 18:19:40 -04:00
John Hawthorn
10f1d808d5 Remove rb_opt_equality_specialized 2021-10-20 18:19:40 -04:00
Kevin Newton
be648e0940 Implement splatarray 2021-10-20 18:19:37 -04:00
Maxime Chevalier-Boisvert
da30f21ab5 Try to fix MJIT symbol clash with cargo cult 2021-10-20 18:19:35 -04:00
Maxime Chevalier-Boisvert
54fe43b45c Implement defined bytecode (#39) 2021-10-20 18:19:35 -04:00
Maxime Chevalier-Boisvert
d6412126bc Implement setivar with a plain old function call (#34)
* Implement setivar with a plain old function call

* Remove return
2021-10-20 18:19:35 -04:00
Maxime Chevalier-Boisvert
0c3842d154 Implement opt_aset as interpreter handler call 2021-10-20 18:19:35 -04:00
Maxime Chevalier-Boisvert
c9feb72b65 Implement opt_mod as call to interpreter function (#29) 2021-10-20 18:19:35 -04:00
Maxime Chevalier-Boisvert
e2c1d69331 Implement opt_eq by calling interpreter function (#28) 2021-10-20 18:19:35 -04:00
Maxime Chevalier-Boisvert
e5f8b41786 Implement send with alias method (#23)
* Implement send with alias method

* Add alias_method tests
2021-10-20 18:19:34 -04:00
Alan Wu
36134f7d29 Implement calls to methods with simple optional params
* Implement calls to methods with simple optional params

* Remove unnecessary MJIT_STATIC

See comment for MJIT_STATIC. I added it not knowing whether it's
required because the function next to it has it. Don't use it and wait
for problems to come up instead.

* Better naming, some comments

* Count bailing on kw only iseqs

On railsbench:
```
opt_send_without_block exit reasons:
                  bmethod      59729 (27.7%)
         optimized_method      59137 (27.5%)
      iseq_complex_callee      41362 (19.2%)
             alias_method      33346 (15.5%)
      callsite_not_simple      19170 ( 8.9%)
       iseq_only_keywords       1300 ( 0.6%)
                 kw_splat       1299 ( 0.6%)
    cfunc_ruby_array_varg         18 ( 0.0%)
```
2021-10-20 18:19:34 -04:00
Alan Wu
b626dd7211 YJIT: Fancier opt_getinlinecache
Make sure `opt_getinlinecache` is in a block all on its own, and
invalidate it from the interpreter when `opt_setinlinecache`.
It will recompile with a filled cache the second time around.
This lets YJIT runs well when the IC for constant is cold.
2021-10-20 18:19:33 -04:00
Alan Wu
5d834bcf9f YJIT: lazy polymorphic getinstancevariable
Lazily compile out a chain of checks for different known classes and
whether `self` embeds its ivars or not.

* Remove trailing whitespaces

* Get proper addresss in Capstone disassembly

* Lowercase address in Capstone disassembly

Capstone uses lowercase for jump targets in generated listings. Let's
match it.

* Use the same successor in getivar guard chains

Cuts down on duplication

* Address reviews

* Fix copypasta error

* Add a comment
2021-10-20 18:19:31 -04:00
Maxime Chevalier-Boisvert
9d8cc01b75 WIP JIT-to-JIT returns 2021-10-20 18:19:28 -04:00
Nobuyoshi Nakada
8d6dbecc80
Remove useless casts 2021-10-19 17:09:32 +09:00
Nobuyoshi Nakada
ec021e469d
Get rid of type-punning cast 2021-10-19 17:08:25 +09:00
S.H
dc9112cf10
Using NIL_P macro instead of == Qnil 2021-10-03 22:34:45 +09:00
Jeremy Evans
9b04909a85 Introduce rb_vm_call_with_refinements to DRY up a few calls 2021-10-01 08:12:46 -09:00
Jeremy Evans
1f5f8a187a
Make Array#min/max optimization respect refined methods
Pass in ec to vm_opt_newarray_{max,min}. Avoids having to
call GET_EC inside the functions, for better performance.

While here, add a test for Array#min/max being redefined to
test_optimization.rb.

Fixes [Bug #18180]
2021-09-30 15:18:14 -07:00
Nobuyoshi Nakada
70624ae43d
Extract hook macro for attributes 2021-09-19 16:27:47 +09:00
Nobuyoshi Nakada
cd829bb078 Remove printf family from the mjit header
Linking printf family functions makes mjit objects to link
unnecessary code.
2021-09-11 08:41:32 +09:00
Jeremy Evans
2d98593bf5 Support tracing of attr_reader and attr_writer
In vm_call_method_each_type, check for c_call and c_return events before
dispatching to vm_call_ivar and vm_call_attrset.  With this approach, the
call cache will still dispatch directly to those functions, so this
change will only decrease performance for the first (uncached) call, and
even then, the performance decrease is very minimal.

This approach requires that we clear the call caches when tracing is
enabled or disabled.  The approach currently switches all vm_call_ivar
and vm_call_attrset call caches to vm_call_general any time tracing is
enabled or disabled. So it could theoretically result in a slowdown for
code that constantly enables or disables tracing.

This approach does not handle targeted tracepoints, but from my testing,
c_call and c_return events are not supported for targeted tracepoints,
so that shouldn't matter.

This includes a benchmark showing the performance decrease is minimal
if detectable at all.

Fixes [Bug #16383]
Fixes [Bug #10470]

Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
2021-08-29 07:23:39 -07:00
Nobuyoshi Nakada
a0a8f2abf5 Get rid of type-punning pointer casts [Bug #18062] 2021-08-11 12:07:44 +09:00
S.H
378e8cdad6
Using RBOOL macro 2021-08-02 12:06:44 +09:00
Nobuyoshi Nakada
0700ee0e94
Refactor class variable cache functions
Extracted repeated code as update_classvariable_cache.  When cvc
table is not set in getclassvariable, an empty table was created
but it has no id and would cause [BUG], so made the code same as
setclassvariable.
2021-06-23 10:55:22 +09:00
eileencodes
b91b3bc771 Add a cache for class variables
Redo of 34a2acdac7 and
931138b006 which were reverted.

GitHub PR #4340.

This change implements a cache for class variables. Previously there was
no cache for cvars. Cvar access is slow due to needing to travel all the
way up th ancestor tree before returning the cvar value. The deeper the
ancestor tree the slower cvar access will be.

The benefits of the cache are more visible with a higher number of
included modules due to the way Ruby looks up class variables. The
benchmark here includes 26 modules and shows with the cache, this branch
is 6.5x faster when accessing class variables.

```
compare-ruby: ruby 3.1.0dev (2021-03-15T06:22:34Z master 9e5105c) [x86_64-darwin19]
built-ruby: ruby 3.1.0dev (2021-03-15T12:12:44Z add-cache-for-clas.. c6be009) [x86_64-darwin19]

|         |compare-ruby|built-ruby|
|:--------|-----------:|---------:|
|vm_cvar  |      5.681M|   36.980M|
|         |           -|     6.51x|
```

Benchmark.ips calling `ActiveRecord::Base.logger` from within a Rails
application. ActiveRecord::Base.logger has 71 ancestors. The more
ancestors a tree has, the more clear the speed increase. IE if Base had
only one ancestor we'd see no improvement. This benchmark is run on a
vanilla Rails application.

Benchmark code:

```ruby
require "benchmark/ips"
require_relative "config/environment"

Benchmark.ips do |x|
  x.report "logger" do
    ActiveRecord::Base.logger
  end
end
```

Ruby 3.0 master / Rails 6.1:

```
Warming up --------------------------------------
              logger   155.251k i/100ms
Calculating -------------------------------------
```

Ruby 3.0 with cvar cache /  Rails 6.1:

```
Warming up --------------------------------------
              logger     1.546M i/100ms
Calculating -------------------------------------
              logger     14.857M (± 4.8%) i/s -     74.198M in   5.006202s
```

Lastly we ran a benchmark to demonstate the difference between master
and our cache when the number of modules increases. This benchmark
measures 1 ancestor, 30 ancestors, and 100 ancestors.

Ruby 3.0 master:

```
Warming up --------------------------------------
            1 module     1.231M i/100ms
          30 modules   432.020k i/100ms
         100 modules   145.399k i/100ms
Calculating -------------------------------------
            1 module     12.210M (± 2.1%) i/s -     61.553M in   5.043400s
          30 modules      4.354M (± 2.7%) i/s -     22.033M in   5.063839s
         100 modules      1.434M (± 2.9%) i/s -      7.270M in   5.072531s

Comparison:
            1 module: 12209958.3 i/s
          30 modules:  4354217.8 i/s - 2.80x  (± 0.00) slower
         100 modules:  1434447.3 i/s - 8.51x  (± 0.00) slower
```

Ruby 3.0 with cvar cache:

```
Warming up --------------------------------------
            1 module     1.641M i/100ms
          30 modules     1.655M i/100ms
         100 modules     1.620M i/100ms
Calculating -------------------------------------
            1 module     16.279M (± 3.8%) i/s -     82.038M in   5.046923s
          30 modules     15.891M (± 3.9%) i/s -     79.459M in   5.007958s
         100 modules     16.087M (± 3.6%) i/s -     81.005M in   5.041931s

Comparison:
            1 module: 16279458.0 i/s
         100 modules: 16087484.6 i/s - same-ish: difference falls within error
          30 modules: 15891406.2 i/s - same-ish: difference falls within error
```

Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2021-06-18 10:02:44 -07:00
Aaron Patterson
07f055bb13
Revert "Filling cache values on cvar write"
This reverts commit 08de37f9fa.
This reverts commit e8ae922b62.
2021-05-11 13:31:00 -07:00
eileencodes
08de37f9fa Filling cache values on cvar write
Instead of on read. Once it's in the inline cache we never have to make
one again. We want to eventually put the value into the cache, and the
best opportunity to do that is when you write the value.
2021-05-11 12:04:27 -07:00
eileencodes
e8ae922b62 Add a cache for class variables
This change implements a cache for class variables. Previously there was
no cache for cvars. Cvar access is slow due to needing to travel all the
way up th ancestor tree before returning the cvar value. The deeper the
ancestor tree the slower cvar access will be.

The benefits of the cache are more visible with a higher number of
included modules due to the way Ruby looks up class variables. The
benchmark here includes 26 modules and shows with the cache, this branch
is 6.5x faster when accessing class variables.

```
compare-ruby: ruby 3.1.0dev (2021-03-15T06:22:34Z master 9e5105ca45) [x86_64-darwin19]
built-ruby: ruby 3.1.0dev (2021-03-15T12:12:44Z add-cache-for-clas.. c6be0093ae) [x86_64-darwin19]

|         |compare-ruby|built-ruby|
|:--------|-----------:|---------:|
|vm_cvar  |      5.681M|   36.980M|
|         |           -|     6.51x|
```

Benchmark.ips calling `ActiveRecord::Base.logger` from within a Rails
application. ActiveRecord::Base.logger has 71 ancestors. The more
ancestors a tree has, the more clear the speed increase. IE if Base had
only one ancestor we'd see no improvement. This benchmark is run on a
vanilla Rails application.

Benchmark code:

```ruby
require "benchmark/ips"
require_relative "config/environment"

Benchmark.ips do |x|
  x.report "logger" do
    ActiveRecord::Base.logger
  end
end
```

Ruby 3.0 master / Rails 6.1:

```
Warming up --------------------------------------
              logger   155.251k i/100ms
Calculating -------------------------------------
```

Ruby 3.0 with cvar cache /  Rails 6.1:

```
Warming up --------------------------------------
              logger     1.546M i/100ms
Calculating -------------------------------------
              logger     14.857M (± 4.8%) i/s -     74.198M in   5.006202s
```

Lastly we ran a benchmark to demonstate the difference between master
and our cache when the number of modules increases. This benchmark
measures 1 ancestor, 30 ancestors, and 100 ancestors.

Ruby 3.0 master:

```
Warming up --------------------------------------
            1 module     1.231M i/100ms
          30 modules   432.020k i/100ms
         100 modules   145.399k i/100ms
Calculating -------------------------------------
            1 module     12.210M (± 2.1%) i/s -     61.553M in   5.043400s
          30 modules      4.354M (± 2.7%) i/s -     22.033M in   5.063839s
         100 modules      1.434M (± 2.9%) i/s -      7.270M in   5.072531s

Comparison:
            1 module: 12209958.3 i/s
          30 modules:  4354217.8 i/s - 2.80x  (± 0.00) slower
         100 modules:  1434447.3 i/s - 8.51x  (± 0.00) slower
```

Ruby 3.0 with cvar cache:

```
Warming up --------------------------------------
            1 module     1.641M i/100ms
          30 modules     1.655M i/100ms
         100 modules     1.620M i/100ms
Calculating -------------------------------------
            1 module     16.279M (± 3.8%) i/s -     82.038M in   5.046923s
          30 modules     15.891M (± 3.9%) i/s -     79.459M in   5.007958s
         100 modules     16.087M (± 3.6%) i/s -     81.005M in   5.041931s

Comparison:
            1 module: 16279458.0 i/s
         100 modules: 16087484.6 i/s - same-ish: difference falls within error
          30 modules: 15891406.2 i/s - same-ish: difference falls within error
```

Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2021-05-11 12:04:27 -07:00
Nobuyoshi Nakada
0bbab1e515
Protoized old pre-ANSI K&R style declarations and definitions 2021-05-07 00:04:36 +09:00
Alan Wu
dee58d7ae7
Add back checks for empty kw splat with tests (#4405)
This reverts commit a224ce8150.
Turns out the checks are needed to handle splatting an array with an
empty ruby2 keywords hash.
2021-04-23 22:17:20 -04:00
Jeremy Evans
1f2b5c6dfe Remove part of comment that is no longer accurate
In Ruby 2.7, empty keyword splats could be added back for backwards
compatibility.  However, that stopped in Ruby 3.0.
2021-04-23 16:44:01 -07:00
Alan Wu
a224ce8150
Remove unnecessary checks for empty kw splat
These two checks are surrounded by an if that ensures the
call site is not a kw splat call site.
2021-04-23 19:37:03 -04:00
Koichi Sasada
ecfa8dcdba fix return from orphan Proc in lambda
A "return" statement in a Proc in a lambda like:
  `lambda{ proc{ return }.call }`
should return outer lambda block. However, the inner Proc can become
orphan Proc from the lambda block. This "return" escape outer-scope
like method, but this behavior was decieded as a bug.
[Bug #17105]

This patch raises LocalJumpError by checking the proc is orphan or
not from lambda blocks before escaping by "return".

Most of tests are written by Jeremy Evans
https://github.com/ruby/ruby/pull/4223
2021-04-02 09:25:33 +09:00
Aaron Patterson
04a814931a return bool instead of VALUE 2021-03-17 10:55:37 -07:00
Aaron Patterson
ea817c60fc Refactor vm_defined to return a boolean
We just need this function to return whether or not the thing we're
looking for is defined.  If it's defined, return something true,
otherwise false.
2021-03-17 10:55:37 -07:00
Aaron Patterson
c3971bea33 Stop calling rb_iseq_defined_string in vm_defined
We already have access to the string from the iseqs, so we can stop
calling this function.
2021-03-17 10:55:37 -07:00
John Hawthorn
9d0ae387c8 Remove DEFINED_IVAR2 from enum
This version of defined? doesn't seem to be possible to emit anymore.
2021-03-10 09:38:20 -08:00