Commit graph

1228 commits

Author SHA1 Message Date
S-H-GAMELINKS
83a5e2bb5c Using RB_FLOAT_TYPE_P macro 2021-09-12 11:16:31 +09:00
S-H-GAMELINKS
56065f0686 Using SYMBOL_P macro 2021-09-11 08:48:56 +09:00
Nobuyoshi Nakada
cfbf2bde40
Remove unused argument 2021-09-10 21:26:16 +09:00
卜部昌平
dddc618d30 suppress GCC's -Wsuggest-attribute=format
I was not aware of this because I use clang these days.
2021-09-10 20:00:06 +09:00
S-H-GAMELINKS
bdd6d8746f Replace RBOOL macro 2021-09-05 23:01:27 +09:00
Nobuyoshi Nakada
cb3df3d87b
Extract compile_attrasgn from iseq_compile_each0 2021-09-01 15:19:11 +09:00
Nobuyoshi Nakada
aac2b0fc6b
Extract compile_kw_arg from iseq_compile_each0 2021-09-01 15:19:11 +09:00
Nobuyoshi Nakada
cbf841e3ed
Extract compile_errinfo from iseq_compile_each0 2021-09-01 15:19:11 +09:00
Nobuyoshi Nakada
d7bba95eba
Extract compile_dots from iseq_compile_each0 2021-09-01 15:19:10 +09:00
Nobuyoshi Nakada
d58143f3b5
Extract compile_colon3 from iseq_compile_each0 2021-09-01 15:19:10 +09:00
Nobuyoshi Nakada
70c8155d8b
Extract compile_colon2 from iseq_compile_each0 2021-09-01 15:19:10 +09:00
Nobuyoshi Nakada
270a674a79
Extract compile_match from iseq_compile_each0 2021-09-01 15:19:09 +09:00
Nobuyoshi Nakada
a92fdc90da
Extract compile_yield from iseq_compile_each0 2021-09-01 15:19:09 +09:00
Nobuyoshi Nakada
996489d7e0
Extract compile_super from iseq_compile_each0 2021-09-01 15:19:09 +09:00
Nobuyoshi Nakada
6cf9f17191
Extract compile_op_log from iseq_compile_each0 2021-09-01 15:19:08 +09:00
Nobuyoshi Nakada
d045d5f860
Extract compile_op_cdecl from iseq_compile_each0 2021-09-01 15:19:08 +09:00
Nobuyoshi Nakada
0c7ff37540
Extract compile_op_asgn2 from iseq_compile_each0 2021-09-01 15:19:08 +09:00
Nobuyoshi Nakada
0b87b75ae9
Extract compile_op_asgn1 from iseq_compile_each0 2021-09-01 15:19:07 +09:00
Nobuyoshi Nakada
f781e537b5
Remove no longer used variable line_node 2021-08-31 15:27:02 +09:00
Nobuyoshi Nakada
d23264d359
Extract compile_block from iseq_compile_each0
And constify `node` argument of `iseq_compile_each0`.
2021-08-31 15:27:02 +09:00
Nobuyoshi Nakada
181207e830
Constify line_node in iseq_compile_each0 2021-08-31 10:21:22 +09:00
Jeremy Evans
48c8df9e0e
Allow tracing of optimized methods
This updates the trace instructions to directly dispatch to
opt_send_without_block.  So this should cause no slowdown in
non-trace mode.

To enable the tracing of the optimized methods, RUBY_EVENT_C_CALL
and RUBY_EVENT_C_RETURN are added as events to the specialized
instructions.

Fixes [Bug #14870]

Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
2021-08-21 10:15:01 -07:00
Kazuki Tsujimoto
4568ba0711
Show verbose error messages when single pattern match fails
[0] => [0, *, a]
    #=> [0] length mismatch (given 1, expected 2+) (NoMatchingPatternError)

Ignore test failures of typeprof caused by this change for now.
2021-08-15 09:38:24 +09:00
Alan Wu
cbecf9c7ba
Fix use-after-free on -DUSE_EMBED_CI=0
On -DUSE_EMBED_CI=0, there are more GC allocations and the old code
didn't keep old_operands[0] reachable while allocating. On a Debian
based system, I get a crash requiring erb under GC stress mode. On
macOS, tool/transcode-tblgen.rb runs incorrectly if I put GC.stress=true
as the first line.
2021-07-29 12:04:36 -04:00
Jeremy Evans
fa87f72e1e Add pattern matching pin support for instance/class/global variables
Pin matching for local variables and constants is already supported,
and it is fairly simple to add support for these variable types.

Note that pin matching for method calls is still not supported
without wrapping in parentheses (pin expressions).  I think that's
for the best as method calls are far more complex (arguments/blocks).

Implements [Feature #17724]
2021-07-15 09:56:02 -07:00
Aaron Patterson
2599d1a8df Store the dup'd CDHASH in the object list during IBF load
Since b2fc592c30 nothing was holding a reference to the dup'd CDHASH
during IBF loading.  If a GC happened to run during IBF load then the
copied hash wouldn't have anything to keep it alive.  We don't really
want to keep the originally loaded CDHASH hash, so this patch just
overwrites the original hash with the copied / modified hash.

[Bug #17984] [ruby-core:104259]
2021-07-06 17:48:40 -07:00
eileencodes
31f4d26273 Check type of instruction - can be INSN or ADJUST
If the type is ADJUST we don't want to treat it like an INSN so we have
to check the type before reading from `insn_info.events`.

[Bug #18001] [ruby-core:104371]

Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2021-06-23 11:34:37 -07:00
eileencodes
b91b3bc771 Add a cache for class variables
Redo of 34a2acdac7 and
931138b006 which were reverted.

GitHub PR #4340.

This change implements a cache for class variables. Previously there was
no cache for cvars. Cvar access is slow due to needing to travel all the
way up th ancestor tree before returning the cvar value. The deeper the
ancestor tree the slower cvar access will be.

The benefits of the cache are more visible with a higher number of
included modules due to the way Ruby looks up class variables. The
benchmark here includes 26 modules and shows with the cache, this branch
is 6.5x faster when accessing class variables.

```
compare-ruby: ruby 3.1.0dev (2021-03-15T06:22:34Z master 9e5105c) [x86_64-darwin19]
built-ruby: ruby 3.1.0dev (2021-03-15T12:12:44Z add-cache-for-clas.. c6be009) [x86_64-darwin19]

|         |compare-ruby|built-ruby|
|:--------|-----------:|---------:|
|vm_cvar  |      5.681M|   36.980M|
|         |           -|     6.51x|
```

Benchmark.ips calling `ActiveRecord::Base.logger` from within a Rails
application. ActiveRecord::Base.logger has 71 ancestors. The more
ancestors a tree has, the more clear the speed increase. IE if Base had
only one ancestor we'd see no improvement. This benchmark is run on a
vanilla Rails application.

Benchmark code:

```ruby
require "benchmark/ips"
require_relative "config/environment"

Benchmark.ips do |x|
  x.report "logger" do
    ActiveRecord::Base.logger
  end
end
```

Ruby 3.0 master / Rails 6.1:

```
Warming up --------------------------------------
              logger   155.251k i/100ms
Calculating -------------------------------------
```

Ruby 3.0 with cvar cache /  Rails 6.1:

```
Warming up --------------------------------------
              logger     1.546M i/100ms
Calculating -------------------------------------
              logger     14.857M (± 4.8%) i/s -     74.198M in   5.006202s
```

Lastly we ran a benchmark to demonstate the difference between master
and our cache when the number of modules increases. This benchmark
measures 1 ancestor, 30 ancestors, and 100 ancestors.

Ruby 3.0 master:

```
Warming up --------------------------------------
            1 module     1.231M i/100ms
          30 modules   432.020k i/100ms
         100 modules   145.399k i/100ms
Calculating -------------------------------------
            1 module     12.210M (± 2.1%) i/s -     61.553M in   5.043400s
          30 modules      4.354M (± 2.7%) i/s -     22.033M in   5.063839s
         100 modules      1.434M (± 2.9%) i/s -      7.270M in   5.072531s

Comparison:
            1 module: 12209958.3 i/s
          30 modules:  4354217.8 i/s - 2.80x  (± 0.00) slower
         100 modules:  1434447.3 i/s - 8.51x  (± 0.00) slower
```

Ruby 3.0 with cvar cache:

```
Warming up --------------------------------------
            1 module     1.641M i/100ms
          30 modules     1.655M i/100ms
         100 modules     1.620M i/100ms
Calculating -------------------------------------
            1 module     16.279M (± 3.8%) i/s -     82.038M in   5.046923s
          30 modules     15.891M (± 3.9%) i/s -     79.459M in   5.007958s
         100 modules     16.087M (± 3.6%) i/s -     81.005M in   5.041931s

Comparison:
            1 module: 16279458.0 i/s
         100 modules: 16087484.6 i/s - same-ish: difference falls within error
          30 modules: 15891406.2 i/s - same-ish: difference falls within error
```

Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2021-06-18 10:02:44 -07:00
Yusuke Endoh
0a36cab1b5 Enable USE_ISEQ_NODE_ID by default
... which is formally called EXPERIMENTAL_ISEQ_NODE_ID.

See also ff69ef27b0.

https://bugs.ruby-lang.org/issues/17930
2021-06-18 03:35:38 +09:00
Yusuke Endoh
dfba87cd62 Make it possible to get AST::Node from Thread::Backtrace::Location
RubyVM::AST.of(Thread::Backtrace::Location) returns a node that
corresponds to the location. Typically, the node is a method call, but
not always.

This change also includes iseq's dump/load support of node_ids for each
instructions.
2021-06-18 03:35:38 +09:00
Yusuke Endoh
fb01411ae8 node.h: Reduce struct size to fit with Ruby object size (five VALUEs)
by merging `rb_ast_body_t#line_count` and `#script_lines`.

Fortunately `line_count == RARRAY_LEN(script_lines)` was always
satisfied. When script_lines is saved, it has an array of lines, and
when not saved, it has a Fixnum that represents the old line_count.
2021-06-18 02:34:27 +09:00
Yusuke Endoh
acae5f363d ast.rb: RubyVM::AST.parse and .of accepts save_script_lines: true
This option makes the parser keep the original source as an array of
the original code lines. This feature exploits the mechanism of
`SCRIPT_LINES__` but records only the specified code that is passed to
RubyVM::AST.of or .parse, instead of recording all parsed program texts.
2021-06-18 02:34:27 +09:00
Nobuyoshi Nakada
e4f891ce8d
Adjust styles [ci skip]
* --braces-after-func-def-line
* --dont-cuddle-else
* --procnames-start-lines
* --space-after-for
* --space-after-if
* --space-after-while
2021-06-17 10:13:40 +09:00
Nobuyoshi Nakada
9f3888d6a3 Warn more duplicate literal hash keys
Following non-special_const literals:
* T_REGEXP
2021-06-03 15:11:18 +09:00
Nobuyoshi Nakada
37eb5e7439 Warn more duplicate literal hash keys
Following non-special_const literals:
* T_BIGNUM
* T_FLOAT (non-flonum)
* T_RATIONAL
* T_COMPLEX
2021-06-03 15:11:18 +09:00
Takashi Kokubun
070caf54d2
Refactor rb_vm_insn_addr2insn calls
It's been a way too much amount of ifdefs.
2021-06-02 01:16:50 -07:00
Alan Wu
5ada23ac12 compile.c: Emit send for === calls in when statements
The checkmatch instruction with VM_CHECKMATCH_TYPE_CASE calls
=== without a call cache. Emit a send instruction to make the call
instead. It includes a call cache.

The call cache improves throughput of using when statements to check the
class of a given object. This is useful for say, JSON serialization.

Use of a regular send instead of checkmatch also avoids taking the VM
lock every time, which is good for multi-ractor workloads.

    Calculating -------------------------------------
                             master        post
         vm_case_classes    11.013M     16.172M i/s -      6.000M times in 0.544795s 0.371009s
             vm_case_lit      2.296       2.263 i/s -       1.000 times in 0.435606s 0.441826s
                 vm_case    74.098M     64.338M i/s -      6.000M times in 0.080974s 0.093257s

    Comparison:
                      vm_case_classes
                    post:  16172114.4 i/s
                  master:  11013316.9 i/s - 1.47x  slower

                          vm_case_lit
                  master:         2.3 i/s
                    post:         2.3 i/s - 1.01x  slower

                              vm_case
                  master:  74097858.6 i/s
                    post:  64338333.9 i/s - 1.15x  slower

The vm_case benchmark is a bit slower post patch, possibily due to the
larger instruction sequence. The benchmark dispatches using
opt_case_dispatch so was not running checkmatch and does not make the
=== call post patch.
2021-05-28 12:34:03 -04:00
Alan Wu
788d30a8b3 Make range literal peephole optimization target "newrange"
It looks for "checkmatch", when it could be applied to anything that has
"newrange".

Making the optimization target more ranges might only be fair play when
all ranges are frozen. So I'm putting a reference to the ticket that
froze all ranges.

[Feature #15504]
2021-05-28 12:34:03 -04:00
Alan Wu
b2fc592c30
Build CDHASH properly when loading iseq from binary
Before this change, CDHASH operands were built as plain hashes when
loaded from binary. Without setting up the hash with the correct
st_table type, the hash can sometimes be an ar_table. When the hash is
an ar_table, lookups can call the `eql?` method on keys of the hash,
which makes the `opt_case_dispatch` instruction not "leaf" as it
implicitly declares.

The following script trips the stack canary for checking the leaf
attribute for `opt_case_dispatch` on VM_CHECK_MODE > 0 (enabled by
default with RUBY_DEBUG).

    rb_vm_iseq = RubyVM::InstructionSequence

    iseq = rb_vm_iseq.compile(<<-EOF)
      case Class.new(String).new("foo")
      when "foo"
        42
      end
    EOF

    puts rb_vm_iseq.load_from_binary(iseq.to_binary).eval

This commit changes the binary loading logic to build CDHASH with the
right st_table type. The dumping logic and the dump format stays the
same
2021-05-21 12:13:55 -04:00
Koichi Sasada
817764bd82 simple rescue+while+break should not use throw
609de71f04 fixes the issue by using
`throw` insn if `ensure` is used. However, that patch introduce
additional `throw` even if it is not needed. This patch solves
the issue.

This issue is pointed by @mame.
2021-05-21 18:12:14 +09:00
Yusuke Endoh
5026f9a5d5 compile.c: stop the jump-jump optimization if the second has any event
Fixes [Bug #17868]
2021-05-20 19:13:39 +09:00
Jeremy Evans
9ce29c94d8 Avoid improper optimization of case statements mixed integer/rational/complex
Fixes [Bug #17857]
2021-05-12 19:30:05 -07:00
卜部昌平
0ab0b86c84 cdhash_cmp: should use ||
cf: https://github.com/ruby/ruby/pull/4469#discussion_r628386707
2021-05-12 10:30:46 +09:00
卜部昌平
e1eff837cf cdhash_cmp: recursively apply
For instance a rational's numerator can be a bignum.  Comparison using
C's == can be insufficient.
2021-05-12 10:30:46 +09:00
卜部昌平
cc0dc67bbb cdhash_cmp: can also take complex
There are complex literals `123i`, which can also be a case condition.
2021-05-12 10:30:46 +09:00
卜部昌平
d0e6c6e682 cdhash_cmp: rational literals with fractions
Nobu kindly pointed out that rational literals can have fractions.
2021-05-12 10:30:46 +09:00
卜部昌平
2bc293e899 cdhash_cmp: can take rational literals
Rational literals are those integers suffixed with `r`.  They tend to
be a part of more complex expressions like `123/456r`, but in theory
they can live alone.  When such "bare" rational literals are passed to
case-when branch, we have to take care of them.  Fixes [Bug #17854]
2021-05-12 10:30:46 +09:00
Aaron Patterson
07f055bb13
Revert "Filling cache values on cvar write"
This reverts commit 08de37f9fa.
This reverts commit e8ae922b62.
2021-05-11 13:31:00 -07:00
eileencodes
08de37f9fa Filling cache values on cvar write
Instead of on read. Once it's in the inline cache we never have to make
one again. We want to eventually put the value into the cache, and the
best opportunity to do that is when you write the value.
2021-05-11 12:04:27 -07:00
eileencodes
e8ae922b62 Add a cache for class variables
This change implements a cache for class variables. Previously there was
no cache for cvars. Cvar access is slow due to needing to travel all the
way up th ancestor tree before returning the cvar value. The deeper the
ancestor tree the slower cvar access will be.

The benefits of the cache are more visible with a higher number of
included modules due to the way Ruby looks up class variables. The
benchmark here includes 26 modules and shows with the cache, this branch
is 6.5x faster when accessing class variables.

```
compare-ruby: ruby 3.1.0dev (2021-03-15T06:22:34Z master 9e5105ca45) [x86_64-darwin19]
built-ruby: ruby 3.1.0dev (2021-03-15T12:12:44Z add-cache-for-clas.. c6be0093ae) [x86_64-darwin19]

|         |compare-ruby|built-ruby|
|:--------|-----------:|---------:|
|vm_cvar  |      5.681M|   36.980M|
|         |           -|     6.51x|
```

Benchmark.ips calling `ActiveRecord::Base.logger` from within a Rails
application. ActiveRecord::Base.logger has 71 ancestors. The more
ancestors a tree has, the more clear the speed increase. IE if Base had
only one ancestor we'd see no improvement. This benchmark is run on a
vanilla Rails application.

Benchmark code:

```ruby
require "benchmark/ips"
require_relative "config/environment"

Benchmark.ips do |x|
  x.report "logger" do
    ActiveRecord::Base.logger
  end
end
```

Ruby 3.0 master / Rails 6.1:

```
Warming up --------------------------------------
              logger   155.251k i/100ms
Calculating -------------------------------------
```

Ruby 3.0 with cvar cache /  Rails 6.1:

```
Warming up --------------------------------------
              logger     1.546M i/100ms
Calculating -------------------------------------
              logger     14.857M (± 4.8%) i/s -     74.198M in   5.006202s
```

Lastly we ran a benchmark to demonstate the difference between master
and our cache when the number of modules increases. This benchmark
measures 1 ancestor, 30 ancestors, and 100 ancestors.

Ruby 3.0 master:

```
Warming up --------------------------------------
            1 module     1.231M i/100ms
          30 modules   432.020k i/100ms
         100 modules   145.399k i/100ms
Calculating -------------------------------------
            1 module     12.210M (± 2.1%) i/s -     61.553M in   5.043400s
          30 modules      4.354M (± 2.7%) i/s -     22.033M in   5.063839s
         100 modules      1.434M (± 2.9%) i/s -      7.270M in   5.072531s

Comparison:
            1 module: 12209958.3 i/s
          30 modules:  4354217.8 i/s - 2.80x  (± 0.00) slower
         100 modules:  1434447.3 i/s - 8.51x  (± 0.00) slower
```

Ruby 3.0 with cvar cache:

```
Warming up --------------------------------------
            1 module     1.641M i/100ms
          30 modules     1.655M i/100ms
         100 modules     1.620M i/100ms
Calculating -------------------------------------
            1 module     16.279M (± 3.8%) i/s -     82.038M in   5.046923s
          30 modules     15.891M (± 3.9%) i/s -     79.459M in   5.007958s
         100 modules     16.087M (± 3.6%) i/s -     81.005M in   5.041931s

Comparison:
            1 module: 16279458.0 i/s
         100 modules: 16087484.6 i/s - same-ish: difference falls within error
          30 modules: 15891406.2 i/s - same-ish: difference falls within error
```

Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2021-05-11 12:04:27 -07:00