Commit graph

1539 commits

Author SHA1 Message Date
Alan Wu
8bda11f690 MicroJIT: compile after ten calls 2021-10-20 18:19:25 -04:00
Alan Wu
e8c914c250 Implement the --disable-ujit command line option 2021-10-20 18:19:24 -04:00
Alan Wu
265c5ca8b1 Avoid triggering GC while translating threaded code 2021-10-20 18:19:23 -04:00
Maxime Chevalier-Boisvert
038f5d964f Avoid recompiling overlapping instruction sequences in ujit 2021-10-20 18:19:23 -04:00
Alan Wu
4929ba0a5c Generate multiple copies of native code for pop
Insert generated addresses into st_table for mapping native code
addresses back to info about VM instructions. Export `encoded_insn_data`
to do this. Also some style fixes.
2021-10-20 18:19:23 -04:00
Maxime Chevalier-Boisvert
1c8fb90f6b Add new files, ujit_compile.c, ujit_compile.h 2021-10-20 18:19:23 -04:00
Maxime Chevalier-Boisvert
566d4abee5 Added shift instructions 2021-10-20 18:19:23 -04:00
Alan Wu
16c5ce863c Yeah, this actually works! 2021-10-20 18:19:22 -04:00
Nobuyoshi Nakada
768ceb4ead
Cast to void pointer for %p in commented out code [ci skip] 2021-10-20 11:22:33 +09:00
Aaron Patterson
217df51f0e Dump outer variables tables when dumping an iseq to binary
This commit dumps the outer variables table when dumping an iseq to
binary.  This fixes a case where Ractors aren't able to tell what outer
variables belong to a lambda after the lambda is loaded via ISeq.load_from_binary

[Bug #18232] [ruby-core:105504]
2021-10-07 15:39:47 -07:00
S.H
dc9112cf10
Using NIL_P macro instead of == Qnil 2021-10-03 22:34:45 +09:00
S-H-GAMELINKS
83a5e2bb5c Using RB_FLOAT_TYPE_P macro 2021-09-12 11:16:31 +09:00
S-H-GAMELINKS
56065f0686 Using SYMBOL_P macro 2021-09-11 08:48:56 +09:00
Nobuyoshi Nakada
cfbf2bde40
Remove unused argument 2021-09-10 21:26:16 +09:00
卜部昌平
dddc618d30 suppress GCC's -Wsuggest-attribute=format
I was not aware of this because I use clang these days.
2021-09-10 20:00:06 +09:00
S-H-GAMELINKS
bdd6d8746f Replace RBOOL macro 2021-09-05 23:01:27 +09:00
Nobuyoshi Nakada
cb3df3d87b
Extract compile_attrasgn from iseq_compile_each0 2021-09-01 15:19:11 +09:00
Nobuyoshi Nakada
aac2b0fc6b
Extract compile_kw_arg from iseq_compile_each0 2021-09-01 15:19:11 +09:00
Nobuyoshi Nakada
cbf841e3ed
Extract compile_errinfo from iseq_compile_each0 2021-09-01 15:19:11 +09:00
Nobuyoshi Nakada
d7bba95eba
Extract compile_dots from iseq_compile_each0 2021-09-01 15:19:10 +09:00
Nobuyoshi Nakada
d58143f3b5
Extract compile_colon3 from iseq_compile_each0 2021-09-01 15:19:10 +09:00
Nobuyoshi Nakada
70c8155d8b
Extract compile_colon2 from iseq_compile_each0 2021-09-01 15:19:10 +09:00
Nobuyoshi Nakada
270a674a79
Extract compile_match from iseq_compile_each0 2021-09-01 15:19:09 +09:00
Nobuyoshi Nakada
a92fdc90da
Extract compile_yield from iseq_compile_each0 2021-09-01 15:19:09 +09:00
Nobuyoshi Nakada
996489d7e0
Extract compile_super from iseq_compile_each0 2021-09-01 15:19:09 +09:00
Nobuyoshi Nakada
6cf9f17191
Extract compile_op_log from iseq_compile_each0 2021-09-01 15:19:08 +09:00
Nobuyoshi Nakada
d045d5f860
Extract compile_op_cdecl from iseq_compile_each0 2021-09-01 15:19:08 +09:00
Nobuyoshi Nakada
0c7ff37540
Extract compile_op_asgn2 from iseq_compile_each0 2021-09-01 15:19:08 +09:00
Nobuyoshi Nakada
0b87b75ae9
Extract compile_op_asgn1 from iseq_compile_each0 2021-09-01 15:19:07 +09:00
Nobuyoshi Nakada
f781e537b5
Remove no longer used variable line_node 2021-08-31 15:27:02 +09:00
Nobuyoshi Nakada
d23264d359
Extract compile_block from iseq_compile_each0
And constify `node` argument of `iseq_compile_each0`.
2021-08-31 15:27:02 +09:00
Nobuyoshi Nakada
181207e830
Constify line_node in iseq_compile_each0 2021-08-31 10:21:22 +09:00
Jeremy Evans
48c8df9e0e
Allow tracing of optimized methods
This updates the trace instructions to directly dispatch to
opt_send_without_block.  So this should cause no slowdown in
non-trace mode.

To enable the tracing of the optimized methods, RUBY_EVENT_C_CALL
and RUBY_EVENT_C_RETURN are added as events to the specialized
instructions.

Fixes [Bug #14870]

Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
2021-08-21 10:15:01 -07:00
Kazuki Tsujimoto
4568ba0711
Show verbose error messages when single pattern match fails
[0] => [0, *, a]
    #=> [0] length mismatch (given 1, expected 2+) (NoMatchingPatternError)

Ignore test failures of typeprof caused by this change for now.
2021-08-15 09:38:24 +09:00
Alan Wu
cbecf9c7ba
Fix use-after-free on -DUSE_EMBED_CI=0
On -DUSE_EMBED_CI=0, there are more GC allocations and the old code
didn't keep old_operands[0] reachable while allocating. On a Debian
based system, I get a crash requiring erb under GC stress mode. On
macOS, tool/transcode-tblgen.rb runs incorrectly if I put GC.stress=true
as the first line.
2021-07-29 12:04:36 -04:00
Jeremy Evans
fa87f72e1e Add pattern matching pin support for instance/class/global variables
Pin matching for local variables and constants is already supported,
and it is fairly simple to add support for these variable types.

Note that pin matching for method calls is still not supported
without wrapping in parentheses (pin expressions).  I think that's
for the best as method calls are far more complex (arguments/blocks).

Implements [Feature #17724]
2021-07-15 09:56:02 -07:00
Aaron Patterson
2599d1a8df Store the dup'd CDHASH in the object list during IBF load
Since b2fc592c30 nothing was holding a reference to the dup'd CDHASH
during IBF loading.  If a GC happened to run during IBF load then the
copied hash wouldn't have anything to keep it alive.  We don't really
want to keep the originally loaded CDHASH hash, so this patch just
overwrites the original hash with the copied / modified hash.

[Bug #17984] [ruby-core:104259]
2021-07-06 17:48:40 -07:00
eileencodes
31f4d26273 Check type of instruction - can be INSN or ADJUST
If the type is ADJUST we don't want to treat it like an INSN so we have
to check the type before reading from `insn_info.events`.

[Bug #18001] [ruby-core:104371]

Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2021-06-23 11:34:37 -07:00
eileencodes
b91b3bc771 Add a cache for class variables
Redo of 34a2acdac7 and
931138b006 which were reverted.

GitHub PR #4340.

This change implements a cache for class variables. Previously there was
no cache for cvars. Cvar access is slow due to needing to travel all the
way up th ancestor tree before returning the cvar value. The deeper the
ancestor tree the slower cvar access will be.

The benefits of the cache are more visible with a higher number of
included modules due to the way Ruby looks up class variables. The
benchmark here includes 26 modules and shows with the cache, this branch
is 6.5x faster when accessing class variables.

```
compare-ruby: ruby 3.1.0dev (2021-03-15T06:22:34Z master 9e5105c) [x86_64-darwin19]
built-ruby: ruby 3.1.0dev (2021-03-15T12:12:44Z add-cache-for-clas.. c6be009) [x86_64-darwin19]

|         |compare-ruby|built-ruby|
|:--------|-----------:|---------:|
|vm_cvar  |      5.681M|   36.980M|
|         |           -|     6.51x|
```

Benchmark.ips calling `ActiveRecord::Base.logger` from within a Rails
application. ActiveRecord::Base.logger has 71 ancestors. The more
ancestors a tree has, the more clear the speed increase. IE if Base had
only one ancestor we'd see no improvement. This benchmark is run on a
vanilla Rails application.

Benchmark code:

```ruby
require "benchmark/ips"
require_relative "config/environment"

Benchmark.ips do |x|
  x.report "logger" do
    ActiveRecord::Base.logger
  end
end
```

Ruby 3.0 master / Rails 6.1:

```
Warming up --------------------------------------
              logger   155.251k i/100ms
Calculating -------------------------------------
```

Ruby 3.0 with cvar cache /  Rails 6.1:

```
Warming up --------------------------------------
              logger     1.546M i/100ms
Calculating -------------------------------------
              logger     14.857M (± 4.8%) i/s -     74.198M in   5.006202s
```

Lastly we ran a benchmark to demonstate the difference between master
and our cache when the number of modules increases. This benchmark
measures 1 ancestor, 30 ancestors, and 100 ancestors.

Ruby 3.0 master:

```
Warming up --------------------------------------
            1 module     1.231M i/100ms
          30 modules   432.020k i/100ms
         100 modules   145.399k i/100ms
Calculating -------------------------------------
            1 module     12.210M (± 2.1%) i/s -     61.553M in   5.043400s
          30 modules      4.354M (± 2.7%) i/s -     22.033M in   5.063839s
         100 modules      1.434M (± 2.9%) i/s -      7.270M in   5.072531s

Comparison:
            1 module: 12209958.3 i/s
          30 modules:  4354217.8 i/s - 2.80x  (± 0.00) slower
         100 modules:  1434447.3 i/s - 8.51x  (± 0.00) slower
```

Ruby 3.0 with cvar cache:

```
Warming up --------------------------------------
            1 module     1.641M i/100ms
          30 modules     1.655M i/100ms
         100 modules     1.620M i/100ms
Calculating -------------------------------------
            1 module     16.279M (± 3.8%) i/s -     82.038M in   5.046923s
          30 modules     15.891M (± 3.9%) i/s -     79.459M in   5.007958s
         100 modules     16.087M (± 3.6%) i/s -     81.005M in   5.041931s

Comparison:
            1 module: 16279458.0 i/s
         100 modules: 16087484.6 i/s - same-ish: difference falls within error
          30 modules: 15891406.2 i/s - same-ish: difference falls within error
```

Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2021-06-18 10:02:44 -07:00
Yusuke Endoh
0a36cab1b5 Enable USE_ISEQ_NODE_ID by default
... which is formally called EXPERIMENTAL_ISEQ_NODE_ID.

See also ff69ef27b0.

https://bugs.ruby-lang.org/issues/17930
2021-06-18 03:35:38 +09:00
Yusuke Endoh
dfba87cd62 Make it possible to get AST::Node from Thread::Backtrace::Location
RubyVM::AST.of(Thread::Backtrace::Location) returns a node that
corresponds to the location. Typically, the node is a method call, but
not always.

This change also includes iseq's dump/load support of node_ids for each
instructions.
2021-06-18 03:35:38 +09:00
Yusuke Endoh
fb01411ae8 node.h: Reduce struct size to fit with Ruby object size (five VALUEs)
by merging `rb_ast_body_t#line_count` and `#script_lines`.

Fortunately `line_count == RARRAY_LEN(script_lines)` was always
satisfied. When script_lines is saved, it has an array of lines, and
when not saved, it has a Fixnum that represents the old line_count.
2021-06-18 02:34:27 +09:00
Yusuke Endoh
acae5f363d ast.rb: RubyVM::AST.parse and .of accepts save_script_lines: true
This option makes the parser keep the original source as an array of
the original code lines. This feature exploits the mechanism of
`SCRIPT_LINES__` but records only the specified code that is passed to
RubyVM::AST.of or .parse, instead of recording all parsed program texts.
2021-06-18 02:34:27 +09:00
Nobuyoshi Nakada
e4f891ce8d
Adjust styles [ci skip]
* --braces-after-func-def-line
* --dont-cuddle-else
* --procnames-start-lines
* --space-after-for
* --space-after-if
* --space-after-while
2021-06-17 10:13:40 +09:00
Nobuyoshi Nakada
9f3888d6a3 Warn more duplicate literal hash keys
Following non-special_const literals:
* T_REGEXP
2021-06-03 15:11:18 +09:00
Nobuyoshi Nakada
37eb5e7439 Warn more duplicate literal hash keys
Following non-special_const literals:
* T_BIGNUM
* T_FLOAT (non-flonum)
* T_RATIONAL
* T_COMPLEX
2021-06-03 15:11:18 +09:00
Takashi Kokubun
070caf54d2
Refactor rb_vm_insn_addr2insn calls
It's been a way too much amount of ifdefs.
2021-06-02 01:16:50 -07:00
Alan Wu
5ada23ac12 compile.c: Emit send for === calls in when statements
The checkmatch instruction with VM_CHECKMATCH_TYPE_CASE calls
=== without a call cache. Emit a send instruction to make the call
instead. It includes a call cache.

The call cache improves throughput of using when statements to check the
class of a given object. This is useful for say, JSON serialization.

Use of a regular send instead of checkmatch also avoids taking the VM
lock every time, which is good for multi-ractor workloads.

    Calculating -------------------------------------
                             master        post
         vm_case_classes    11.013M     16.172M i/s -      6.000M times in 0.544795s 0.371009s
             vm_case_lit      2.296       2.263 i/s -       1.000 times in 0.435606s 0.441826s
                 vm_case    74.098M     64.338M i/s -      6.000M times in 0.080974s 0.093257s

    Comparison:
                      vm_case_classes
                    post:  16172114.4 i/s
                  master:  11013316.9 i/s - 1.47x  slower

                          vm_case_lit
                  master:         2.3 i/s
                    post:         2.3 i/s - 1.01x  slower

                              vm_case
                  master:  74097858.6 i/s
                    post:  64338333.9 i/s - 1.15x  slower

The vm_case benchmark is a bit slower post patch, possibily due to the
larger instruction sequence. The benchmark dispatches using
opt_case_dispatch so was not running checkmatch and does not make the
=== call post patch.
2021-05-28 12:34:03 -04:00
Alan Wu
788d30a8b3 Make range literal peephole optimization target "newrange"
It looks for "checkmatch", when it could be applied to anything that has
"newrange".

Making the optimization target more ranges might only be fair play when
all ranges are frozen. So I'm putting a reference to the ticket that
froze all ranges.

[Feature #15504]
2021-05-28 12:34:03 -04:00
Alan Wu
b2fc592c30
Build CDHASH properly when loading iseq from binary
Before this change, CDHASH operands were built as plain hashes when
loaded from binary. Without setting up the hash with the correct
st_table type, the hash can sometimes be an ar_table. When the hash is
an ar_table, lookups can call the `eql?` method on keys of the hash,
which makes the `opt_case_dispatch` instruction not "leaf" as it
implicitly declares.

The following script trips the stack canary for checking the leaf
attribute for `opt_case_dispatch` on VM_CHECK_MODE > 0 (enabled by
default with RUBY_DEBUG).

    rb_vm_iseq = RubyVM::InstructionSequence

    iseq = rb_vm_iseq.compile(<<-EOF)
      case Class.new(String).new("foo")
      when "foo"
        42
      end
    EOF

    puts rb_vm_iseq.load_from_binary(iseq.to_binary).eval

This commit changes the binary loading logic to build CDHASH with the
right st_table type. The dumping logic and the dump format stays the
same
2021-05-21 12:13:55 -04:00