Commit graph

2691 commits

Author SHA1 Message Date
ydah
4e6091ce09 Implement WHILE and UNTIL NODE locations 2024-09-11 09:28:55 +09:00
ydah
d52e599538 Implement WHEN NODE locations 2024-09-09 10:34:02 +09:00
ydah
32680f543c Implement AND/OR NODE operator locations 2024-09-05 13:03:28 +09:00
ydah
ab18b1b4f5 Implement VALIAS NODE keyword locations 2024-09-04 14:36:35 +09:00
ydah
a2243ee48b Implement ALIAS NODE keyword locations 2024-09-03 22:09:08 +09:00
ydah
af143d8a74 Implement UNDEF NODE keyword locations 2024-09-03 21:15:12 +09:00
yui-knk
c93d07ed74 [Bug #20695] Do not create needless string object in parser
`set_parser_s_value` does nothing in parser therefore no need to
create string object in parser `set_yylval_node`.

# Object allocation

Run `ruby benchmarks/lobsters/benchmark.rb` with the patch

```diff
diff --git a/benchmarks/lobsters/benchmark.rb b/benchmarks/lobsters/benchmark.rb
index 240c50c..6cdd0ac 100644
--- a/benchmarks/lobsters/benchmark.rb
+++ b/benchmarks/lobsters/benchmark.rb
@@ -7,6 +7,8 @@ Dir.chdir __dir__
 use_gemfile

 require_relative 'config/environment'
+printf "allocated_after_load=%d\n", GC.stat(:total_allocated_objects)
+exit
 require_relative "route_generator"

 # For an in-mem DB, we need to load all data on every boot
```

## Before

```
ruby 3.4.0dev (2024-08-31T18:30:25Z master d6fc8f3d57) [arm64-darwin21]
...
allocated_after_load=2143519
```

## After

```
ruby 3.4.0dev (2024-09-01T00:40:04Z fix_bugs_20695 d1bae52f75) [arm64-darwin21]
...
allocated_after_load=1579662
```

## Ruby 3.3.0 for reference

```
ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [arm64-darwin21]
...
allocated_after_load=1732702
```
2024-09-03 08:40:07 +09:00
Nobuyoshi Nakada
620ce3807b
[Bug #20680] ensure block is always void context 2024-08-25 08:16:54 +09:00
Nobuyoshi Nakada
995b4c329b
Make same structures same 2024-08-20 12:26:02 +09:00
Peter Zhu
584559d86a Fix leak of token_info when Ripper#warn jumps
For example, the following code leaks:

    class MyRipper < Ripper
      def initialize(src, &blk)
        super(src)
        @blk = blk
      end

      def warn(msg, *args) = @blk.call(msg)
    end

    $VERBOSE = true
    def call_parse = MyRipper.new("if true\n  end\n") { |msg| return msg }.parse

    10.times do
      500_000.times do
        call_parse
      end

      puts `ps -o rss= -p #{$$}`
    end

Before:

    37536
    53744
    70064
    86448
    102576
    119120
    135248
    151216
    167744
    183824

After:

    19280
    19696
    19728
    20336
    20448
    21408
    21616
    21616
    21824
    21840
2024-08-07 09:14:14 -04:00
Peter Zhu
ced35800d4 Fix leak in warning of duplicate keys when Ripper#warn jumps
For example, the following code leaks:

    class MyRipper < Ripper
      def initialize(src, &blk)
        super(src)
        @blk = blk
      end

      def warn(msg, *args) = @blk.call(msg)
    end

    $VERBOSE = true
    def call_parse = MyRipper.new("if true\n  end\n") { |msg| return msg }.parse

    10.times do
      500_000.times do
        call_parse
      end

      puts `ps -o rss= -p #{$$}`
    end

Before:

    34832
    51952
    69760
    88048
    105344
    123040
    141152
    159152
    176656
    194272

After:

    18400
    20256
    20272
    20272
    20272
    20304
    20368
    20368
    20368
    20400
2024-08-06 10:19:50 -04:00
yui-knk
66cbafc603 Refactor to use tokenize_ident instead of TOK_INTERN and set_yylval_name 2024-08-02 11:37:10 +09:00
Peter Zhu
6358397490 Fix leak of AST when Ripper#compile_error jumps
For example, the following script leaks:

    class MyRipper < Ripper
      def initialize(src, &blk)
        super(src)
        @blk = blk
      end

      def compile_error(msg) = @blk.call(msg)
    end

    def call_parse = MyRipper.new("/") { |msg| return msg }.parse

    10.times do
      100_000.times do
        call_parse
      end

      puts `ps -o rss= -p #{$$}`
    end

Before:

    93952
    169040
    244224
    318784
    394432
    468224
    544048
    618560
    693776
    768384

After:

    19776
    19776
    20352
    20880
    20912
    21408
    21328
    21152
    21472
    20944
2024-07-31 14:47:44 -04:00
yui-knk
f2728c3393 Change RESBODY Node structure
Extracrt exception variable into `nd_exc_var` field
to keep the original grammar structure.

For example:

```
begin
rescue Error => e1
end
```

Before:

```
@ NODE_RESBODY (id: 8, line: 2, location: (2,0)-(2,18))
+- nd_args:
|   @ NODE_LIST (id: 2, line: 2, location: (2,7)-(2,12))
|   +- as.nd_alen: 1
|   +- nd_head:
|   |   @ NODE_CONST (id: 1, line: 2, location: (2,7)-(2,12))
|   |   +- nd_vid: :Error
|   +- nd_next:
|       (null node)
+- nd_body:
|   @ NODE_BLOCK (id: 6, line: 2, location: (2,13)-(2,18))
|   +- nd_head (1):
|   |   @ NODE_LASGN (id: 3, line: 2, location: (2,13)-(2,18))
|   |   +- nd_vid: :e1
|   |   +- nd_value:
|   |       @ NODE_ERRINFO (id: 5, line: 2, location: (2,13)-(2,18))
|   +- nd_head (2):
|       @ NODE_BEGIN (id: 4, line: 2, location: (2,18)-(2,18))
|       +- nd_body:
|           (null node)
+- nd_next:
    (null node)
```

After:

```
@ NODE_RESBODY (id: 6, line: 2, location: (2,0)-(2,18))
+- nd_args:
|   @ NODE_LIST (id: 2, line: 2, location: (2,7)-(2,12))
|   +- as.nd_alen: 1
|   +- nd_head:
|   |   @ NODE_CONST (id: 1, line: 2, location: (2,7)-(2,12))
|   |   +- nd_vid: :Error
|   +- nd_next:
|       (null node)
+- nd_exc_var:
|   @ NODE_LASGN (id: 3, line: 2, location: (2,13)-(2,18))
|   +- nd_vid: :e1
|   +- nd_value:
|       @ NODE_ERRINFO (id: 5, line: 2, location: (2,13)-(2,18))
+- nd_body:
|   @ NODE_BEGIN (id: 4, line: 2, location: (2,18)-(2,18))
|   +- nd_body:
|       (null node)
+- nd_next:
    (null node)
```
2024-07-26 07:29:32 +09:00
Nobuyoshi Nakada
e642ddf7ae
[Bug #20647] Disallow return directly within a singleton class 2024-07-24 14:44:32 +09:00
Peter Zhu
f0d8a0a2bf Fix memory leak in parser when loading non-ASCII file
When loading a non-ASCII compatible file, an error is raised which
causes memory leak.

For example:

    require "tempfile"

    Tempfile.create do |f|
      f.write("# -*- coding: UTF-16BE -*-")
      f.flush

      10.times do
        20_000.times do
          begin
            load(f.path)
          rescue
          end
        end

        puts `ps -o rss= -p #{$$}`
      end
    end

Before:

    33904
    49072
    64528
    79216
    94576
    109504
    124768
    139536
    154928
    170256

After:

    19568
    21296
    21664
    21728
    22192
    22256
    22416
    22272
    22272
    22272
2024-07-23 08:50:53 -04:00
yui-knk
57b11be15a Implement UNLESS NODE keyword locations 2024-07-23 14:35:23 +09:00
Nobuyoshi Nakada
3c4dc3e7ac
Remove unneeded local variable
`$5`, `brace_block` is no longer assigned in this action.
2024-07-21 12:10:33 +09:00
yui-knk
11e5ebaba7 Fix SEGV on method call with empty args and brace block for do block command call 2024-07-21 11:02:38 +09:00
yui-knk
84680dc255 Include undef keyword into UNDEF NODE location
For example:

```
undef a, b
```

Before:

```
@ NODE_UNDEF (id: 1, line: 1, location: (1,6)-(1,10))*
```

After:

```
@ NODE_UNDEF (id: 1, line: 1, location: (1,0)-(1,10))*
```
2024-07-20 13:04:48 +09:00
yui-knk
6be539aab5 Change UNDEF Node structure
Change UNDEF Node to hold their items to keep the original grammar
structure.

For example:

```
undef a, b
```

Before:

```
@ NODE_BLOCK (id: 4, line: 1, location: (1,6)-(1,10))*
+- nd_head (1):
|   @ NODE_UNDEF (id: 1, line: 1, location: (1,6)-(1,7))
|   +- nd_undef:
|       @ NODE_SYM (id: 0, line: 1, location: (1,6)-(1,7))
|       +- string: :a
+- nd_head (2):
    @ NODE_UNDEF (id: 3, line: 1, location: (1,9)-(1,10))
    +- nd_undef:
        @ NODE_SYM (id: 2, line: 1, location: (1,9)-(1,10))
        +- string: :b
```

After:

```
@ NODE_UNDEF (id: 1, line: 1, location: (1,6)-(1,10))*
+- nd_undefs:
    +- length: 2
    +- element (0):
    |   @ NODE_SYM (id: 0, line: 1, location: (1,6)-(1,7))
    |   +- string: :a
    +- element (1):
        @ NODE_SYM (id: 2, line: 1, location: (1,9)-(1,10))
        +- string: :b
```
2024-07-20 11:25:26 +09:00
yui-knk
231a9acc15 Free data of struct rb_parser_ary in rb_parser_ary_free
For example:

    10.times do
      100_000.times do
        RubyVM::AbstractSyntaxTree.parse("x = 1 + 2 +", keep_tokens: true)
      rescue SyntaxError
      end

      puts `ps -o rss= -p #{$$}`
    end

Before:

    28944
    44816
    60720
    76496
    92336
   108160
   123968
   139808
   155648
   171408

After:

    11984
    12704
    12816
    12832
    13072
    13088
    13088
    13136
    13136
    13152
2024-07-18 19:19:27 +09:00
yui-knk
4fb7e1b6d0 Change enum rb_parser_ary_data_type default value to 1 for easy debug
We face `[BUG] unexpected rb_parser_ary_data_type (0) for script lines`
on master branch recently.
This commit changes `enum rb_parser_ary_data_type` to start with `1`
and `0` to be invalid then it makes clear `rb_parser_ary_data_type (0)`
is not intentional.
2024-06-26 07:48:43 +09:00
Nobuyoshi Nakada
250fc1223c [Bug #20457] Do not remove final return node
This was an optimization for versions prior to 1.9 that traverse the
AST at runtime.
2024-06-25 11:07:58 +09:00
Nobuyoshi Nakada
22f98bb7ca Parenthesize nd_fl_newline macro expressions 2024-06-25 11:07:58 +09:00
Aaron Patterson
cdf33ed5f3 Optimized forwarding callers and callees
This patch optimizes forwarding callers and callees. It only optimizes methods that only take `...` as their parameter, and then pass `...` to other calls.

Calls it optimizes look like this:

```ruby
def bar(a) = a
def foo(...) = bar(...) # optimized
foo(123)
```

```ruby
def bar(a) = a
def foo(...) = bar(1, 2, ...) # optimized
foo(123)
```

```ruby
def bar(*a) = a

def foo(...)
  list = [1, 2]
  bar(*list, ...) # optimized
end
foo(123)
```

All variants of the above but using `super` are also optimized, including a bare super like this:

```ruby
def foo(...)
  super
end
```

This patch eliminates intermediate allocations made when calling methods that accept `...`.
We can observe allocation elimination like this:

```ruby
def m
  x = GC.stat(:total_allocated_objects)
  yield
  GC.stat(:total_allocated_objects) - x
end

def bar(a) = a
def foo(...) = bar(...)

def test
  m { foo(123) }
end

test
p test # allocates 1 object on master, but 0 objects with this patch
```

```ruby
def bar(a, b:) = a + b
def foo(...) = bar(...)

def test
  m { foo(1, b: 2) }
end

test
p test # allocates 2 objects on master, but 0 objects with this patch
```

How does it work?
-----------------

This patch works by using a dynamic stack size when passing forwarded parameters to callees.
The caller's info object (known as the "CI") contains the stack size of the
parameters, so we pass the CI object itself as a parameter to the callee.
When forwarding parameters, the forwarding ISeq uses the caller's CI to determine how much stack to copy, then copies the caller's stack before calling the callee.
The CI at the forwarded call site is adjusted using information from the caller's CI.

I think this description is kind of confusing, so let's walk through an example with code.

```ruby
def delegatee(a, b) = a + b

def delegator(...)
  delegatee(...)  # CI2 (FORWARDING)
end

def caller
  delegator(1, 2) # CI1 (argc: 2)
end
```

Before we call the delegator method, the stack looks like this:

```
Executing Line | Code                                  | Stack
---------------+---------------------------------------+--------
              1| def delegatee(a, b) = a + b           | self
              2|                                       | 1
              3| def delegator(...)                    | 2
              4|   #                                   |
              5|   delegatee(...)  # CI2 (FORWARDING)  |
              6| end                                   |
              7|                                       |
              8| def caller                            |
          ->  9|   delegator(1, 2) # CI1 (argc: 2)     |
             10| end                                   |
```

The ISeq for `delegator` is tagged as "forwardable", so when `caller` calls in
to `delegator`, it writes `CI1` on to the stack as a local variable for the
`delegator` method.  The `delegator` method has a special local called `...`
that holds the caller's CI object.

Here is the ISeq disasm fo `delegator`:

```
== disasm: #<ISeq:delegator@-e:1 (1,0)-(1,39)>
local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 1] "..."@0
0000 putself                                                          (   1)[LiCa]
0001 getlocal_WC_0                          "..."@0
0003 send                                   <calldata!mid:delegatee, argc:0, FCALL|FORWARDING>, nil
0006 leave                                  [Re]
```

The local called `...` will contain the caller's CI: CI1.

Here is the stack when we enter `delegator`:

```
Executing Line | Code                                  | Stack
---------------+---------------------------------------+--------
              1| def delegatee(a, b) = a + b           | self
              2|                                       | 1
              3| def delegator(...)                    | 2
           -> 4|   #                                   | CI1 (argc: 2)
              5|   delegatee(...)  # CI2 (FORWARDING)  | cref_or_me
              6| end                                   | specval
              7|                                       | type
              8| def caller                            |
              9|   delegator(1, 2) # CI1 (argc: 2)     |
             10| end                                   |
```

The CI at `delegatee` on line 5 is tagged as "FORWARDING", so it knows to
memcopy the caller's stack before calling `delegatee`.  In this case, it will
memcopy self, 1, and 2 to the stack before calling `delegatee`.  It knows how much
memory to copy from the caller because `CI1` contains stack size information
(argc: 2).

Before executing the `send` instruction, we push `...` on the stack.  The
`send` instruction pops `...`, and because it is tagged with `FORWARDING`, it
knows to memcopy (using the information in the CI it just popped):

```
== disasm: #<ISeq:delegator@-e:1 (1,0)-(1,39)>
local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 1] "..."@0
0000 putself                                                          (   1)[LiCa]
0001 getlocal_WC_0                          "..."@0
0003 send                                   <calldata!mid:delegatee, argc:0, FCALL|FORWARDING>, nil
0006 leave                                  [Re]
```

Instruction 001 puts the caller's CI on the stack.  `send` is tagged with
FORWARDING, so it reads the CI and _copies_ the callers stack to this stack:

```
Executing Line | Code                                  | Stack
---------------+---------------------------------------+--------
              1| def delegatee(a, b) = a + b           | self
              2|                                       | 1
              3| def delegator(...)                    | 2
              4|   #                                   | CI1 (argc: 2)
           -> 5|   delegatee(...)  # CI2 (FORWARDING)  | cref_or_me
              6| end                                   | specval
              7|                                       | type
              8| def caller                            | self
              9|   delegator(1, 2) # CI1 (argc: 2)     | 1
             10| end                                   | 2
```

The "FORWARDING" call site combines information from CI1 with CI2 in order
to support passing other values in addition to the `...` value, as well as
perfectly forward splat args, kwargs, etc.

Since we're able to copy the stack from `caller` in to `delegator`'s stack, we
can avoid allocating objects.

I want to do this to eliminate object allocations for delegate methods.
My long term goal is to implement `Class#new` in Ruby and it uses `...`.

I was able to implement `Class#new` in Ruby
[here](https://github.com/ruby/ruby/pull/9289).
If we adopt the technique in this patch, then we can optimize allocating
objects that take keyword parameters for `initialize`.

For example, this code will allocate 2 objects: one for `SomeObject`, and one
for the kwargs:

```ruby
SomeObject.new(foo: 1)
```

If we combine this technique, plus implement `Class#new` in Ruby, then we can
reduce allocations for this common operation.

Co-Authored-By: John Hawthorn <john@hawthorn.email>
Co-Authored-By: Alan Wu <XrXr@users.noreply.github.com>
2024-06-18 09:28:25 -07:00
Nobuyoshi Nakada
a1f72a563b [Bug #20579] ripper: Dispatch spaces at END-OF-INPUT without newline 2024-06-14 17:54:02 +09:00
Nobuyoshi Nakada
7f47469105 Include __LINE__ in add_delayed_token macro 2024-06-14 17:54:02 +09:00
Nobuyoshi Nakada
2e59cf00cc [Bug #20578] ripper: Fix dispatching part at invalid escapes 2024-06-14 15:02:15 +09:00
S-H-GAMELINKS
1fc0763724 Introduce ident_or_const inline rule 2024-06-12 15:36:55 +09:00
Nobuyoshi Nakada
206465e84d ripper: Unify dispatch_end 2024-06-12 11:49:33 +09:00
Nobuyoshi Nakada
906a86e4de
Use dllexport as RUBY_FUNC_EXPORTED on Windows 2024-06-09 16:55:27 +09:00
Nobuyoshi Nakada
7612e45306
ripper: Unify formal argument error handling 2024-06-08 15:00:18 +09:00
Nobuyoshi Nakada
9bee49e0e1
ripper: Unify backref error handling 2024-06-08 13:25:44 +09:00
Nobuyoshi Nakada
18fcec23bf
ripper: Introduce RIPPER_ID macro instead of ripper_id_ macros 2024-06-08 13:20:46 +09:00
Nobuyoshi Nakada
9e28354705
ripper: Fix excess compile_error at simple backref op_asgn
Fix up 89cfc15207.
2024-06-07 11:28:38 +09:00
Kevin Newton
cbc83c4a92 Remove circular parameter syntax error
https://bugs.ruby-lang.org/issues/20478
2024-06-06 16:29:50 -04:00
Nobuyoshi Nakada
27321290d9 [Bug #20521] ripper: Clean up strterm 2024-06-06 20:43:56 +09:00
Nobuyoshi Nakada
ae203984ff Ditto for NODE_DOT2 and NODE_DOT3 2024-06-02 09:43:33 +09:00
Nobuyoshi Nakada
2889ed1bcb Use RNode_DREGX variable for debuggers
At least LLDB needs an actual variable not only casts to access the
type in debugger sessions.
2024-06-02 09:43:33 +09:00
Nobuyoshi Nakada
cedc7737b6 Make interchangeable NODE types aliases 2024-06-02 09:43:33 +09:00
Nobuyoshi Nakada
fd74614059
Get rid of type-punning pointer casts 2024-06-01 21:51:27 +09:00
Nobuyoshi Nakada
05553cf22d
[Bug #20517] Make a multibyte character one token at meta escape 2024-06-01 19:33:12 +09:00
Jeremy Evans
89486c79bb
Make error messages clear blocks/keywords are disallowed in index assignment
Blocks and keywords are allowed in regular index.

Also update NEWS to make this more clear.

Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
2024-05-31 08:22:40 -07:00
Yusuke Endoh
a15e4d405b Revert 528c4501f4
Recently, `TestRubyLiteral#test_float` fails randomly.

```
  1) Error:
TestRubyLiteral#test_float:
ArgumentError: SyntaxError#path changed: "(eval at /home/chkbuild/chkbuild/tmp/build/20240527T050036Z/ruby/test/ruby/test_literal.rb:642)"->"(eval at /home/chkbuild/chkbuild/tmp/build/20240527T050036Z/ruby/test/ruby/test_literal.rb:642)"
```
20240527T050036Z.fail.html.gz

According to Launchable, the first failure was on Apr 30.
This is just when 528c4501f4 was
committed. I don't know if the change is really the cause, but I want to
revert it once to see if the random failure disappears.
2024-05-31 18:24:43 +09:00
Kevin Newton
47f0965269 Update duplicated when clause warning message 2024-05-24 12:36:54 -04:00
Nobuyoshi Nakada
a99d79dd31 Remove dead code
Since 140512d222, `else` without
`rescue` has been a syntax error.
2024-05-23 19:28:02 +09:00
Yusuke Endoh
1471a160ba Add RB_GC_GUARD for rb_str_to_parser_string
I think this fixes the following random test failure that could not be
fixed for a long time:

```
  1) Failure:
TestSymbol#test_inspect_under_gc_compact_stress [/home/chkbuild/chkbuild/tmp/build/20240522T003003Z/ruby/test/ruby/test_symbol.rb:126]:
<":testing"> expected but was
<":\"\\x00\\x00\\x00\\x00\\x00\\x00\\x00\"">.
```

The value passed to this function is the return value of `rb_id2str`, so
it is never collected.  However, if auto_compact is enabled, the string
may move and `RSTRING_PTR(str)` became invalid.

This change prevents the string from being moved by RB_GC_GUARD.
2024-05-23 19:26:45 +09:00
Nobuyoshi Nakada
c773453c77
ripper: Splat find patterns 2024-05-21 13:52:30 +09:00
Nobuyoshi Nakada
501dbf2bca
ripper: Splat hash patterns 2024-05-21 13:52:30 +09:00