Commit graph

2232 commits

Author SHA1 Message Date
nagachika
483ad38c69 merge revision(s) b15e88e0fc: [Backport #19619]
[Bug #19619] Preserve numbered parameters context

	Preserve numbered parameters context across method definitions
2024-07-21 12:25:04 +09:00
nagachika
96a82418b2 merge revision(s) 05553cf22d: [Backport #20517]
[Bug #20517] Make a multibyte character one token at meta escape
2024-07-15 22:01:57 +09:00
nagachika
a804d5514c merge revision(s) d503e1b95a: [Backport #20030]
[Bug #20030] dispatch invalid escaped character without ignoring it
2024-07-15 21:51:03 +09:00
nagachika
1b9ff146e3 Revert "merge revision(s) c8d162c889: [Backport #19973]"
This reverts commit 24dd529750.
2024-02-25 14:03:05 +09:00
nagachika
24dd529750 merge revision(s) c8d162c889: [Backport #19973]
[Bug #19973] Warn duplicated keyword arguments after keyword splat

	---
	 parse.y                  | 11 +++++++----
	 test/ruby/test_syntax.rb |  6 ++++++
	 2 files changed, 13 insertions(+), 4 deletions(-)
2024-02-25 11:52:05 +09:00
nagachika
482e1f573e merge revision(s) 17b0643392: [Backport #19924]
[Bug #19924] Source code should be unsigned char stream

	Use `peekc` or `nextc` to fetch the next character, instead of reading
	from `lex.pcur` directly, for compilers that plain char is signed.
	---
	 parse.y                 | 10 +++++-----
	 test/ruby/test_parse.rb |  2 ++
	 2 files changed, 7 insertions(+), 5 deletions(-)
2023-11-10 11:41:57 +09:00
nagachika
bb877e5b4f merge revision(s) 382678d411: [Backport #19788]
[Bug #19788] Use the result of `tCOLON2` event

	---
	 parse.y                            | 16 ++++++++--------
	 test/ripper/test_parser_events.rb  | 17 +++++++++++++++++
	 test/ripper/test_scanner_events.rb |  5 +++++
	 3 files changed, 30 insertions(+), 8 deletions(-)
2023-09-09 19:33:29 +09:00
nagachika
0c908fa681 merge revision(s) 0b8f15575a: [Backport #19836]
Fix memory leak for incomplete lambdas

	[Bug #19836]

	The parser does not free the chain of `struct vtable`, which causes
	memory leaks.

	The following script reproduces this issue:

	```
	10.times do
	  100_000.times do
	    Ripper.parse("-> {")
	  end

	  puts `ps -o rss= -p #{$$}`
	end
	```
	---
	 parse.y                    | 24 ++++++++++++++----------
	 test/ripper/test_ripper.rb |  7 +++++++
	 2 files changed, 21 insertions(+), 10 deletions(-)
2023-08-13 13:35:25 +09:00
nagachika
6898389a0f merge revision(s) 5bc8fceca8: [Backport #19835]
Fix memory leak in parser for incomplete tokens

	[Bug #19835]

	The parser does not free the `tbl` of the `struct vtable` when there are
	leftover `lvtbl` in the parser. This causes a memory leak.

	The following script reproduces this issue:

	```
	10.times do
	  100_000.times do
	    Ripper.parse("class Foo")
	  end

	  puts `ps -o rss= -p #{$$}`
	end
	```
	---
	 parse.y                    | 42 ++++++++++++++++++++++++++++--------------
	 test/ripper/test_ripper.rb |  7 +++++++
	 2 files changed, 35 insertions(+), 14 deletions(-)
2023-08-13 13:21:30 +09:00
nagachika
465eb7418d merge revision(s) 91c004885f: [Backport #19025]
[Bug #19025] Numbered parameter names are always local variables

	---
	 parse.y                  | 2 +-
	 test/ruby/test_syntax.rb | 1 +
	 2 files changed, 2 insertions(+), 1 deletion(-)
2023-07-22 11:55:49 +09:00
nagachika
3f6187a947 merge revision(s) 1bc8838d60: [Backport #19750]
Handle unterminated unicode escapes in regexps

	This fixes an infinite loop possible after ec3542229b.
	For \u{} escapes in regexps, skip validation in the parser, and rely on the regexp
	code to handle validation. This is necessary so that invalid unicode escapes in
	comments in extended regexps are allowed.

	Fixes [Bug #19750]

	Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
	---
	 parse.y                 | 97 ++++++++++++++++++++++++++++++++-----------------
	 test/ruby/test_parse.rb | 16 ++++++++
	 2 files changed, 79 insertions(+), 34 deletions(-)
2023-07-17 18:04:41 +09:00
Shugo Maeda
2581de112c Disallow mixed usage of ... and */**
[Feature #19134]
2022-12-15 18:56:24 +09:00
Nobuyoshi Nakada
764da87ab0 [Bug #19195] Allow optional newlines before closing parenthesis 2022-12-13 18:06:11 +09:00
Shugo Maeda
04311008b6
Use loc to fix a failure of test_ast.rb
```
    1) Failure:
  TestAst#test_ranges:test/fiber/scheduler.rb [/home/runner/work/ruby/ruby/src/test/ruby/test_ast.rb:122]:
  <[]> expected but was
  <[{:type=>:max_validation_error,
    :max=>
     #<RubyVM::AbstractSyntaxTree::Node::CodePosition:0x00007f80d630b598
      @column=20,
      @lineno=203>,
    :end_pos=>
     #<RubyVM::AbstractSyntaxTree::Node::CodePosition:0x00007f80d630b778
      @column=19,
      @lineno=203>,
    :node=>
     (BLOCK_PASS@203:15-203:19
        (ARGSPUSH@203:15-203:20 (SPLAT@203:16-203:19 (LVAR@203:16-203:19 :*))
           (HASH@203:16-203:19
              (LIST@203:16-203:19 nil (LVAR@203:16-203:19 :**) nil)))
        (LVAR@203:16-203:19 :&))}]>.
```
2022-12-05 15:54:21 +09:00
Shugo Maeda
2649055c98
Should use argsloc for the last argument for arg_append() 2022-12-05 15:10:15 +09:00
S-H-GAMELINKS
1a64d45c67 Introduce encoding check macro 2022-12-02 01:31:27 +09:00
yui-knk
8be62f06c8 Remove ruby2_keywords related to args forwarding
This was introduced by b609bdeb53
to suppress warnings. However these warngins were deleted by
beae6cbf0f. Therefore these codes
are not needed anymore.
2022-11-29 15:39:56 +09:00
Shugo Maeda
a0e4dc52b0 Use idFWD_* instead of ANON_*_ID 2022-11-29 11:22:09 +09:00
Shugo Maeda
4fc668a4f3 Allow ** in def foo(...)
[Feature #19134]
2022-11-29 11:22:09 +09:00
Jeremy Evans
f5d73da806 Fix the position of rescue clause without exc_list
If the rescue clause has only exc_var and not exc_list, use the
exc_var position instead of the rescue body position.

This issue appears to have been introduced in
688169fd83 when "opt_list" was split
into "exc_list exc_var".

Fixes [Bug #18974]
2022-11-24 14:26:08 -08:00
yui-knk
854312eede Refactor to use has_delayed_token macro 2022-11-21 16:32:13 +09:00
yui-knk
d8601621ed Enhance keep_tokens option for RubyVM::AbstractSyntaxTree parsing methods
Implementation for Language Server Protocol (LSP) sometimes needs token information.
For example both `m(1)` and `m(1, )` has same AST structure other than node locations
then it's impossible to check the existence of `,` from AST. However in later case,
it might be better to suggest variables list for the second argument.
Token information is important for such case.

This commit adds these methods.

* Add `keep_tokens` option for `RubyVM::AbstractSyntaxTree.parse`, `.parse_file` and `.of`
* Add `RubyVM::AbstractSyntaxTree::Node#tokens` which returns tokens for the node including tokens for descendants nodes.
* Add `RubyVM::AbstractSyntaxTree::Node#all_tokens` which returns all tokens for the input script regardless the receiver node.

[Feature #19070]

Impacts on memory usage and performance are below:

Memory usage:

```
$ cat test.rb
root = RubyVM::AbstractSyntaxTree.parse_file(File.expand_path('../test/ruby/test_keyword.rb', __FILE__), keep_tokens: true)

$ /usr/bin/time -f %Mkb /usr/local/bin/ruby -v
ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux]
11408kb

# keep_tokens :false
$ /usr/bin/time -f %Mkb /usr/local/bin/ruby test.rb
17508kb

# keep_tokens :true
$ /usr/bin/time -f %Mkb /usr/local/bin/ruby test.rb
30960kb
```

Performance:

```
$ cat ../ast_keep_tokens.yml
prelude: |
  src = <<~SRC
    module M
      class C
        def m1(a, b)
          1 + a + b
        end
      end
    end
  SRC
benchmark:
  without_keep_tokens: |
    RubyVM::AbstractSyntaxTree.parse(src, keep_tokens: false)
  with_keep_tokens: |
    RubyVM::AbstractSyntaxTree.parse(src, keep_tokens: true)

$ make benchmark COMPARE_RUBY="./ruby" ARGS=../ast_keep_tokens.yml
/home/kaneko.y/.rbenv/shims/ruby --disable=gems -rrubygems -I../benchmark/lib ../benchmark/benchmark-driver/exe/benchmark-driver \
            --executables="compare-ruby::./ruby -I.ext/common --disable-gem" \
            --executables="built-ruby::./miniruby -I../lib -I. -I.ext/common  ../tool/runruby.rb --extout=.ext  -- --disable-gems --disable-gem" \
            --output=markdown --output-compare -v ../ast_keep_tokens.yml
compare-ruby: ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux]
built-ruby: ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux]
warming up..

|                     |compare-ruby|built-ruby|
|:--------------------|-----------:|---------:|
|without_keep_tokens  |     21.659k|   21.303k|
|                     |       1.02x|         -|
|with_keep_tokens     |      6.220k|    5.691k|
|                     |       1.09x|         -|
```
2022-11-21 09:01:34 +09:00
yui-knk
f0ce118662 Make anonymous rest arg (*) and block arg (&) accessible from ARGS node 2022-11-18 18:25:42 +09:00
Shugo Maeda
ddd62fadaf Allow anonymous keyword rest parameter with other keyword parameters
Fixes [Bug #19132]
2022-11-18 18:23:06 +09:00
S-H-GAMELINKS
1f4f6c9832 Using UNDEF_P macro 2022-11-16 18:58:33 +09:00
Nobuyoshi Nakada
230267d1a8 Now bison 3.0 or later is required 2022-11-09 21:34:02 +09:00
yui-knk
f7db1affd1 Set default %printer for NODE nterms
Before:

```
Reducing stack by rule 639 (line 5062):
   $1 = token "integer literal" (1.0-1.1: 1)
-> $$ = nterm simple_numeric (1.0-1.1: )
```

After:

```
Reducing stack by rule 641 (line 5078):
   $1 = token "integer literal" (1.0-1.1: 1)
-> $$ = nterm simple_numeric (1.0-1.1: NODE_LIT)
```

`"<*>"` is supported by Bison 2.3b (2008-05-27) or later.
https://git.savannah.gnu.org/cgit/bison.git/commit/?id=12e3584054c16ab255672c07af0ffc7bb220e8bc

Therefore developers need to install Bison 2.3b+ to build ruby from
source codes if their Bison is older.

Minimum version requirement for Bison is changed to 3.0.

See: https://bugs.ruby-lang.org/issues/19068 [Feature #19068]
2022-11-08 12:30:03 +09:00
Nobuyoshi Nakada
546566d34b
Do not set $! to SyntaxError when error tolerant 2022-10-09 19:07:21 +09:00
yui-knk
8483737bbf Fix typos 2022-10-08 23:29:36 +09:00
yui-knk
50f5223236 Fix SEGV of dump parsetree
Assign internal_id to semantic value so that dump parsetree option
can render the tree for these codes without SEGV.

* `def m(&); end`
* `def m(*); end`
* `def m(**); end`
2022-10-08 22:30:50 +09:00
yui-knk
3531086095 "expr_value" can be error
So that "IF" node is kept in the case below

```
def m
  if
end
```

[Feature #19013]
2022-10-08 17:59:11 +09:00
yui-knk
4bfdf6d06d Move error from top_stmts and top_stmt to stmt
By this change, syntax error is recovered smaller units.
In the case below, "DEFN :bar" is same level with "CLASS :Foo"
now.

```
module Z
  class Foo
    foo.
  end

  def bar
  end
end
```

[Feature #19013]
2022-10-08 17:59:11 +09:00
yui-knk
4f24f3ea94 Treat "end" as reserved word with consideration of indent
"end" after "." or "::" is treated as local variable or method,
see `EXPR_DOT_bit` for detail.
However this "changes" where `bar` method is defined. In the example
below it is not module Z but class Foo.

```
module Z
  class Foo
    foo.
  end

  def bar
  end
end
```

[Feature #19013]
2022-10-08 17:59:11 +09:00
yui-knk
342d4c16d9 Generates "end" tokens if parser hits end of input
but "end" tokens are needed for correct language.

[Feature #19013]
2022-10-08 17:59:11 +09:00
yui-knk
fbbdbdd891 Add error_tolerant option to RubyVM::AST
If this option is enabled, SyntaxError is not raised and Node is
returned even if passed script is broken.

[Feature #19013]
2022-10-08 17:59:11 +09:00
Shugo Maeda
a8ad22d926
Suppress a warning on clang
The following warning appears without this fix:

```
parse.y:78:1: warning: unknown warning group '-Wpsabi', ignored
      [-Wunknown-warning-option]
RBIMPL_WARNING_IGNORED(-Wpsabi)
^
./include/ruby/internal/warning_push.h:103:39: note: expanded from macro
      'RBIMPL_WARNING_IGNORED'
                                      ^
./include/ruby/internal/warning_push.h:99:39: note: expanded from macro
      'RBIMPL_WARNING_PRAGMA2'
                                      ^
./include/ruby/internal/warning_push.h:98:39: note: expanded from macro
      'RBIMPL_WARNING_PRAGMA1'
                                      ^
./include/ruby/internal/warning_push.h:97:39: note: expanded from macro
      'RBIMPL_WARNING_PRAGMA0'
                                      ^
<scratch space>:49:27: note: expanded from here
 clang diagnostic ignored "-Wpsabi"
                          ^
1 warning generated.
```
2022-09-26 14:44:54 +09:00
S.H
960db13c47
Reuse opt_arg_append function 2022-09-14 23:10:21 +09:00
Kazuki Tsujimoto
db0e0dad11
Fix unexpected "duplicated key name" error in paren-less one line pattern matching
[Bug #18990]
2022-09-09 14:00:27 +09:00
Nobuyoshi Nakada
ace2eee544
[Bug #18963] Separate string contents by here document terminator 2022-08-28 09:29:24 +09:00
S.H
13d31331c8
Reuse nonlocal_var patterns 2022-08-22 18:52:36 +09:00
S-H-GAMELINKS
3541f32951 Reuse opt_nl rule 2022-08-19 09:51:06 +09:00
S-H-GAMELINKS
f095361758 Repalce to NIL_P macro 2022-08-19 09:47:43 +09:00
Nobuyoshi Nakada
844a0edbae [Bug #18962] Do not read again once reached EOF
`Ripper::Lexer#parse` re-parses the source code with syntax errors
when `raise_errors: false`.

Co-Authored-By: tompng <tomoyapenguin@gmail.com>
2022-08-12 15:58:18 +09:00
Kevin Backhouse
8c1808151f
Fix some UBSAN false positives (#6115)
* Fix some UBSAN false positives.
* ruby tool/update-deps --fix
2022-07-12 11:48:10 -07:00
Nobuyoshi Nakada
eaeb130b11 [Bug #18890] newline should be insignificant after pattern label 2022-07-06 08:32:36 +09:00
Nobuyoshi Nakada
982cda9a3e [Bug #18877] Let lex_ctxt not to eat escaped whitespace 2022-06-30 16:31:51 +09:00
Nobuyoshi Nakada
685efac059
[Bug #18884] class cannot be just followed by modifiers 2022-06-29 14:13:15 +09:00
Nobuyoshi Nakada
961543945f
Suppress notes for old gcc 2022-06-23 22:52:45 +09:00
S-H-GAMELINKS
420f3ced4d Using is_ascii_string to check encoding 2022-06-17 12:02:50 +09:00
Nobuyoshi Nakada
cd5cafa4a3 Respect the encoding of the source [Bug #18827]
Do not override the input string encoding at the time of preparation,
the source encoding is not determined from the input yet.
2022-06-17 01:48:52 +09:00