Commit graph

20692 commits

Author SHA1 Message Date
Earlopain
e5e160475b [ruby/prism] Warn when the parser translator receives an incompatible builder class
In https://github.com/ruby/prism/pull/3494 I added a bit of code
so that using the new builder doesn't break stuff.
This code can be dropped when it is enforced that builder
is _always_ the correct subclass (and makes future issues like that unlikely).

193d4b806d
2025-03-19 21:03:17 +00:00
Nobuyoshi Nakada
00c84f4d49
A comment for TestRubyOptions::ExpectedStderrList [ci skip] 2025-03-19 15:19:06 +09:00
Nobuyoshi Nakada
6c7f721f1e
Source path may or may not exist 2025-03-19 15:08:20 +09:00
Yusuke Endoh
3eb802fb56 Loosen SEGV message testing
Since `rb_bug` does not always take Ruby frame info during SEGV, the
source file path may not be output.

```
  1) Failure:
TestRubyOptions#test_crash_report_script [/tmp/ruby/src/trunk_gcc11/test/ruby/test_rubyoptions.rb:907]:
Expected /
        bug\.rb:(?:1:)?\s\[BUG\]\sSegmentation\sfault.*\n
      /x
to match
  "[BUG] Segmentation fault at 0x000003e900328766\n"+
```
http://ci.rvm.jp/results/trunk_gcc11@ruby-sp2-noble-docker/5663880
2025-03-19 14:57:15 +09:00
Kevin Newton
adaaa7878e Handle void expressions in defined?
[Bug #21029]
2025-03-18 14:44:28 -04:00
Kevin Newton
33aaa069a4 [ruby/prism] Update truffleruby version
2afe89f8ce
2025-03-18 13:36:53 -04:00
Kevin Newton
dc48c1aca3 [ruby/prism] Add a multiple statements flag to parentheses
This can get triggered even if the list of statements only contains
a single statement. This is necessary to properly support compiling

```ruby
defined? (;a)
defined? (a;)
```

as "expression". Previously these were parsed as statements lists
with single statements in them.

b63b5d67a9
2025-03-18 13:36:53 -04:00
Earlopain
e3c8464630 [ruby/prism] Only unnest parser mlhs nodes when no rest argument is provided
```
(a,), = []

PARSER====================
s(:masgn,
  s(:mlhs,
    s(:mlhs,
      s(:lvasgn, :a))),
  s(:array))
PRISM====================
s(:masgn,
  s(:mlhs,
    s(:lvasgn, :a)),
  s(:array))
```

8aa1f4690e
2025-03-18 13:36:53 -04:00
Earlopain
94e12ffa39 [ruby/prism] Fix parser translator multiline interpolated symbols
In 2637007929 I added tests but didn't modify them correctly

de021e74de
2025-03-18 13:36:53 -04:00
Earlopain
a8adf5e006 [ruby/prism] Further refine string handling in the parser translator
Mostly around newlines and line continuation.
* percent arrays need special backslash handling in the ast
* Fix offset issue for heredocs with many line continuations (used wrong variable as index access)
* More refined rules on when to simplify string tokens
* Handle line continuations in squiggly heredocs
* Correctly dedent squiggly heredocs with interpolation
* Consider `':foo:` and `%s[foo]` to not be interpolation

4edfe9d981
2025-03-18 13:36:53 -04:00
Earlopain
fc14d3ac7d [ruby/prism] Allow to test a custom fixtures path during testing
Of course, these won't really be fixtures, but it allows to test against whole codebases
without copying them, doing symlinks or something like that.

For example, I can tell that over the whole RuboCop codebase, there are only 8 files that produce mismatched ast.
Telling what the problem is is a different problem. The ast for real files can and will be huge so I haven't checked yet
(maybe parser bug) but it's nice for discoverability regardless

2184d82ba6
2025-03-18 13:36:53 -04:00
Kevin Newton
0b4604d5a0 [ruby/prism] Use Set.new over to_set
422d5c4c64
2025-03-18 13:36:53 -04:00
Earlopain
d5503444fd [ruby/prism] Fix parser translator crash for certain octal escapes
`Integer#chr` performs some validation that we don't want/need. Octal escapes can go above 255, where it will then raise trying to convert.

`append_as_bytes` actually allows to pass a number, so we can just skip that call.
Although, on older rubies of course we still need to handle this in the polyfill.
I don't really like using `pack` but don't know of another way to do so.

For the utf-8 escapes, this is not an issue. Invalid utf-8 in these is simply a syntax error.

161c606b1f
2025-03-18 13:36:53 -04:00
Earlopain
fd7a10cf4a [ruby/prism] Further refine string handling in the parser translator
Mostly around newlines and line continuation.
* percent arrays need special backslash handling in the ast
* Fix offset issue for heredocs with many line continuations (used wrong variable as index access)
* More refined rules on when to simplify string tokens
* Handle line continuations in squiggly heredocs
* Correctly dedent squiggly heredocs with interpolation
* Consider `':foo:` and `%s[foo]` to not be interpolation

4edfe9d981
2025-03-18 13:36:53 -04:00
Earlopain
5d138f2b43 [ruby/prism] Better handle regexp in the parser translator
Turns out, it was already almost correct. If you disregard \c and \M style escapes, only a single character is allowed to be escaped in a regex so most tests passed already.

There was also a mistake where the wrong value was constructed for the ast, this is now fixed.
One test fails because of this, but I'm fairly sure it is because of a parser bug. For `/\“/`, the backslash is supposed to be removed because it is a multibyte character. But tbh,
I don't entirely understand all the rules.

Fixes more than half of the remaining ast differences for rubocop tests

e1c75f304b
2025-03-18 13:36:53 -04:00
Earlopain
177adf6fa5 [ruby/prism] Fix parser translator tokens for %-arrays with whitespace escapes
Also fixes a token incompatibility for the word separator. parser only considers whitespace until the first newline

bd3dd2b62a
2025-03-18 13:36:53 -04:00
Earlopain
ac728389e2 [ruby/prism] Fix parser translator edge-case when multiline string ends with \n
When the line contains no real newline but contains unescaped ones, then there will be one less entry

4ef093b600
2025-03-18 13:36:53 -04:00
Earlopain
0fcb7fc21d [ruby/prism] Better handle all kinds of multiline strings in the parser translator
This is a followup to #3373, where the implementation
was extracted

2637007929
2025-03-18 13:36:53 -04:00
Earlopain
acf404e20e [ruby/prism] Fix an incompatibility with the parser translator
The offset cache contains an entry for each byte so it can't be accessed via the string length.

Adds tests for all variants except for this:
```
"fo
o" "ba
’"
```

For some reason, this still has the wrong offset.

a651126458
2025-03-18 13:36:53 -04:00
Earlopain
f49a0114e3 [ruby/prism] Fix parser translator rescue location with semicolon body
There are a few other locations that should be included in that check.
I think the end location must always be present but I left it in to be safe (maybe implicit begin somehow?)

545d07ddc3
2025-03-18 13:36:53 -04:00
Earlopain
bc506295a3 [ruby/prism] Further refine string handling in the parser translator
Mostly around newlines and line continuation.
* percent arrays need special backslash handling in the ast
* Fix offset issue for heredocs with many line continuations (used wrong variable as index access)
* More refined rules on when to simplify string tokens
* Handle line continuations in squiggly heredocs
* Correctly dedent squiggly heredocs with interpolation
* Consider `':foo:` and `%s[foo]` to not be interpolation

4edfe9d981
2025-03-18 13:36:53 -04:00
Kevin Newton
fcd6e53693 Remove incorrectly committed snapshots 2025-03-18 13:36:53 -04:00
Earlopain
705bd6fadb [ruby/prism] Fix parser translator when unescaping invalid utf8
1. The string starts out as binary
2. `ち` is appended, forcing it back into utf-8
3. Some invalid byte sequences are tried to append

> incompatible character encodings: UTF-8 and BINARY (ASCII-8BIT)

This makes use of my wish to use `append_as_bytes`. Unfortunatly that method is rather new
so it needs a fallback

e31e94a775
2025-03-18 13:36:53 -04:00
Kevin Newton
3d6fc29169 [ruby/prism] Make xstrings concat syntax error
f734350499
2025-03-18 16:00:03 +00:00
Nobuyoshi Nakada
f69ad0e810
[Bug #21094] Update nested module names when setting temporary name 2025-03-18 23:47:20 +09:00
Mari Imaizumi
e63c516046 [Feature #19908] Update Unicode headers to 15.1.0 2025-03-18 21:18:12 +09:00
Mari Imaizumi
75844889eb Fix case folding in single byte encoding 2025-03-18 21:04:02 +09:00
Nobuyoshi Nakada
c7f31c88ae
[Feature #20702] Tests for Array#fetch_values 2025-03-18 17:55:46 +09:00
Nobuyoshi Nakada
1acfb29015 [Bug #21186] multibyte char literal should be a single letter word 2025-03-17 23:55:11 +09:00
Nobuyoshi Nakada
8d6f153fba Manage skipping instance variable IDs in one place 2025-03-17 23:42:16 +09:00
Nobuyoshi Nakada
8f19f0aad5 [ruby/optparse] Fix completion of key-value pairs array
Enum array may be the list of pairs of key and value.  Check if only
key is completable, not pair.

Fix https://github.com/ruby/optparse/pull/93
Fix https://github.com/ruby/optparse/pull/94

a8d0ba8dac
2025-03-17 10:18:49 +00:00
Jérôme Parent-Lévesque
b5cdbadeed [Bug #21185] Fix Range#overlap? with infinite range
Infinite ranges, i.e. unbounded ranges, should overlap with any other range
which wasn't the case in the following example: (0..3).overlap?(nil..nil)
2025-03-17 16:26:23 +09:00
Nobuyoshi Nakada
35920f7a44
Refine TestSocket_TCPSocket#test_initialize_failure
* Use `assert_raise_kind_of` instead of `rescue` and `flunk`.
* Use `assert_include` for the pattern that may contain regexp meta
  characters.
2025-03-15 16:02:19 +09:00
Nobuyoshi Nakada
29c0ca58c2 Test for the crash 2025-03-15 15:50:46 +09:00
Jean Boussier
de48e47ddf Invoke inherited callbacks before const_added
[Misc #21143]

Conceptually this makes sense and is more consistent with using
the `Name = Class.new(Superclass)` alternative method.

However the new class is still named before `inherited` is called.
2025-03-14 09:51:57 +01:00
Kevin Newton
af76b7f4d9 [ruby/prism] Revert "Mark extension as Ractor-safe"
56eaf53732
2025-03-12 19:56:22 +00:00
Kevin Newton
242e99eb0f [ruby/prism] Mark extension as Ractor-safe
10e5431b38
2025-03-12 19:15:03 +00:00
Alan Wu
08b3a45bc9 Push a real iseq in rb_vm_push_frame_fname()
Previously, vm_make_env_each() (used during proc
creation and for the debug inspector C API) picked up the
non-GC-allocated iseq that rb_vm_push_frame_fname() creates,
which led to a SEGV when the GC tried to mark the non GC object.

Put a real iseq imemo instead. Speed should be about the same since
the old code also did a imemo allocation and a malloc allocation.

Real iseq allows ironing out the special-casing of dummy frames in
rb_execution_context_mark() and rb_execution_context_update(). A check
is added to RubyVM::ISeq#eval, though, to stop attempts to run dummy
iseqs.

[Bug #21180]

Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2025-03-12 15:00:26 -04:00
Alan Wu
9b9661883b Have ast live longer in ISeq.compile_file to fix GC stress crash
Previously, live range of `ast_value` ended on the call right before
rb_ast_dispose(), which led to premature collection and use-after-free.

We observed this crashing on -O3, -DVM_CHECK_MODE, with GCC 11.4.0 on
Ubuntu.

Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2025-03-12 15:00:26 -04:00
Peter Zhu
a8d63ecdb8 Fix flaky test_AREF_fstring_key
The code between the two ObjectSpace.count_objects could trigger a GC,
which could free string objects causing this test to fail.

We can see this failure on CI http://ci.rvm.jp/results/trunk-random2@ruby-sp2-noble-docker/5651016

    TestHashOnly#test_AREF_fstring_key [test/ruby/test_hash.rb:1991]:
    <197483> expected but was
    <129689>.
2025-03-12 13:27:03 -04:00
Nobuyoshi Nakada
9459bedd84
[Bug #19841] Refine error on marshaling recursive USERDEF 2025-03-12 18:42:38 +09:00
Jean Boussier
1d07deb422 [ruby/json] Raise a ParserError on all incomplete unicode escape sequence.
This was the behavior until `2.10.0` unadvertently changed it.

`"\u1"` would raise, but `"\u1zzz"` wouldn't.

7d0637b9e6
2025-03-12 18:02:09 +09:00
Peter Zhu
1cdec3240b Fix memory leak in rb_reg_search_set_match
https://github.com/ruby/ruby/pull/12801 changed regexp matches to reuse
the backref, which causes memory to leak if the original registers of the
match is not freed.

For example, the following script leaks memory:

    10.times do
      1_000_000.times do
        "aaaaaaaaaaa".gsub(/a/, "")
      end

      puts `ps -o rss= -p #{$$}`
    end

Before:

    774256
    1535152
    2297360
    3059280
    3821296
    4583552
    5160304
    5091456
    5114256
    4980192

After:

    12480
    11440
    11696
    11632
    11632
    11760
    11824
    11824
    11824
    11888
2025-03-11 21:55:03 -04:00
Nobuyoshi Nakada
3278e3b6f3
[Bug #21177] Win32: Allow longer path name 2025-03-12 00:46:05 +09:00
Peter Zhu
e51411ff1f Fix flaky test_latest_gc_info_need_major_by
The test could flake because a major GC could be triggered due to allocation
for caches or other things, which would cause the test to fail.
2025-03-11 11:44:07 -04:00
Peter Zhu
47c3ae6962 Bump tolerance for weak reference test from 1 to 2
The test fails sometimes with:

    TestGc#test_latest_gc_info_weak_references_count [test/ruby/test_gc.rb:421]:
    Expected 2 to be <= 1.
2025-03-10 20:00:47 -04:00
Koichi ITO
6b4453e332 [ruby/prism] Support itblock for Prism::Translation::Parser
## Summary

`itblock` node is added to support the `it` block parameter syntax introduced in Ruby 3.4.

```console
$ ruby -Ilib -rprism -rprism/translation/parser34 -e 'buffer = Parser::Source::Buffer.new("path"); buffer.source = "proc { it }"; \
                                                      p Prism::Translation::Parser34.new.tokenize(buffer)[0]'
s(:itblock,
  s(:send, nil, :proc), :it,
  s(:lvar, :it))
```

This node design is similar to the `numblock` node, which was introduced for the numbered parameter syntax in Ruby 2.7.

```
$ ruby -Ilib -rprism -rprism/translation/parser34 -e 'buffer = Parser::Source::Buffer.new("path"); buffer.source = "proc { _1 }"; \
                                                      p Prism::Translation::Parser34.new.tokenize(buffer)[0]'
s(:numblock,
  s(:send, nil, :proc), 1,
  s(:lvar, :_1))
```

The difference is that while numbered parameters can have multiple parameters, the `it` block parameter syntax allows only a single parameter.

In Ruby 3.3, the conventional node prior to the `it` block parameter syntax is returned.

```console
$ ruby -Ilib -rprism -rprism/translation/parser33 -e 'buffer = Parser::Source::Buffer.new("path"); buffer.source = "proc { it }"; \
                                                      p Prism::Translation::Parser33.new.tokenize(buffer)[0]'
s(:block,
  s(:send, nil, :proc),
  s(:args),
  s(:send, nil, :it))
```

## Development Note

The Parser gem does not yet support the `it` block parameter syntax. This is the first case where Prism's node design precedes that of the Parser gem.
When implementing https://github.com/whitequark/parser/issues/962, this node design will need to be taken into consideration.

c141e1420a
2025-03-10 16:57:46 +00:00
Koichi ITO
f4c16c57aa [ruby/optparse] Make the result of tty? obtainable with flexible stdout
In mock testing for stdout, `StringIO.new` is sometimes used to redirect the output.
In such cases, the assignment is done with `$stdout = StringIO.new`, not the constant `STDOUT`.
e.g., https://github.com/rubocop/rubocop/blob/v1.71.1/lib/rubocop/rspec/shared_contexts.rb#L154-L164

After assigning `StringIO.new`, `$stdout.tty?` returns `false`,
allowing the standard output destination to be switched during test execution.

```ruby
STDOUT.tty?       # => true
StringIO.new.tty? # => false
```

However, since `STDOUT.tty?` returns `true`, a failure occurred in environments
where the environment variables `RUBY_PAGER` or `PAGER` are set.
e.g., https://github.com/rubocop/rubocop/pull/13784

To address this, `STDOUT` has been updated to `$stdout` so that the result of `tty?` can be flexibly overridden.

A potential concern is that `$stdout`, unlike `STDOUT`,
does not always represent the standard output at the time the Ruby process started.
However, no concrete examples of issues related to this have been identified.

`STDOUT.tty?` is the logic of optparse introduced in https://github.com/ruby/optparse/pull/70.

This PR replaces `STDOUT` with `$stdout` throughout, based on the assumption
that `$stdout` is sufficient for use with optparse.

262cf6f9ac
2025-03-10 10:19:58 +00:00
Nobuyoshi Nakada
9e265b583b [ruby/optparse] Add post-check of value
Fix https://github.com/ruby/optparse/pull/80

050a87d029
2025-03-10 09:55:29 +00:00
David Rodríguez
e21e5bc814 [rubygems/rubygems] Fix gem rdoc not working with newer versions of rdoc
369f9b9311
2025-03-10 12:43:36 +09:00