In https://github.com/ruby/prism/pull/3494 I added a bit of code
so that using the new builder doesn't break stuff.
This code can be dropped when it is enforced that builder
is _always_ the correct subclass (and makes future issues like that unlikely).
193d4b806d
Since `rb_bug` does not always take Ruby frame info during SEGV, the
source file path may not be output.
```
1) Failure:
TestRubyOptions#test_crash_report_script [/tmp/ruby/src/trunk_gcc11/test/ruby/test_rubyoptions.rb:907]:
Expected /
bug\.rb:(?:1:)?\s\[BUG\]\sSegmentation\sfault.*\n
/x
to match
"[BUG] Segmentation fault at 0x000003e900328766\n"+
```
http://ci.rvm.jp/results/trunk_gcc11@ruby-sp2-noble-docker/5663880
This can get triggered even if the list of statements only contains
a single statement. This is necessary to properly support compiling
```ruby
defined? (;a)
defined? (a;)
```
as "expression". Previously these were parsed as statements lists
with single statements in them.
b63b5d67a9
Mostly around newlines and line continuation.
* percent arrays need special backslash handling in the ast
* Fix offset issue for heredocs with many line continuations (used wrong variable as index access)
* More refined rules on when to simplify string tokens
* Handle line continuations in squiggly heredocs
* Correctly dedent squiggly heredocs with interpolation
* Consider `':foo:` and `%s[foo]` to not be interpolation
4edfe9d981
Of course, these won't really be fixtures, but it allows to test against whole codebases
without copying them, doing symlinks or something like that.
For example, I can tell that over the whole RuboCop codebase, there are only 8 files that produce mismatched ast.
Telling what the problem is is a different problem. The ast for real files can and will be huge so I haven't checked yet
(maybe parser bug) but it's nice for discoverability regardless
2184d82ba6
`Integer#chr` performs some validation that we don't want/need. Octal escapes can go above 255, where it will then raise trying to convert.
`append_as_bytes` actually allows to pass a number, so we can just skip that call.
Although, on older rubies of course we still need to handle this in the polyfill.
I don't really like using `pack` but don't know of another way to do so.
For the utf-8 escapes, this is not an issue. Invalid utf-8 in these is simply a syntax error.
161c606b1f
Mostly around newlines and line continuation.
* percent arrays need special backslash handling in the ast
* Fix offset issue for heredocs with many line continuations (used wrong variable as index access)
* More refined rules on when to simplify string tokens
* Handle line continuations in squiggly heredocs
* Correctly dedent squiggly heredocs with interpolation
* Consider `':foo:` and `%s[foo]` to not be interpolation
4edfe9d981
Turns out, it was already almost correct. If you disregard \c and \M style escapes, only a single character is allowed to be escaped in a regex so most tests passed already.
There was also a mistake where the wrong value was constructed for the ast, this is now fixed.
One test fails because of this, but I'm fairly sure it is because of a parser bug. For `/\“/`, the backslash is supposed to be removed because it is a multibyte character. But tbh,
I don't entirely understand all the rules.
Fixes more than half of the remaining ast differences for rubocop tests
e1c75f304b
The offset cache contains an entry for each byte so it can't be accessed via the string length.
Adds tests for all variants except for this:
```
"fo
o" "ba
’"
```
For some reason, this still has the wrong offset.
a651126458
There are a few other locations that should be included in that check.
I think the end location must always be present but I left it in to be safe (maybe implicit begin somehow?)
545d07ddc3
Mostly around newlines and line continuation.
* percent arrays need special backslash handling in the ast
* Fix offset issue for heredocs with many line continuations (used wrong variable as index access)
* More refined rules on when to simplify string tokens
* Handle line continuations in squiggly heredocs
* Correctly dedent squiggly heredocs with interpolation
* Consider `':foo:` and `%s[foo]` to not be interpolation
4edfe9d981
1. The string starts out as binary
2. `ち` is appended, forcing it back into utf-8
3. Some invalid byte sequences are tried to append
> incompatible character encodings: UTF-8 and BINARY (ASCII-8BIT)
This makes use of my wish to use `append_as_bytes`. Unfortunatly that method is rather new
so it needs a fallback
e31e94a775
[Misc #21143]
Conceptually this makes sense and is more consistent with using
the `Name = Class.new(Superclass)` alternative method.
However the new class is still named before `inherited` is called.
Previously, vm_make_env_each() (used during proc
creation and for the debug inspector C API) picked up the
non-GC-allocated iseq that rb_vm_push_frame_fname() creates,
which led to a SEGV when the GC tried to mark the non GC object.
Put a real iseq imemo instead. Speed should be about the same since
the old code also did a imemo allocation and a malloc allocation.
Real iseq allows ironing out the special-casing of dummy frames in
rb_execution_context_mark() and rb_execution_context_update(). A check
is added to RubyVM::ISeq#eval, though, to stop attempts to run dummy
iseqs.
[Bug #21180]
Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
Previously, live range of `ast_value` ended on the call right before
rb_ast_dispose(), which led to premature collection and use-after-free.
We observed this crashing on -O3, -DVM_CHECK_MODE, with GCC 11.4.0 on
Ubuntu.
Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
The code between the two ObjectSpace.count_objects could trigger a GC,
which could free string objects causing this test to fail.
We can see this failure on CI http://ci.rvm.jp/results/trunk-random2@ruby-sp2-noble-docker/5651016
TestHashOnly#test_AREF_fstring_key [test/ruby/test_hash.rb:1991]:
<197483> expected but was
<129689>.
https://github.com/ruby/ruby/pull/12801 changed regexp matches to reuse
the backref, which causes memory to leak if the original registers of the
match is not freed.
For example, the following script leaks memory:
10.times do
1_000_000.times do
"aaaaaaaaaaa".gsub(/a/, "")
end
puts `ps -o rss= -p #{$$}`
end
Before:
774256
1535152
2297360
3059280
3821296
4583552
5160304
5091456
5114256
4980192
After:
12480
11440
11696
11632
11632
11760
11824
11824
11824
11888
## Summary
`itblock` node is added to support the `it` block parameter syntax introduced in Ruby 3.4.
```console
$ ruby -Ilib -rprism -rprism/translation/parser34 -e 'buffer = Parser::Source::Buffer.new("path"); buffer.source = "proc { it }"; \
p Prism::Translation::Parser34.new.tokenize(buffer)[0]'
s(:itblock,
s(:send, nil, :proc), :it,
s(:lvar, :it))
```
This node design is similar to the `numblock` node, which was introduced for the numbered parameter syntax in Ruby 2.7.
```
$ ruby -Ilib -rprism -rprism/translation/parser34 -e 'buffer = Parser::Source::Buffer.new("path"); buffer.source = "proc { _1 }"; \
p Prism::Translation::Parser34.new.tokenize(buffer)[0]'
s(:numblock,
s(:send, nil, :proc), 1,
s(:lvar, :_1))
```
The difference is that while numbered parameters can have multiple parameters, the `it` block parameter syntax allows only a single parameter.
In Ruby 3.3, the conventional node prior to the `it` block parameter syntax is returned.
```console
$ ruby -Ilib -rprism -rprism/translation/parser33 -e 'buffer = Parser::Source::Buffer.new("path"); buffer.source = "proc { it }"; \
p Prism::Translation::Parser33.new.tokenize(buffer)[0]'
s(:block,
s(:send, nil, :proc),
s(:args),
s(:send, nil, :it))
```
## Development Note
The Parser gem does not yet support the `it` block parameter syntax. This is the first case where Prism's node design precedes that of the Parser gem.
When implementing https://github.com/whitequark/parser/issues/962, this node design will need to be taken into consideration.
c141e1420a
In mock testing for stdout, `StringIO.new` is sometimes used to redirect the output.
In such cases, the assignment is done with `$stdout = StringIO.new`, not the constant `STDOUT`.
e.g., https://github.com/rubocop/rubocop/blob/v1.71.1/lib/rubocop/rspec/shared_contexts.rb#L154-L164
After assigning `StringIO.new`, `$stdout.tty?` returns `false`,
allowing the standard output destination to be switched during test execution.
```ruby
STDOUT.tty? # => true
StringIO.new.tty? # => false
```
However, since `STDOUT.tty?` returns `true`, a failure occurred in environments
where the environment variables `RUBY_PAGER` or `PAGER` are set.
e.g., https://github.com/rubocop/rubocop/pull/13784
To address this, `STDOUT` has been updated to `$stdout` so that the result of `tty?` can be flexibly overridden.
A potential concern is that `$stdout`, unlike `STDOUT`,
does not always represent the standard output at the time the Ruby process started.
However, no concrete examples of issues related to this have been identified.
`STDOUT.tty?` is the logic of optparse introduced in https://github.com/ruby/optparse/pull/70.
This PR replaces `STDOUT` with `$stdout` throughout, based on the assumption
that `$stdout` is sufficient for use with optparse.
262cf6f9ac