The purpose of this commit is to fix Bug #21188. We need to detect when
stdin has run in to an EOF case. Unfortunately we can't _call_ the eof
function on IO because it will block.
Here is a short script to demonstrate the issue:
```ruby
x = STDIN.gets
puts x
puts x.eof?
```
If you run the script, then type some characters (but _NOT_ a newline),
then hit Ctrl-D twice, it will print the input string. Unfortunately,
calling `eof?` will try to read from STDIN again causing us to need a
3rd Ctrl-D to exit the program.
Before introducing the EOF callback to Prism, the input loop looked
kind of like this:
```ruby
loop do
str = STDIN.gets
process(str)
if str.nil?
p :DONE
end
end
```
Which required 3 Ctrl-D to exit. If we naively changed it to something
like this:
```ruby
loop do
str = STDIN.gets
process(str)
if STDIN.eof?
p :DONE
end
end
```
It would still require 3 Ctrl-D because `eof?` would block. In this
patch, we're wrapping the IO object, checking the buffer for a newline
and length, and then using that to simulate a non-blocking eof? method.
This commit wraps STDIN and emulates a non-blocking `eof` function.
[Bug #21188]
Strings concatenated with backslash may end up being frozen when they
shouldn't be. This commit fixes the issue. It required a change
upstream in Prism, but also a change to the Prism compiler in CRuby.
https://github.com/ruby/prism/pull/3606
[Bug #21187]
When a script has problem with the magic comment encoding, we only
display that error. However, if there are other syntax errors in the
file, the error linked list could contain multiple items. This lead to
an inconsistency in the "size" field of the linked list, and the actual
items in the linked list. In other words, the linked list had more than
one item, but the size field was one.
The error display routine would only allocate `size` items, but
iterating the linked list would overrun the array. This commit changes
the iterator to compare the current node to the "finish" node in the
linked list, no longer assuming the linked list ends with NULL.
[Bug #21461]
If all nodes in the array are safe, then it is safe to avoid
allocation for the positional splat:
```ruby
m(*a, kw: [:a]) # Safe
m(*a, kw: [meth]) # Unsafe
```
This avoids an unnecessary allocation in a Rails method call.
Details: https://github.com/rails/rails/pull/54949/files#r2052645431
This was handled correctly in parse.y (NODE_COLON2), but not in
prism. This wasn't caught earlier, because I only added tests for
the optimized case and not the unoptimized case. Add tests for
the unoptimized case.
In code terms:
```ruby
m(*a, kw: lvar::X) # Does not require allocation for *a
m(*a, kw: method()::X) # Requires allocation for *a
```
This commit fixes the second case when prism is used.
[Bug #21439] Fix PM_SPLAT_NODE compilation error in for loops
This commit fixes a crash that occurred when using splat nodes (*) as
the index variable in for loops. The error "Unexpected node type for
index in for node: PM_SPLAT_NODE" was thrown because the compiler
didn't know how to handle splat nodes in this context.
The fix allows code like `for *x in [[1,2], [3,4]]` to compile and
execute correctly, where the splat collects each sub-array.
The following is crashing for me:
```shell
ruby --yjit --yjit-call-threshold=1 -e '1.tap { raise rescue p it }'
ruby: YJIT has panicked. More info to follow...
thread '<unnamed>' panicked at ./yjit/src/codegen.rs:2402:14:
...
```
It seems `it` sometimes points to the wrong value:
```shell
ruby -e '1.tap { raise rescue p it }'
false
ruby -e '1.tap { begin; raise; ensure; p it; end } rescue nil'
false
```
But only when `$!` is set:
```shell
ruby -e '1.tap { begin; nil; ensure; p it; end }'
1
ruby -e '1.tap { begin; nil; rescue; ensure; p it; end }'
1
ruby -e '1.tap { begin; raise; rescue; ensure; p it; end }'
1
```
This commit inlines instructions for Class#new. To make this work, we
added a new YARV instructions, `opt_new`. `opt_new` checks whether or
not the `new` method is the default allocator method. If it is, it
allocates the object, and pushes the instance on the stack. If not, the
instruction jumps to the "slow path" method call instructions.
Old instructions:
```
> ruby --dump=insns -e'Object.new'
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,10)>
0000 opt_getconstant_path <ic:0 Object> ( 1)[Li]
0002 opt_send_without_block <calldata!mid:new, argc:0, ARGS_SIMPLE>
0004 leave
```
New instructions:
```
> ./miniruby --dump=insns -e'Object.new'
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,10)>
0000 opt_getconstant_path <ic:0 Object> ( 1)[Li]
0002 putnil
0003 swap
0004 opt_new <calldata!mid:new, argc:0, ARGS_SIMPLE>, 11
0007 opt_send_without_block <calldata!mid:initialize, argc:0, FCALL|ARGS_SIMPLE>
0009 jump 14
0011 opt_send_without_block <calldata!mid:new, argc:0, ARGS_SIMPLE>
0013 swap
0014 pop
0015 leave
```
This commit speeds up basic object allocation (`Foo.new`) by 60%, but
classes that take keyword parameters see an even bigger benefit because
no hash is allocated when instantiating the object (3x to 6x faster).
Here is an example that uses `Hash.new(capacity: 0)`:
```
> hyperfine "ruby --disable-gems -e'i = 0; while i < 10_000_000; Hash.new(capacity: 0); i += 1; end'" "./ruby --disable-gems -e'i = 0; while i < 10_000_000; Hash.new(capacity: 0); i += 1; end'"
Benchmark 1: ruby --disable-gems -e'i = 0; while i < 10_000_000; Hash.new(capacity: 0); i += 1; end'
Time (mean ± σ): 1.082 s ± 0.004 s [User: 1.074 s, System: 0.008 s]
Range (min … max): 1.076 s … 1.088 s 10 runs
Benchmark 2: ./ruby --disable-gems -e'i = 0; while i < 10_000_000; Hash.new(capacity: 0); i += 1; end'
Time (mean ± σ): 627.9 ms ± 3.5 ms [User: 622.7 ms, System: 4.8 ms]
Range (min … max): 622.7 ms … 633.2 ms 10 runs
Summary
./ruby --disable-gems -e'i = 0; while i < 10_000_000; Hash.new(capacity: 0); i += 1; end' ran
1.72 ± 0.01 times faster than ruby --disable-gems -e'i = 0; while i < 10_000_000; Hash.new(capacity: 0); i += 1; end'
```
This commit changes the backtrace for `initialize`:
```
aaron@tc ~/g/ruby (inline-new)> cat test.rb
class Foo
def initialize
puts caller
end
end
def hello
Foo.new
end
hello
aaron@tc ~/g/ruby (inline-new)> ruby -v test.rb
ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +PRISM [arm64-darwin24]
test.rb:8:in 'Class#new'
test.rb:8:in 'Object#hello'
test.rb:11:in '<main>'
aaron@tc ~/g/ruby (inline-new)> ./miniruby -v test.rb
ruby 3.5.0dev (2025-03-28T23:59:40Z inline-new c4157884e4) +PRISM [arm64-darwin24]
test.rb:8:in 'Object#hello'
test.rb:11:in '<main>'
```
It also increases memory usage for calls to `new` by 122 bytes:
```
aaron@tc ~/g/ruby (inline-new)> cat test.rb
require "objspace"
class Foo
def initialize
puts caller
end
end
def hello
Foo.new
end
puts ObjectSpace.memsize_of(RubyVM::InstructionSequence.of(method(:hello)))
aaron@tc ~/g/ruby (inline-new)> make runruby
RUBY_ON_BUG='gdb -x ./.gdbinit -p' ./miniruby -I./lib -I. -I.ext/common ./tool/runruby.rb --extout=.ext -- --disable-gems ./test.rb
656
aaron@tc ~/g/ruby (inline-new)> ruby -v test.rb
ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +PRISM [arm64-darwin24]
544
```
Thanks to @ko1 for coming up with this idea!
Co-Authored-By: John Hawthorn <john@hawthorn.email>
The iseq location object has a slot for node ids. parse.y was correctly
populating that field but Prism was not. This commit populates the field
with the ast node id for that iseq
[Bug #21014]
Put a pop as needed. This example currently causes [BUG]:
$ ruby --parser=prism -e'1.times{"".freeze;nil}'
-e:1: [BUG] Stack consistency error (sp: 15, bp: 14)
ruby 3.4.0dev (2024-12-20T00:48:01Z master 978df259ca) +PRISM [x86_64-linux]
Before this commit, when a file ended with a newline, the syntax error
message would show an extra line after the file.
For example, the error message would look like this:
```
[aaron@tc-lan-adapter ~/g/ruby (master)]$ echo "foo(" > test.rb
[aaron@tc-lan-adapter ~/g/ruby (master)]$ od -c test.rb
0000000 f o o ( \n
0000005
[aaron@tc-lan-adapter ~/g/ruby (master)]$ wc -l test.rb
1 test.rb
[aaron@tc-lan-adapter ~/g/ruby (master)]$ ./miniruby test.rb
test.rb: test.rb:1: syntax error found (SyntaxError)
> 1 | foo(
| ^ unexpected end-of-input; expected a `)` to close the arguments
2 |
```
This commit fixes the "end of line" book keeping when printing an error
so that there is no extra line output at the end of the file:
```
[aaron@tc-lan-adapter ~/g/ruby (fix-last-line-error)]$ echo "foo(" | ./miniruby
-: -:1: syntax error found (SyntaxError)
> 1 | foo(
| ^ unexpected end-of-input; expected a `)` to close the arguments
[aaron@tc-lan-adapter ~/g/ruby (fix-last-line-error)]$ echo -n "foo(" | ./miniruby
-: -:1: syntax error found (SyntaxError)
> 1 | foo(
| ^ unexpected end-of-input; expected a `)` to close the arguments
```
Notice that in the above example, the error message only displays one
line regardless of whether or not the file ended with a newline.
[Bug #20918]
[ruby-core:120035]
Prism will later free this string via free rather than xfree, so we need
to use malloc rather than xmalloc.
Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
Co-authored-by: Matthew Draper <matthew@trebex.net>
This fixes a failed assertion reported to SimpleCov
https://github.com/simplecov-ruby/simplecov/issues/1113
This can be repro'd as follows:
1. Create a file `test.rb` containing the following code
```
@foo&.(@bar)
```
2. require it with branch coverage enabled
```
ruby -rcoverage -e "Coverage.start(branches: true); require_relative 'test.rb'"
```
The assertion is failing because the Prism compiler is incorrectly
detecting the start and end cursor position of the call site for the
implicit call .()
This patch replicates the parse.y behaviour of setting the default
end_cursor to be the final closing location of the call node.
This behaviour can be verified against `parse.y` by modifying the test
command as follows:
```
ruby --parser=parse.y -rcoverage -e "Coverage.start(branches: true); require_relative 'test.rb'"
```
[Bug #20866]
* Use FL_USER0 for ELTS_SHARED
This makes space in RString for two bits for chilled strings.
* Mark strings returned by `Symbol#to_s` as chilled
[Feature #20350]
`STR_CHILLED` now spans on two user flags. If one bit is set it
marks a chilled string literal, if it's the other it marks a
`Symbol#to_s` chilled string.
Since it's not possible, and doesn't make much sense to include
debug info when `--debug-frozen-string-literal` is set, we can't
include allocation source, but we can safely include the symbol
name in the warning message, making it much easier to find the source
of the issue.
Co-Authored-By: Étienne Barrié <etienne.barrie@gmail.com>
---------
Co-authored-by: Étienne Barrié <etienne.barrie@gmail.com>
Co-authored-by: Jean Boussier <jean.boussier@gmail.com>
If there's a syntax error during iseq compilation then prism would leak
memory because it would not free the pm_parse_result_t.
This commit changes pm_iseq_new_with_opt to have a rb_protect to catch
when an error is raised, and return NULL and set error_state to a value
that can be raised by calling rb_jump_tag after memory has been freed.
For example:
10.times do
10_000.times do
eval("/[/=~s")
rescue SyntaxError
end
puts `ps -o rss= -p #{$$}`
end
Before:
39280
68736
99232
128864
158896
188208
217344
246304
275376
304592
After:
12192
13200
14256
14848
16000
16000
16000
16064
17232
17952
* YJIT: Replace Array#each only when YJIT is enabled
* Add comments about BUILTIN_ATTR_C_TRACE
* Make Ruby Array#each available with --yjit as well
* Fix all paths that expect a C location
* Use method_basic_definition_p to detect patches
* Copy a comment about C_TRACE flag to compilers
* Rephrase a comment about add_yjit_hook
* Give METHOD_ENTRY_BASIC flag to Array#each
* Add --yjit-c-builtin option
* Allow inconsistent source_location in test-spec
* Refactor a check of BUILTIN_ATTR_C_TRACE
* Set METHOD_ENTRY_BASIC without touching vm->running
[Feature #20205]
The warning now suggests running with --debug-frozen-string-literal:
```
test.rb:3: warning: literal string will be frozen in the future (run with --debug-frozen-string-literal for more information)
```
When using --debug-frozen-string-literal, the location where the string
was created is shown:
```
test.rb:3: warning: literal string will be frozen in the future
test.rb:1: info: the string was created here
```
When resurrecting strings and debug mode is not enabled, the overhead is a simple FL_TEST_RAW.
When mutating chilled strings and deprecation warnings are not enabled,
the overhead is a simple warning category enabled check.
Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
If there is a syntax error, there could be an ast_node in the result.
This could get leaked if there is a syntax error so parsing could not
complete (parsed is not set to true).
For example, the following script leaks memory:
10.times do
10_000.times do
eval("def foo(...) super(...) {}; end")
rescue SyntaxError
end
puts `ps -o rss= -p #{$$}`
end
Before:
31328
42768
53856
65120
76208
86768
97856
109120
120208
131296
After:
20944
20944
20944
20944
20944
20944
20944
20944
20944
20944
This caused an issue when `defined?` was in the `if` condition. Its
instructions weren't appended to the instruction sequence even though it was compiled
if a compile-time known logical short-circuit happened before the `defined?`. The catch table
entry (`defined?` compilation produces a catch table entry) was still on the iseq even though the
instructions weren't there. This caused faulty exception handling in the method.
The solution is to no add the catch table entry for `defined?` after a compile-time known logical
short circuit.
This shouldn't touch much code, it's only for cases like the following,
which can occur during debugging:
if false && defined?(Some::CONSTANT)
"more code..."
end
Fixes [Bug #20501]