Commit graph

753 commits

Author SHA1 Message Date
Kevin Newton
ac0f6716b1 [PRISM] Respect frozen_string_literal option in RubyVM::InstructionSequence.compile 2024-05-01 19:19:07 -04:00
HASUMI Hitoshi
55a402bb75 Add line_count field to rb_ast_body_t
This patch adds `int line_count` field to `rb_ast_body_t` structure.
Instead, we no longer cast `script_lines` to Fixnum.

## Background

Ref https://github.com/ruby/ruby/pull/10618

In the PR above, we have decoupled IMEMO from `rb_ast_t`.
This means we could lift the five-words-restriction of the structure
that forced us to unionize `rb_ast_t *` and `FIXNUM` in one field.

## Relating refactor

- Remove the second parameter of `rb_ruby_ast_new()` function

## Attention

I will remove a code that assigns -1 to line_count, in `rb_binding_add_dynavars()`
of vm.c, because I don't think it is necessary.
But I will make another PR for this so that we can atomically revert
in case I was wrong (See the comment on the code)
2024-04-27 12:08:26 +09:00
Kevin Newton
94d6295b2d [PRISM] Enable coverage in eval ISEQs 2024-04-26 12:25:45 -04:00
Kevin Newton
49764869af [PRISM] Enable coverage in top and main iseqs 2024-04-26 12:25:45 -04:00
HASUMI Hitoshi
2244c58b00 [Universal parser] Decouple IMEMO from rb_ast_t
This patch removes the `VALUE flags` member from the `rb_ast_t` structure making `rb_ast_t` no longer an IMEMO object.

## Background

We are trying to make the Ruby parser generated from parse.y a universal parser that can be used by other implementations such as mruby.
To achieve this, it is necessary to exclude VALUE and IMEMO from parse.y, AST, and NODE.

## Summary (file by file)

- `rubyparser.h`
  - Remove the `VALUE flags` member from `rb_ast_t`
- `ruby_parser.c` and `internal/ruby_parser.h`
  - Use TypedData_Make_Struct VALUE which wraps `rb_ast_t` `in ast_alloc()` so that GC can manage it
    - You can retrieve `rb_ast_t` from the VALUE by `rb_ruby_ast_data_get()`
  - Change the return type of `rb_parser_compile_XXXX()` functions from `rb_ast_t *` to `VALUE`
  - rb_ruby_ast_new() which internally `calls ast_alloc()` is to create VALUE vast outside ruby_parser.c
- `iseq.c` and `vm_core.h`
  - Amend the first parameter of `rb_iseq_new_XXXX()` functions from `rb_ast_body_t *` to `VALUE`
  - This keeps the VALUE of AST on the machine stack to prevent being removed by GC
- `ast.c`
  - Almost all change is replacement `rb_ast_t *ast` with `VALUE vast` (sorry for the big diff)
  - Fix `node_memsize()`
    - Now it includes `rb_ast_local_table_link`, `tokens` and script_lines
- `compile.c`, `load.c`, `node.c`, `parse.y`, `proc.c`, `ruby.c`, `template/prelude.c.tmpl`, `vm.c` and `vm_eval.c`
  - Follow-up due to the above changes
- `imemo.{c|h}`
  - If an object with `imemo_ast` appears, considers it a bug

Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
2024-04-26 11:21:08 +09:00
Takashi Kokubun
7ab1a608e7
YJIT: Optimize local variables when EP == BP (take 2) (#10607)
* Revert "Revert "YJIT: Optimize local variables when EP == BP" (#10584)"

This reverts commit c878344195.

* YJIT: Take care of GC references in ISEQ invariants

Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>

---------

Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
2024-04-25 10:04:53 -04:00
eileencodes
6443d690ae Don't mark empty singleton cc's
These cc's aren't managed by the garbage collector so we shouldn't try
to mark and move them.
2024-04-18 14:21:01 -07:00
Koichi Sasada
f9f3018001 ISeq#to_a respects use_block status
```ruby
b = RubyVM::InstructionSequence.compile('def f = yield; def g = nil').to_a
pp b

 #=>
 ...
 {:use_block=>true},
 ...
```
2024-04-17 17:03:46 +09:00
HASUMI Hitoshi
9b1e97b211 [Universal parser] DeVALUE of p->debug_lines and ast->body.script_lines
This patch is part of universal parser work.

## Summary
- Decouple VALUE from members below:
  - `(struct parser_params *)->debug_lines`
  - `(rb_ast_t *)->body.script_lines`
- Instead, they are now `rb_parser_ary_t *`
  - They can also be a `(VALUE)FIXNUM` as before to hold line count
- `ISEQ_BODY(iseq)->variable.script_lines` remains VALUE
  - In order to do this,
  - Add `VALUE script_lines` param to `rb_iseq_new_with_opt()`
  - Introduce `rb_parser_build_script_lines_from()` to convert `rb_parser_ary_t *` into `VALUE`

## Other details
- Extend `rb_parser_ary_t *`. It previously could only store `rb_parser_ast_token *`, now can store script_lines, too
- Change tactics of building the top-level `SCRIPT_LINES__` in `yycompile0()`
  - Before: While parsing, each line of the script is added to `SCRIPT_LINES__[path]`
  - After: After `yyparse(p)`, `SCRIPT_LINES__[path]` will be built from `p->debug_lines`
- Remove the second parameter of `rb_parser_set_script_lines()` to make it simple
- Introduce `script_lines_free()` to be called from `rb_ast_free()` because the GC no longer takes care of the script_lines
- Introduce `rb_parser_string_deep_copy()` in parse.y to maintain script_lines when `rb_ruby_parser_free()` called
  - With regard to this, please see *Future tasks* below

## Future tasks
- Decouple IMEMO from `rb_ast_t *`
  - This lifts the five-members-restriction of Ruby object,
  - So we will be able to move the ownership of the `lex.string_buffer` from parser to AST
  - Then we remove `rb_parser_string_deep_copy()` to make the whole thing simple
2024-04-15 20:51:54 +09:00
Koichi Sasada
9180e33ca3 show warning for unused block
With verbopse mode (-w), the interpreter shows a warning if
a block is passed to a method which does not use the given block.

Warning on:

* the invoked method is written in C
* the invoked method is not `initialize`
* not invoked with `super`
* the first time on the call-site with the invoked method
  (`obj.foo{}` will be warned once if `foo` is same method)

[Feature #15554]

`Primitive.attr! :use_block` is introduced to declare that primitive
functions (written in C) will use passed block.

For minitest, test needs some tweak, so use
ea9caafc07
for `test-bundled-gems`.
2024-04-15 12:08:07 +09:00
Peter Zhu
4960a598d6 Reapply "Mark iseq structs with rb_gc_mark_movable"
This reverts commit 16c18eafb5.
2024-04-03 09:47:54 -04:00
Kevin Newton
94f7098d1c [PRISM] Fix ISEQ load 2024-04-02 11:11:53 -04:00
Kevin Newton
f57c7fef6b [PRISM] Have RubyVM::InstructionSequence.compile respect --parser=prism 2024-03-29 12:28:54 -04:00
Kevin Newton
42d1cd8f7f [PRISM] Pass --enable-frozen-string-literal through to evals 2024-03-27 08:34:42 -04:00
Nobuyoshi Nakada
16c18eafb5 Revert "Mark iseq structs with rb_gc_mark_movable"
This reverts commit a31ca3500d which
broke debug inspector API.
2024-03-27 13:26:22 +09:00
Nobuyoshi Nakada
0c114dfcc7 Check existing ISeq wrapper 2024-03-27 13:26:22 +09:00
Gannon McGibbon
a31ca3500d Mark iseq structs with rb_gc_mark_movable
Using rb_gc_mark_movable and a reference update function, we can make
instruction sequences movable in memory, and avoid pinning compiled iseqs.

```
require "objspace"
iseqs = []
GC.disable
50_000.times do
  iseqs << RubyVM::InstructionSequence.compile("")
end
GC.enable
GC.compact
p ObjectSpace.dump_all(output: :string).lines.grep(/"pinned":true/).count
```

Co-authored-by: Peter Zhu <peter@peterzhu.ca>
2024-03-25 10:43:12 -04:00
Étienne Barrié
12be40ae6b Implement chilled strings
[Feature #20205]

As a path toward enabling frozen string literals by default in the future,
this commit introduce "chilled strings". From a user perspective chilled
strings pretend to be frozen, but on the first attempt to mutate them,
they lose their frozen status and emit a warning rather than to raise a
`FrozenError`.

Implementation wise, `rb_compile_option_struct.frozen_string_literal` is
no longer a boolean but a tri-state of `enabled/disabled/unset`.

When code is compiled with frozen string literals neither explictly enabled
or disabled, string literals are compiled with a new `putchilledstring`
instruction. This instruction is identical to `putstring` except it marks
the String with the `STR_CHILLED (FL_USER3)` and `FL_FREEZE` flags.

Chilled strings have the `FL_FREEZE` flag as to minimize the need to check
for chilled strings across the codebase, and to improve compatibility with
C extensions.

Notes:
  - `String#freeze`: clears the chilled flag.
  - `String#-@`: acts as if the string was mutable.
  - `String#+@`: acts as if the string was mutable.
  - `String#clone`: copies the chilled flag.

Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
2024-03-19 09:26:49 +01:00
Jean Boussier
91bf7eb274 Refactor frozen_string_literal check during compilation
In preparation for https://bugs.ruby-lang.org/issues/20205.

The `frozen_string_literal` compilation option will no longer
be a boolean but a tri-state: `on/off/default`.
2024-03-15 15:52:33 +01:00
Kevin Newton
f8355e88d6 [PRISM] Do not load -r until we check if main script can be read 2024-02-28 12:42:57 -05:00
Kevin Newton
742abbf770 Switch {prism: true} to {parser: :prism} in ISeq.to_a 2024-02-28 10:58:04 -05:00
Kevin Newton
0f1ca9492c [PRISM] Provide runtime flag for prism in iseq 2024-02-21 11:44:40 -05:00
Kevin Newton
9933377c34 [PRISM] Correctly hook up line numbers for eval 2024-02-14 15:29:26 -05:00
Matt Valentine-House
fd3f776a05 [PRISM] Use Prism for eval if enabled 2024-02-13 21:19:12 -05:00
Kevin Newton
aed052ce9d [PRISM] Revert incorrect frozen string literal handling 2024-02-07 10:42:23 -05:00
Kevin Newton
ccec209b2c [PRISM] Fix fsl coming from file 2024-02-06 12:36:46 -05:00
Kevin Newton
610636fd6b [PRISM] Mirror iseq APIs
Before this commit, we were mixing a lot of concerns with the prism
compile between RubyVM::InstructionSequence and the general entry
points to the prism parser/compiler.

This commit makes all of the various prism-related APIs mirror
their corresponding APIs in the existing parser/compiler. This means
we now have the correct frame naming, and it's much easier to follow
where the logic actually flows. Furthermore this consolidates a lot
of the prism initialization, making it easier to see where we could
potentially be raising errors.
2024-01-31 13:41:36 -05:00
Jeremy Evans
22e488464a Add VM_CALL_ARGS_SPLAT_MUT callinfo flag
This flag is set when the caller has already created a new array to
handle a splat, such as for `f(*a, b)` and `f(*a, *b)`.  Previously,
if `f` was defined as `def f(*a)`, these calls would create an extra
array on the callee side, instead of using the new array created
by the caller.

This modifies `setup_args_core` to set the flag whenver it would add
a `splatarray true` instruction.  However, when `splatarray true` is
changed to `splatarray false` in the peephole optimizer, to avoid
unnecessary allocations on the caller side, the flag must be removed.
Add `optimize_args_splat_no_copy` and have the peephole optimizer call
that.  This significantly simplifies the related peephole optimizer
code.

On the callee side, in `setup_parameters_complex`, set
`args->rest_dupped` to true if the flag is set.

This takes a similar approach for optimizing regular splats that was
previiously used for keyword splats in
d2c41b1bff (via VM_CALL_KW_SPLAT_MUT).
2024-01-24 18:25:55 -08:00
Takashi Kokubun
c0cabc0a69
Dump annotations on RubyVM::ISeq.disasm (#9667)
Make it easier to check what annotations an ISEQ has. SINGLE_NOARG_LEAF
is added automatically, so it's hard to be sure about the annotation by
just reading code. It's also unclear to me what happens to it with
Primitive.mandatory_only?, but this at least explains that LEAF
annotation is not added to the non-mandatory_only ISEQ.
2024-01-23 22:54:39 +00:00
Matt Valentine-House
d054904cad [Prism] Don't change file after setting it.
This causes the Iseq file names to be wrong, which is affecting
Tracepoint events in certain cases.

because we're taking a pointer to the string and using it in
`pm_string_mapped_pointer` we also need to `RB_GC_GUARD` the relevant
Ruby object to ensure it's not moved or swept before the parser has been
free'd.
2024-01-22 15:15:32 -08:00
Matt Valentine-House
4592fdc545 [Prism] path and script name are not the same
When loading Ruby from a file, or parsing using
RubyVM::InstructionSequence.
2024-01-22 15:15:32 -08:00
Kevin Newton
6bcbb9a02b Make prism respect dump_without_opt 2024-01-22 10:18:41 -05:00
Peter Zhu
d0b774cfb8 Remove null checks for xfree
xfree can handle null values, so we don't need to check it.
2024-01-19 10:25:02 -05:00
Peter Zhu
c28094d385 [PRISM] Add function to free scope node
pm_scope_node_destroy frees the scope node after we're done using it to
make sure that the index_lookup_table is not leaked.

For example:

    10.times do
      100_000.times do
        RubyVM::InstructionSequence.compile_prism("begin; 1; rescue; 2; end")
      end

      puts `ps -o rss= -p #{$$}`
    end

Before:

    33056
    50304
    67776
    84544
    101520
    118448
    135712
    152352
    169136
    186656

After:

    15264
    15296
    15408
    17040
    17152
    17152
    18320
    18352
    18400
    18608
2024-01-18 16:33:25 -05:00
Peter Zhu
a6e924cf5f [PRISM] Fix crash in compile_prism
If the argument is not a file or a string, it assumes it's a string
which will crash because RSTRING_PTR and RSTRING_LEN assumes it's a
string.
2024-01-17 15:51:44 -05:00
Peter Zhu
5471f99eea [PRISM] Fix memory leak when compiling file
There is a memory leak when passing a file to
RubyVM::InstructionSequence.compile_prism because it does not free the
mapped file.

For example:

    require "tempfile"

    Tempfile.create(%w"test_iseq .rb") do |f|
      f.puts "name = 'Prism'; puts 'hello'"
      f.close

      10.times do
        1_000.times do
          RubyVM::InstructionSequence.compile_prism(f)
        end

        puts `ps -o rss= -p #{$$}`
      end
    end

Before:

    27968
    44848
    61408
    77872
    94144
    110432
    126640
    142816
    159200
    175584

After:

    11504
    12144
    12592
    13072
    13488
    13664
    14064
    14368
    14704
    15168
2024-01-16 16:19:43 -05:00
Aaron Patterson
475663f039 Only intern constants upon compilation entry
Before this commit the Prism compiler would try to intern constants
every time it re-entered. This pool of constants is "constant" (there is
only one pool per parser instance), so we should do it only once: upon
the top level entry to the compiler.

This change does just that: it populates the interned constants once.

Fixes: https://github.com/ruby/prism/issues/2152
2024-01-12 14:53:14 -08:00
Kevin Newton
44d0c5ae3f [PRISM] Raise syntax errors when found 2024-01-11 14:59:37 -05:00
John Hawthorn
c18edc5b5d Avoid underflow of rb_yjit_live_iseq_count
This value is only incremented when rb_iseq_translate_threaded_code is
called, which doesn't happen for iseqs which result in a syntax error.

This is easy to hit by running a debug build with RUBY_FREE_AT_EXIT=1,
but any build and options could underflow this value by running enough
evals.
2023-12-21 20:43:01 -08:00
Nobuyoshi Nakada
2f595c744e
Adjust styles [ci skip] 2023-12-17 00:21:00 +09:00
HParker
55326a915f Introduce --parser runtime flag
Introduce runtime flag for specifying the parser,

```
ruby --parser=prism
```

also update the description:

```
$ ruby --parser=prism --version
ruby 3.3.0dev (2023-12-08T04:47:14Z add-parser-runtime.. 0616384c9f) +PRISM [x86_64-darwin23]
```

[Bug #20044]
2023-12-15 13:42:19 -05:00
eileencodes
049a9bd62f [PRISM] Fix compile_prism when src is a file
`compile_prism` can take a source and file (and other arguments) or a
file as the source. `compile` checks if the source is a file and if it
is converts it. `compile_prism` is now doing the same thing.

On the Ruby side `compile` handles a file
[here](https://github.com/ruby/ruby/blob/master/iseq.c#L1159-L1162).

Before:

```
"********* Ruby *************"
== disasm: #<ISeq:<compiled>@<compiled>:1 (1,0)-(26,21)>
local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 1] name@0
0000 putstring                              "Prism"                   (  25)[Li]
0002 setlocal                               name@0, 0
0005 putself                                                          (  26)[Li]
0006 putobject                              "hello, "
0008 getlocal                               name@0, 0
0011 dup
0012 objtostring                            <calldata!mid:to_s, argc:0, FCALL|ARGS_SIMPLE>
0014 anytostring
0015 concatstrings                          2
0017 send                                   <calldata!mid:puts, argc:1, FCALL|ARGS_SIMPLE>, nil
0020 leave
hello, Prism

"********* PRISM *************"
./test.rb:13:in `compile_prism': wrong argument type File (expected String) (TypeError)
	from ./test.rb:13:in `<main>'
make: *** [run] Error 1
```

After:

```
"********* Ruby *************"
== disasm: #<ISeq:<compiled>@<compiled>:1 (1,0)-(26,21)>
local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 1] name@0
0000 putstring                              "Prism"                   (  25)[Li]
0002 setlocal                               name@0, 0
0005 putself                                                          (  26)[Li]
0006 putobject                              "hello, "
0008 getlocal                               name@0, 0
0011 dup
0012 objtostring                            <calldata!mid:to_s, argc:0, FCALL|ARGS_SIMPLE>
0014 anytostring
0015 concatstrings                          2
0017 send                                   <calldata!mid:puts, argc:1, FCALL|ARGS_SIMPLE>, nil
0020 leave

"********* PRISM *************"
== disasm: #<ISeq:<compiled>@test_code.rb:24 (24,0)-(25,21)>
local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 1] name@0
0000 putstring                              "Prism"                   (  24)[Li]
0002 setlocal                               name@0, 0
0005 putself                                                          (  25)[Li]
0006 putobject                              "hello, "
0008 getlocal                               name@0, 0
0011 dup
0012 objtostring                            <calldata!mid:to_s, argc:0, FCALL|ARGS_SIMPLE>
0014 anytostring
0015 concatstrings                          2
0017 send                                   <calldata!mid:puts, argc:1, FCALL|ARGS_SIMPLE>, nil
0020 leave                                                            (  24)
```

Fixes ruby/prism#1609
2023-12-15 10:27:44 -05:00
Adam Hess
6816e8efcf Free everything at shutdown
when the RUBY_FREE_ON_SHUTDOWN environment variable is set, manually free memory at shutdown.

Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
Co-authored-by: Peter Zhu <peter@peterzhu.ca>
2023-12-07 15:52:35 -05:00
Nobuyoshi Nakada
582c202c23
Adjust styles [ci skip] 2023-12-06 15:14:59 +09:00
HParker
b8b319dd1a Revert "allow enabling Prism via flag or env var"
This reverts commit 9b76c7fc89.
2023-12-06 10:21:12 +09:00
Nobuyoshi Nakada
c146da50bd
Adjust styles [ci skip] 2023-12-06 09:43:10 +09:00
HParker
9b76c7fc89 allow enabling Prism via flag or env var
Enable Prism using either --prism

    ruby --prism test.rb

or via env var

    RUBY_PRISM=1 ruby test.rb
2023-12-05 12:17:14 -05:00
Peter Zhu
d1691617d6 Pin instruction storage
The operands in each instruction needs to be pinned because if
auto-compaction runs in iseq_set_sequence, then the objects could exist
on the generated_iseq buffer, which would not be reference updated which
can lead to T_MOVED (and subsequently T_NONE) objects on the iseq.
2023-12-02 09:06:03 -05:00
Kevin Newton
323bec6295 RubyVM::InstructionSequence.compile_file_prism
* Provide a new API compile_file_prism which mirrors compile_file
but uses prism to parse/compile.
* Provide the ability to run test-all with RUBY_ISEQ_DUMP_DEBUG set
to "prism". If it is, we'll use the new compile_file_prism API to
load iseqs during the test run.
2023-11-20 12:45:29 -08:00
Nobuyoshi Nakada
ad72d96fcc
Escape and quote non-local variable names 2023-11-15 17:52:40 +09:00