Commit graph

275 commits

Author SHA1 Message Date
Nobuyoshi Nakada
4bde1493a7 Make node line macros inline functions
To suppress -Waddress warning and for the debugging purpose.
2025-01-30 18:19:53 +09:00
Anastasia Belova
ff64806ae5 Fix possible dereference of NULL
In some places in compile.c (for example in compile_builtin_arg)
ERROR_ARGS macro is used while node = NULL. This macro uses nd_line
which dereferences node. Add check for NULL to prevent such errors.
2025-01-29 10:35:28 +09:00
Nobuyoshi Nakada
d44a41d814
Add rb_node_get_type for debuggers 2025-01-09 19:24:27 +09:00
HASUMI Hitoshi
ddd8da4b6b [Universal parser] Improve AST structure
This patch moves `ast->node_buffer->config` to `ast->config` aiming to improve readability and maintainability of the source.

## Background

We could not add the `config` field to the `rb_ast_t *` due to the five-word restriction of the IMEMO object.
But it is now doable by merging https://github.com/ruby/ruby/pull/10618

## About assigning `&rb_global_parser_config` to `ast->config` in `ast_alloc()`

The approach of not setting `ast->config` in `ast_alloc()` means that the client, CRuby in this scenario, that directly calls `ast_alloc()` will be responsible for releasing it if a resource that is passed to AST needs to be released.

However, we have put on hold whether we can guarantee the above so far, thus, this patch looks like that.

```
// ruby_parser.c
static VALUE
ast_alloc(void)
{
    rb_ast_t *ast;
    VALUE vast = TypedData_Make_Struct(0, rb_ast_t, &ast_data_type, ast);
#ifdef UNIVERSAL_PARSER
    ast = (rb_ast_t *)DATA_PTR(vast);
    ast->config = &rb_global_parser_config;
#endif
    return vast;
}
```
2024-04-28 12:08:21 +09:00
Kevin Newton
af800bef21 Remove dependency on NODE from coverage structure 2024-04-26 12:25:45 -04:00
HASUMI Hitoshi
9b1e97b211 [Universal parser] DeVALUE of p->debug_lines and ast->body.script_lines
This patch is part of universal parser work.

## Summary
- Decouple VALUE from members below:
  - `(struct parser_params *)->debug_lines`
  - `(rb_ast_t *)->body.script_lines`
- Instead, they are now `rb_parser_ary_t *`
  - They can also be a `(VALUE)FIXNUM` as before to hold line count
- `ISEQ_BODY(iseq)->variable.script_lines` remains VALUE
  - In order to do this,
  - Add `VALUE script_lines` param to `rb_iseq_new_with_opt()`
  - Introduce `rb_parser_build_script_lines_from()` to convert `rb_parser_ary_t *` into `VALUE`

## Other details
- Extend `rb_parser_ary_t *`. It previously could only store `rb_parser_ast_token *`, now can store script_lines, too
- Change tactics of building the top-level `SCRIPT_LINES__` in `yycompile0()`
  - Before: While parsing, each line of the script is added to `SCRIPT_LINES__[path]`
  - After: After `yyparse(p)`, `SCRIPT_LINES__[path]` will be built from `p->debug_lines`
- Remove the second parameter of `rb_parser_set_script_lines()` to make it simple
- Introduce `script_lines_free()` to be called from `rb_ast_free()` because the GC no longer takes care of the script_lines
- Introduce `rb_parser_string_deep_copy()` in parse.y to maintain script_lines when `rb_ruby_parser_free()` called
  - With regard to this, please see *Future tasks* below

## Future tasks
- Decouple IMEMO from `rb_ast_t *`
  - This lifts the five-members-restriction of Ruby object,
  - So we will be able to move the ownership of the `lex.string_buffer` from parser to AST
  - Then we remove `rb_parser_string_deep_copy()` to make the whole thing simple
2024-04-15 20:51:54 +09:00
yui-knk
cebbe18eed Remove needless check
`nodetype_markable_p` always returns `false` then
`rb_ast_node_type_change` never calls `rb_bug`.
2024-04-05 09:19:57 +09:00
yui-knk
fc8fe78c07 Merge two node_buffer_list_t fields into one
All types of Node are managed by `node_buffer_list_t unmarkable`
therefore merge them into `node_buffer_list_t buffer_list`.
2024-04-05 09:19:57 +09:00
yui-knk
e816ab0b0c Remove rb_imemo_tmpbuf_t from parser
No parser semantic value types are `VALUE` then no need to
use imemo for managing semantic value stack anymore.
2024-04-02 19:37:27 +09:00
HASUMI Hitoshi
40ecad0ad7
[Universal parser] Fix -Wsuggest-attribute=format warnings
Under a configuration including `cppflags=-DUNIVERSAL_PARSER`, warnings listed below show in build time:

```
node.c:396:30: warning: initialization left-hand side might be a candidate for a format attribute [-Wsuggest-attribute=format]
  396 |     bug_report_func rb_bug = ast->node_buffer->config->bug;
      |                              ^~~
```

```
ruby_parser.c:655:21: warning: initialization left-hand side might be a candidate for a format attribute [-Wsuggest-attribute=format]
  655 |     .compile_warn = rb_compile_warn,
      |                     ^~~~~~~~~~~~~~~
ruby_parser.c:656:24: warning: initialization left-hand side might be a candidate for a format attribute [-Wsuggest-attribute=format]
  656 |     .compile_warning = rb_compile_warning,
      |                        ^~~~~~~~~~~~~~~~~~
ruby_parser.c:657:12: warning: initialization left-hand side might be a candidate for a format attribute [-Wsuggest-attribute=format]
  657 |     .bug = rb_bug,
      |            ^~~~~~
ruby_parser.c:658:14: warning: initialization left-hand side might be a candidate for a format attribute [-Wsuggest-attribute=format]
  658 |     .fatal = rb_fatal,
      |              ^~~~~~~~
```

To fix, this patch suggests adding `__attribute__((format(printf, n, m)))` to those function declarations.
2024-03-15 22:14:57 +09:00
HASUMI Hitoshi
9a19cfd4cd [Universal Parser] Reduce dependence on RArray in parse.y
- Introduce `rb_parser_ary_t` structure to partly eliminate RArray from parse.y
  - In this patch, `parser_params->tokens` and `parser_params->ast->node_buffer->tokens` are now `rb_parser_ary_t *`
  - Instead, `ast_node_all_tokens()` internally creates a Ruby Array object from the `rb_parser_ary_t`
  - Also, delete `rb_ast_tokens()` and `rb_ast_set_tokens()` in node.c

- Implement `rb_parser_str_escape()`
  - This is a port of the `rb_str_escape()` function in string.c
  - `rb_parser_str_escape()` does not depend on `VALUE` (RString)
  - Instead, it uses `rb_parser_stirng_t *`
  - This function works when --dump=y option passed

- Because WIP of the universal parser, similar functions like `rb_parser_tokens_free()` exist in both node.c and parse.y. Refactoring them may be needed in some way in the future

- Although we considered redesigning the structure: `ast->node_buffer->tokens` into `ast->tokens`, we leave it as it is because `rb_ast_t` is an imemo. (We will address it in the future)
2024-03-12 17:17:52 +09:00
Peter Zhu
c184aa8740 Use rb_gc_mark_and_move for imemo 2024-02-20 10:39:30 -05:00
yui-knk
89cfc15207 [Feature #20257] Rearchitect Ripper
Introduce another semantic value stack for Ripper so that
Ripper can manage both Node and Ruby Object separately.
This rearchitectutre of Ripper solves these issues.
Therefore adding test cases for them.

* [Bug 10436] https://bugs.ruby-lang.org/issues/10436
* [Bug 18988] https://bugs.ruby-lang.org/issues/18988
* [Bug 20055] https://bugs.ruby-lang.org/issues/20055

Checked the differences of `Ripper.sexp` for files under `/test/ruby`
are only on test_pattern_matching.rb.
The differences comes from the differences between
`new_hash_pattern_tail` functions between parser and Ripper.
Ripper `new_hash_pattern_tail` didn’t call `assignable` then
`kw_rest_arg` wasn’t marked as local variable.
This is also fixed by this commit.

```
--- a/./tmp/before/test_pattern_matching.rb
+++ b/./tmp/after/test_pattern_matching.rb
@@ -3607,7 +3607,7 @@
                  [:in,
                   [:hshptn, nil, [], [:var_field, [:@ident, “a”, [984, 13]]]],
                   [[:binary,
-                    [:vcall, [:@ident, “a”, [985, 10]]],
+                    [:var_ref, [:@ident, “a”, [985, 10]]],
                     :==,
                     [:hash, nil]]],
                   nil]]],
@@ -3662,7 +3662,7 @@
                  [:in,
                   [:hshptn, nil, [], [:var_field, [:@ident, “a”, [993, 13]]]],
                   [[:binary,
-                    [:vcall, [:@ident, “a”, [994, 10]]],
+                    [:var_ref, [:@ident, “a”, [994, 10]]],
                     :==,
                     [:hash,
                      [:assoclist_from_args,
@@ -3813,7 +3813,7 @@
                    [:command,
                     [:@ident, “raise”, [1022, 10]],
                     [:args_add_block,
-                     [[:vcall, [:@ident, “b”, [1022, 16]]]],
+                     [[:var_ref, [:@ident, “b”, [1022, 16]]]],
                      false]]],
                   [:else, [[:var_ref, [:@kw, “true”, [1024, 10]]]]]]]],
                nil,
@@ -3876,7 +3876,7 @@
                      [:@int, “0”, [1033, 15]]],
                     :“&&“,
                     [:binary,
-                     [:vcall, [:@ident, “b”, [1033, 20]]],
+                     [:var_ref, [:@ident, “b”, [1033, 20]]],
                      :==,
                      [:hash, nil]]]],
                   nil]]],
@@ -3946,7 +3946,7 @@
                      [:@int, “0”, [1042, 15]]],
                     :“&&“,
                     [:binary,
-                     [:vcall, [:@ident, “b”, [1042, 20]]],
+                     [:var_ref, [:@ident, “b”, [1042, 20]]],
                      :==,
                      [:hash,
                       [:assoclist_from_args,
@@ -5206,7 +5206,7 @@
                      [[:assoc_new,
                        [:@label, “c:“, [1352, 22]],
                        [:@int, “0”, [1352, 25]]]]]],
-                   [:vcall, [:@ident, “r”, [1352, 29]]]],
+                   [:var_ref, [:@ident, “r”, [1352, 29]]]],
                   false]]],
                [:binary,
                 [:call,
@@ -5299,7 +5299,7 @@
                       [:assoc_new,
                        [:@label, “c:“, [1367, 34]],
                        [:@int, “0”, [1367, 37]]]]]],
-                   [:vcall, [:@ident, “r”, [1367, 41]]]],
+                   [:var_ref, [:@ident, “r”, [1367, 41]]]],
                   false]]],
                [:binary,
                 [:call,
@@ -5931,7 +5931,7 @@
              [:in,
               [:hshptn, nil, [], [:var_field, [:@ident, “r”, [1533, 11]]]],
               [[:binary,
-                [:vcall, [:@ident, “r”, [1534, 8]]],
+                [:var_ref, [:@ident, “r”, [1534, 8]]],
                 :==,
                 [:hash,
                  [:assoclist_from_args,
```
2024-02-20 17:33:58 +09:00
Nobuyoshi Nakada
0610f555ea
Constify rb_global_parser_config 2024-01-14 17:55:11 +09:00
Nobuyoshi Nakada
a405b28e85 Delete heredoc line mark references 2023-10-14 11:08:43 +09:00
yui-knk
f28d380374 Pass nd_value to NODE_REQUIRED_KEYWORD_P 2023-10-07 17:54:35 +09:00
yui-knk
74c6781153 Change RNode structure from union to struct
All kind of AST nodes use same struct RNode, which has u1, u2, u3 union members
for holding different kind of data.
This has two problems.

1. Low flexibility of data structure

Some nodes, for example NODE_TRUE, don’t use u1, u2, u3. On the other hand,
NODE_OP_ASGN2 needs more than three union members. However they use same
structure definition, need to allocate three union members for NODE_TRUE and
need to separate NODE_OP_ASGN2 into another node.
This change removes the restriction so make it possible to
change data structure by each node type.

2. No compile time check for union member access

It’s developer’s responsibility for using correct member for each node type when it’s union.
This change clarifies which node has which type of fields and enables compile time check.

This commit also changes node_buffer_elem_struct buf management to handle
different size data with alignment.
2023-09-28 11:58:10 +09:00
Nobuyoshi Nakada
22a44735f0
Use ANSI-style prototype declarations instead of the old K&R style 2023-09-21 23:01:02 +09:00
卜部昌平
81620ed9b5 needless duplicated typedef deleted 2023-08-25 17:27:53 +09:00
yui-knk
b481b673d7 [Feature #19719] Universal Parser
Introduce Universal Parser mode for the parser.
This commit includes these changes:

* Introduce `UNIVERSAL_PARSER` macro. All of CRuby related functions
  are passed via `struct rb_parser_config_struct` when this macro is enabled.
* Add CI task with 'cppflags=-DUNIVERSAL_PARSER' for ubuntu.
2023-06-12 18:23:48 +09:00
yui-knk
5f65e8c5d5 Rename rb_node_name to the original name
98637d421d changes the name of
the function. However this function is exported as global,
then change the name to origin one for keeping compatibility.
2023-05-24 20:54:48 +09:00
yui-knk
98637d421d Move ruby_node_name to node.c and rename prefix of the function 2023-05-23 18:05:35 +09:00
Shugo Maeda
2581de112c Disallow mixed usage of ... and */**
[Feature #19134]
2022-12-15 18:56:24 +09:00
yui-knk
8be62f06c8 Remove ruby2_keywords related to args forwarding
This was introduced by b609bdeb53
to suppress warnings. However these warngins were deleted by
beae6cbf0f. Therefore these codes
are not needed anymore.
2022-11-29 15:39:56 +09:00
yui-knk
d8601621ed Enhance keep_tokens option for RubyVM::AbstractSyntaxTree parsing methods
Implementation for Language Server Protocol (LSP) sometimes needs token information.
For example both `m(1)` and `m(1, )` has same AST structure other than node locations
then it's impossible to check the existence of `,` from AST. However in later case,
it might be better to suggest variables list for the second argument.
Token information is important for such case.

This commit adds these methods.

* Add `keep_tokens` option for `RubyVM::AbstractSyntaxTree.parse`, `.parse_file` and `.of`
* Add `RubyVM::AbstractSyntaxTree::Node#tokens` which returns tokens for the node including tokens for descendants nodes.
* Add `RubyVM::AbstractSyntaxTree::Node#all_tokens` which returns all tokens for the input script regardless the receiver node.

[Feature #19070]

Impacts on memory usage and performance are below:

Memory usage:

```
$ cat test.rb
root = RubyVM::AbstractSyntaxTree.parse_file(File.expand_path('../test/ruby/test_keyword.rb', __FILE__), keep_tokens: true)

$ /usr/bin/time -f %Mkb /usr/local/bin/ruby -v
ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux]
11408kb

# keep_tokens :false
$ /usr/bin/time -f %Mkb /usr/local/bin/ruby test.rb
17508kb

# keep_tokens :true
$ /usr/bin/time -f %Mkb /usr/local/bin/ruby test.rb
30960kb
```

Performance:

```
$ cat ../ast_keep_tokens.yml
prelude: |
  src = <<~SRC
    module M
      class C
        def m1(a, b)
          1 + a + b
        end
      end
    end
  SRC
benchmark:
  without_keep_tokens: |
    RubyVM::AbstractSyntaxTree.parse(src, keep_tokens: false)
  with_keep_tokens: |
    RubyVM::AbstractSyntaxTree.parse(src, keep_tokens: true)

$ make benchmark COMPARE_RUBY="./ruby" ARGS=../ast_keep_tokens.yml
/home/kaneko.y/.rbenv/shims/ruby --disable=gems -rrubygems -I../benchmark/lib ../benchmark/benchmark-driver/exe/benchmark-driver \
            --executables="compare-ruby::./ruby -I.ext/common --disable-gem" \
            --executables="built-ruby::./miniruby -I../lib -I. -I.ext/common  ../tool/runruby.rb --extout=.ext  -- --disable-gems --disable-gem" \
            --output=markdown --output-compare -v ../ast_keep_tokens.yml
compare-ruby: ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux]
built-ruby: ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux]
warming up..

|                     |compare-ruby|built-ruby|
|:--------------------|-----------:|---------:|
|without_keep_tokens  |     21.659k|   21.303k|
|                     |       1.02x|         -|
|with_keep_tokens     |      6.220k|    5.691k|
|                     |       1.09x|         -|
```
2022-11-21 09:01:34 +09:00
yui-knk
f0ce118662 Make anonymous rest arg (*) and block arg (&) accessible from ARGS node 2022-11-18 18:25:42 +09:00
yui-knk
561b6c4fa0 Remove unused macro
NEW_PREEXE has not been used since 52a5f76e8b
2022-10-09 14:00:52 +09:00
yui-knk
4bfdf6d06d Move error from top_stmts and top_stmt to stmt
By this change, syntax error is recovered smaller units.
In the case below, "DEFN :bar" is same level with "CLASS :Foo"
now.

```
module Z
  class Foo
    foo.
  end

  def bar
  end
end
```

[Feature #19013]
2022-10-08 17:59:11 +09:00
Takashi Kokubun
5b21e94beb Expand tabs [ci skip]
[Misc #18891]
2022-07-21 09:42:04 -07:00
Nobuyoshi Nakada
54f0e63a8c Remove NODE_DASGN_CURR [Feature #18406]
This `NODE` type was used in pre-YARV implementation, to improve
the performance of assignment to dynamic local variable defined at
the innermost scope.  It has no longer any actual difference with
`NODE_DASGN`, except for the node dump.
2021-12-13 12:53:03 +09:00
Nobuyoshi Nakada
d118e7c025
Turn nd_type_p into an inline function 2021-12-04 10:35:44 +09:00
S.H
ec7f14d9fa
Add nd_type_p macro 2021-12-04 00:01:24 +09:00
Yusuke Endoh
feda058531 Refactor hacky ID tables to struct rb_ast_id_table_t
The implementation of a local variable tables was represented as `ID*`,
but it was very hacky: the first element is not an ID but the size of
the table, and, the last element is (sometimes) a link to the next local
table only when the id tables are a linked list.

This change converts the hacky implementation to a normal struct.
2021-11-21 08:59:24 +09:00
Yusuke Endoh
6764256dc7 node/h: clean node field accessors
This change removes nd_oid, nd_rest, and nd_opt, and adds some comments
for special accessors.
2021-11-17 23:39:34 +09:00
Yusuke Endoh
fb01411ae8 node.h: Reduce struct size to fit with Ruby object size (five VALUEs)
by merging `rb_ast_body_t#line_count` and `#script_lines`.

Fortunately `line_count == RARRAY_LEN(script_lines)` was always
satisfied. When script_lines is saved, it has an array of lines, and
when not saved, it has a Fixnum that represents the old line_count.
2021-06-18 02:34:27 +09:00
Yusuke Endoh
acae5f363d ast.rb: RubyVM::AST.parse and .of accepts save_script_lines: true
This option makes the parser keep the original source as an array of
the original code lines. This feature exploits the mechanism of
`SCRIPT_LINES__` but records only the specified code that is passed to
RubyVM::AST.of or .parse, instead of recording all parsed program texts.
2021-06-18 02:34:27 +09:00
Nobuyoshi Nakada
c060bdc2b4
NODE markability should not change by nd_set_type 2021-01-14 16:12:02 +09:00
Kazuki Tsujimoto
e03e1982bd
Change NODE layout for pattern matching
I prefer pconst to be the first element of NODE.

  Before:

       | ARYPTN | FNDPTN | HSHPTN
    ---+--------+--------+-----------
    u1 | imemo  | imemo  | pkwargs
    u2 | pconst | pconst | pconst
    u3 | apinfo | fpinfo | pkwrestarg

  After:

       | ARYPTN | FNDPTN | HSHPTN
    ---+--------+--------+-----------
    u1 | pconst | pconst | pconst
    u2 | imemo  | imemo  | pkwargs
    u3 | apinfo | fpinfo | pkwrestarg
2020-11-01 16:19:07 +09:00
卜部昌平
cd1d6d9029 include/ruby/backward/2/r_cast.h: deprecate
Remove all usages of RCAST() so that the header file can be excluded
from ruby/ruby.h's dependency.
2020-08-27 15:03:36 +09:00
Aaron Patterson
e8edc34f0a Remove unused struct member
I accidentally added this in 35ba2783fe,
and it's making the size of RVALUE be too big. I'm sorry! orz
2020-08-03 17:01:13 -07:00
Kazuki Tsujimoto
fcdbdff631
rb_{ary,fnd}_pattern_info: Remove imemo member to reduce memory usage
This is a partial revert commit of 8f096226e1.

NODE layout:

  Before:

       | ARYPTN | FNDPTN | HSHPTN
    ---+--------+--------+-----------
    u1 | pconst | pconst | pconst
    u2 | unused | unused | pkwargs
    u3 | apinfo | fpinfo | pkwrestarg

  After:

       | ARYPTN | FNDPTN | HSHPTN
    ---+--------+--------+-----------
    u1 | imemo  | imemo  | pkwargs
    u2 | pconst | pconst | pconst
    u3 | apinfo | fpinfo | pkwrestarg
2020-08-02 01:04:06 +09:00
Aaron Patterson
35ba2783fe Use a linked list to eliminate imemo tmp bufs for managing local tables
This patch changes local table memory to be managed by a linked list
rather than via the garbage collector.  It reduces allocations from the
GC and also fixes a use-after-free bug in the concurrent-with-sweep
compactor I'm working on.
2020-07-27 12:40:01 -07:00
Koichi Sasada
a0f12a0258
Use ID instead of GENTRY for gvars. (#3278)
Use ID instead of GENTRY for gvars.

Global variables are compiled into GENTRY (a pointer to struct
rb_global_entry). This patch replace this GENTRY to ID and
make the code simple.

We need to search GENTRY from ID every time (st_lookup), so
additional overhead will be introduced.
However, the performance of accessing global variables is not
important now a day and this simplicity helps Ractor development.
2020-07-03 16:56:44 +09:00
Kazuki Tsujimoto
ddded1157a
Introduce find pattern [Feature #16828] 2020-06-14 09:24:36 +09:00
Nobuyoshi Nakada
0a52015da7
Constified code_loc_gen 2020-05-14 17:15:24 +09:00
Kazuki Tsujimoto
6ed7bc83ec
Fix indentation 2020-05-04 13:27:25 +09:00
卜部昌平
4ff3f20540 add #include guard hack
According to MSVC manual (*1), cl.exe can skip including a header file
when that:

- contains #pragma once, or
- starts with #ifndef, or
- starts with #if ! defined.

GCC has a similar trick (*2), but it acts more stricter (e. g. there
must be _no tokens_ outside of #ifndef...#endif).

Sun C lacked #pragma once for a looong time.  Oracle Developer Studio
12.5 finally implemented it, but we cannot assume such recent version.

This changeset modifies header files so that each of them include
strictly one #ifndef...#endif.  I believe this is the most portable way
to trigger compiler optimizations. [Bug #16770]

*1: https://docs.microsoft.com/en-us/cpp/preprocessor/once
*2: https://gcc.gnu.org/onlinedocs/cppinternals/Guard-Macros.html
2020-04-13 16:06:00 +09:00
Jeremy Evans
ffd0820ab3 Deprecate taint/trust and related methods, and make the methods no-ops
This removes the related tests, and puts the related specs behind
version guards.  This affects all code in lib, including some
libraries that may want to support older versions of Ruby.
2019-11-18 01:00:25 +02:00
卜部昌平
c9ffe751d1 delete unused functions
Looking at the list of symbols inside of libruby-static.a, I found
hundreds of functions that are defined, but used from nowhere.

There can be reasons for each of them (e.g. some functions are
specific to some platform, some are useful when debugging, etc).
However it seems the functions deleted here exist for no reason.

This changeset reduces the size of ruby binary from 26,671,456
bytes to 26,592,864 bytes on my machine.
2019-11-14 20:35:48 +09:00
Nobuyoshi Nakada
fb6a489af2
Revert "Method reference operator"
This reverts commit 67c5747369.
[Feature #16275]
2019-11-12 17:24:48 +09:00