Commit graph

2260 commits

Author SHA1 Message Date
Nobuyoshi Nakada
26aef1c736 Use lex_eol_p family 2023-07-26 11:39:29 +09:00
S-H-GAMELINKS
76ea8ecbf3 Supress warning that variable may be used uninitialized with ripper building 2023-07-20 21:55:44 +09:00
yui-knk
82cd70ef93 Use functions defined by parser_st.c to reduce dependency on st.c 2023-07-15 12:50:40 +09:00
S-H-GAMELINKS
acd9c208d5 Move some macro for universal parser 2023-07-09 15:00:52 +09:00
S-H-GAMELINKS
8b2a0ec8df Move ISASCII defination to parse.y 2023-07-08 15:26:55 +09:00
Nobuyoshi Nakada
8ddfc17720 Use uint_least32_t
The elements of `ruby_global_name_punct_bits` table are 32-bit masks.
2023-07-04 21:30:44 +09:00
S-H-GAMELINKS
3fd1968d6f Introduce script_lines function for refactor script_lines_defined and script_lines_get functions 2023-07-01 23:17:57 +09:00
Jeremy Evans
1bc8838d60
Handle unterminated unicode escapes in regexps
This fixes an infinite loop possible after ec3542229b.
For \u{} escapes in regexps, skip validation in the parser, and rely on the regexp
code to handle validation. This is necessary so that invalid unicode escapes in
comments in extended regexps are allowed.

Fixes [Bug #19750]

Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
2023-06-30 19:37:53 -07:00
Peter Zhu
58386814a7 Don't check for null pointer in calls to free
According to the C99 specification section 7.20.3.2 paragraph 2:

> If ptr is a null pointer, no action occurs.

So we do not need to check that the pointer is a null pointer.
2023-06-30 09:13:31 -04:00
Nobuyoshi Nakada
1344de5621
[Bug #19743] All but EOF can be read again after push-back 2023-06-22 20:10:13 +09:00
Nobuyoshi Nakada
6be402e172
[Bug #19736] Recover after unterminated interpolation 2023-06-20 20:10:46 +09:00
yui-knk
4f79c83a6a Remove coverage_enabled from parser_params
`yyparse` never changes the value of `coverage_enabled`.
`coverage_enabled` depends on only return value of `e_option_supplied`.
Therefore `parser_params` doesn't need to have `coverage_enabled.
2023-06-18 10:10:52 +09:00
yui-knk
d444f1b1fa Specify int bitfield as signed int bitfield
sunc treats int bitfield as unsigned int.
This commit will fix build failure on sunc.

* 20230617T100003Z.fail.html.gz
* 20230617T090011Z.fail.html.gz
2023-06-17 22:02:13 +09:00
yui-knk
19c62b400d Replace parser & node compile_option from Hash to bit field
This commit reduces dependency to CRuby object.
2023-06-17 16:41:08 +09:00
Nobuyoshi Nakada
81836c6cb9
Fix duplicate symbol errors when statically linking ripper 2023-06-12 20:22:01 +09:00
yui-knk
b481b673d7 [Feature #19719] Universal Parser
Introduce Universal Parser mode for the parser.
This commit includes these changes:

* Introduce `UNIVERSAL_PARSER` macro. All of CRuby related functions
  are passed via `struct rb_parser_config_struct` when this macro is enabled.
* Add CI task with 'cppflags=-DUNIVERSAL_PARSER' for ubuntu.
2023-06-12 18:23:48 +09:00
yui-knk
5f65e8c5d5 Rename rb_node_name to the original name
98637d421d changes the name of
the function. However this function is exported as global,
then change the name to origin one for keeping compatibility.
2023-05-24 20:54:48 +09:00
yui-knk
98637d421d Move ruby_node_name to node.c and rename prefix of the function 2023-05-23 18:05:35 +09:00
Nobuyoshi Nakada
91c004885f
[Bug #19025] Numbered parameter names are always local variables 2023-05-14 22:16:15 +09:00
Nobuyoshi Nakada
bdaa491565 Add user argument to some macros used by bison 2023-05-14 15:38:48 +09:00
S-H-GAMELINKS
b632566f7e Introduce anddot_multiple_assignment_check function 2023-05-14 10:32:25 +09:00
Nobuyoshi Nakada
b15e88e0fc
[Bug #19619] Preserve numbered parameters context
Preserve numbered parameters context across method definitions
2023-05-02 17:39:18 +09:00
Nobuyoshi Nakada
b82c06a711
Handle private AREF call in compile.c 2023-04-30 23:21:59 +09:00
Takashi Kokubun
4af9bd52cb Get rid of a breakpoint left in parse.y 2023-04-10 11:22:12 -07:00
Nobuyoshi Nakada
ac8a16237c
[Bug #19563] Yield words separators per lines
So that newlines across a here-doc terminator will be separated
tokens.

Cf. https://github.com/ruby/irb/pull/558
2023-04-07 23:13:56 +09:00
Kazuki Tsujimoto
4ac8d11724
* in an array pattern should not be parsed as nil in ripper
After 6c0925ba70, it was impossible
to distinguish between the presence or absence of `*`.

    # Before the commit
    Ripper.sexp('0 in []')[1][0][2][1]  #=> [:aryptn, nil, nil, nil, nil]
    Ripper.sexp('0 in [*]')[1][0][2][1] #=> [:aryptn, nil, nil, [:var_field, nil], nil]

    # After the commit
    Ripper.sexp('0 in []')[1][0][2][1]  #=> [:aryptn, nil, nil, nil, nil]
    Ripper.sexp('0 in [*]')[1][0][2][1] #=> [:aryptn, nil, nil, nil, nil]

This commit reverts it.
2023-04-01 16:35:24 +09:00
yui-knk
3488eda41d Fix gc_verify_internal_consistency error for pattern_matching in ripper
`gc_verify_internal_consistency` reports "found internal inconsistency"
for "test_pattern_matching.rb".

http://ci.rvm.jp/results/trunk-gc-asserts@ruby-sp2-docker/4501173

Ruby's parser manages objects by two different ways.

1. For parser

* markable node holds objects
* call `RB_OBJ_WRITTEN` with `p->ast` as parent
* `mark_ast_value` marks objects

2. For ripper

* unmarkable node, NODE_RIPPER/NODE_CDECL, holds objects
* call `rb_ast_add_mark_object`. This function calls `rb_hash_aset` then
  `RB_OBJ_WRITTEN` is called with `mark_hash` as parent
* `mark_hash` marks objects

However in current pattern_matching implementation

* markable node holds objects
* call `rb_ast_add_mark_object`

This commit fix it to be #2.

This was inconsistency however always `mark_hash` is
made young by `rb_ast_add_mark_object` call then objects
are not collected.
2023-03-31 09:38:34 +09:00
Nobuyoshi Nakada
6f122965cf [Bug #19547] Add token for unescaped backslash
This token is exposed only when `RubyVM::AbstractSyntaxTree` with
`keep_tokens` option.
2023-03-30 19:47:36 +09:00
Kazuki Tsujimoto
d51529244f
[Bug #19175] p_kw without a sub pattern should be `assignable' 2023-03-26 18:57:34 +09:00
Kazuki Tsujimoto
6c0925ba70
[Bug #19175] p_rest should be `assignable'
It should also check for duplicate names.
2023-03-26 18:56:21 +09:00
Nobuyoshi Nakada
67dd52d59c
[Bug #19539] Match heredoc identifier from end of line
Not to ignore leading spaces in indented heredoc identifier.
2023-03-19 01:35:21 +09:00
Takashi Kokubun
c5e9af9c9d Expand tabs in parse.y
I used the same script as https://github.com/ruby/ruby/pull/6094 but
for a .y file.
2023-03-09 09:32:11 -08:00
Nobuyoshi Nakada
538c3b9ab7
Suppress -Wunused-but-set-variable warning 2023-02-14 19:26:41 +09:00
Nobuyoshi Nakada
7b343d9c67 Extract body rules from endless method definitions 2023-02-01 16:17:12 +09:00
yui-knk
e82cef1762 Remove not used argument from tokenize_ident
This has not been used since 5e59be3edd
2023-01-25 10:52:37 +09:00
Nobuyoshi Nakada
41fbcc5193
Fix format specifiers for pointer differences 2023-01-07 11:47:50 +09:00
Nobuyoshi Nakada
cee5beab1d [Bug #19312] Return end-of-input at __END__ 2023-01-06 13:13:07 +01:00
Nobuyoshi Nakada
3becc4a105
[Bug #19291] Rewind to the previous line
When rewinding looking ahead after newline token, also reset the last
line string, the pointers to it, and the location, not only the line
number.
2023-01-02 16:12:08 +09:00
yui-knk
adc29351f7 EXPR_DOT is set when next token is tANDDOT ("&.") [ci skip] 2022-12-26 17:34:57 +09:00
Shugo Maeda
2581de112c Disallow mixed usage of ... and */**
[Feature #19134]
2022-12-15 18:56:24 +09:00
Nobuyoshi Nakada
764da87ab0 [Bug #19195] Allow optional newlines before closing parenthesis 2022-12-13 18:06:11 +09:00
Shugo Maeda
04311008b6
Use loc to fix a failure of test_ast.rb
```
    1) Failure:
  TestAst#test_ranges:test/fiber/scheduler.rb [/home/runner/work/ruby/ruby/src/test/ruby/test_ast.rb:122]:
  <[]> expected but was
  <[{:type=>:max_validation_error,
    :max=>
     #<RubyVM::AbstractSyntaxTree::Node::CodePosition:0x00007f80d630b598
      @column=20,
      @lineno=203>,
    :end_pos=>
     #<RubyVM::AbstractSyntaxTree::Node::CodePosition:0x00007f80d630b778
      @column=19,
      @lineno=203>,
    :node=>
     (BLOCK_PASS@203:15-203:19
        (ARGSPUSH@203:15-203:20 (SPLAT@203:16-203:19 (LVAR@203:16-203:19 :*))
           (HASH@203:16-203:19
              (LIST@203:16-203:19 nil (LVAR@203:16-203:19 :**) nil)))
        (LVAR@203:16-203:19 :&))}]>.
```
2022-12-05 15:54:21 +09:00
Shugo Maeda
2649055c98
Should use argsloc for the last argument for arg_append() 2022-12-05 15:10:15 +09:00
S-H-GAMELINKS
1a64d45c67 Introduce encoding check macro 2022-12-02 01:31:27 +09:00
yui-knk
8be62f06c8 Remove ruby2_keywords related to args forwarding
This was introduced by b609bdeb53
to suppress warnings. However these warngins were deleted by
beae6cbf0f. Therefore these codes
are not needed anymore.
2022-11-29 15:39:56 +09:00
Shugo Maeda
a0e4dc52b0 Use idFWD_* instead of ANON_*_ID 2022-11-29 11:22:09 +09:00
Shugo Maeda
4fc668a4f3 Allow ** in def foo(...)
[Feature #19134]
2022-11-29 11:22:09 +09:00
Jeremy Evans
f5d73da806 Fix the position of rescue clause without exc_list
If the rescue clause has only exc_var and not exc_list, use the
exc_var position instead of the rescue body position.

This issue appears to have been introduced in
688169fd83 when "opt_list" was split
into "exc_list exc_var".

Fixes [Bug #18974]
2022-11-24 14:26:08 -08:00
yui-knk
854312eede Refactor to use has_delayed_token macro 2022-11-21 16:32:13 +09:00
yui-knk
d8601621ed Enhance keep_tokens option for RubyVM::AbstractSyntaxTree parsing methods
Implementation for Language Server Protocol (LSP) sometimes needs token information.
For example both `m(1)` and `m(1, )` has same AST structure other than node locations
then it's impossible to check the existence of `,` from AST. However in later case,
it might be better to suggest variables list for the second argument.
Token information is important for such case.

This commit adds these methods.

* Add `keep_tokens` option for `RubyVM::AbstractSyntaxTree.parse`, `.parse_file` and `.of`
* Add `RubyVM::AbstractSyntaxTree::Node#tokens` which returns tokens for the node including tokens for descendants nodes.
* Add `RubyVM::AbstractSyntaxTree::Node#all_tokens` which returns all tokens for the input script regardless the receiver node.

[Feature #19070]

Impacts on memory usage and performance are below:

Memory usage:

```
$ cat test.rb
root = RubyVM::AbstractSyntaxTree.parse_file(File.expand_path('../test/ruby/test_keyword.rb', __FILE__), keep_tokens: true)

$ /usr/bin/time -f %Mkb /usr/local/bin/ruby -v
ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux]
11408kb

# keep_tokens :false
$ /usr/bin/time -f %Mkb /usr/local/bin/ruby test.rb
17508kb

# keep_tokens :true
$ /usr/bin/time -f %Mkb /usr/local/bin/ruby test.rb
30960kb
```

Performance:

```
$ cat ../ast_keep_tokens.yml
prelude: |
  src = <<~SRC
    module M
      class C
        def m1(a, b)
          1 + a + b
        end
      end
    end
  SRC
benchmark:
  without_keep_tokens: |
    RubyVM::AbstractSyntaxTree.parse(src, keep_tokens: false)
  with_keep_tokens: |
    RubyVM::AbstractSyntaxTree.parse(src, keep_tokens: true)

$ make benchmark COMPARE_RUBY="./ruby" ARGS=../ast_keep_tokens.yml
/home/kaneko.y/.rbenv/shims/ruby --disable=gems -rrubygems -I../benchmark/lib ../benchmark/benchmark-driver/exe/benchmark-driver \
            --executables="compare-ruby::./ruby -I.ext/common --disable-gem" \
            --executables="built-ruby::./miniruby -I../lib -I. -I.ext/common  ../tool/runruby.rb --extout=.ext  -- --disable-gems --disable-gem" \
            --output=markdown --output-compare -v ../ast_keep_tokens.yml
compare-ruby: ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux]
built-ruby: ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux]
warming up..

|                     |compare-ruby|built-ruby|
|:--------------------|-----------:|---------:|
|without_keep_tokens  |     21.659k|   21.303k|
|                     |       1.02x|         -|
|with_keep_tokens     |      6.220k|    5.691k|
|                     |       1.09x|         -|
```
2022-11-21 09:01:34 +09:00