Kevin Newton
24f48382bc
[ruby/prism] (parser) Fix up tokens for empty symbol
...
5985ab7687
2024-06-18 21:18:39 -04:00
Koichi ITO
3605d6076d
[ruby/prism] Fix token incompatibility for Prism::Translation::Parser::Lexer
...
This PR fixes token incompatibility for `Prism::Translation::Parser::Lexer` when using backquoted heredoc indetiner:
```ruby
<<-` FOO`
a
b
FOO
```
## Parser gem (Expected)
Returns `tXSTRING_BEG` as the first token:
```console
$ bundle exec ruby -Ilib -rparser/ruby33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = File.read("example.rb"); p Parser::Ruby33.new.tokenize(buf)'
ruby 3.3.0 (2023-12-25 revision 5124f9ac75
) [x86_64-darwin22]
[s(:xstr,
s(:str, "a\n"),
s(:str, "b\n")), [], [[:tXSTRING_BEG, ["<<`", #<Parser::Source::Range example.rb 0...10>]],
[:tSTRING_CONTENT, ["a\n", #<Parser::Source::Range example.rb 11...13>]],
[:tSTRING_CONTENT, ["b\n", #<Parser::Source::Range example.rb 13...15>]],
[:tSTRING_END, [" FOO", #<Parser::Source::Range example.rb 15...23>]], [:tNL, [nil, #<Parser::Source::Range example.rb 10...11>]]]]
```
## `Prism::Translation::Parser` (Actual)
Previously, the tokens returned by the Parser gem were different. The escaped backslash does not match in the `tSTRING_BEG` token and
value of `tSTRING_END` token.
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = File.read("example.rb"); p Prism::Translation::Parser33.new.tokenize(buf)'
ruby 3.3.0 (2023-12-25 revision 5124f9ac75
) [x86_64-darwin22]
[s(:xstr,
s(:str, "a\n"),
s(:str, "b\n")), [], [[:tSTRING_BEG, ["<<\"", #<Parser::Source::Range example.rb 0...10>]],
[:tSTRING_CONTENT, ["a\n", #<Parser::Source::Range example.rb 11...13>]],
[:tSTRING_CONTENT, ["b\n", #<Parser::Source::Range example.rb 13...15>]],
[:tSTRING_END, ["` FOO`", #<Parser::Source::Range example.rb 15...23>]], [:tNL, [nil, #<Parser::Source::Range example.rb 10...11>]]]]
```
After this correction, the AST and tokens returned by the Parser gem are the same:
```console
$ bunlde exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = File.read("example.rb"); p Prism::Translation::Parser33.new.tokenize(buf)'
ruby 3.3.0 (2023-12-25 revision 5124f9ac75
) [x86_64-darwin22]
[s(:xstr,
s(:str, "a\n"),
s(:str, "b\n")), [], [[:tXSTRING_BEG, ["<<`", #<Parser::Source::Range example.rb 0...10>]],
[:tSTRING_CONTENT, ["a\n", #<Parser::Source::Range example.rb 11...13>]],
[:tSTRING_CONTENT, ["b\n", #<Parser::Source::Range example.rb 13...15>]],
[:tSTRING_END, [" FOO", #<Parser::Source::Range example.rb 15...23>]], [:tNL, [nil, #<Parser::Source::Range example.rb 10...11>]]]]
```
308f8d85a1
2024-03-16 17:55:38 +00:00
Koichi ITO
e3a82d79fd
[ruby/prism] Fix token incompatibility for Prism::Translation::Parser::Lexer
...
This PR fixes token incompatibility for `Prism::Translation::Parser::Lexer`
when using escaped backslash in string literal:
```ruby
"\\ foo \\ bar"
```
## Parser gem (Expected)
```console
$ bundle exec ruby -Ilib -rparser/ruby33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = File.read("example.rb"); p Parser::Ruby33.new.tokenize(buf)'
ruby 3.3.0 (2023-12-25 revision 5124f9ac75
) [x86_64-darwin22]
[s(:str, "\\ foo \\ bar"), [], [[:tSTRING, ["\\ foo \\ bar", #<Parser::Source::Range example.rb 0...15>]],
[:tNL, [nil, #<Parser::Source::Range example.rb 15...16>]]]]
```
## `Prism::Translation::Parser` (Actual)
Previously, the tokens returned by the Parser gem were different. The escaped backslash does not match in the `tSTRING` token:
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = File.read("example.rb"); p Prism::Translation::Parser33.new.tokenize(buf)'
ruby 3.3.0 (2023-12-25 revision 5124f9ac75
) [x86_64-darwin22]
[s(:str, "\\ foo \\ bar"), [], [[:tSTRING, ["\\\\ foo \\\\ bar", #<Parser::Source::Range example.rb 0...15>]],
[:tNL, [nil, #<Parser::Source::Range example.rb 15...16>]]]]
```
After this correction, the AST and tokens returned by the Parser gem are the same:
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = File.read("example.rb"); p Prism::Translation::Parser33.new.tokenize(buf)'
ruby 3.3.0 (2023-12-25 revision 5124f9ac75
) [x86_64-darwin22]
[s(:str, "\\ foo \\ bar"), [], [[:tSTRING, ["\\ foo \\ bar", #<Parser::Source::Range example.rb 0...15>]],
[:tNL, [nil, #<Parser::Source::Range example.rb 15...16>]]]]
```
The reproduction test is based on the following strings.txt and exists:
https://github.com/ruby/prism/blob/v0.24.0/test/prism/fixtures/strings.txt#L79
But, the restoration has not yet been performed due to remaining other issues in strings.txt.
2c44e7e307
2024-03-15 18:08:39 +00:00
Koichi ITO
c9da8d67fd
[ruby/prism] Fix a token incompatibility for Prism::Translation::Parser::Lexer
...
This PR fixes a token incompatibility between Parser gem and `Prism::Translation::Parser`
for the heredocs_leading_whitespace.txt test.
7d45fb1eed
2024-03-15 18:07:59 +00:00
Koichi ITO
c0b8dee95a
[ruby/prism] Fix an AST and token incompatibility for Prism::Translation::Parser
...
This PR fixes an AST and token incompatibility between Parser gem and `Prism::Translation::Parser`
for dstring literal:
```ruby
"foo
#{bar}"
```
## Parser gem (Expected)
```console
$ bundle exec ruby -Ilib -rparser/ruby33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = File.read("example.rb"); p Parser::Ruby33.new.tokenize(buf)'
ruby 3.3.0 (2023-12-25 revision 5124f9ac75
) [x86_64-darwin22]
[s(:dstr,
s(:str, "foo\n"),
s(:str, " "),
s(:begin,
s(:send, nil, :bar))), [], [[:tSTRING_BEG, ["\"", #<Parser::Source::Range example.rb 0...1>]],
[:tSTRING_CONTENT, ["foo\n", #<Parser::Source::Range example.rb 1...5>]],
[:tSTRING_CONTENT, [" ", #<Parser::Source::Range example.rb 5...7>]],
[:tSTRING_DBEG, ["\#{", #<Parser::Source::Range example.rb 7...9>]],
[:tIDENTIFIER, ["bar", #<Parser::Source::Range example.rb 9...12>]],
[:tSTRING_DEND, ["}", #<Parser::Source::Range example.rb 12...13>]],
[:tSTRING_END, ["\"", #<Parser::Source::Range example.rb 13...14>]], [:tNL, [nil, #<Parser::Source::Range example.rb 14...15>]]]]
```
## `Prism::Translation::Parser` (Actual)
Previously, the AST and tokens returned by the Parser gem were different. In this case, `dstr` node should not be nested:
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = File.read("example.rb"); p Prism::Translation::Parser33.new.tokenize(buf)'
ruby 3.3.0 (2023-12-25 revision 5124f9ac75
) [x86_64-darwin22]
[s(:dstr,
s(:dstr,
s(:str, "foo\n"),
s(:str, " ")),
s(:begin,
s(:send, nil, :bar))), [], [[:tSTRING_BEG, ["\"", #<Parser::Source::Range example.rb 0...1>]],
[:tSTRING_CONTENT, ["foo\n", #<Parser::Source::Range example.rb 1...5>]],
[:tSTRING_CONTENT, [" ", #<Parser::Source::Range example.rb 5...7>]],
[:tSTRING_DBEG, ["\#{", #<Parser::Source::Range example.rb 7...9>]],
[:tIDENTIFIER, ["bar", #<Parser::Source::Range example.rb 9...12>]],
[:tSTRING_DEND, ["}", #<Parser::Source::Range example.rb 12...13>]],
[:tSTRING_END, ["\"", #<Parser::Source::Range example.rb 13...14>]], [:tNL, [nil, #<Parser::Source::Range example.rb 14...15>]]]]
```
After this correction, the AST and tokens returned by the Parser gem are the same:
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = File.read("example.rb"); p Prism::Translation::Parser33.new.tokenize(buf)'
ruby 3.3.0 (2023-12-25 revision 5124f9ac75
) [x86_64-darwin22]
[s(:dstr,
s(:str, "foo\n"),
s(:str, " "),
s(:begin,
s(:send, nil, :bar))), [], [[:tSTRING_BEG, ["\"", #<Parser::Source::Range example.rb 0...1>]],
[:tSTRING_CONTENT, ["foo\n", #<Parser::Source::Range example.rb 1...5>]],
[:tSTRING_CONTENT, [" ", #<Parser::Source::Range example.rb 5...7>]],
[:tSTRING_DBEG, ["\#{", #<Parser::Source::Range example.rb 7...9>]],
[:tIDENTIFIER, ["bar", #<Parser::Source::Range example.rb 9...12>]],
[:tSTRING_DEND, ["}", #<Parser::Source::Range example.rb 12...13>]],
[:tSTRING_END, ["\"", #<Parser::Source::Range example.rb 13...14>]], [:tNL, [nil, #<Parser::Source::Range example.rb 14...15>]]]]
```
c1652a9ee7
2024-03-15 12:31:40 +00:00
Koichi ITO
0f076fa520
[ruby/prism] Fix an AST and token incompatibility for Prism::Translation::Parser
...
This PR fixes an AST and token incompatibility between Parser gem and `Prism::Translation::Parser`
for empty xstring literal.
## Parser gem (Expected)
```console
$ bundle exec ruby -Ilib -rparser/ruby33 -ve \
'buf = Parser::Source::Buffer.new("/tmp/s.rb"); buf.source = "``"; p Parser::Ruby33.new.tokenize(buf)'
ruby 3.3.0 (2023-12-25 revision 5124f9ac75
) [x86_64-darwin22]
[s(:xstr), [], [[:tXSTRING_BEG, ["`", #<Parser::Source::Range /tmp/s.rb 0...1>]],
[:tSTRING_END, ["`", #<Parser::Source::Range /tmp/s.rb 1...2>]]]]
```
## `Prism::Translation::Parser` (Actual)
Previously, the AST and tokens returned by the Parser gem were different:
```console
$ bunele exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("/tmp/s.rb"); buf.source = "``"; p Prism::Translation::Parser33.new.tokenize(buf)'
ruby 3.3.0 (2023-12-25 revision 5124f9ac75
) [x86_64-darwin22]
[s(:xstr, s(:str, "")), [], [[:tBACK_REF2, ["`", #<Parser::Source::Range /tmp/s.rb 0...1>]],
[:tSTRING_END, ["`", #<Parser::Source::Range /tmp/s.rb 1...2>]]]]
```
After this correction, the AST and tokens returned by the Parser gem are the same:
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("/tmp/s.rb"); buf.source = "``"; p Prism::Translation::Parser33.new.tokenize(buf)'
ruby 3.3.0 (2023-12-25 revision 5124f9ac75
) [x86_64-darwin22]
[s(:xstr), [], [[:tXSTRING_BEG, ["`", #<Parser::Source::Range /tmp/s.rb 0...1>]],
[:tSTRING_END, ["`", #<Parser::Source::Range /tmp/s.rb 1...2>]]]]
```
4ac89dcbb5
2024-03-13 16:02:10 +00:00
Koichi ITO
f8cab4ef8e
[ruby/prism] Fix a token incompatibility for Prism::Translation::Parser::Lexer
...
In practice, the `BACKTICK` is mapped either as `:tXSTRING_BEG` or `:tBACK_REF2`.
The former is used in xstrings like `` `foo` ``, while the latter is utilized as
a back reference in contexts like `` A::` ``.
This PR will make corrections to differentiate the use of `BACKTICK`.
This mistake was discovered through the investigation of xstring.txt file.
The PR will run tests from xstring.txt file except for `` `f\oo` ``, which will still fail,
hence it will be separated into xstring_with_backslash.txt file.
This separation will facilitate addressing the correction at a different time.
49ad8df40a
2024-03-12 17:33:14 +00:00
Kevin Newton
1e886c040a
[ruby/prism] Fix some whitequark/parser lexer compatibilities
...
34e521d071
2024-03-12 15:32:25 +00:00
Koichi ITO
c637f53edd
[ruby/prism] Fix a token incompatibility for Prism::Translation::Parser::Lexer
...
This PR fixes a token incompatibility between Parser gem and `Prism::Translation::Parser` for `tBACK_REF2`:
## Parser gem (Expected)
Returns `tBACK_REF2` token:
```console
$ bundle exec ruby -Ilib -rparser/ruby33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "A::`"; p Parser::Ruby33.new.tokenize(buf)[2]'
ruby 3.3.0 (2023-12-25 revision 5124f9ac75
) [x86_64-darwin22]
[[:tCONSTANT, ["A", #<Parser::Source::Range example.rb 0...1>]], [:tCOLON2, ["::", #<Parser::Source::Range example.rb 1...3>]],
[:tBACK_REF2, ["`", #<Parser::Source::Range example.rb 3...4>]]]
```
## `Prism::Translation::Parser` (Actual)
Previously, the parser returned `tXSTRING_BEG` token when parsing the following:
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "A::`"; p Prism::Translation::Parser33.new.tokenize(buf)[2]'
ruby 3.3.0 (2023-12-25 revision 5124f9ac75
) [x86_64-darwin22]
[[:tCONSTANT, ["A", #<Parser::Source::Range example.rb 0...1>]], [:tCOLON2, ["::", #<Parser::Source::Range example.rb 1...3>]],
[:tXSTRING_BEG, ["`", #<Parser::Source::Range example.rb 3...4>]]]
```
After the update, the parser now returns `tBACK_REF2` token for the same input:
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "A::`"; p Prism::Translation::Parser33.new.tokenize(buf)[2]'
ruby 3.3.0 (2023-12-25 revision 5124f9ac75
) [x86_64-darwin22]
[[:tCONSTANT, ["A", #<Parser::Source::Range example.rb 0...1>]], [:tCOLON2, ["::", #<Parser::Source::Range example.rb 1...3>]],
[:tBACK_REF2, ["`", #<Parser::Source::Range example.rb 3...4>]]]
```
This correction enables the restoration of `constants.txt` as a test case.
7f63b28f98
2024-03-12 15:17:20 +00:00
Koichi ITO
72a57076e2
[ruby/prism] Fix a token incompatibility for Prism::Translation::Parser::Lexer
...
This PR fixes a token incompatibility between Parser gem and `Prism::Translation::Parser` for
beginless range:
## Parser gem (Expected)
Returns `tBDOT2` token:
```console
$ bundle exec ruby -Ilib -rparser/ruby33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "..42"; p Parser::Ruby33.new.tokenize(buf)[2]'
ruby 3.3.0 (2023-12-25 revision 5124f9ac75
) [x86_64-darwin22]
[[:tBDOT2, ["..", #<Parser::Source::Range example.rb 0...2>]], [:tINTEGER, [42, #<Parser::Source::Range example.rb 2...4>]]]
```
## `Prism::Translation::Parser` (Actual)
Previously, the parser returned `tDOT2` token when parsing the following:
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "..42"; p Prism::Translation::Parser33.new.tokenize(buf)[2]'
ruby 3.3.0 (2023-12-25 revision 5124f9ac75
) [x86_64-darwin22]
[[:tDOT2, ["..", #<Parser::Source::Range example.rb 0...2>]], [:tINTEGER, [42, #<Parser::Source::Range example.rb 2...4>]]]
```
After the update, the parser now returns `tBDOT2` token for the same input:
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "..42"; p Prism::Translation::Parser33.new.tokenize(buf)[2]'
ruby 3.3.0 (2023-12-25 revision 5124f9ac75
) [x86_64-darwin22]
[[:tBDOT2, ["..", #<Parser::Source::Range example.rb 0...2>]], [:tINTEGER, [42, #<Parser::Source::Range example.rb 2...4>]]]
```
This correction enables the restoration of `endless_range_in_conditional.txt` as a test case.
f624b99ab0
2024-03-12 14:15:34 +00:00
Koichi ITO
2af6bc26c5
[ruby/prism] Fix an AST and token incompatibility for Prism::Translation::Parser
...
Fixes https://github.com/ruby/prism/pull/2515 .
This PR fixes an AST and token incompatibility between Parser gem and `Prism::Translation::Parser`
for string literal with line breaks.
c58466e5bf
2024-03-12 10:45:56 +00:00
Koichi ITO
dd24d88473
[ruby/prism] Fix a token incompatibility for Prism::Translation::Parser
...
Fixes https://github.com/ruby/prism/pull/2512 .
This PR fixes a token incompatibility between Parser gem and `Prism::Translation::Parser`
for `HEREDOC_END` with a newline.
b67d1e0c6f
2024-03-08 19:07:53 +00:00
Koichi ITO
af8a4205bf
Fix an error for Prism::Translation::Parser::Lexer
...
This PR fixes the following error for `Prism::Translation::Parser::Lexer` on the main branch:
```console
$ cat example.rb
'a' # aあ
"
#{x}
"
$ bundle exec rubocop
Parser::Source::Range: end_pos must not be less than begin_pos
/Users/koic/.rbenv/versions/3.0.4/lib/ruby/gems/3.0.0/gems/parser-3.3.0.5/lib/parser/source/range.rb:39:in `initialize'
/Users/koic/src/github.com/ruby/prism/lib/prism/translation/parser/lexer.rb:299:in `new'
/Users/koic/src/github.com/ruby/prism/lib/prism/translation/parser/lexer.rb:299:in `block in to_a'
/Users/koic/src/github.com/ruby/prism/lib/prism/translation/parser/lexer.rb:297:in `map'
/Users/koic/src/github.com/ruby/prism/lib/prism/translation/parser/lexer.rb:297:in `to_a'
/Users/koic/src/github.com/ruby/prism/lib/prism/translation/parser.rb:263:in `build_tokens'
/Users/koic/src/github.com/ruby/prism/lib/prism/translation/parser.rb:92:in `tokenize'
```
This change was made in https://github.com/ruby/prism/pull/2557 , and it seems there was
an inconsistency in Range due to forgetting to apply `offset_cache` to `start_offset`.
2024-03-08 12:33:02 +00:00
Koichi ITO
0e4bfd08e5
[ruby/prism] Fix an AST and token incompatibility for Prism::Translation::Parser
...
Fixes https://github.com/ruby/prism/pull/2506 .
This PR fixes an AST and token incompatibility between Parser gem and `Prism::Translation::Parser`
for symbols quoted with line breaks.
06ab4df8cd
2024-03-07 14:05:20 +00:00
Kevin Newton
d266b71467
[ruby/prism] Use the diagnostic types in the parser translation layer
...
1a8a0063dc
2024-03-06 21:42:54 -05:00
Ufuk Kayserilioglu
8dfe0c7c28
[ruby/prism] Fix some type-checking errors by using different method calls
...
For example, use `.fetch` or `.dig` instead of `[]`, and use `===` instead of `is_a?` for checking types of objects.
548b54915f
2024-03-06 21:37:52 +00:00
Kevin Newton
5856ea3fd1
[ruby/prism] Fix up some minor parser incompatibilities
...
c6c771d1fa
2024-03-04 14:39:52 +00:00
Max Prokopiev
f012ce0d18
[ruby/prism] Fix lexing of foo!
when it's a first thing to parse
...
7597aca76a
2024-02-16 14:08:56 +00:00
Kevin Newton
f12ebe1188
[ruby/prism] Add parser translation
...
8cdec8070c
2024-01-27 19:59:42 +00:00