This PR fixes the following incorrect range of `Prism::Location` when `PM_ERR_RETURN_INVALID`.
It may be hard to tell from the text, but this Ruby error highlights `return`:
```console
$ ruby -e 'class Foo return end'
-e:1: Invalid return in class/module body
class Foo return end
-e: compile error (SyntaxError)
```
Previously, the error's `Prism::Location` pointed to `end`:
```console
$ bundle exec ruby -Ilib -rprism -ve 'p Prism.parse("class Foo return end").errors'
ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-darwin22]
[#<Prism::ParseError @type=:return_invalid @message="invalid `return` in a class or module body"
@location=#<Prism::Location @start_offset=17 @length=3 start_line=1> @level=:fatal>]
After this fix, it will indicate `return`.
```console
$ bundle exec ruby -Ilib -rprism -ve 'p Prism.parse("class Foo return end").errors'
ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-darwin22]
[#<Prism::ParseError @type=:return_invalid @message="invalid `return` in a class or module body"
@location=#<Prism::Location @start_offset=10 @length=6 start_line=1> @level=:fatal>]
```
For reference, here are the before and after of `Prism::Translation::Parser`.
Before:
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve 'p Prism::Translation::Parser33.parse("class Foo return end")'
ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-darwin22]
(string):1:18: error: invalid `return` in a class or module body
(string):1: class Foo return end
(string):1: ^~~
/Users/koic/.rbenv/versions/3.3.0/lib/ruby/gems/3.3.0/gems/parser-3.3.0.5/lib/parser/diagnostic/engine.rb:72:in `process':
invalid `return` in a class or module body (Parser::SyntaxError)
from /Users/koic/src/github.com/ruby/prism/lib/prism/translation/parser.rb:220:in `block in unwrap'
from /Users/koic/src/github.com/ruby/prism/lib/prism/translation/parser.rb:218:in `each'
from /Users/koic/src/github.com/ruby/prism/lib/prism/translation/parser.rb:218:in `unwrap'
from /Users/koic/src/github.com/ruby/prism/lib/prism/translation/parser.rb:49:in `parse'
from /Users/koic/.rbenv/versions/3.3.0/lib/ruby/gems/3.3.0/gems/parser-3.3.0.5/lib/parser/base.rb:33:in `parse'
from -e:1:in `<main>'
```
After:
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve 'p Prism::Translation::Parser33.parse("class Foo return end")'
ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-darwin22]
(string):1:11: error: invalid `return` in a class or module body
(string):1: class Foo return end
(string):1: ^~~~~~
/Users/koic/.rbenv/versions/3.3.0/lib/ruby/gems/3.3.0/gems/parser-3.3.0.5/lib/parser/diagnostic/engine.rb:72:in `process':
invalid `return` in a class or module body (Parser::SyntaxError)
from /Users/koic/src/github.com/ruby/prism/lib/prism/translation/parser.rb:220:in `block in unwrap'
from /Users/koic/src/github.com/ruby/prism/lib/prism/translation/parser.rb:218:in `each'
from /Users/koic/src/github.com/ruby/prism/lib/prism/translation/parser.rb:218:in `unwrap'
from /Users/koic/src/github.com/ruby/prism/lib/prism/translation/parser.rb:49:in `parse'
from /Users/koic/.rbenv/versions/3.3.0/lib/ruby/gems/3.3.0/gems/parser-3.3.0.5/lib/parser/base.rb:33:in `parse'
from -e:1:in `<main>'
```
This PR ensures that the originally intended `return` is highlighted as it should be.
1f9af4d2ad
Fixes https://github.com/ruby/prism/pull/2617.
There was an issue with the lexer as follows.
The following are valid regexp options:
```console
$ bundle exec ruby -Ilib -rprism -ve 'p Prism.lex("/x/io").value.map {|token| token[0].type }'
ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-darwin22]
[:REGEXP_BEGIN, :STRING_CONTENT, :REGEXP_END, :EOF]
```
The following are invalid regexp options. Unnecessary the `IDENTIFIER` token is appearing:
```console
$ bundle exec ruby -Ilib -rprism -ve 'p Prism.lex("/x/az").value.map {|token| token[0].type }'
ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-darwin22]
[:REGEXP_BEGIN, :STRING_CONTENT, :REGEXP_END, :IDENTIFIER, :EOF]
```
As a behavior of Ruby, when given `A` to `Z` and `a` to `z`, they act as invalid regexp options. e.g.,
```console
$ ruby -e '/regexp/az'
-e:1: unknown regexp options - az
/regexp/az
-e: compile error (SyntaxError)
```
Thus, it should probably not be construed as `IDENTIFIER` token.
Therefore, `pm_byte_table` has been adapted to accept those invalid regexp option values.
Whether it is a valid regexp option or not is checked by `pm_regular_expression_flags_create`.
For invalid regexp options, `PM_ERR_REGEXP_UNKNOWN_OPTIONS` is added to diagnostics.
d2a6096fcf
We're not using this anymore, and it doesn't make a lot of sense
outside the context of a compiler anyway, and in anyway it's wrong
when you have local variables written in default values.
5edbd9c25b
Split up the diagnostic levels so that error and warning levels
aren't mixed. Also fix up deconstruct_keys implementation.
bd3eeb308d
Co-authored-by: Benoit Daloze <eregontp@gmail.com>
Ruby sets a Symbol literal's encoding to US-ASCII if the symbols consists only of US ASCII code points. Character escapes can also lead a Symbol to have a different encoding than its source's encoding.
f315660b31
It's possible to repeat parameters in method definitions like so:
```ruby
def foo(_a, _a)
end
```
The compiler needs to know to adjust the local table size to account for
these duplicate names. We'll use the repeated parameter flag to account
for the extra stack space required
b443cb1f60
Co-Authored-By: Kevin Newton <kddnewton@gmail.com>
Co-Authored-By: Jemma Issroff <jemmaissroff@gmail.com>
This doesn't actually fix the encodings for symbols and regex,
unfortunately. But I wanted to get this change in because it is
the last AST change we're going to make before 3.3 is released.
So, if consumers want, they can start to check these flags to
determine the encoding, even though it will be wrong. Then once we
actually set them correctly, everything should work.
9b35f7e891