Commit graph

14 commits

Author SHA1 Message Date
Mike Dalessio
cfe1edddbf [ruby/yarp] fix: report syntax error for invalid hex escape
Closes https://github.com/ruby/yarp/pull/1367

b1ab54f526
2023-09-01 17:04:37 +00:00
Mike Dalessio
512f8217cb [ruby/yarp] fix: double-counting of errors in parsing escaped strings
Essentially, this change updates `yp_unescape_calculate_difference` to
not create syntax errors, and we rely entirely on
`yp_unescape_manipulate_string` to report syntax errors.

To do that, this PR adds another (!) parameter to `unescape`:
`yp_list_t *error_list`. When present, `unescape` reports syntax
errors (and otherwise does not).

However, an edge case that needed to be addressed is reporting syntax
errors in this case:

    ?\u{1234 2345}

In a string context, it's possible to have multiple codepoints by
doing something like `"\u{1234 2345}"`; however, in the character
literal context, this is a syntax error -- only a single codepoint is
allowed.

Unfortunately, when `yp_unescape_manipulate_string` is called, there's
nothing to indicate that we are in a "character literal" context and
that only a single codepoint is valid.

To make this work, this PR:

- introduces a new static utility function in yarp.c,
  `yp_char_literal_node_create_and_unescape`, which is called when
  we're parsing `YP_TOKEN_CHARACTER_LITERAL`
- introduces a new (unexported) function,
  `yp_unescape_manipulate_char_literal` which does the same thing as
  `yp_unescape_manipulate_string` but tells `unescape` that only a
  single codepoint is expected

f6a65840b5
2023-09-01 17:04:37 +00:00
Mike Dalessio
df4c77608e [ruby/yarp] fix: octal, hex, and unicode strings at the end of a
file
(https://github.com/ruby/yarp/pull/1371)

* refactor: move EOF check into yp_unescape_calculate_difference

parser_lex is a bit more readable when we can rely on that behavior

* fix: octal and hex digits at the end of a file

Previously this resulted in invalid memory access.

* fix: unicode strings at the end of a file

Previously this resulted in invalid memory access.

* Unterminated curly-bracket unicode is a syntax error

21cf11acb5
2023-08-31 22:40:35 +00:00
Mike Dalessio
6599ca44bb [ruby/yarp] simplify the calling convention for unescape
We don't need to pass in a destination pointer _and_ a write_to_str
boolean flag.

347cb29ebb
2023-08-30 20:51:49 +00:00
Kevin Newton
7be08f3f58 [ruby/yarp] Switch from handling const char * to const uint8_t *
465e7bb0a9
2023-08-30 14:41:23 -04:00
Kevin Newton
432702a427 [ruby/yarp] Encoding-dependent escapes
36a5b801c4
2023-08-24 11:56:16 -04:00
Takashi Kokubun
40002dd7dc Resync YARP 2023-08-17 09:58:56 -07:00
Takashi Kokubun
3873b1eb39 Resync YARP 2023-08-16 17:47:32 -07:00
Benoit Daloze
b6f26c2e4a [ruby/yarp] Use common fields for yp_string_t
* Otherwise it is undefined behavior to access the field of another `.as`.
* Accessing the right `.as` field according mode would be extra overhead.

7dc41ee803
2023-08-16 17:47:32 -07:00
Benoit Daloze
2ccaaaa101 [ruby/yarp] Add simpler exported unescape function to librubyparser
* Moves logic from the C extension to librubyparser which can be shared with the Fiddle backend.

aa48d5e444
2023-08-16 17:47:32 -07:00
Kevin Newton
45efbadba5 [ruby/yarp] Enable all of -wconversion
638163f6c6
2023-08-16 17:47:32 -07:00
Jemma Issroff
bfb933371d Manual YARP resync 2023-07-05 16:58:55 -04:00
Kevin Newton
26b69fd407 [ruby/yarp] Handle bad input for ascii printable
06242aa7a0
2023-06-29 01:23:37 +00:00
Jemma Issroff
cc7f765f2c [Feature #19741] Sync all files in yarp
This commit is the initial sync of all files from ruby/yarp
into ruby/ruby. Notably, it does the following:

* Sync all ruby/yarp/lib/ files to ruby/ruby/lib/yarp
* Sync all ruby/yarp/src/ files to ruby/ruby/yarp/
* Sync all ruby/yarp/test/ files to ruby/ruby/test/yarp
2023-06-21 11:25:39 -07:00