Commit graph

215 commits

Author SHA1 Message Date
Jean Boussier
f664e7eaab [ruby/json] Add tests for the behavior of JSON.generate with base types subclasses
Ref: https://github.com/ruby/json/pull/674
Ref: https://github.com/ruby/json/pull/668

The behavior on such case it quite unclear, the goal here is to
figure out whatever was the behavior on Cext version of `json 2.7.0`
and get all implementations to converge.

We can then decide to make them all behave differently if we so wish.

614921dcef
2024-11-05 18:00:36 +01:00
Jean Boussier
ed22e68379 [ruby/json] JSON::Ext::Parser mark the name cache entries when not on the heap
This is somewhat dead code as unless you are using `JSON::Parser.new`
direcltly we never allocate `JSON::Ext::Parser` anymore.

But still, we should mark all its reference in case some code out there
uses that.

Followup: #675

8bf74a977b
2024-11-05 18:00:36 +01:00
Jean Boussier
ef5565f5d1 JSON.generate: call to_json on String subclasses
Fix: https://github.com/ruby/json/issues/667

This is yet another behavior on which the various implementations
differed, but the C implementation used to call `to_json` on String
subclasses used as keys.

This was optimized out in e125072130229e54a651f7b11d7d5a782ae7fb65
but there is an Active Support test case for it, so it's best to
make all 3 implementation respect this behavior.
2024-11-01 13:04:24 +09:00
Jean Boussier
3782600f0f [ruby/json] Emit warnings when dumping binary strings
Because of it's Ruby 1.8 heritage, the C extension doesn't care
much about strings encoding. We should get stricter over time.

42402fc13f
2024-11-01 13:04:24 +09:00
Jean Boussier
f2b8829df0 Deprecate unsafe default options of JSON.load
[Feature #19528]

Ref: https://bugs.ruby-lang.org/issues/19528

`load` is understood as the default method for serializer kind of libraries, and
the default options of `JSON.load` has caused many security vulnerabilities over the
years.

The plan is to do like YAML/Psych, deprecate these default options and direct
users toward using `JSON.unsafe_load` so at least it's obvious it should be
used against untrusted data.
2024-11-01 13:04:24 +09:00
Jean Boussier
cc2e67a138 Elide Generator::State allocation until a to_json method has to be called
Fix: https://github.com/ruby/json/issues/655

For very small documents, the biggest performance gap with alternatives is
that the API impose that we allocate the `State` object. In a real world app
this doesn't make much of a difference, but when running in a micro-benchmark
this doubles the allocations, causing twice the amount of GC runs, making us
look bad.

However, unless we have to call a `to_json` method, the `State` object isn't
visible, so with some refactoring, we can elude that allocation entirely.

Instead we allocate the State internal struct on the stack, and if we need
to call a `to_json` method, we allocate the `State` and spill the struct on
the heap.

As a result, `JSON.generate` is now as fast as re-using a `State` instance,
as long as only primitives are generated.

Before:
```
== Encoding small mixed (34 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
        json (reuse)   598.654k i/100ms
                json   400.542k i/100ms
                  oj   533.353k i/100ms
Calculating -------------------------------------
        json (reuse)      6.371M (± 8.6%) i/s  (156.96 ns/i) -     31.729M in   5.059195s
                json      4.120M (± 6.6%) i/s  (242.72 ns/i) -     20.828M in   5.090549s
                  oj      5.622M (± 6.4%) i/s  (177.86 ns/i) -     28.268M in   5.061473s

Comparison:
        json (reuse):  6371126.6 i/s
                  oj:  5622452.0 i/s - same-ish: difference falls within error
                json:  4119991.1 i/s - 1.55x  slower

== Encoding small nested array (121 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
        json (reuse)   248.125k i/100ms
                json   215.255k i/100ms
                  oj   217.531k i/100ms
Calculating -------------------------------------
        json (reuse)      2.628M (± 6.1%) i/s  (380.55 ns/i) -     13.151M in   5.030281s
                json      2.185M (± 6.7%) i/s  (457.74 ns/i) -     10.978M in   5.057655s
                  oj      2.217M (± 6.7%) i/s  (451.10 ns/i) -     11.094M in   5.044844s

Comparison:
        json (reuse):  2627799.4 i/s
                  oj:  2216824.8 i/s - 1.19x  slower
                json:  2184669.5 i/s - 1.20x  slower

== Encoding small hash (65 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
        json (reuse)   641.334k i/100ms
                json   322.745k i/100ms
                  oj   642.450k i/100ms
Calculating -------------------------------------
        json (reuse)      7.133M (± 6.5%) i/s  (140.19 ns/i) -     35.915M in   5.068201s
                json      4.615M (± 7.0%) i/s  (216.70 ns/i) -     22.915M in   5.003718s
                  oj      6.912M (± 6.4%) i/s  (144.68 ns/i) -     34.692M in   5.047690s

Comparison:
        json (reuse):  7133123.3 i/s
                  oj:  6911977.1 i/s - same-ish: difference falls within error
                json:  4614696.6 i/s - 1.55x  slower
```

After:

```
== Encoding small mixed (34 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
        json (reuse)   572.751k i/100ms
                json   457.741k i/100ms
                  oj   512.247k i/100ms
Calculating -------------------------------------
        json (reuse)      6.324M (± 6.9%) i/s  (158.12 ns/i) -     31.501M in   5.023093s
                json      6.263M (± 6.9%) i/s  (159.66 ns/i) -     31.126M in   5.017086s
                  oj      5.569M (± 6.6%) i/s  (179.56 ns/i) -     27.661M in   5.003739s

Comparison:
        json (reuse):  6324183.5 i/s
                json:  6263204.9 i/s - same-ish: difference falls within error
                  oj:  5569049.2 i/s - same-ish: difference falls within error

== Encoding small nested array (121 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
        json (reuse)   258.505k i/100ms
                json   242.335k i/100ms
                  oj   220.678k i/100ms
Calculating -------------------------------------
        json (reuse)      2.589M (± 9.6%) i/s  (386.17 ns/i) -     12.925M in   5.071853s
                json      2.594M (± 6.6%) i/s  (385.46 ns/i) -     13.086M in   5.083035s
                  oj      2.250M (± 2.3%) i/s  (444.43 ns/i) -     11.255M in   5.004707s

Comparison:
        json (reuse):  2589499.6 i/s
                json:  2594321.0 i/s - same-ish: difference falls within error
                  oj:  2250064.0 i/s - 1.15x  slower

== Encoding small hash (65 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
        json (reuse)   656.373k i/100ms
                json   644.135k i/100ms
                  oj   650.283k i/100ms
Calculating -------------------------------------
        json (reuse)      7.202M (± 7.1%) i/s  (138.84 ns/i) -     36.101M in   5.051438s
                json      7.278M (± 1.7%) i/s  (137.40 ns/i) -     36.716M in   5.046300s
                  oj      7.036M (± 1.7%) i/s  (142.12 ns/i) -     35.766M in   5.084729s

Comparison:
        json (reuse):  7202447.9 i/s
                json:  7277883.0 i/s - same-ish: difference falls within error
                  oj:  7036115.2 i/s - same-ish: difference falls within error

```
2024-11-01 13:04:24 +09:00
Benoit Daloze
88b411464d [ruby/json] Skip test failing on JRuby
0f0b16b3f5
2024-11-01 13:04:24 +09:00
Benoit Daloze
eb19156a28 [ruby/json] Add test for parsing broken strings
850bd077c4
2024-11-01 13:04:24 +09:00
Jean Boussier
ebfa178b72 [ruby/json] Setup ruby_memcheck
Hoping it might find the leak reported in https://github.com/ruby/json/issues/460

08635312e5
2024-11-01 13:04:24 +09:00
Jean Boussier
b094ee3f23
Handle all formatting configs potentially being nil.
Fix: https://github.com/ruby/json/issues/653

I don't think this was really fully supported in the past, but
it kinda worked with some of the implementations.
2024-10-29 13:25:01 +09:00
Jean Boussier
a5bd0c638a [ruby/json] Workaround rubygems $LOAD_PATH bug
Ref: https://github.com/ruby/json/issues/647
Ref: https://github.com/rubygems/rubygems/pull/6490

Older rubygems are executing `extconf.rb` with a broken `$LOAD_PATH`
causing the `json` gem native extension to be loaded with the stdlib
version of the `.rb` files.

This fails with

```
json/common.rb:82:in `initialize': wrong number of arguments (given 1, expected 0) (ArgumentError)
```

Since this is just for `extconf.rb` we can probably just accept that
extra argument and ignore it.

The bug was fixed in rubygems 3.4.9 / 2023-03-20

1f5e849fe0
2024-10-26 18:44:15 +09:00
Jean Boussier
3daf16e51f [ruby/json] Cleanup test_helper.rb
49de571dd8
2024-10-26 18:44:15 +09:00
Jean Boussier
7314275548 json_pure: fix ractor compatibility
This actually never worked, because the test was always testing
the ext version from the stdlib, never the pure version nor the
current ext version.
2024-10-26 18:44:15 +09:00
Jean Boussier
b1d417dc7b [ruby/json] Cleaner .encode / .force_encoding
cecf04fdfc
2024-10-26 18:44:15 +09:00
Jean Boussier
1045b9f820 [ruby/json] Modernize heredocs
fb25e94aea
2024-10-26 18:44:15 +09:00
Jean Boussier
bfdf02ea72 pretty_generate: don't apply object_nl / array_nl for empty containers
Fix: https://github.com/ruby/json/issues/437

Before:

```json
{
  "foo": {
  },
  "bar": [
  ]
}
```

After:

```json
{
  "foo": {},
  "bar": []
}
```
2024-10-26 18:44:15 +09:00
Jean Boussier
fc9f0cb8c5 [ruby/json] JSON.dump / String#to_json: raise on invalid encoding
This regressed since 2.7.2.

35407d6635
2024-10-26 18:44:15 +09:00
Benoit Daloze
2c6e3bc71e Raise the correct exception in fast_serialize_string
* Related to https://github.com/ruby/json/issues/344
2024-10-26 18:44:15 +09:00
Jean Boussier
70f554efb4 [ruby/json] raise_parse_error: avoid UB
Fix: https://github.com/ruby/json/pull/625

Declaring the buffer in a sub block cause bugs on some compilers.

90967c9eb0
2024-10-26 18:44:15 +09:00
Étienne Barrié
44aef5e852 [ruby/json] Drop compatibility for missing Array#permutation (Ruby <= 1.8.6)
b02091ed44

Co-authored-by: Jean Boussier <jean.boussier@gmail.com>
2024-10-26 18:44:15 +09:00
Étienne Barrié
82f7550f65 Use frozen string literals
Co-authored-by: Jean Boussier <jean.boussier@gmail.com>
2024-10-26 18:44:15 +09:00
Étienne Barrié
11348c583f Use Encoding constants, String#b
Co-authored-by: Jean Boussier <jean.boussier@gmail.com>
2024-10-26 18:44:15 +09:00
Jean Boussier
18cc663aef [ruby/json] Add test coverage for JSON.load with a Proc
Fix: https://github.com/ruby/json/issues/438

9dd89eaac8
2024-10-26 18:44:15 +09:00
Jean Boussier
9045258c88 [ruby/json] Limit the size of ParserError exception messages
Fix: https://github.com/ruby/json/issues/534

Only include up to 32 bytes of unparseable the source.

f44995cfb6
2024-10-26 18:44:15 +09:00
Stephen Humphries
326a21d441
Relax Pure::Parser's comment regex...
...to allow any character sequence, including "/*", before then end
sequence of a multi-line ANSI C-style comment
.
2024-10-18 11:31:42 +09:00
Jean Boussier
8feed977a0 [ruby/json] Assume Encoding is defined
8713aa4812
2024-10-18 11:30:55 +09:00
Takumasa Ochi
20dc1e5c25 [ruby/json] Always dup argument to preserve original encoding for force_encoding
db9a489ca2
2024-10-18 11:30:42 +09:00
YuheiNakasaka
57e1b64c81 [ruby/json] Fix behavior of trying to parse non-string objects
e2e9936047
2024-10-18 11:28:13 +09:00
Hiroshi SHIBATA
1379ef6f7d [ruby/json] Embedded helper.rb into test_helper.rb
f8417ffc69
2024-10-16 04:34:22 +00:00
Hiroshi SHIBATA
3c1b0f21aa [ruby/json] Fixed load path for ext version
c17823688e
2024-10-16 04:34:21 +00:00
Hiroshi SHIBATA
8af6606b22
Restore missing test-case from c5a6d80427
Co-authored-by: "Jean Boussier" <byroot@ruby-lang.org>
2024-10-16 11:24:25 +09:00
Jean Boussier
fdbead76ec
[ruby/json] ractor_test.rb: ignore stderr
When rubygems is double loaded it fails the test.

The warning should happen in the first place but this
makes the test more resilient.

513ddeaeb1
2024-10-16 11:24:25 +09:00
Hiroshi SHIBATA
fe33475605 Removed trailing space 2024-10-08 14:10:05 +09:00
Jean Boussier
718c4f7e1e JSONPure: String#to_json should raise on invalid encoding
Fix: #344

This matches the ext behavior.
2024-10-08 14:10:05 +09:00
Jean Boussier
8fdd3d0ed6 JSON::Pure fix strict mode
Followup: https://github.com/flori/json/pull/519
Fix: https://github.com/flori/json/issues/584
2024-10-08 14:10:05 +09:00
Jean Boussier
8c7e291dd8 Update references to flori/json
Now that the repository was transfered, these links will become
dead in a few months.
2024-10-07 20:12:57 -04:00
Étienne Barrié
f4883e7904 [flori/json] Use the compiled extension in test
148afef84c
2024-09-03 11:51:51 +09:00
Hiroshi SHIBATA
7c8f9603b1 [flori/json] Make OpenStruct support as optional
202ffe2335
2024-01-31 14:56:00 +09:00
Hiroshi SHIBATA
84654bfbba [flori/json] cosmetics
39d6c854a4
2023-12-05 12:04:11 +09:00
Hiroshi SHIBATA
abc3d124f7 [flori/json] The modern Ruby uses utf-8 encodings by default
11b31210ac
2023-12-05 12:04:10 +09:00
tompng
70740deea7 [flori/json] Fix JSON.dump overload combination
41c2712a3b
2023-12-05 12:04:08 +09:00
Takashi Kokubun
e6b35e8a6d [flori/json] Overload kwargs in JSON.dump
936f280f9f
2023-12-05 12:04:08 +09:00
Jean Boussier
a22ed89438 [flori/json] JSON.dump: handle unenclosed hashes regression
Fix: https://github.com/flori/json/issues/553

We can never add keyword arguments to `dump` otherwise
existing code using unenclosed hash will break.

8e0076a3f2
2023-12-05 12:04:07 +09:00
Hiroshi SHIBATA
7d142c08cb
lib/helper only needs on flori/json repo 2023-12-01 16:47:06 +09:00
Hiroshi SHIBATA
86045fca24
Manually merged from flori/json
> https://github.com/flori/json/pull/525
  > Rename escape_slash in script_safe and also escape E+2028 and E+2029

  Co-authored-by: Jean Boussier <jean.boussier@gmail.com>

  > https://github.com/flori/json/pull/454
  > Remove unnecessary initialization of create_id in JSON.parse()

  Co-authored-by: Watson <watson1978@gmail.com>
2023-12-01 16:47:06 +09:00
Jean Boussier
0dfeb17296
Rename escape_slash in script_safe and also escape E+2028 and E+2029
It is rather common to directly interpolate JSON string inside
<script> tags in HTML as to provide configuration or parameters to a
script.

However this may lead to XSS vulnerabilities, to prevent that 3
characters need to be escaped:

  - `/` (forward slash)
  - `U+2028` (LINE SEPARATOR)
  - `U+2029` (PARAGRAPH SEPARATOR)

The forward slash need to be escaped to prevent closing the script
tag early, and the other two are valid JSON but invalid Javascript
and can be used to break JS parsing.

Given that the intent of escaping forward slash is the same than escaping
U+2028 and U+2029, I chos to rename and repurpose the existing `escape_slash`
option.
2023-12-01 16:47:06 +09:00
John Hawthorn
4b770527c2
[flori/json] Fix "unexpected token" offset for Infinity
Previously in the JSON::Ext parser, when we encountered an "Infinity"
token (and weren't allowing NaN/Infinity) we would try to display the
"unexpected token" at the character before.

42ac170712
2023-12-01 16:47:06 +09:00
Lucas Kanashiro
854e6559b6
[flori/json] tests/ractor_test.rb: make assert_separately available
Require tests/lib/helper.rb to avoid:

NoMethodError: undefined method `assert_separately'

a81bcc0328
2023-12-01 16:47:06 +09:00
Hiroshi SHIBATA
b17ae88894
[flori/json] skip TruffleRuby
bab704eb49
2023-10-11 15:45:17 +09:00
Hiroshi SHIBATA
e42df781d9
[flori/json] define_method is also private at Ruby 2.3 and 2.4
3804f38bf4
2023-10-11 15:45:17 +09:00