Commit graph

9456 commits

Author SHA1 Message Date
Jean Boussier
b85a7a44fa [ruby/json] Trigger write barrier when setting Generator::State configs
Followup: 6382c231b0

0c797b4a11
2024-11-01 13:04:24 +09:00
Jean Boussier
ef5565f5d1 JSON.generate: call to_json on String subclasses
Fix: https://github.com/ruby/json/issues/667

This is yet another behavior on which the various implementations
differed, but the C implementation used to call `to_json` on String
subclasses used as keys.

This was optimized out in e125072130229e54a651f7b11d7d5a782ae7fb65
but there is an Active Support test case for it, so it's best to
make all 3 implementation respect this behavior.
2024-11-01 13:04:24 +09:00
Jean Boussier
b8b33efd4d [ruby/json] Remove String#-@ check in extconf.rb
Now that older rubies have been droped, we no longer need to check
for all that.

35cf2b84e0
2024-11-01 13:04:24 +09:00
Jean Boussier
165cc6cf40 [ruby/json] json_string_unescape: assume the string doesn't need escaping
If that assumption holds true, then we don't need to copy the
string into a buffer to unescape it. For small string is just saves
copying, but for large ones it also saves a malloc/free combo.

Before:

```
== Parsing twitter.json (567916 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    52.000 i/100ms
                  oj    61.000 i/100ms
           oj strict    70.000 i/100ms
          Oj::Parser    71.000 i/100ms
           rapidjson    55.000 i/100ms
Calculating -------------------------------------
                json    510.111 (± 2.9%) i/s    (1.96 ms/i) -      2.548k in   5.000029s
                  oj    610.232 (± 3.1%) i/s    (1.64 ms/i) -      3.050k in   5.003725s
           oj strict    713.231 (± 3.2%) i/s    (1.40 ms/i) -      3.570k in   5.010902s
          Oj::Parser    762.598 (± 3.0%) i/s    (1.31 ms/i) -      3.834k in   5.033130s
           rapidjson    553.029 (± 7.4%) i/s    (1.81 ms/i) -      2.750k in   5.022630s

Comparison:
                json:      510.1 i/s
          Oj::Parser:      762.6 i/s - 1.49x  faster
           oj strict:      713.2 i/s - 1.40x  faster
                  oj:      610.2 i/s - 1.20x  faster
           rapidjson:      553.0 i/s - same-ish: difference falls within error

== Parsing citm_catalog.json (1727030 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    28.000 i/100ms
                  oj    33.000 i/100ms
           oj strict    37.000 i/100ms
          Oj::Parser    43.000 i/100ms
           rapidjson    38.000 i/100ms
Calculating -------------------------------------
                json    303.853 (± 3.6%) i/s    (3.29 ms/i) -      1.540k in   5.076079s
                  oj    348.009 (± 2.0%) i/s    (2.87 ms/i) -      1.749k in   5.027738s
           oj strict    396.679 (± 3.3%) i/s    (2.52 ms/i) -      1.998k in   5.042271s
          Oj::Parser    406.699 (± 2.2%) i/s    (2.46 ms/i) -      2.064k in   5.077587s
           rapidjson    393.463 (± 3.3%) i/s    (2.54 ms/i) -      1.976k in   5.028501s

Comparison:
                json:      303.9 i/s
          Oj::Parser:      406.7 i/s - 1.34x  faster
           oj strict:      396.7 i/s - 1.31x  faster
           rapidjson:      393.5 i/s - 1.29x  faster
                  oj:      348.0 i/s - 1.15x  faster
```

After:

```
== Parsing twitter.json (567916 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    56.000 i/100ms
                  oj    62.000 i/100ms
           oj strict    72.000 i/100ms
          Oj::Parser    77.000 i/100ms
           rapidjson    55.000 i/100ms
Calculating -------------------------------------
                json    568.025 (± 2.1%) i/s    (1.76 ms/i) -      2.856k in   5.030272s
                  oj    630.936 (± 1.4%) i/s    (1.58 ms/i) -      3.162k in   5.012630s
           oj strict    705.784 (±11.2%) i/s    (1.42 ms/i) -      3.456k in   5.006706s
          Oj::Parser    783.989 (± 1.7%) i/s    (1.28 ms/i) -      3.927k in   5.010343s
           rapidjson    557.630 (± 2.0%) i/s    (1.79 ms/i) -      2.805k in   5.032388s

Comparison:
                json:      568.0 i/s
          Oj::Parser:      784.0 i/s - 1.38x  faster
           oj strict:      705.8 i/s - 1.24x  faster
                  oj:      630.9 i/s - 1.11x  faster
           rapidjson:      557.6 i/s - same-ish: difference falls within error

== Parsing citm_catalog.json (1727030 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    29.000 i/100ms
                  oj    33.000 i/100ms
           oj strict    38.000 i/100ms
          Oj::Parser    43.000 i/100ms
           rapidjson    37.000 i/100ms
Calculating -------------------------------------
                json    319.271 (± 3.1%) i/s    (3.13 ms/i) -      1.595k in   5.001128s
                  oj    347.946 (± 1.7%) i/s    (2.87 ms/i) -      1.749k in   5.028395s
           oj strict    396.914 (± 3.0%) i/s    (2.52 ms/i) -      2.014k in   5.079645s
          Oj::Parser    409.311 (± 2.7%) i/s    (2.44 ms/i) -      2.064k in   5.046626s
           rapidjson    394.752 (± 1.5%) i/s    (2.53 ms/i) -      1.998k in   5.062776s

Comparison:
                json:      319.3 i/s
          Oj::Parser:      409.3 i/s - 1.28x  faster
           oj strict:      396.9 i/s - 1.24x  faster
           rapidjson:      394.8 i/s - 1.24x  faster
                  oj:      347.9 i/s - 1.09x  faster
```

7e0f66546a
2024-11-01 13:04:24 +09:00
Jean Boussier
081689b9e2 [ruby/json] parser.rl: extract build_string
7e557ee291
2024-11-01 13:04:24 +09:00
Benoit Daloze
6412e6f6c3 [ruby/json] Use String#encode instead of rb_str_conv_enc()
* rb_str_conv_enc() returns the source string unmodified
  if the conversion did not work. But we should be consistent with
  the generator here and only accept BINARY or convertible to UTF-8.

1344ad6f66
2024-11-01 13:04:24 +09:00
Jean Boussier
3782600f0f [ruby/json] Emit warnings when dumping binary strings
Because of it's Ruby 1.8 heritage, the C extension doesn't care
much about strings encoding. We should get stricter over time.

42402fc13f
2024-11-01 13:04:24 +09:00
Jean Boussier
f2b8829df0 Deprecate unsafe default options of JSON.load
[Feature #19528]

Ref: https://bugs.ruby-lang.org/issues/19528

`load` is understood as the default method for serializer kind of libraries, and
the default options of `JSON.load` has caused many security vulnerabilities over the
years.

The plan is to do like YAML/Psych, deprecate these default options and direct
users toward using `JSON.unsafe_load` so at least it's obvious it should be
used against untrusted data.
2024-11-01 13:04:24 +09:00
Jean Boussier
cc2e67a138 Elide Generator::State allocation until a to_json method has to be called
Fix: https://github.com/ruby/json/issues/655

For very small documents, the biggest performance gap with alternatives is
that the API impose that we allocate the `State` object. In a real world app
this doesn't make much of a difference, but when running in a micro-benchmark
this doubles the allocations, causing twice the amount of GC runs, making us
look bad.

However, unless we have to call a `to_json` method, the `State` object isn't
visible, so with some refactoring, we can elude that allocation entirely.

Instead we allocate the State internal struct on the stack, and if we need
to call a `to_json` method, we allocate the `State` and spill the struct on
the heap.

As a result, `JSON.generate` is now as fast as re-using a `State` instance,
as long as only primitives are generated.

Before:
```
== Encoding small mixed (34 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
        json (reuse)   598.654k i/100ms
                json   400.542k i/100ms
                  oj   533.353k i/100ms
Calculating -------------------------------------
        json (reuse)      6.371M (± 8.6%) i/s  (156.96 ns/i) -     31.729M in   5.059195s
                json      4.120M (± 6.6%) i/s  (242.72 ns/i) -     20.828M in   5.090549s
                  oj      5.622M (± 6.4%) i/s  (177.86 ns/i) -     28.268M in   5.061473s

Comparison:
        json (reuse):  6371126.6 i/s
                  oj:  5622452.0 i/s - same-ish: difference falls within error
                json:  4119991.1 i/s - 1.55x  slower

== Encoding small nested array (121 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
        json (reuse)   248.125k i/100ms
                json   215.255k i/100ms
                  oj   217.531k i/100ms
Calculating -------------------------------------
        json (reuse)      2.628M (± 6.1%) i/s  (380.55 ns/i) -     13.151M in   5.030281s
                json      2.185M (± 6.7%) i/s  (457.74 ns/i) -     10.978M in   5.057655s
                  oj      2.217M (± 6.7%) i/s  (451.10 ns/i) -     11.094M in   5.044844s

Comparison:
        json (reuse):  2627799.4 i/s
                  oj:  2216824.8 i/s - 1.19x  slower
                json:  2184669.5 i/s - 1.20x  slower

== Encoding small hash (65 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
        json (reuse)   641.334k i/100ms
                json   322.745k i/100ms
                  oj   642.450k i/100ms
Calculating -------------------------------------
        json (reuse)      7.133M (± 6.5%) i/s  (140.19 ns/i) -     35.915M in   5.068201s
                json      4.615M (± 7.0%) i/s  (216.70 ns/i) -     22.915M in   5.003718s
                  oj      6.912M (± 6.4%) i/s  (144.68 ns/i) -     34.692M in   5.047690s

Comparison:
        json (reuse):  7133123.3 i/s
                  oj:  6911977.1 i/s - same-ish: difference falls within error
                json:  4614696.6 i/s - 1.55x  slower
```

After:

```
== Encoding small mixed (34 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
        json (reuse)   572.751k i/100ms
                json   457.741k i/100ms
                  oj   512.247k i/100ms
Calculating -------------------------------------
        json (reuse)      6.324M (± 6.9%) i/s  (158.12 ns/i) -     31.501M in   5.023093s
                json      6.263M (± 6.9%) i/s  (159.66 ns/i) -     31.126M in   5.017086s
                  oj      5.569M (± 6.6%) i/s  (179.56 ns/i) -     27.661M in   5.003739s

Comparison:
        json (reuse):  6324183.5 i/s
                json:  6263204.9 i/s - same-ish: difference falls within error
                  oj:  5569049.2 i/s - same-ish: difference falls within error

== Encoding small nested array (121 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
        json (reuse)   258.505k i/100ms
                json   242.335k i/100ms
                  oj   220.678k i/100ms
Calculating -------------------------------------
        json (reuse)      2.589M (± 9.6%) i/s  (386.17 ns/i) -     12.925M in   5.071853s
                json      2.594M (± 6.6%) i/s  (385.46 ns/i) -     13.086M in   5.083035s
                  oj      2.250M (± 2.3%) i/s  (444.43 ns/i) -     11.255M in   5.004707s

Comparison:
        json (reuse):  2589499.6 i/s
                json:  2594321.0 i/s - same-ish: difference falls within error
                  oj:  2250064.0 i/s - 1.15x  slower

== Encoding small hash (65 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
        json (reuse)   656.373k i/100ms
                json   644.135k i/100ms
                  oj   650.283k i/100ms
Calculating -------------------------------------
        json (reuse)      7.202M (± 7.1%) i/s  (138.84 ns/i) -     36.101M in   5.051438s
                json      7.278M (± 1.7%) i/s  (137.40 ns/i) -     36.716M in   5.046300s
                  oj      7.036M (± 1.7%) i/s  (142.12 ns/i) -     35.766M in   5.084729s

Comparison:
        json (reuse):  7202447.9 i/s
                json:  7277883.0 i/s - same-ish: difference falls within error
                  oj:  7036115.2 i/s - same-ish: difference falls within error

```
2024-11-01 13:04:24 +09:00
Jean Boussier
7daa1083c9 [ruby/json] Move State#configure back into C
While less nice, this open the door to eluding the State object
allocation when possible.

5c0d428d4c
2024-11-01 13:04:24 +09:00
Jean Boussier
5dc3b15b3c [ruby/json] generator.c: store pretty strings in VALUE
Given we expect these to almost always be null, we might as
well keep them in RString.

And even when provided, assuming we're passed frozen strings
we'll save on copying them.

This also reduce the size of the struct from 112B to 72B.

6382c231b0
2024-11-01 13:04:24 +09:00
Jean Boussier
4a5e44953a [ruby/json] Make fbuffer_inc_capa easier to inline
With the extra logic added for stack allocation, and especially the
memcpy, it became harder for compilers to inline.

This doesn't fully reclaim the speed lost with the stack allocation,
but it's getting closer.

Before:

```
== Encoding twitter.json (466906 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json   160.000 i/100ms
                  oj   225.000 i/100ms
Calculating -------------------------------------
                json      1.577k (± 2.0%) i/s  (634.20 μs/i) -      8.000k in   5.075561s
                  oj      2.264k (± 2.3%) i/s  (441.79 μs/i) -     11.475k in   5.072205s

Comparison:
                json:     1576.8 i/s
                  oj:     2263.5 i/s - 1.44x  faster

== Encoding citm_catalog.json (500298 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json   101.000 i/100ms
                  oj   123.000 i/100ms
Calculating -------------------------------------
                json      1.033k (± 2.6%) i/s  (968.06 μs/i) -      5.252k in   5.087617s
                  oj      1.257k (± 2.2%) i/s  (795.54 μs/i) -      6.396k in   5.090830s

Comparison:
                json:     1033.0 i/s
                  oj:     1257.0 i/s - 1.22x  faster
```

After:

```
== Encoding twitter.json (466906 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) [arm64-darwin23]
Warming up --------------------------------------
                json   213.000 i/100ms
                  oj   230.000 i/100ms
Calculating -------------------------------------
                json      2.064k (± 3.6%) i/s  (484.44 μs/i) -     10.437k in   5.063685s
                  oj      2.246k (± 0.7%) i/s  (445.19 μs/i) -     11.270k in   5.017541s

Comparison:
                json:     2064.2 i/s
                  oj:     2246.2 i/s - 1.09x  faster

== Encoding citm_catalog.json (500298 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) [arm64-darwin23]
Warming up --------------------------------------
                json   133.000 i/100ms
                  oj   132.000 i/100ms
Calculating -------------------------------------
                json      1.327k (± 1.7%) i/s  (753.69 μs/i) -      6.650k in   5.013565s
                  oj      1.305k (± 2.2%) i/s  (766.40 μs/i) -      6.600k in   5.061089s

Comparison:
                json:     1326.8 i/s
                  oj:     1304.8 i/s - same-ish: difference falls within error
```

89f816e868
2024-11-01 13:04:24 +09:00
Jean Boussier
59eebeca02 [ruby/json] Allocate the initial generator buffer on the stack
Ref: https://github.com/ruby/json/issues/655
Followup: https://github.com/ruby/json/issues/657

Assuming the generator might be used for fairly small documents
we can start with a reasonable buffer size of the stack, and if
we outgrow it, we can spill on the heap.

In a way this is optimizing for micro-benchmarks, but there are
valid use case for fiarly small JSON document in actual real world
scenarios, so trashing the GC less in such case make sense.

Before:

```
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                  Oj   518.700k i/100ms
          JSON reuse   483.370k i/100ms
Calculating -------------------------------------
                  Oj      5.722M (± 1.8%) i/s  (174.76 ns/i) -     29.047M in   5.077823s
          JSON reuse      5.278M (± 1.5%) i/s  (189.46 ns/i) -     26.585M in   5.038172s

Comparison:
                  Oj:  5722283.8 i/s
          JSON reuse:  5278061.7 i/s - 1.08x  slower
```

After:

```
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                  Oj   517.837k i/100ms
          JSON reuse   548.871k i/100ms
Calculating -------------------------------------
                  Oj      5.693M (± 1.6%) i/s  (175.65 ns/i) -     28.481M in   5.004056s
          JSON reuse      5.855M (± 1.2%) i/s  (170.80 ns/i) -     29.639M in   5.063004s

Comparison:
                  Oj:  5692985.6 i/s
          JSON reuse:  5854857.9 i/s - 1.03x  faster
```

fe607f4806
2024-11-01 13:04:24 +09:00
Jean Boussier
d329896fb5 [ruby/json] Fix a memory leak in #to_json methods
Fix: https://github.com/ruby/json/issues/460

The various `to_json` methods must rescue exceptions
to free the buffer.

```
require 'json'

data = 10_000.times.to_a << BasicObject.new
20.times do
  100.times do
    begin
      data.to_json
    rescue NoMethodError
    end
  end
  puts `ps -o rss= -p #{$$}`
end
```

```
 20128
 24992
 29920
 34672
 39600
 44336
 49136
 53936
 58816
 63616
 68416
 73232
 78032
 82896
 87696
 92528
 97408
102208
107008
111808
```

d227d225ca
2024-11-01 13:04:24 +09:00
Kazuki Yamaguchi
27d77a9c73 [ruby/openssl] pkcs7: remove default cipher from PKCS7.encrypt
Require that users explicitly specify the desired algorithm. In my
opinion, we are not in a position to specify the default cipher.

When OpenSSL::PKCS7.encrypt is given only two arguments, it uses
"RC2-40-CBC" as the symmetric cipher algorithm. 40-bit RC2 is a US
export-grade cipher and considered insecure.

Although this is technically a breaking change, the impact should be
minimal. Even when OpenSSL is compiled with RC2 support and the macro
OPENSSL_NO_RC2 is not defined, it will not actually work on modern
systems because RC2 is part of the legacy provider.

439f456bfa
2024-10-31 08:31:16 +00:00
Kazuki Yamaguchi
339a8dd5e7 [ruby/openssl] ssl: remove redundant ossl_ssl_ex_vcb_idx
The SSL ex_data index is used for storing the verify_callback Proc. The
only user of it, ossl_ssl_verify_callback(), can find the callback by
looking at the SSLContext object which is always known.

3a3d6e258b
2024-10-31 08:28:34 +00:00
Yuki Morohoshi
772a213a29 [ruby/openssl] [DOC] better wording for OpenSSL::Config document.
26370636f3

Co-authored-by: Olle Jonsson <olle.jonsson@gmail.com>
2024-10-31 08:26:12 +00:00
Yuki Morohoshi
9d94a3b8aa [ruby/openssl] [DOC] Replace removed method in example for OpenSSL::Config#to_s
93c7bf52ac
2024-10-31 08:26:12 +00:00
kojix2
550ac2f2ed
[DOC] Fix typos 2024-10-31 12:44:50 +09:00
Peter Zhu
e077be119b [ruby/json] Remove double semicolon at end of line in parser
f6d6ca3c17
2024-10-30 10:13:49 +09:00
Jean Boussier
f2e51146f8 [ruby/json] Remove dead cases from convert_UTF8_to_* functions
d54063a790
2024-10-30 10:13:49 +09:00
Jean Boussier
5d176436ce [ruby/json] Allocate the FBuffer struct on the stack
Ref: https://github.com/ruby/json/issues/655

The actual buffer is still on the heap, but this saves a pair
of malloc/free.

This helps a lot on micro-benchmarks

Before:

```
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                  Oj   531.598k i/100ms
          JSON reuse   417.666k i/100ms
Calculating -------------------------------------
                  Oj      5.735M (± 1.3%) i/s  (174.35 ns/i) -     28.706M in   5.005900s
          JSON reuse      4.604M (± 1.4%) i/s  (217.18 ns/i) -     23.389M in   5.080779s

Comparison:
                  Oj:  5735475.6 i/s
          JSON reuse:  4604380.3 i/s - 1.25x  slower
```

After:

```
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                  Oj   518.700k i/100ms
          JSON reuse   483.370k i/100ms
Calculating -------------------------------------
                  Oj      5.722M (± 1.8%) i/s  (174.76 ns/i) -     29.047M in   5.077823s
          JSON reuse      5.278M (± 1.5%) i/s  (189.46 ns/i) -     26.585M in   5.038172s

Comparison:
                  Oj:  5722283.8 i/s
          JSON reuse:  5278061.7 i/s - 1.08x  slower
```

Bench:

```ruby
require 'benchmark/ips'
require 'oj'
require 'json'

json_encoder = JSON::State.new(JSON.dump_default_options)
test_data = [1, "string", { a: 1, b: 2 }, [3, 4, 5]]

Oj.default_options = Oj.default_options.merge(mode: :compat)

Benchmark.ips do |x|
  x.config(time: 5, warmup: 2)

  x.report("Oj") do
    Oj.dump(test_data)
  end

  x.report("JSON reuse") do
    json_encoder.generate(test_data)
  end

  x.compare!(order: :baseline)
end
```

72110f7992
2024-10-30 10:13:48 +09:00
Jean Boussier
2e43621806
[ruby/json] Optimize fbuffer_append_long
Ref: https://github.com/ruby/json/issues/655

Rather than to write the number backward, and then reverse
the buffer, we can start from the back of the buffer and write
the number in the proper direction.

Before:

```
== Encoding integers (8009 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json     8.606k i/100ms
                  oj     9.598k i/100ms
Calculating -------------------------------------
                json     86.059k (± 0.8%) i/s   (11.62 μs/i) -    430.300k in   5.000416s
                  oj     97.409k (± 0.6%) i/s   (10.27 μs/i) -    489.498k in   5.025360s

Comparison:
                json:    86058.8 i/s
                  oj:    97408.8 i/s - 1.13x  faster
```

After:

```
== Encoding integers (8009 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
        json (reuse)     9.500k i/100ms
                json     9.359k i/100ms
                  oj     9.722k i/100ms
Calculating -------------------------------------
        json (reuse)     96.270k (± 0.4%) i/s   (10.39 μs/i) -    484.500k in   5.032777s
                json     94.800k (± 2.2%) i/s   (10.55 μs/i) -    477.309k in   5.037495s
                  oj     97.131k (± 0.7%) i/s   (10.30 μs/i) -    486.100k in   5.004822s

Comparison:
        json (reuse):    96270.1 i/s
                  oj:    97130.5 i/s - same-ish: difference falls within error
                json:    94799.9 i/s - same-ish: difference falls within error
```

0655b58d14
2024-10-29 13:25:01 +09:00
Nobuyoshi Nakada
484ea00d2e [ruby/stringio] An empty string should be converted to empty in any encoding
ef03f9368d
2024-10-26 13:20:34 +00:00
Nobuyoshi Nakada
f513863c81 [ruby/stringio] Unreachable after an invalid argument exception
a2aab4721c
2024-10-26 12:55:45 +00:00
Nobuyoshi Nakada
393c5df008 [ruby/stringio] Remove SafeStringValue
In Ruby 2.7 and later, it is the same as `StringValue`.

561ea67ea8
2024-10-26 12:55:45 +00:00
Hiroshi SHIBATA
caa946f2de Restore ext/json/extconf.rb 2024-10-26 18:44:15 +09:00
Jean Boussier
e136e552b6 [ruby/json] Instantiate Parser with a kwsplat
Prior to 2.7.3, `JSON::Ext::Parser` would only take kwargs.
So if json_pure 2.7.4 is loaded with `json <= 2.7.2` (or stdlib)
it blows up.

Ref: https://github.com/ruby/json/issues/650
Fix: https://github.com/ruby/json/issues/651

4d9dc98817
2024-10-26 18:44:15 +09:00
Jean Boussier
8018a3121f [ruby/json] Workaround being loaded alongside a different json_pure version
Fix: https://github.com/ruby/json/issues/646

Since both `json` and `json_pure` expose the same files, if the
versions don't match, the native extension may be loaded with Ruby
code that don't match and is incompatible.

By doing the `require json/ext/generator/state` from C we ensure
we're at least loading that.

But this is a dirty workaround for the 2.7.x branch, we should
find a better way to fully isolate the two gems.

dfdd4acf36
2024-10-26 18:44:15 +09:00
Jean Boussier
a5bd0c638a [ruby/json] Workaround rubygems $LOAD_PATH bug
Ref: https://github.com/ruby/json/issues/647
Ref: https://github.com/rubygems/rubygems/pull/6490

Older rubygems are executing `extconf.rb` with a broken `$LOAD_PATH`
causing the `json` gem native extension to be loaded with the stdlib
version of the `.rb` files.

This fails with

```
json/common.rb:82:in `initialize': wrong number of arguments (given 1, expected 0) (ArgumentError)
```

Since this is just for `extconf.rb` we can probably just accept that
extra argument and ignore it.

The bug was fixed in rubygems 3.4.9 / 2023-03-20

1f5e849fe0
2024-10-26 18:44:15 +09:00
Jean Boussier
a3c21756e9 [ruby/json] Use smaller types for JSON_Parser boolean fields
7f079b25be
2024-10-26 18:44:15 +09:00
Jean Boussier
1045b9f820 [ruby/json] Modernize heredocs
fb25e94aea
2024-10-26 18:44:15 +09:00
Jean Boussier
bfdf02ea72 pretty_generate: don't apply object_nl / array_nl for empty containers
Fix: https://github.com/ruby/json/issues/437

Before:

```json
{
  "foo": {
  },
  "bar": [
  ]
}
```

After:

```json
{
  "foo": {},
  "bar": []
}
```
2024-10-26 18:44:15 +09:00
Jean Boussier
1d4708565f Set Ruby 2.7 as the required version 2024-10-26 18:44:15 +09:00
Jean Boussier
7d37ae6751 [ruby/json] Start 2.8.0 development
937c8d2e65
2024-10-26 18:44:15 +09:00
Jean Boussier
9d3fd50cfe [ruby/json] Release 2.7.3
7a3b482013
2024-10-26 18:44:15 +09:00
Jean Boussier
89d4bbacfb [ruby/json] Release 2.7.3.rc1
a48be35825
2024-10-26 18:44:15 +09:00
Jean Boussier
925131073d Merge json and json-java gemspecs 2024-10-26 18:44:15 +09:00
Jean Boussier
fc9f0cb8c5 [ruby/json] JSON.dump / String#to_json: raise on invalid encoding
This regressed since 2.7.2.

35407d6635
2024-10-26 18:44:15 +09:00
Benoit Daloze
1cf1bf9588 [ruby/json] Add lib/json/ext/generator/state.rb to the gemspec
* Otherwise the gem always uses the pure-Ruby backend
  as it's missing that file and rescuing the LoadError.

1e2809b0b0
2024-10-26 18:44:15 +09:00
Jean Boussier
70f554efb4 [ruby/json] raise_parse_error: avoid UB
Fix: https://github.com/ruby/json/pull/625

Declaring the buffer in a sub block cause bugs on some compilers.

90967c9eb0
2024-10-26 18:44:15 +09:00
Étienne Barrié
82f7550f65 Use frozen string literals
Co-authored-by: Jean Boussier <jean.boussier@gmail.com>
2024-10-26 18:44:15 +09:00
Étienne Barrié
5f97468958 [ruby/json] Drop compatibility for missing Time#tv_nsec (Ruby 1.8)
b240bde402

Co-authored-by: Jean Boussier <jean.boussier@gmail.com>
2024-10-26 18:44:15 +09:00
Jean Boussier
a052d96103 [ruby/json] Compile with std=c99
d4968d2e48
2024-10-26 18:44:15 +09:00
Jean Boussier
cbd933bcf1 [ruby/json] convert_UTF8_to_ASCII_only_JSON: apply the same optimization pass
42edaf7f17
2024-10-26 18:44:15 +09:00
Jean Boussier
e52b47680e [ruby/json] Reduce encoding benchmark size
Profiling revealed that we were spending lots of time growing the buffer.
Buffer operations is definitely something we want to optimize, but for
this specific benchmark what we're interested in is UTF-8 scanning performance.

Each iteration of the two scaning benchmark were producing 20MB of JSON,
now they only produce 5MB.

Now:

```
== Encoding mostly utf8 (5001001 bytes)
ruby 3.4.0dev (2024-10-18T19:01:45Z master 7be9a333ca) +YJIT +PRISM [arm64-darwin23]
Warming up --------------------------------------
                json    35.000 i/100ms
                  oj    36.000 i/100ms
           rapidjson    10.000 i/100ms
Calculating -------------------------------------
                json    359.161 (± 1.4%) i/s    (2.78 ms/i) -      1.820k in   5.068542s
                  oj    359.699 (± 0.6%) i/s    (2.78 ms/i) -      1.800k in   5.004291s
           rapidjson     99.687 (± 2.0%) i/s   (10.03 ms/i) -    500.000 in   5.017321s

Comparison:
                json:      359.2 i/s
                  oj:      359.7 i/s - same-ish: difference falls within error
           rapidjson:       99.7 i/s - 3.60x  slower
```

1a338532d2
2024-10-26 18:44:15 +09:00
Jean Boussier
97713ac952 [ruby/json] convert_UTF8_to_JSON: repurpose the escape tables into size tables
Since we're looking up the table anyway, we might as well store the
UTF-8 char length in it. For single byte characters that don't need
escaping we store `0`.

This helps on strings with lots of multi-byte characters:

Before:

```
== Encoding mostly utf8 (20004001 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json     6.000 i/100ms
                  oj    10.000 i/100ms
           rapidjson     2.000 i/100ms
Calculating -------------------------------------
                json     67.978 (± 1.5%) i/s   (14.71 ms/i) -    342.000 in   5.033062s
                  oj    100.876 (± 2.0%) i/s    (9.91 ms/i) -    510.000 in   5.058080s
           rapidjson     26.389 (± 7.6%) i/s   (37.89 ms/i) -    132.000 in   5.027681s

Comparison:
                json:       68.0 i/s
                  oj:      100.9 i/s - 1.48x  faster
           rapidjson:       26.4 i/s - 2.58x  slower
```

After:

```
== Encoding mostly utf8 (20004001 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json     7.000 i/100ms
                  oj    10.000 i/100ms
           rapidjson     2.000 i/100ms
Calculating -------------------------------------
                json     75.187 (± 2.7%) i/s   (13.30 ms/i) -    378.000 in   5.030111s
                  oj     95.196 (± 2.1%) i/s   (10.50 ms/i) -    480.000 in   5.043565s
           rapidjson     25.969 (± 3.9%) i/s   (38.51 ms/i) -    130.000 in   5.011471s

Comparison:
                json:       75.2 i/s
                  oj:       95.2 i/s - 1.27x  faster
           rapidjson:       26.0 i/s - 2.90x  slower
```

51e2631d1f
2024-10-26 18:44:15 +09:00
Jean Boussier
9f300d0541 [ruby/json] Optimize convert_UTF8_to_JSON for mostly ASCII strings
If we assume that even UTF-8 strings are mostly ASCII, we can implement a
fast path for the ASCII parts.

Before:

```
== Encoding mixed utf8 (20012001 bytes)
ruby 3.4.0dev (2024-10-18T15:12:54Z master d1b5c10957) +YJIT +PRISM [arm64-darwin23]
Warming up --------------------------------------
                json     5.000 i/100ms
                  oj     9.000 i/100ms
           rapidjson     2.000 i/100ms
Calculating -------------------------------------
                json     49.403 (± 2.0%) i/s   (20.24 ms/i) -    250.000 in   5.062647s
                  oj    100.120 (± 2.0%) i/s    (9.99 ms/i) -    504.000 in   5.035349s
           rapidjson     26.404 (± 0.0%) i/s   (37.87 ms/i) -    132.000 in   5.001025s

Comparison:
                json:       49.4 i/s
                  oj:      100.1 i/s - 2.03x  faster
           rapidjson:       26.4 i/s - 1.87x  slower
```

After:

```
== Encoding mixed utf8 (20012001 bytes)
ruby 3.4.0dev (2024-10-18T15:12:54Z master d1b5c10957) +YJIT +PRISM [arm64-darwin23]
Warming up --------------------------------------
                json    10.000 i/100ms
                  oj     9.000 i/100ms
           rapidjson     2.000 i/100ms
Calculating -------------------------------------
                json     95.686 (± 2.1%) i/s   (10.45 ms/i) -    480.000 in   5.018575s
                  oj     96.875 (± 2.1%) i/s   (10.32 ms/i) -    486.000 in   5.019097s
           rapidjson     26.260 (± 3.8%) i/s   (38.08 ms/i) -    132.000 in   5.033151s

Comparison:
                json:       95.7 i/s
                  oj:       96.9 i/s - same-ish: difference falls within error
           rapidjson:       26.3 i/s - 3.64x  slower
```

f8166c2d7f
2024-10-26 18:44:15 +09:00
Jean Boussier
07fc21cfad [ruby/json] Ext::Parser avoid costly check on decimal_class when it is nil
Closes: https://github.com/ruby/json/pull/512

d882a45d82

Co-Authored-By: lukeg <luke.gru@gmail.com>
2024-10-26 18:44:15 +09:00
Jean Boussier
9045258c88 [ruby/json] Limit the size of ParserError exception messages
Fix: https://github.com/ruby/json/issues/534

Only include up to 32 bytes of unparseable the source.

f44995cfb6
2024-10-26 18:44:15 +09:00