Commit graph

89 commits

Author SHA1 Message Date
Jean Boussier
d188a6883f [ruby/json] Implement a fast path for integer parsing
`rb_cstr2inum` isn't very fast because it handles tons of
different scenarios, and also require a NULL terminated string
which forces us to copy the number into a secondary buffer.

But since the parser already computed the length, we can much more
cheaply do this with a very simple function as long as the number
is small enough to fit into a native type (`long long`).

If the number is too long, we can fallback to the `rb_cstr2inum`
slowpath.

Before:

```
== Parsing citm_catalog.json (1727030 bytes)
ruby 3.4.0dev (2024-11-06T07:59:09Z precompute-hash-wh.. 7943f98a8a) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
                json    40.000 i/100ms
                  oj    35.000 i/100ms
          Oj::Parser    45.000 i/100ms
           rapidjson    38.000 i/100ms
Calculating -------------------------------------
                json    425.941 (± 1.9%) i/s    (2.35 ms/i) -      2.160k in   5.072833s
                  oj    349.617 (± 1.7%) i/s    (2.86 ms/i) -      1.750k in   5.006953s
          Oj::Parser    464.767 (± 1.7%) i/s    (2.15 ms/i) -      2.340k in   5.036381s
           rapidjson    382.413 (± 2.4%) i/s    (2.61 ms/i) -      1.938k in   5.070757s

Comparison:
                json:      425.9 i/s
          Oj::Parser:      464.8 i/s - 1.09x  faster
           rapidjson:      382.4 i/s - 1.11x  slower
                  oj:      349.6 i/s - 1.22x  slower
```

After:

```
== Parsing citm_catalog.json (1727030 bytes)
ruby 3.4.0dev (2024-11-06T07:59:09Z precompute-hash-wh.. 7943f98a8a) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
                json    46.000 i/100ms
                  oj    33.000 i/100ms
          Oj::Parser    45.000 i/100ms
           rapidjson    39.000 i/100ms
Calculating -------------------------------------
                json    462.332 (± 3.2%) i/s    (2.16 ms/i) -      2.346k in   5.080504s
                  oj    351.140 (± 1.1%) i/s    (2.85 ms/i) -      1.782k in   5.075616s
          Oj::Parser    473.500 (± 1.3%) i/s    (2.11 ms/i) -      2.385k in   5.037695s
           rapidjson    395.052 (± 3.5%) i/s    (2.53 ms/i) -      1.989k in   5.042275s

Comparison:
                json:      462.3 i/s
          Oj::Parser:      473.5 i/s - same-ish: difference falls within error
           rapidjson:      395.1 i/s - 1.17x  slower
                  oj:      351.1 i/s - 1.32x  slower
```

3a4dc9e1b4
2024-11-06 23:31:30 +01:00
Jean Boussier
6cea370b23 [ruby/json] parser.rl: parse_string implement a fast path
If we assume most string don't contain any escape sequence we can avoid
a lot of costly operations when it holds true.

Before:

```
== Parsing activitypub.json (58160 bytes)
ruby 3.4.0dev (2024-11-06T07:59:09Z precompute-hash-wh.. 7943f98a8a) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
                json   884.000 i/100ms
                  oj   789.000 i/100ms
          Oj::Parser   943.000 i/100ms
           rapidjson   584.000 i/100ms
Calculating -------------------------------------
                json      8.897k (± 1.3%) i/s  (112.40 μs/i) -     45.084k in   5.068520s
                  oj      7.967k (± 1.5%) i/s  (125.52 μs/i) -     40.239k in   5.051985s
          Oj::Parser      9.564k (± 1.4%) i/s  (104.56 μs/i) -     48.093k in   5.029626s
           rapidjson      5.947k (± 1.4%) i/s  (168.16 μs/i) -     29.784k in   5.009437s

Comparison:
                json:     8896.5 i/s
          Oj::Parser:     9563.8 i/s - 1.08x  faster
                  oj:     7966.8 i/s - 1.12x  slower
           rapidjson:     5946.7 i/s - 1.50x  slower

== Parsing twitter.json (567916 bytes)
ruby 3.4.0dev (2024-11-06T07:59:09Z precompute-hash-wh.. 7943f98a8a) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
                json    83.000 i/100ms
                  oj    64.000 i/100ms
          Oj::Parser    77.000 i/100ms
           rapidjson    54.000 i/100ms
Calculating -------------------------------------
                json    823.083 (± 1.8%) i/s    (1.21 ms/i) -      4.150k in   5.043805s
                  oj    632.538 (± 1.4%) i/s    (1.58 ms/i) -      3.200k in   5.060073s
          Oj::Parser    769.122 (± 1.8%) i/s    (1.30 ms/i) -      3.850k in   5.007501s
           rapidjson    548.494 (± 1.5%) i/s    (1.82 ms/i) -      2.754k in   5.022153s

Comparison:
                json:      823.1 i/s
          Oj::Parser:      769.1 i/s - 1.07x  slower
                  oj:      632.5 i/s - 1.30x  slower
           rapidjson:      548.5 i/s - 1.50x  slower

== Parsing citm_catalog.json (1727030 bytes)
ruby 3.4.0dev (2024-11-06T07:59:09Z precompute-hash-wh.. 7943f98a8a) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
                json    41.000 i/100ms
                  oj    34.000 i/100ms
          Oj::Parser    45.000 i/100ms
           rapidjson    39.000 i/100ms
Calculating -------------------------------------
                json    427.162 (± 1.2%) i/s    (2.34 ms/i) -      2.173k in   5.087666s
                  oj    351.463 (± 2.8%) i/s    (2.85 ms/i) -      1.768k in   5.035149s
          Oj::Parser    461.849 (± 3.7%) i/s    (2.17 ms/i) -      2.340k in   5.074461s
           rapidjson    395.155 (± 1.8%) i/s    (2.53 ms/i) -      1.989k in   5.034927s

Comparison:
                json:      427.2 i/s
          Oj::Parser:      461.8 i/s - 1.08x  faster
           rapidjson:      395.2 i/s - 1.08x  slower
                  oj:      351.5 i/s - 1.22x  slower
```

After:

```
== Parsing activitypub.json (58160 bytes)
ruby 3.4.0dev (2024-11-06T07:59:09Z precompute-hash-wh.. 7943f98a8a) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
                json   953.000 i/100ms
                  oj   813.000 i/100ms
          Oj::Parser   956.000 i/100ms
           rapidjson   563.000 i/100ms
Calculating -------------------------------------
                json      9.525k (± 1.2%) i/s  (104.98 μs/i) -     47.650k in   5.003252s
                  oj      8.117k (± 0.5%) i/s  (123.20 μs/i) -     40.650k in   5.008283s
          Oj::Parser      9.590k (± 3.2%) i/s  (104.27 μs/i) -     48.756k in   5.089794s
           rapidjson      6.020k (± 0.9%) i/s  (166.10 μs/i) -     30.402k in   5.050155s

Comparison:
                json:     9525.3 i/s
          Oj::Parser:     9590.1 i/s - same-ish: difference falls within error
                  oj:     8116.7 i/s - 1.17x  slower
           rapidjson:     6020.5 i/s - 1.58x  slower

== Parsing twitter.json (567916 bytes)
ruby 3.4.0dev (2024-11-06T07:59:09Z precompute-hash-wh.. 7943f98a8a) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
                json    87.000 i/100ms
                  oj    64.000 i/100ms
          Oj::Parser    75.000 i/100ms
           rapidjson    55.000 i/100ms
Calculating -------------------------------------
                json    866.563 (± 0.8%) i/s    (1.15 ms/i) -      4.350k in   5.020138s
                  oj    643.567 (± 0.8%) i/s    (1.55 ms/i) -      3.264k in   5.072101s
          Oj::Parser    777.346 (± 3.5%) i/s    (1.29 ms/i) -      3.900k in   5.023933s
           rapidjson    557.158 (± 0.7%) i/s    (1.79 ms/i) -      2.805k in   5.034731s

Comparison:
                json:      866.6 i/s
          Oj::Parser:      777.3 i/s - 1.11x  slower
                  oj:      643.6 i/s - 1.35x  slower
           rapidjson:      557.2 i/s - 1.56x  slower

== Parsing citm_catalog.json (1727030 bytes)
ruby 3.4.0dev (2024-11-06T07:59:09Z precompute-hash-wh.. 7943f98a8a) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
                json    41.000 i/100ms
                  oj    35.000 i/100ms
          Oj::Parser    40.000 i/100ms
           rapidjson    39.000 i/100ms
Calculating -------------------------------------
                json    429.216 (± 1.2%) i/s    (2.33 ms/i) -      2.173k in   5.063351s
                  oj    354.755 (± 1.1%) i/s    (2.82 ms/i) -      1.785k in   5.032374s
          Oj::Parser    465.114 (± 3.7%) i/s    (2.15 ms/i) -      2.360k in   5.081634s
           rapidjson    387.135 (± 1.3%) i/s    (2.58 ms/i) -      1.950k in   5.037787s

Comparison:
                json:      429.2 i/s
          Oj::Parser:      465.1 i/s - 1.08x  faster
           rapidjson:      387.1 i/s - 1.11x  slower
                  oj:      354.8 i/s - 1.21x  slower
```

96bd97c61e
2024-11-06 23:31:30 +01:00
Nobuyoshi Nakada
8254f6492c [ruby/json] Categorize deprecated warning
1acce7aceb
2024-11-06 23:31:30 +01:00
Jean Boussier
ca8f21ace8 [ruby/json] Resync 2024-11-05 18:00:36 +01:00
Jean Boussier
ed22e68379 [ruby/json] JSON::Ext::Parser mark the name cache entries when not on the heap
This is somewhat dead code as unless you are using `JSON::Parser.new`
direcltly we never allocate `JSON::Ext::Parser` anymore.

But still, we should mark all its reference in case some code out there
uses that.

Followup: #675

8bf74a977b
2024-11-05 18:00:36 +01:00
Jean Boussier
ee4fa4ccee [ruby/json] json_string_unescape: Use the returned RString as buffer
Rather than to copy into a buffer to unescape and then copy that
buffer into the final string, we can directly copy into the final
string.

The downside is that if the string contains a lot of escaping, we
end up returning a string that's larger than strictly necessary, but
it's probably fine.

Before:

```
== Parsing twitter.json (567916 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    56.000 i/100ms
                  oj    58.000 i/100ms
           oj strict    74.000 i/100ms
          Oj::Parser    76.000 i/100ms
           rapidjson    52.000 i/100ms
Calculating -------------------------------------
                json    556.659 (± 2.9%) i/s    (1.80 ms/i) -      2.800k in   5.034719s
                  oj    604.077 (± 3.8%) i/s    (1.66 ms/i) -      3.016k in   5.001546s
           oj strict    706.942 (± 3.5%) i/s    (1.41 ms/i) -      3.552k in   5.030954s
          Oj::Parser    752.917 (± 3.2%) i/s    (1.33 ms/i) -      3.800k in   5.052707s
           rapidjson    546.470 (± 3.5%) i/s    (1.83 ms/i) -      2.756k in   5.049855s

Comparison:
                json:      556.7 i/s
          Oj::Parser:      752.9 i/s - 1.35x  faster
           oj strict:      706.9 i/s - 1.27x  faster
                  oj:      604.1 i/s - 1.09x  faster
           rapidjson:      546.5 i/s - same-ish: difference falls within error

== Parsing citm_catalog.json (1727030 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    29.000 i/100ms
                  oj    32.000 i/100ms
           oj strict    38.000 i/100ms
          Oj::Parser    42.000 i/100ms
           rapidjson    38.000 i/100ms
Calculating -------------------------------------
                json    317.858 (± 3.1%) i/s    (3.15 ms/i) -      1.595k in   5.023245s
                  oj    348.168 (± 2.6%) i/s    (2.87 ms/i) -      1.760k in   5.058431s
           oj strict    394.599 (± 2.8%) i/s    (2.53 ms/i) -      1.976k in   5.012073s
          Oj::Parser    403.771 (± 3.0%) i/s    (2.48 ms/i) -      2.058k in   5.101578s
           rapidjson    383.441 (± 3.7%) i/s    (2.61 ms/i) -      1.938k in   5.061355s

Comparison:
                json:      317.9 i/s
          Oj::Parser:      403.8 i/s - 1.27x  faster
           oj strict:      394.6 i/s - 1.24x  faster
           rapidjson:      383.4 i/s - 1.21x  faster
                  oj:      348.2 i/s - 1.10x  faster
```

After:

```
== Parsing twitter.json (567916 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    56.000 i/100ms
                  oj    62.000 i/100ms
           oj strict    73.000 i/100ms
          Oj::Parser    76.000 i/100ms
           rapidjson    54.000 i/100ms
Calculating -------------------------------------
                json    561.009 (± 7.5%) i/s    (1.78 ms/i) -      2.800k in   5.039548s
                  oj    601.124 (± 4.3%) i/s    (1.66 ms/i) -      3.038k in   5.064686s
           oj strict    707.455 (± 3.4%) i/s    (1.41 ms/i) -      3.577k in   5.062540s
          Oj::Parser    751.799 (± 3.1%) i/s    (1.33 ms/i) -      3.800k in   5.059509s
           rapidjson    535.641 (± 3.2%) i/s    (1.87 ms/i) -      2.700k in   5.045816s

Comparison:
                json:      561.0 i/s
          Oj::Parser:      751.8 i/s - 1.34x  faster
           oj strict:      707.5 i/s - 1.26x  faster
                  oj:      601.1 i/s - same-ish: difference falls within error
           rapidjson:      535.6 i/s - same-ish: difference falls within error

== Parsing citm_catalog.json (1727030 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    30.000 i/100ms
                  oj    32.000 i/100ms
           oj strict    36.000 i/100ms
          Oj::Parser    42.000 i/100ms
           rapidjson    39.000 i/100ms
Calculating -------------------------------------
                json    313.248 (± 7.3%) i/s    (3.19 ms/i) -      1.560k in   5.014118s
                  oj    341.977 (± 4.1%) i/s    (2.92 ms/i) -      1.728k in   5.063332s
           oj strict    387.062 (± 6.2%) i/s    (2.58 ms/i) -      1.944k in   5.045961s
          Oj::Parser    400.423 (± 4.0%) i/s    (2.50 ms/i) -      2.016k in   5.044513s
           rapidjson    379.046 (± 6.1%) i/s    (2.64 ms/i) -      1.911k in   5.064461s

Comparison:
                json:      313.2 i/s
          Oj::Parser:      400.4 i/s - 1.28x  faster
           oj strict:      387.1 i/s - 1.24x  faster
           rapidjson:      379.0 i/s - 1.21x  faster
                  oj:      342.0 i/s - same-ish: difference falls within error
```

5e1ec4a268
2024-11-01 13:04:24 +09:00
Jean Boussier
b8b33efd4d [ruby/json] Remove String#-@ check in extconf.rb
Now that older rubies have been droped, we no longer need to check
for all that.

35cf2b84e0
2024-11-01 13:04:24 +09:00
Jean Boussier
165cc6cf40 [ruby/json] json_string_unescape: assume the string doesn't need escaping
If that assumption holds true, then we don't need to copy the
string into a buffer to unescape it. For small string is just saves
copying, but for large ones it also saves a malloc/free combo.

Before:

```
== Parsing twitter.json (567916 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    52.000 i/100ms
                  oj    61.000 i/100ms
           oj strict    70.000 i/100ms
          Oj::Parser    71.000 i/100ms
           rapidjson    55.000 i/100ms
Calculating -------------------------------------
                json    510.111 (± 2.9%) i/s    (1.96 ms/i) -      2.548k in   5.000029s
                  oj    610.232 (± 3.1%) i/s    (1.64 ms/i) -      3.050k in   5.003725s
           oj strict    713.231 (± 3.2%) i/s    (1.40 ms/i) -      3.570k in   5.010902s
          Oj::Parser    762.598 (± 3.0%) i/s    (1.31 ms/i) -      3.834k in   5.033130s
           rapidjson    553.029 (± 7.4%) i/s    (1.81 ms/i) -      2.750k in   5.022630s

Comparison:
                json:      510.1 i/s
          Oj::Parser:      762.6 i/s - 1.49x  faster
           oj strict:      713.2 i/s - 1.40x  faster
                  oj:      610.2 i/s - 1.20x  faster
           rapidjson:      553.0 i/s - same-ish: difference falls within error

== Parsing citm_catalog.json (1727030 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    28.000 i/100ms
                  oj    33.000 i/100ms
           oj strict    37.000 i/100ms
          Oj::Parser    43.000 i/100ms
           rapidjson    38.000 i/100ms
Calculating -------------------------------------
                json    303.853 (± 3.6%) i/s    (3.29 ms/i) -      1.540k in   5.076079s
                  oj    348.009 (± 2.0%) i/s    (2.87 ms/i) -      1.749k in   5.027738s
           oj strict    396.679 (± 3.3%) i/s    (2.52 ms/i) -      1.998k in   5.042271s
          Oj::Parser    406.699 (± 2.2%) i/s    (2.46 ms/i) -      2.064k in   5.077587s
           rapidjson    393.463 (± 3.3%) i/s    (2.54 ms/i) -      1.976k in   5.028501s

Comparison:
                json:      303.9 i/s
          Oj::Parser:      406.7 i/s - 1.34x  faster
           oj strict:      396.7 i/s - 1.31x  faster
           rapidjson:      393.5 i/s - 1.29x  faster
                  oj:      348.0 i/s - 1.15x  faster
```

After:

```
== Parsing twitter.json (567916 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    56.000 i/100ms
                  oj    62.000 i/100ms
           oj strict    72.000 i/100ms
          Oj::Parser    77.000 i/100ms
           rapidjson    55.000 i/100ms
Calculating -------------------------------------
                json    568.025 (± 2.1%) i/s    (1.76 ms/i) -      2.856k in   5.030272s
                  oj    630.936 (± 1.4%) i/s    (1.58 ms/i) -      3.162k in   5.012630s
           oj strict    705.784 (±11.2%) i/s    (1.42 ms/i) -      3.456k in   5.006706s
          Oj::Parser    783.989 (± 1.7%) i/s    (1.28 ms/i) -      3.927k in   5.010343s
           rapidjson    557.630 (± 2.0%) i/s    (1.79 ms/i) -      2.805k in   5.032388s

Comparison:
                json:      568.0 i/s
          Oj::Parser:      784.0 i/s - 1.38x  faster
           oj strict:      705.8 i/s - 1.24x  faster
                  oj:      630.9 i/s - 1.11x  faster
           rapidjson:      557.6 i/s - same-ish: difference falls within error

== Parsing citm_catalog.json (1727030 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    29.000 i/100ms
                  oj    33.000 i/100ms
           oj strict    38.000 i/100ms
          Oj::Parser    43.000 i/100ms
           rapidjson    37.000 i/100ms
Calculating -------------------------------------
                json    319.271 (± 3.1%) i/s    (3.13 ms/i) -      1.595k in   5.001128s
                  oj    347.946 (± 1.7%) i/s    (2.87 ms/i) -      1.749k in   5.028395s
           oj strict    396.914 (± 3.0%) i/s    (2.52 ms/i) -      2.014k in   5.079645s
          Oj::Parser    409.311 (± 2.7%) i/s    (2.44 ms/i) -      2.064k in   5.046626s
           rapidjson    394.752 (± 1.5%) i/s    (2.53 ms/i) -      1.998k in   5.062776s

Comparison:
                json:      319.3 i/s
          Oj::Parser:      409.3 i/s - 1.28x  faster
           oj strict:      396.9 i/s - 1.24x  faster
           rapidjson:      394.8 i/s - 1.24x  faster
                  oj:      347.9 i/s - 1.09x  faster
```

7e0f66546a
2024-11-01 13:04:24 +09:00
Jean Boussier
081689b9e2 [ruby/json] parser.rl: extract build_string
7e557ee291
2024-11-01 13:04:24 +09:00
Benoit Daloze
6412e6f6c3 [ruby/json] Use String#encode instead of rb_str_conv_enc()
* rb_str_conv_enc() returns the source string unmodified
  if the conversion did not work. But we should be consistent with
  the generator here and only accept BINARY or convertible to UTF-8.

1344ad6f66
2024-11-01 13:04:24 +09:00
Jean Boussier
3782600f0f [ruby/json] Emit warnings when dumping binary strings
Because of it's Ruby 1.8 heritage, the C extension doesn't care
much about strings encoding. We should get stricter over time.

42402fc13f
2024-11-01 13:04:24 +09:00
Jean Boussier
f2b8829df0 Deprecate unsafe default options of JSON.load
[Feature #19528]

Ref: https://bugs.ruby-lang.org/issues/19528

`load` is understood as the default method for serializer kind of libraries, and
the default options of `JSON.load` has caused many security vulnerabilities over the
years.

The plan is to do like YAML/Psych, deprecate these default options and direct
users toward using `JSON.unsafe_load` so at least it's obvious it should be
used against untrusted data.
2024-11-01 13:04:24 +09:00
Jean Boussier
59eebeca02 [ruby/json] Allocate the initial generator buffer on the stack
Ref: https://github.com/ruby/json/issues/655
Followup: https://github.com/ruby/json/issues/657

Assuming the generator might be used for fairly small documents
we can start with a reasonable buffer size of the stack, and if
we outgrow it, we can spill on the heap.

In a way this is optimizing for micro-benchmarks, but there are
valid use case for fiarly small JSON document in actual real world
scenarios, so trashing the GC less in such case make sense.

Before:

```
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                  Oj   518.700k i/100ms
          JSON reuse   483.370k i/100ms
Calculating -------------------------------------
                  Oj      5.722M (± 1.8%) i/s  (174.76 ns/i) -     29.047M in   5.077823s
          JSON reuse      5.278M (± 1.5%) i/s  (189.46 ns/i) -     26.585M in   5.038172s

Comparison:
                  Oj:  5722283.8 i/s
          JSON reuse:  5278061.7 i/s - 1.08x  slower
```

After:

```
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                  Oj   517.837k i/100ms
          JSON reuse   548.871k i/100ms
Calculating -------------------------------------
                  Oj      5.693M (± 1.6%) i/s  (175.65 ns/i) -     28.481M in   5.004056s
          JSON reuse      5.855M (± 1.2%) i/s  (170.80 ns/i) -     29.639M in   5.063004s

Comparison:
                  Oj:  5692985.6 i/s
          JSON reuse:  5854857.9 i/s - 1.03x  faster
```

fe607f4806
2024-11-01 13:04:24 +09:00
Peter Zhu
e077be119b [ruby/json] Remove double semicolon at end of line in parser
f6d6ca3c17
2024-10-30 10:13:49 +09:00
Jean Boussier
5d176436ce [ruby/json] Allocate the FBuffer struct on the stack
Ref: https://github.com/ruby/json/issues/655

The actual buffer is still on the heap, but this saves a pair
of malloc/free.

This helps a lot on micro-benchmarks

Before:

```
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                  Oj   531.598k i/100ms
          JSON reuse   417.666k i/100ms
Calculating -------------------------------------
                  Oj      5.735M (± 1.3%) i/s  (174.35 ns/i) -     28.706M in   5.005900s
          JSON reuse      4.604M (± 1.4%) i/s  (217.18 ns/i) -     23.389M in   5.080779s

Comparison:
                  Oj:  5735475.6 i/s
          JSON reuse:  4604380.3 i/s - 1.25x  slower
```

After:

```
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                  Oj   518.700k i/100ms
          JSON reuse   483.370k i/100ms
Calculating -------------------------------------
                  Oj      5.722M (± 1.8%) i/s  (174.76 ns/i) -     29.047M in   5.077823s
          JSON reuse      5.278M (± 1.5%) i/s  (189.46 ns/i) -     26.585M in   5.038172s

Comparison:
                  Oj:  5722283.8 i/s
          JSON reuse:  5278061.7 i/s - 1.08x  slower
```

Bench:

```ruby
require 'benchmark/ips'
require 'oj'
require 'json'

json_encoder = JSON::State.new(JSON.dump_default_options)
test_data = [1, "string", { a: 1, b: 2 }, [3, 4, 5]]

Oj.default_options = Oj.default_options.merge(mode: :compat)

Benchmark.ips do |x|
  x.config(time: 5, warmup: 2)

  x.report("Oj") do
    Oj.dump(test_data)
  end

  x.report("JSON reuse") do
    json_encoder.generate(test_data)
  end

  x.compare!(order: :baseline)
end
```

72110f7992
2024-10-30 10:13:48 +09:00
Jean Boussier
fc9f0cb8c5 [ruby/json] JSON.dump / String#to_json: raise on invalid encoding
This regressed since 2.7.2.

35407d6635
2024-10-26 18:44:15 +09:00
Jean Boussier
70f554efb4 [ruby/json] raise_parse_error: avoid UB
Fix: https://github.com/ruby/json/pull/625

Declaring the buffer in a sub block cause bugs on some compilers.

90967c9eb0
2024-10-26 18:44:15 +09:00
Jean Boussier
07fc21cfad [ruby/json] Ext::Parser avoid costly check on decimal_class when it is nil
Closes: https://github.com/ruby/json/pull/512

d882a45d82

Co-Authored-By: lukeg <luke.gru@gmail.com>
2024-10-26 18:44:15 +09:00
Jean Boussier
9045258c88 [ruby/json] Limit the size of ParserError exception messages
Fix: https://github.com/ruby/json/issues/534

Only include up to 32 bytes of unparseable the source.

f44995cfb6
2024-10-26 18:44:15 +09:00
Jean Boussier
7dfc1f3d66 [ruby/json] parser.c: refactor raise_parse_error
09e1df2643
2024-10-26 18:44:15 +09:00
Jean Boussier
618085f48d [ruby/json] Get rid of the remaining tabs.
1a9af430d2
2024-10-26 18:44:15 +09:00
Jean Boussier
e0f8732023 Reduce allocations in parse and load argument handling
Avoid needless hash allocations and such that degrade performance
significantly on micro-benchmarks.
2024-10-26 18:44:15 +09:00
Jean Boussier
8e7e638221 Add more precise documentation for object_class and array_class
Fix: https://github.com/ruby/json/issues/419
2024-10-26 18:44:15 +09:00
Takumasa Ochi
20dc1e5c25 [ruby/json] Always dup argument to preserve original encoding for force_encoding
db9a489ca2
2024-10-18 11:30:42 +09:00
Jean Boussier
c4d4c6b846 [ruby/json] Speedup Parser initialization
Extracted from: https://github.com/ruby/json/pull/512

Use `rb_hash_lookup2` to check for hash key existence instead
of going through `rb_funcall`.

43835a0d13

Co-Authored-By: lukeg <luke.gru@gmail.com>
2024-10-18 11:28:12 +09:00
Jean Boussier
df48f597cf [ruby/json] Get rid of some more outdated compatibility code
All these macros are available on Ruby 2.3+

227885f460
2024-10-17 13:02:13 +00:00
Hiroshi SHIBATA
8a79f345a2 [ruby/json] Unicode string like § is not allowed in C files at ruby/ruby repo
53409bcc74
2024-10-08 14:10:05 +09:00
Luke T. Shumaker
74d459fd52 [ruby/json] Adjust to the CVTUTF code being gone
I, Luke T. Shumaker, am the sole author of the added code.

I did not reference CVTUTF when writing it.  I did reference the
Unicode standard (15.0.0), the Wikipedia article on UTF-8, and the
Wikipedia article on UTF-16.  When I saw some tests fail, I did
reference the old deleted code (but a JSON-specific part, inherently
not as based on CVTUTF) to determine that script_safe should also
escape U+2028 and U+2029.

I targeted simplicity and clarity when writing the code--it can likely
be optimized.  In my mind, the obvious next optimization is to have it
combine contiguous non-escaped characters into just one call to
fbuffer_append(), instead of calling fbuffer_append() for each
character.

Regarding the use of the "modern" types `uint32_t`, `uint16_t`, and
`bool`:
 - ruby.h is guaranteed to give us uint32_t and uint16_t.
 - Since Ruby 3.0.0, ruby.h is guaranteed to give us bool... but we
   support down to Ruby 2.3.  But, ruby.h is guaranteed to give us
   HAVE_STDBOOL_H for the C99 stdbool.h; so use that to include
   stdbool.h if we can, and if not then fall back to a copy of the
   same bool definition that Ruby 3.0.5 uses with C89.

c96351f874
2024-10-08 14:10:05 +09:00
Jean Boussier
d612f9fd34 [flori/json] Remove outdated ifdef checks
`json` requires Ruby 2.3, so `HAVE_RUBY_ENCODING_H` and `HAVE_RB_ENC_RAISE`
are always true.

5c8dc6b70a
2024-09-03 11:51:51 +09:00
Jean Boussier
c5ae432ec8
[flori/json] Cleanup useless ifdef
The json gem now requires Ruby 2.3, so there is no point keeping
compatibility code for older releases that don't have the
TypedData API.

45c86e153f
2024-06-04 12:23:48 +09:00
Hiroshi SHIBATA
86045fca24
Manually merged from flori/json
> https://github.com/flori/json/pull/525
  > Rename escape_slash in script_safe and also escape E+2028 and E+2029

  Co-authored-by: Jean Boussier <jean.boussier@gmail.com>

  > https://github.com/flori/json/pull/454
  > Remove unnecessary initialization of create_id in JSON.parse()

  Co-authored-by: Watson <watson1978@gmail.com>
2023-12-01 16:47:06 +09:00
Jean Boussier
698cb84062
Use ruby_xfree to free buffers
They are allocated with ruby_xmalloc, they should be freed with
ruby_xfree.
2023-12-01 16:47:06 +09:00
John Hawthorn
4b770527c2
[flori/json] Fix "unexpected token" offset for Infinity
Previously in the JSON::Ext parser, when we encountered an "Infinity"
token (and weren't allowing NaN/Infinity) we would try to display the
"unexpected token" at the character before.

42ac170712
2023-12-01 16:47:06 +09:00
Nobuyoshi Nakada
104089ce02 [flori/json] [DOC] Remove duplicate sentence
ed242667b4
2023-07-19 00:02:58 +09:00
Nobuyoshi Nakada
f1f84ca71c [flori/json] Remove HAVE_RB_SCAN_ARGS_OPTIONAL_HASH check
This macro is defined since ruby 2.1, which is older than the required
ruby version.

dd1d54e78a
2023-07-19 00:02:58 +09:00
Dimitar Haralanov
9977462fd9 [flori/json] Rename JSON::ParseError to JSON:ParserError
20b80ca317
2023-07-18 12:25:54 +09:00
Jean Boussier
66b52f046f [flori/json] Stop including the parser source __LINE__ in exceptions
It makes testing for JSON errors very tedious. You either have
to use a Regexp or to regularly update all your assertions
when JSON is upgraded.

de9eb1d28e
2022-07-29 19:10:10 +09:00
Andrew Bromwich
a15d0e267a
[flori/json] Fix parser bug for empty string allocation
When `HAVE_RB_ENC_INTERNED_STR` is enabled it is possible to
pass through a null pointer to `rb_enc_interned_str` resulting
in a segfault

Fixes #495

b59368a8c2
2022-05-20 17:49:13 +09:00
Hiroshi SHIBATA
767f3904ee
[flori/json] Doc: Improve documentation on JSON#parse and JSON#parse!
75ada77b96

Co-authored-by: Bruno Gomes da Silva <brunojabs@gmail.com>
2022-05-20 17:49:13 +09:00
Jean Boussier
2de594ca98
[flori/json] Deduplicate strings inside json_string_unescape
[ci 2]

1982070cb8
2021-05-17 19:51:51 +09:00
Jean Boussier
1d2b4ccaf2
[flori/json] Refactor json_string_unescape
f398769332
2021-05-17 19:51:50 +09:00
Kenta Murata
14d7d1df25
[json] Make json Ractor safe 2020-12-21 22:10:43 +09:00
Kenta Murata
4c2e7f26bd
[json] JSON_parse_float: Fix how to convert number
Stop BigDecimal-specific optimization.  Instead, it tries the conversion
methods in the following order:

1. `try_convert`,
2. `new`, and
3. class-named function, e.g. `Foo::Bar.Baz` function for `Foo::Bar::Baz` class

If all the above candidates are unavailable, it fallbacks to Float.
2020-12-21 22:10:43 +09:00
Jean Boussier
520e0916af
Implement a freeze: parser option
If set to true all parsed objects will be
immediately frozen, and strings will be
deduplicated if the Ruby implementation
allows it.
2020-10-20 21:40:25 +09:00
Watson
cb3e62511c
[flori/json] Use frozen string for hash key
When use non-frozen string for hash key with `rb_hash_aset()`, it will duplicate and freeze it internally.
To avoid duplicate and freeze, this patch will give a frozen string in `rb_hash_aset()`.

```
Warming up --------------------------------------
                json    14.000  i/100ms
Calculating -------------------------------------
                json    148.844  (± 1.3%) i/s -    756.000  in   5.079969s
```

```
Warming up --------------------------------------
                json    16.000  i/100ms
Calculating -------------------------------------
                json    165.608  (± 1.8%) i/s -    832.000  in   5.025367s
```

```
require 'json'
require 'securerandom'
require 'benchmark/ips'

obj = []

1000.times do |i|
  obj << {
    "id": i,
    "uuid": SecureRandom.uuid,
    "created_at": Time.now
  }
end

json = obj.to_json

Benchmark.ips do |x|
  x.report "json" do |iter|
    count = 0
    while count < iter
      JSON.parse(json)
      count += 1
    end
  end
end
```

18292c0c1d
2020-07-01 18:47:51 +09:00
Florian Frank
7376d70cb0
[flori/json] Only attempt to resize strings not other objects
167ada8da7
2019-10-14 19:54:48 +09:00
Yusuke Endoh
417c64b9a8 ext/json/parser/parser.rl: Use "signed" char to contain negative values
char is not always signed.  In fact, it is unsigned in arm.

20191004T181708Z.log.html.gz
```
compiling parser.c
parser.rl: In function ‘unescape_unicode’:
parser.rl:50:5: warning: comparison is always false due to limited range of data type [-Wtype-limits]
     if (b < 0) return UNI_REPLACEMENT_CHAR;
     ^
```
2019-10-05 07:00:57 +09:00
Yusuke Endoh
076d3d758b ext/json/parser/parser.rl: Update the source code of parser.c
There have been some direct changes in parser.c which is automatically
generated from parser.rl.  This updates parser.rl to sync the changes:

* 91793b8967
* 79ead821dd
* 80b5a0ff2a
2019-10-05 06:34:40 +09:00
mrkn
a7e3516ff1 Fix JSON::Parser against bigdecimal updates
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66127 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-02 05:21:57 +00:00
eregon
e7da0fc34e ext/json/parser/parser.c: do not call rb_str_resize() on Time object
* See https://github.com/flori/json/issues/342

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64177 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-08-03 15:11:36 +00:00