Commit graph

163 commits

Author SHA1 Message Date
Jean Boussier
1510d72bec [ruby/json] Fix generate(script_safe: true) to not confuse unrelated characters
Fix: https://github.com/ruby/json/issues/715

The first byte check was missing.

93a7f8717d
2024-12-05 09:16:22 +01:00
Yusuke Endoh
209f8ba7c4 [ruby/json] Prevent a warning of "a candidate for gnu_printf format attribute"
GCC 13 prints the following warning.

20241127T001003Z.log.html.gz
```
compiling generator.c
generator.c: In function ‘raise_generator_error’:
generator.c:91:5: warning: function ‘raise_generator_error’ might be a candidate for ‘gnu_printf’ format attribute [-Wsuggest-attribute=format]
   91 |     VALUE str = rb_vsprintf(fmt, args);
      |     ^~~~~
```

This change prevents the warning by specifying the format attribute.

b8c1490846
2024-11-27 23:35:20 +09:00
Jean Boussier
693a793521 JSON::GeneratorError expose invalid object
Fix: https://github.com/ruby/json/issues/710

Makes it easier to debug why a given tree of objects can't
be dumped as JSON.

Co-Authored-By: Étienne Barrié <etienne.barrie@gmail.com>
2024-11-26 15:11:05 +09:00
Jean Boussier
ee0de3fd4e [ruby/json] JSON.dump: write directly into the provided IO
Ref: https://github.com/ruby/json/issues/524

Rather than to buffer everything in memory.

Unfortunately Ruby doesn't provide an API to write into
and IO without first allocating a string, which is a bit
wasteful.

f017af6c0a
2024-11-26 15:11:05 +09:00
Nobuyoshi Nakada
29d76d8c8b [ruby/json] Fix right shift warnings
Ignoring `CHAR_BITS` > 8 platform, as far as `ch` indexes
`escape_table` that is hard-coded as 256 elements.

```
../../../../src/ext/json/generator/generator.c(121): warning C4333: '>>': right shift by too large amount, data loss
../../../../src/ext/json/generator/generator.c(122): warning C4333: '>>': right shift by too large amount, data loss
../../../../src/ext/json/generator/generator.c(243): warning C4333: '>>': right shift by too large amount, data loss
../../../../src/ext/json/generator/generator.c(244): warning C4333: '>>': right shift by too large amount, data loss
../../../../src/ext/json/generator/generator.c(291): warning C4333: '>>': right shift by too large amount, data loss
../../../../src/ext/json/generator/generator.c(292): warning C4333: '>>': right shift by too large amount, data loss
```

fb82373612
2024-11-06 23:31:30 +01:00
Jean Boussier
9987298654 Update depend files 2024-11-05 18:00:36 +01:00
Jean Boussier
ca8f21ace8 [ruby/json] Resync 2024-11-05 18:00:36 +01:00
Jean Boussier
f664e7eaab [ruby/json] Add tests for the behavior of JSON.generate with base types subclasses
Ref: https://github.com/ruby/json/pull/674
Ref: https://github.com/ruby/json/pull/668

The behavior on such case it quite unclear, the goal here is to
figure out whatever was the behavior on Cext version of `json 2.7.0`
and get all implementations to converge.

We can then decide to make them all behave differently if we so wish.

614921dcef
2024-11-05 18:00:36 +01:00
Jean Boussier
2f84a02ad5 [ruby/json] Use rb_str_new_frozen
90c8aaaa6a
2024-11-05 18:00:36 +01:00
Jean Boussier
b85a7a44fa [ruby/json] Trigger write barrier when setting Generator::State configs
Followup: 6382c231b0

0c797b4a11
2024-11-01 13:04:24 +09:00
Jean Boussier
ef5565f5d1 JSON.generate: call to_json on String subclasses
Fix: https://github.com/ruby/json/issues/667

This is yet another behavior on which the various implementations
differed, but the C implementation used to call `to_json` on String
subclasses used as keys.

This was optimized out in e125072130229e54a651f7b11d7d5a782ae7fb65
but there is an Active Support test case for it, so it's best to
make all 3 implementation respect this behavior.
2024-11-01 13:04:24 +09:00
Jean Boussier
3782600f0f [ruby/json] Emit warnings when dumping binary strings
Because of it's Ruby 1.8 heritage, the C extension doesn't care
much about strings encoding. We should get stricter over time.

42402fc13f
2024-11-01 13:04:24 +09:00
Jean Boussier
cc2e67a138 Elide Generator::State allocation until a to_json method has to be called
Fix: https://github.com/ruby/json/issues/655

For very small documents, the biggest performance gap with alternatives is
that the API impose that we allocate the `State` object. In a real world app
this doesn't make much of a difference, but when running in a micro-benchmark
this doubles the allocations, causing twice the amount of GC runs, making us
look bad.

However, unless we have to call a `to_json` method, the `State` object isn't
visible, so with some refactoring, we can elude that allocation entirely.

Instead we allocate the State internal struct on the stack, and if we need
to call a `to_json` method, we allocate the `State` and spill the struct on
the heap.

As a result, `JSON.generate` is now as fast as re-using a `State` instance,
as long as only primitives are generated.

Before:
```
== Encoding small mixed (34 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
        json (reuse)   598.654k i/100ms
                json   400.542k i/100ms
                  oj   533.353k i/100ms
Calculating -------------------------------------
        json (reuse)      6.371M (± 8.6%) i/s  (156.96 ns/i) -     31.729M in   5.059195s
                json      4.120M (± 6.6%) i/s  (242.72 ns/i) -     20.828M in   5.090549s
                  oj      5.622M (± 6.4%) i/s  (177.86 ns/i) -     28.268M in   5.061473s

Comparison:
        json (reuse):  6371126.6 i/s
                  oj:  5622452.0 i/s - same-ish: difference falls within error
                json:  4119991.1 i/s - 1.55x  slower

== Encoding small nested array (121 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
        json (reuse)   248.125k i/100ms
                json   215.255k i/100ms
                  oj   217.531k i/100ms
Calculating -------------------------------------
        json (reuse)      2.628M (± 6.1%) i/s  (380.55 ns/i) -     13.151M in   5.030281s
                json      2.185M (± 6.7%) i/s  (457.74 ns/i) -     10.978M in   5.057655s
                  oj      2.217M (± 6.7%) i/s  (451.10 ns/i) -     11.094M in   5.044844s

Comparison:
        json (reuse):  2627799.4 i/s
                  oj:  2216824.8 i/s - 1.19x  slower
                json:  2184669.5 i/s - 1.20x  slower

== Encoding small hash (65 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
        json (reuse)   641.334k i/100ms
                json   322.745k i/100ms
                  oj   642.450k i/100ms
Calculating -------------------------------------
        json (reuse)      7.133M (± 6.5%) i/s  (140.19 ns/i) -     35.915M in   5.068201s
                json      4.615M (± 7.0%) i/s  (216.70 ns/i) -     22.915M in   5.003718s
                  oj      6.912M (± 6.4%) i/s  (144.68 ns/i) -     34.692M in   5.047690s

Comparison:
        json (reuse):  7133123.3 i/s
                  oj:  6911977.1 i/s - same-ish: difference falls within error
                json:  4614696.6 i/s - 1.55x  slower
```

After:

```
== Encoding small mixed (34 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
        json (reuse)   572.751k i/100ms
                json   457.741k i/100ms
                  oj   512.247k i/100ms
Calculating -------------------------------------
        json (reuse)      6.324M (± 6.9%) i/s  (158.12 ns/i) -     31.501M in   5.023093s
                json      6.263M (± 6.9%) i/s  (159.66 ns/i) -     31.126M in   5.017086s
                  oj      5.569M (± 6.6%) i/s  (179.56 ns/i) -     27.661M in   5.003739s

Comparison:
        json (reuse):  6324183.5 i/s
                json:  6263204.9 i/s - same-ish: difference falls within error
                  oj:  5569049.2 i/s - same-ish: difference falls within error

== Encoding small nested array (121 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
        json (reuse)   258.505k i/100ms
                json   242.335k i/100ms
                  oj   220.678k i/100ms
Calculating -------------------------------------
        json (reuse)      2.589M (± 9.6%) i/s  (386.17 ns/i) -     12.925M in   5.071853s
                json      2.594M (± 6.6%) i/s  (385.46 ns/i) -     13.086M in   5.083035s
                  oj      2.250M (± 2.3%) i/s  (444.43 ns/i) -     11.255M in   5.004707s

Comparison:
        json (reuse):  2589499.6 i/s
                json:  2594321.0 i/s - same-ish: difference falls within error
                  oj:  2250064.0 i/s - 1.15x  slower

== Encoding small hash (65 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
        json (reuse)   656.373k i/100ms
                json   644.135k i/100ms
                  oj   650.283k i/100ms
Calculating -------------------------------------
        json (reuse)      7.202M (± 7.1%) i/s  (138.84 ns/i) -     36.101M in   5.051438s
                json      7.278M (± 1.7%) i/s  (137.40 ns/i) -     36.716M in   5.046300s
                  oj      7.036M (± 1.7%) i/s  (142.12 ns/i) -     35.766M in   5.084729s

Comparison:
        json (reuse):  7202447.9 i/s
                json:  7277883.0 i/s - same-ish: difference falls within error
                  oj:  7036115.2 i/s - same-ish: difference falls within error

```
2024-11-01 13:04:24 +09:00
Jean Boussier
7daa1083c9 [ruby/json] Move State#configure back into C
While less nice, this open the door to eluding the State object
allocation when possible.

5c0d428d4c
2024-11-01 13:04:24 +09:00
Jean Boussier
5dc3b15b3c [ruby/json] generator.c: store pretty strings in VALUE
Given we expect these to almost always be null, we might as
well keep them in RString.

And even when provided, assuming we're passed frozen strings
we'll save on copying them.

This also reduce the size of the struct from 112B to 72B.

6382c231b0
2024-11-01 13:04:24 +09:00
Jean Boussier
59eebeca02 [ruby/json] Allocate the initial generator buffer on the stack
Ref: https://github.com/ruby/json/issues/655
Followup: https://github.com/ruby/json/issues/657

Assuming the generator might be used for fairly small documents
we can start with a reasonable buffer size of the stack, and if
we outgrow it, we can spill on the heap.

In a way this is optimizing for micro-benchmarks, but there are
valid use case for fiarly small JSON document in actual real world
scenarios, so trashing the GC less in such case make sense.

Before:

```
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                  Oj   518.700k i/100ms
          JSON reuse   483.370k i/100ms
Calculating -------------------------------------
                  Oj      5.722M (± 1.8%) i/s  (174.76 ns/i) -     29.047M in   5.077823s
          JSON reuse      5.278M (± 1.5%) i/s  (189.46 ns/i) -     26.585M in   5.038172s

Comparison:
                  Oj:  5722283.8 i/s
          JSON reuse:  5278061.7 i/s - 1.08x  slower
```

After:

```
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                  Oj   517.837k i/100ms
          JSON reuse   548.871k i/100ms
Calculating -------------------------------------
                  Oj      5.693M (± 1.6%) i/s  (175.65 ns/i) -     28.481M in   5.004056s
          JSON reuse      5.855M (± 1.2%) i/s  (170.80 ns/i) -     29.639M in   5.063004s

Comparison:
                  Oj:  5692985.6 i/s
          JSON reuse:  5854857.9 i/s - 1.03x  faster
```

fe607f4806
2024-11-01 13:04:24 +09:00
Jean Boussier
d329896fb5 [ruby/json] Fix a memory leak in #to_json methods
Fix: https://github.com/ruby/json/issues/460

The various `to_json` methods must rescue exceptions
to free the buffer.

```
require 'json'

data = 10_000.times.to_a << BasicObject.new
20.times do
  100.times do
    begin
      data.to_json
    rescue NoMethodError
    end
  end
  puts `ps -o rss= -p #{$$}`
end
```

```
 20128
 24992
 29920
 34672
 39600
 44336
 49136
 53936
 58816
 63616
 68416
 73232
 78032
 82896
 87696
 92528
 97408
102208
107008
111808
```

d227d225ca
2024-11-01 13:04:24 +09:00
Jean Boussier
f2e51146f8 [ruby/json] Remove dead cases from convert_UTF8_to_* functions
d54063a790
2024-10-30 10:13:49 +09:00
Jean Boussier
5d176436ce [ruby/json] Allocate the FBuffer struct on the stack
Ref: https://github.com/ruby/json/issues/655

The actual buffer is still on the heap, but this saves a pair
of malloc/free.

This helps a lot on micro-benchmarks

Before:

```
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                  Oj   531.598k i/100ms
          JSON reuse   417.666k i/100ms
Calculating -------------------------------------
                  Oj      5.735M (± 1.3%) i/s  (174.35 ns/i) -     28.706M in   5.005900s
          JSON reuse      4.604M (± 1.4%) i/s  (217.18 ns/i) -     23.389M in   5.080779s

Comparison:
                  Oj:  5735475.6 i/s
          JSON reuse:  4604380.3 i/s - 1.25x  slower
```

After:

```
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                  Oj   518.700k i/100ms
          JSON reuse   483.370k i/100ms
Calculating -------------------------------------
                  Oj      5.722M (± 1.8%) i/s  (174.76 ns/i) -     29.047M in   5.077823s
          JSON reuse      5.278M (± 1.5%) i/s  (189.46 ns/i) -     26.585M in   5.038172s

Comparison:
                  Oj:  5722283.8 i/s
          JSON reuse:  5278061.7 i/s - 1.08x  slower
```

Bench:

```ruby
require 'benchmark/ips'
require 'oj'
require 'json'

json_encoder = JSON::State.new(JSON.dump_default_options)
test_data = [1, "string", { a: 1, b: 2 }, [3, 4, 5]]

Oj.default_options = Oj.default_options.merge(mode: :compat)

Benchmark.ips do |x|
  x.config(time: 5, warmup: 2)

  x.report("Oj") do
    Oj.dump(test_data)
  end

  x.report("JSON reuse") do
    json_encoder.generate(test_data)
  end

  x.compare!(order: :baseline)
end
```

72110f7992
2024-10-30 10:13:48 +09:00
Jean Boussier
8018a3121f [ruby/json] Workaround being loaded alongside a different json_pure version
Fix: https://github.com/ruby/json/issues/646

Since both `json` and `json_pure` expose the same files, if the
versions don't match, the native extension may be loaded with Ruby
code that don't match and is incompatible.

By doing the `require json/ext/generator/state` from C we ensure
we're at least loading that.

But this is a dirty workaround for the 2.7.x branch, we should
find a better way to fully isolate the two gems.

dfdd4acf36
2024-10-26 18:44:15 +09:00
Jean Boussier
a5bd0c638a [ruby/json] Workaround rubygems $LOAD_PATH bug
Ref: https://github.com/ruby/json/issues/647
Ref: https://github.com/rubygems/rubygems/pull/6490

Older rubygems are executing `extconf.rb` with a broken `$LOAD_PATH`
causing the `json` gem native extension to be loaded with the stdlib
version of the `.rb` files.

This fails with

```
json/common.rb:82:in `initialize': wrong number of arguments (given 1, expected 0) (ArgumentError)
```

Since this is just for `extconf.rb` we can probably just accept that
extra argument and ignore it.

The bug was fixed in rubygems 3.4.9 / 2023-03-20

1f5e849fe0
2024-10-26 18:44:15 +09:00
Jean Boussier
bfdf02ea72 pretty_generate: don't apply object_nl / array_nl for empty containers
Fix: https://github.com/ruby/json/issues/437

Before:

```json
{
  "foo": {
  },
  "bar": [
  ]
}
```

After:

```json
{
  "foo": {},
  "bar": []
}
```
2024-10-26 18:44:15 +09:00
Jean Boussier
fc9f0cb8c5 [ruby/json] JSON.dump / String#to_json: raise on invalid encoding
This regressed since 2.7.2.

35407d6635
2024-10-26 18:44:15 +09:00
Jean Boussier
a052d96103 [ruby/json] Compile with std=c99
d4968d2e48
2024-10-26 18:44:15 +09:00
Jean Boussier
cbd933bcf1 [ruby/json] convert_UTF8_to_ASCII_only_JSON: apply the same optimization pass
42edaf7f17
2024-10-26 18:44:15 +09:00
Jean Boussier
e52b47680e [ruby/json] Reduce encoding benchmark size
Profiling revealed that we were spending lots of time growing the buffer.
Buffer operations is definitely something we want to optimize, but for
this specific benchmark what we're interested in is UTF-8 scanning performance.

Each iteration of the two scaning benchmark were producing 20MB of JSON,
now they only produce 5MB.

Now:

```
== Encoding mostly utf8 (5001001 bytes)
ruby 3.4.0dev (2024-10-18T19:01:45Z master 7be9a333ca) +YJIT +PRISM [arm64-darwin23]
Warming up --------------------------------------
                json    35.000 i/100ms
                  oj    36.000 i/100ms
           rapidjson    10.000 i/100ms
Calculating -------------------------------------
                json    359.161 (± 1.4%) i/s    (2.78 ms/i) -      1.820k in   5.068542s
                  oj    359.699 (± 0.6%) i/s    (2.78 ms/i) -      1.800k in   5.004291s
           rapidjson     99.687 (± 2.0%) i/s   (10.03 ms/i) -    500.000 in   5.017321s

Comparison:
                json:      359.2 i/s
                  oj:      359.7 i/s - same-ish: difference falls within error
           rapidjson:       99.7 i/s - 3.60x  slower
```

1a338532d2
2024-10-26 18:44:15 +09:00
Jean Boussier
97713ac952 [ruby/json] convert_UTF8_to_JSON: repurpose the escape tables into size tables
Since we're looking up the table anyway, we might as well store the
UTF-8 char length in it. For single byte characters that don't need
escaping we store `0`.

This helps on strings with lots of multi-byte characters:

Before:

```
== Encoding mostly utf8 (20004001 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json     6.000 i/100ms
                  oj    10.000 i/100ms
           rapidjson     2.000 i/100ms
Calculating -------------------------------------
                json     67.978 (± 1.5%) i/s   (14.71 ms/i) -    342.000 in   5.033062s
                  oj    100.876 (± 2.0%) i/s    (9.91 ms/i) -    510.000 in   5.058080s
           rapidjson     26.389 (± 7.6%) i/s   (37.89 ms/i) -    132.000 in   5.027681s

Comparison:
                json:       68.0 i/s
                  oj:      100.9 i/s - 1.48x  faster
           rapidjson:       26.4 i/s - 2.58x  slower
```

After:

```
== Encoding mostly utf8 (20004001 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json     7.000 i/100ms
                  oj    10.000 i/100ms
           rapidjson     2.000 i/100ms
Calculating -------------------------------------
                json     75.187 (± 2.7%) i/s   (13.30 ms/i) -    378.000 in   5.030111s
                  oj     95.196 (± 2.1%) i/s   (10.50 ms/i) -    480.000 in   5.043565s
           rapidjson     25.969 (± 3.9%) i/s   (38.51 ms/i) -    130.000 in   5.011471s

Comparison:
                json:       75.2 i/s
                  oj:       95.2 i/s - 1.27x  faster
           rapidjson:       26.0 i/s - 2.90x  slower
```

51e2631d1f
2024-10-26 18:44:15 +09:00
Jean Boussier
9f300d0541 [ruby/json] Optimize convert_UTF8_to_JSON for mostly ASCII strings
If we assume that even UTF-8 strings are mostly ASCII, we can implement a
fast path for the ASCII parts.

Before:

```
== Encoding mixed utf8 (20012001 bytes)
ruby 3.4.0dev (2024-10-18T15:12:54Z master d1b5c10957) +YJIT +PRISM [arm64-darwin23]
Warming up --------------------------------------
                json     5.000 i/100ms
                  oj     9.000 i/100ms
           rapidjson     2.000 i/100ms
Calculating -------------------------------------
                json     49.403 (± 2.0%) i/s   (20.24 ms/i) -    250.000 in   5.062647s
                  oj    100.120 (± 2.0%) i/s    (9.99 ms/i) -    504.000 in   5.035349s
           rapidjson     26.404 (± 0.0%) i/s   (37.87 ms/i) -    132.000 in   5.001025s

Comparison:
                json:       49.4 i/s
                  oj:      100.1 i/s - 2.03x  faster
           rapidjson:       26.4 i/s - 1.87x  slower
```

After:

```
== Encoding mixed utf8 (20012001 bytes)
ruby 3.4.0dev (2024-10-18T15:12:54Z master d1b5c10957) +YJIT +PRISM [arm64-darwin23]
Warming up --------------------------------------
                json    10.000 i/100ms
                  oj     9.000 i/100ms
           rapidjson     2.000 i/100ms
Calculating -------------------------------------
                json     95.686 (± 2.1%) i/s   (10.45 ms/i) -    480.000 in   5.018575s
                  oj     96.875 (± 2.1%) i/s   (10.32 ms/i) -    486.000 in   5.019097s
           rapidjson     26.260 (± 3.8%) i/s   (38.08 ms/i) -    132.000 in   5.033151s

Comparison:
                json:       95.7 i/s
                  oj:       96.9 i/s - same-ish: difference falls within error
           rapidjson:       26.3 i/s - 3.64x  slower
```

f8166c2d7f
2024-10-26 18:44:15 +09:00
Jean Boussier
618085f48d [ruby/json] Get rid of the remaining tabs.
1a9af430d2
2024-10-26 18:44:15 +09:00
Peter Zhu
48899d56a9 [ruby/json] Sync changes
Some changes were missed in the automatic sync.
2024-10-17 21:07:54 +02:00
Peter Zhu
e4330536d2 [ruby/json] Fix State#max_nesting=
Returning state->max_nesting is not valid because it's not a Ruby object.

6679ceb
2024-10-17 13:39:48 -04:00
Jean Boussier
a7317f53e0 Add a fast path for ASCII strings
This optimization is based on a few assumptions:

  - Most strings are ASCII only.
  - Most strings had their coderange scanned already.

If the above is true, then by checking the string coderange, we can
use a much more streamlined function to encode ASCII strings.

Before:

```
== Encoding twitter.json (466906 bytes)
ruby 3.4.0preview2 (2024-10-07 master 32c733f57b) +YJIT +PRISM [arm64-darwin23]
Warming up --------------------------------------
                json   140.000 i/100ms
                  oj   230.000 i/100ms
           rapidjson   108.000 i/100ms
Calculating -------------------------------------
                json      1.464k (± 1.4%) i/s  (682.83 μs/i) -      7.420k in   5.067573s
                  oj      2.338k (± 1.5%) i/s  (427.64 μs/i) -     11.730k in   5.017336s
           rapidjson      1.075k (± 1.6%) i/s  (930.40 μs/i) -      5.400k in   5.025469s

Comparison:
                json:     1464.5 i/s
                  oj:     2338.4 i/s - 1.60x  faster
           rapidjson:     1074.8 i/s - 1.36x  slower

```

After:

```
== Encoding twitter.json (466906 bytes)
ruby 3.4.0preview2 (2024-10-07 master 32c733f57b) +YJIT +PRISM [arm64-darwin23]
Warming up --------------------------------------
                json   189.000 i/100ms
                  oj   228.000 i/100ms
           rapidjson   108.000 i/100ms
Calculating -------------------------------------
                json      1.903k (± 1.2%) i/s  (525.55 μs/i) -      9.639k in   5.066521s
                  oj      2.306k (± 1.3%) i/s  (433.71 μs/i) -     11.628k in   5.044096s
           rapidjson      1.069k (± 2.4%) i/s  (935.38 μs/i) -      5.400k in   5.053794s

Comparison:
                json:     1902.8 i/s
                  oj:     2305.7 i/s - 1.21x  faster
           rapidjson:     1069.1 i/s - 1.78x  slower
```
2024-10-17 15:21:34 +00:00
Jean Boussier
df48f597cf [ruby/json] Get rid of some more outdated compatibility code
All these macros are available on Ruby 2.3+

227885f460
2024-10-17 13:02:13 +00:00
Jean Boussier
6105bae331 [ruby/json] Get rid of compatibility code for older rubies
All of these are for rubies older than 2.3.

811297f86a
2024-10-17 12:22:16 +00:00
Jean Boussier
a1c420c740 [ruby/json] generator.c: reduce the number of globals
Most of these classes and modules don't need to be global variables

b783445ec9
2024-10-17 11:35:32 +00:00
Jean Boussier
43e08133c3 [ruby/json] Convert Generator initialize and configure method into Ruby
This helps very marginally with allocation speed.

25db79dfaa
2024-10-17 11:35:32 +00:00
Yusuke Endoh
233f63c7fb [ruby/json] Use RB_ENCODING_GET instead of rb_enc_get to improve performance
This speeds up `JSON.generate` by about 12% in a benchmark.

4329e30826
2024-10-17 08:54:48 +00:00
Yusuke Endoh
0b4257efa3 [ruby/json] Apply RB_UNLIKELY for less frequently used options
This speeds up `JSON.generate` by about 4% in a benchmark.

6471710cfc
2024-10-17 08:54:47 +00:00
Yusuke Endoh
64c24f6971 [ruby/json] Stop prebuilding object_delim2
Also, remove static functions that are no longer used.

This speeds up `JSON.generate` by about 5% in a benchmark.

4c984b2017
2024-10-17 08:54:47 +00:00
Yusuke Endoh
186e77209e [ruby/json] Stop prebuilding object_delim
This speeds up `JSON.generate` by about 4% in a benchmark

ed47a10e4f
2024-10-17 08:54:47 +00:00
Yusuke Endoh
88719fb300 [ruby/json] Stop prebuilding array_delim
The purpose of this change is to exploit `fbuffer_append_char` that is
faster than `fbuffer_append`.

`array_delim` was a buffer that concatenated a single comma with
`array_nl`. However, in the typical use case (`JSON.generate(data)`),
`array_nl` is empty. This means that `array_delim` was a
single-character buffer in many cases.

`fbuffer_append(buffer, array_delim)` used `memcpy` to copy one byte,
which was not so efficient.
Rather, this change uses `fbuffer_append_char(buffer, ',')` and then
`fbuffer_append(buffer, array_nl)` only when `array_nl` is not NULL.

This speeds up `JSON.generate` by about 9% in a benchmark.

445de6e459
2024-10-17 08:54:46 +00:00
Yusuke Endoh
fb84aa5501 [ruby/json] Directly use generate_json_string for object keys
... instead of `generate_json`.

Since the object key is already confirmed to be a string, using a
generic dispatch function brings an unnecessary overhead.

This speeds up `JSON.generate` by about 3% in a benchmark.

e125072130
2024-10-17 08:54:46 +00:00
Yusuke Endoh
3911189fba [ruby/json] Use efficient object-type dispatching
Dispatching based on Ruby's VALUE structure is more efficient than
simply cascaded "if ... else if ..." checks.

This speeds up `JSON.generate` by about 5% in a benchmark.

4f9180debb
2024-10-17 08:54:45 +00:00
Yusuke Endoh
7962b4c342 [ruby/json] Use RARRAY_AREF instead of rb_ary_entry to improve performance
It is safe to use `RARRAY_AREF` here because no Ruby code is executed
between `RARRAY_LEN` and `RARRAY_AREF`.

This speeds up `JSON.generate` by about 4% in a benchmark.

c5d80f9fd4
2024-10-17 08:54:45 +00:00
Jean Boussier
613694734e [ruby/json] generator.c: better fix for comparison of integers of different signs
c372dc9268
2024-10-08 12:22:42 +00:00
Hiroshi SHIBATA
c684164534 Fixed C23 compilation error with ruby/ruby master 2024-10-08 14:10:05 +09:00
Jean Boussier
ea9d34082f [ruby/json] Fix compilation warning
```
generator.c:69:27: warning: comparison of integers of different signs: 'short' and 'unsigned long' [-Wsign-compare]
            for (i = 1; i < ch_len; i++) {
```

ff8edcd47c
2024-10-08 14:10:05 +09:00
Luke T. Shumaker
934d67b415 [ruby/json] generator.c: Optimize by combining calls to fbuffer_append
62301c0bc3
2024-10-08 14:10:05 +09:00
Luke T. Shumaker
74d459fd52 [ruby/json] Adjust to the CVTUTF code being gone
I, Luke T. Shumaker, am the sole author of the added code.

I did not reference CVTUTF when writing it.  I did reference the
Unicode standard (15.0.0), the Wikipedia article on UTF-8, and the
Wikipedia article on UTF-16.  When I saw some tests fail, I did
reference the old deleted code (but a JSON-specific part, inherently
not as based on CVTUTF) to determine that script_safe should also
escape U+2028 and U+2029.

I targeted simplicity and clarity when writing the code--it can likely
be optimized.  In my mind, the obvious next optimization is to have it
combine contiguous non-escaped characters into just one call to
fbuffer_append(), instead of calling fbuffer_append() for each
character.

Regarding the use of the "modern" types `uint32_t`, `uint16_t`, and
`bool`:
 - ruby.h is guaranteed to give us uint32_t and uint16_t.
 - Since Ruby 3.0.0, ruby.h is guaranteed to give us bool... but we
   support down to Ruby 2.3.  But, ruby.h is guaranteed to give us
   HAVE_STDBOOL_H for the C99 stdbool.h; so use that to include
   stdbool.h if we can, and if not then fall back to a copy of the
   same bool definition that Ruby 3.0.5 uses with C89.

c96351f874
2024-10-08 14:10:05 +09:00
Luke T. Shumaker
6e47968929 [ruby/json] Delete code that is based on CVTUTF
I did this based on manual inspection, comparing the code to my re-created
history of CVTUTF at https://git.lukeshu.com/2git/cvtutf/ (created by the
scripts at https://git.lukeshu.com/2git/cvtutf-make/)

0819553144
2024-10-08 14:10:05 +09:00