archive/ruby - Eplg Git: Free And Private Git Hosting

mirror of https://github.com/ruby/ruby.git synced 2025-08-24 13:34:17 +02:00

Author	SHA1	Message	Date
Jean Boussier	4dde7101c7	Refactor jeaiii-ltoa.h Some relatively minor change to make the library more in line with the gem. Some renaming, etc.	2025-03-27 13:54:12 +09:00
eno	d1f3c81258	Faster integer formatting This commit provides an alternative implementation for a long → decimal conversion. The main difference is that it uses an algorithm pulled from https://github.com/jeaiii/itoa. The source there is C++, it was converted by hand to C for inclusion with this gem. jeaiii's algorithm is covered by the MIT License, see source code. On addition this version now also generates the string directly into the fbuffer, foregoing the need to run a separate memory copy. As a result, I see a speedup of 32% on Apple Silicon M1 for an integer set of benchmarks.	2025-03-27 11:37:27 +09:00
Hiroshi SHIBATA	6b15857e25	Removed trailing space	2025-03-24 15:20:59 +09:00
Jean Boussier	f3f4524d19	Reorganize `fpconv` vendoring Make it a single file and declare the dependency.	2025-03-24 14:49:44 +09:00
eno	528c08cc5f	[ruby/json] Adjust fpconv to add ".0" to integers Adds a test case fix `fa5bdf87cb`	2025-03-24 14:35:04 +09:00
eno	a59333c58b	[ruby/json] Faster float formatting This commit provides an alternative implementation for a float → decimal conversion. It integrates a C implementation of Fabian Loitsch's Grisu-algorithm [[pdf]](http://florian.loitsch.com/publications/dtoa-pldi2010.pdf), extracted from https://github.com/night-shift/fpconv. The relevant files are added in this PR, they are, as is all of https://github.com/night-shift/fpconv, available under a MIT License. As a result, I see a speedup of 900% on Apple Silicon M1 for a float set of benchmarks. floats don't have a single correct string representation: a float like `1000.0` can be represented as "1000", "1e3", "1000.0" (and more). The Grisu algorithm converts floating point numbers to an optimal decimal string representation without loss of precision. As a result, a float that is exactly an integer (like `Float(10)`) will be converted by that algorithm into `"10"`. While technically correct – the JSON format treats floats and integers identically –, this differs from the current behaviour of the `"json"` gem. To address this, the integration checks for that case, and explicitely adds a ".0" suffix in those cases. This is sufficient to meet all existing tests; there is, however, a chance that the current implementation and this implementation occasionally encode floats differently. ``` == Encoding floats (4179311 bytes) ruby 3.4.1 (2024-12-25 revision `48d4efcb85`) +YJIT +PRISM [arm64-darwin24] Warming up -------------------------------------- json (local) 4.000 i/100ms Calculating ------------------------------------- json (local) 46.046 (± 2.2%) i/s (21.72 ms/i) - 232.000 in 5.039611s Normalize to 2090234 byte == Encoding floats (4179242 bytes) ruby 3.4.1 (2024-12-25 revision `48d4efcb85`) +YJIT +PRISM [arm64-darwin24] Warming up -------------------------------------- json (2.10.2) 1.000 i/100ms Calculating ------------------------------------- json (2.10.2) 4.614 (± 0.0%) i/s (216.74 ms/i) - 24.000 in 5.201871s ``` These benchmarks are run via a script ([link](https://gist.github.com/radiospiel/04019402726a28b31616df3d0c17bd1c)) which is based on the gem's `benchmark/encoder.rb` file. There are probably better ways to run benchmarks :) My version allows to combine multiple test cases into a single one. The `dumps` benchmark, which covers the JSON files in `benchmark/data/*.json` – with the exception of `canada.json` – , reported a minor speedup within statistical uncertainty. `7d77415108`	2025-03-24 14:35:04 +09:00
Jean Boussier	293ad8a4e9	Fix a compatibility issue with `MultiJson.dump(obj, pretty: true)` Fix: https://github.com/ruby/json/issues/748 `MultiJson` pass `State#to_h` as options, and the `as_json` property defaults to `false` but `false` wasn't accepted by the constructor.	2025-02-12 13:15:01 +09:00
Étienne Barrié	b4bfbcaddc	Optimize Symbol generation in strict mode Co-authored-by: Jean Boussier <jean.boussier@gmail.com>	2025-02-06 16:02:03 +09:00
Étienne Barrié	f865148e19	Fix JSON::Coder to call as_json proc for NaN and Infinity Co-authored-by: Jean Boussier <jean.boussier@gmail.com>	2025-02-06 16:02:03 +09:00
Jean Boussier	98c56de823	[ruby/json] Refactor further to expose the simpler escape search possible `e03515ac8b`	2025-02-03 10:05:26 +09:00
Jean Boussier	98e1c2845a	[ruby/json] Refactor convert_UTF8_to_JSON to split searching and escaping code The goal is to be able to dispatch to more optimized search implementations without having to duplicate the escaping code. Somehow, this is a few % faster already: ``` == Encoding activitypub.json (52595 bytes) ruby 3.4.1 (2024-12-25 revision `48d4efcb85`) +YJIT +PRISM [arm64-darwin23] Warming up -------------------------------------- after 2.257k i/100ms Calculating ------------------------------------- after 22.930k (± 1.3%) i/s (43.61 μs/i) - 115.107k in 5.020814s Comparison: before: 21604.0 i/s after: 22930.1 i/s - 1.06x faster == Encoding citm_catalog.json (500298 bytes) ruby 3.4.1 (2024-12-25 revision `48d4efcb85`) +YJIT +PRISM [arm64-darwin23] Warming up -------------------------------------- after 137.000 i/100ms Calculating ------------------------------------- after 1.397k (± 1.1%) i/s (715.57 μs/i) - 6.987k in 5.000408s Comparison: before: 1344.4 i/s after: 1397.5 i/s - 1.04x faster == Encoding twitter.json (466906 bytes) ruby 3.4.1 (2024-12-25 revision `48d4efcb85`) +YJIT +PRISM [arm64-darwin23] Warming up -------------------------------------- after 249.000 i/100ms Calculating ------------------------------------- after 2.464k (± 1.8%) i/s (405.81 μs/i) - 12.450k in 5.054131s Comparison: before: 2326.5 i/s after: 2464.2 i/s - 1.06x faster ``` `8fb5ae807f`	2025-02-03 10:05:25 +09:00
Étienne Barrié	89e316ad06	Introduce JSON::Coder Co-authored-by: Jean Boussier <jean.boussier@gmail.com>	2025-01-28 15:41:47 +09:00
Étienne Barrié	e8676cada8	[ruby/json] Introduce JSON::Fragment `9e3500f345` Co-authored-by: Jean Boussier <jean.boussier@gmail.com>	2025-01-20 14:20:55 +01:00
Étienne Barrié	f301383cdd	Remove Generator::State#_generate Co-authored-by: Jean Boussier <jean.boussier@gmail.com>	2025-01-14 12:24:37 +09:00
Jean Boussier	f756950d82	Improve lookup tables for string escaping. Introduce a simplified table for the most common case, which is `script_safe: false, ascii_only: false`. On the `script_safe` table, now only `0xE2` does a multi-byte check. Merge back `convert_ASCII_to_JSON`, as it no longer help much with the simplified escape table. ``` == Encoding mixed utf8 (5003001 bytes) ruby 3.4.1 (2024-12-25 revision `48d4efcb85`) +YJIT +PRISM [arm64-darwin23] Warming up -------------------------------------- after 38.000 i/100ms Calculating ------------------------------------- after 398.220 (± 3.0%) i/s (2.51 ms/i) - 2.014k in 5.061659s Comparison: before: 381.8 i/s after: 398.2 i/s - same-ish: difference falls within error == Encoding mostly utf8 (5001001 bytes) ruby 3.4.1 (2024-12-25 revision `48d4efcb85`) +YJIT +PRISM [arm64-darwin23] Warming up -------------------------------------- after 39.000 i/100ms Calculating ------------------------------------- after 393.337 (± 2.5%) i/s (2.54 ms/i) - 1.989k in 5.059397s Comparison: before: 304.3 i/s after: 393.3 i/s - 1.29x faster == Encoding twitter.json (466906 bytes) ruby 3.4.1 (2024-12-25 revision `48d4efcb85`) +YJIT +PRISM [arm64-darwin23] Warming up -------------------------------------- after 244.000 i/100ms Calculating ------------------------------------- after 2.436k (± 0.9%) i/s (410.43 μs/i) - 12.200k in 5.007702s Comparison: before: 2125.9 i/s after: 2436.5 i/s - 1.15x faster ```	2025-01-07 13:21:46 +09:00
Jean Boussier	1510d72bec	[ruby/json] Fix generate(script_safe: true) to not confuse unrelated characters Fix: https://github.com/ruby/json/issues/715 The first byte check was missing. `93a7f8717d`	2024-12-05 09:16:22 +01:00
Yusuke Endoh	209f8ba7c4	[ruby/json] Prevent a warning of "a candidate for gnu_printf format attribute" GCC 13 prints the following warning. `20241127`T001003Z.log.html.gz ``` compiling generator.c generator.c: In function ‘raise_generator_error’: generator.c:91:5: warning: function ‘raise_generator_error’ might be a candidate for ‘gnu_printf’ format attribute [-Wsuggest-attribute=format] 91 \| VALUE str = rb_vsprintf(fmt, args); \| ^~~~~ ``` This change prevents the warning by specifying the format attribute. `b8c1490846`	2024-11-27 23:35:20 +09:00
Jean Boussier	693a793521	JSON::GeneratorError expose invalid object Fix: https://github.com/ruby/json/issues/710 Makes it easier to debug why a given tree of objects can't be dumped as JSON. Co-Authored-By: Étienne Barrié <etienne.barrie@gmail.com>	2024-11-26 15:11:05 +09:00
Jean Boussier	ee0de3fd4e	[ruby/json] JSON.dump: write directly into the provided IO Ref: https://github.com/ruby/json/issues/524 Rather than to buffer everything in memory. Unfortunately Ruby doesn't provide an API to write into and IO without first allocating a string, which is a bit wasteful. `f017af6c0a`	2024-11-26 15:11:05 +09:00
Nobuyoshi Nakada	29d76d8c8b	[ruby/json] Fix right shift warnings Ignoring `CHAR_BITS` > 8 platform, as far as `ch` indexes `escape_table` that is hard-coded as 256 elements. ``` ../../../../src/ext/json/generator/generator.c(121): warning C4333: '>>': right shift by too large amount, data loss ../../../../src/ext/json/generator/generator.c(122): warning C4333: '>>': right shift by too large amount, data loss ../../../../src/ext/json/generator/generator.c(243): warning C4333: '>>': right shift by too large amount, data loss ../../../../src/ext/json/generator/generator.c(244): warning C4333: '>>': right shift by too large amount, data loss ../../../../src/ext/json/generator/generator.c(291): warning C4333: '>>': right shift by too large amount, data loss ../../../../src/ext/json/generator/generator.c(292): warning C4333: '>>': right shift by too large amount, data loss ``` `fb82373612`	2024-11-06 23:31:30 +01:00
Jean Boussier	ca8f21ace8	[ruby/json] Resync	2024-11-05 18:00:36 +01:00
Jean Boussier	f664e7eaab	[ruby/json] Add tests for the behavior of JSON.generate with base types subclasses Ref: https://github.com/ruby/json/pull/674 Ref: https://github.com/ruby/json/pull/668 The behavior on such case it quite unclear, the goal here is to figure out whatever was the behavior on Cext version of `json 2.7.0` and get all implementations to converge. We can then decide to make them all behave differently if we so wish. `614921dcef`	2024-11-05 18:00:36 +01:00
Jean Boussier	2f84a02ad5	[ruby/json] Use rb_str_new_frozen `90c8aaaa6a`	2024-11-05 18:00:36 +01:00
Jean Boussier	b85a7a44fa	[ruby/json] Trigger write barrier when setting Generator::State configs Followup: `6382c231b0` `0c797b4a11`	2024-11-01 13:04:24 +09:00
Jean Boussier	ef5565f5d1	JSON.generate: call to_json on String subclasses Fix: https://github.com/ruby/json/issues/667 This is yet another behavior on which the various implementations differed, but the C implementation used to call `to_json` on String subclasses used as keys. This was optimized out in e125072130229e54a651f7b11d7d5a782ae7fb65 but there is an Active Support test case for it, so it's best to make all 3 implementation respect this behavior.	2024-11-01 13:04:24 +09:00
Jean Boussier	3782600f0f	[ruby/json] Emit warnings when dumping binary strings Because of it's Ruby 1.8 heritage, the C extension doesn't care much about strings encoding. We should get stricter over time. `42402fc13f`	2024-11-01 13:04:24 +09:00
Jean Boussier	cc2e67a138	Elide Generator::State allocation until a `to_json` method has to be called Fix: https://github.com/ruby/json/issues/655 For very small documents, the biggest performance gap with alternatives is that the API impose that we allocate the `State` object. In a real world app this doesn't make much of a difference, but when running in a micro-benchmark this doubles the allocations, causing twice the amount of GC runs, making us look bad. However, unless we have to call a `to_json` method, the `State` object isn't visible, so with some refactoring, we can elude that allocation entirely. Instead we allocate the State internal struct on the stack, and if we need to call a `to_json` method, we allocate the `State` and spill the struct on the heap. As a result, `JSON.generate` is now as fast as re-using a `State` instance, as long as only primitives are generated. Before: ``` == Encoding small mixed (34 bytes) ruby 3.3.4 (2024-07-09 revision `be1089c8ec`) +YJIT [arm64-darwin23] Warming up -------------------------------------- json (reuse) 598.654k i/100ms json 400.542k i/100ms oj 533.353k i/100ms Calculating ------------------------------------- json (reuse) 6.371M (± 8.6%) i/s (156.96 ns/i) - 31.729M in 5.059195s json 4.120M (± 6.6%) i/s (242.72 ns/i) - 20.828M in 5.090549s oj 5.622M (± 6.4%) i/s (177.86 ns/i) - 28.268M in 5.061473s Comparison: json (reuse): 6371126.6 i/s oj: 5622452.0 i/s - same-ish: difference falls within error json: 4119991.1 i/s - 1.55x slower == Encoding small nested array (121 bytes) ruby 3.3.4 (2024-07-09 revision `be1089c8ec`) +YJIT [arm64-darwin23] Warming up -------------------------------------- json (reuse) 248.125k i/100ms json 215.255k i/100ms oj 217.531k i/100ms Calculating ------------------------------------- json (reuse) 2.628M (± 6.1%) i/s (380.55 ns/i) - 13.151M in 5.030281s json 2.185M (± 6.7%) i/s (457.74 ns/i) - 10.978M in 5.057655s oj 2.217M (± 6.7%) i/s (451.10 ns/i) - 11.094M in 5.044844s Comparison: json (reuse): 2627799.4 i/s oj: 2216824.8 i/s - 1.19x slower json: 2184669.5 i/s - 1.20x slower == Encoding small hash (65 bytes) ruby 3.3.4 (2024-07-09 revision `be1089c8ec`) +YJIT [arm64-darwin23] Warming up -------------------------------------- json (reuse) 641.334k i/100ms json 322.745k i/100ms oj 642.450k i/100ms Calculating ------------------------------------- json (reuse) 7.133M (± 6.5%) i/s (140.19 ns/i) - 35.915M in 5.068201s json 4.615M (± 7.0%) i/s (216.70 ns/i) - 22.915M in 5.003718s oj 6.912M (± 6.4%) i/s (144.68 ns/i) - 34.692M in 5.047690s Comparison: json (reuse): 7133123.3 i/s oj: 6911977.1 i/s - same-ish: difference falls within error json: 4614696.6 i/s - 1.55x slower ``` After: ``` == Encoding small mixed (34 bytes) ruby 3.3.4 (2024-07-09 revision `be1089c8ec`) +YJIT [arm64-darwin23] Warming up -------------------------------------- json (reuse) 572.751k i/100ms json 457.741k i/100ms oj 512.247k i/100ms Calculating ------------------------------------- json (reuse) 6.324M (± 6.9%) i/s (158.12 ns/i) - 31.501M in 5.023093s json 6.263M (± 6.9%) i/s (159.66 ns/i) - 31.126M in 5.017086s oj 5.569M (± 6.6%) i/s (179.56 ns/i) - 27.661M in 5.003739s Comparison: json (reuse): 6324183.5 i/s json: 6263204.9 i/s - same-ish: difference falls within error oj: 5569049.2 i/s - same-ish: difference falls within error == Encoding small nested array (121 bytes) ruby 3.3.4 (2024-07-09 revision `be1089c8ec`) +YJIT [arm64-darwin23] Warming up -------------------------------------- json (reuse) 258.505k i/100ms json 242.335k i/100ms oj 220.678k i/100ms Calculating ------------------------------------- json (reuse) 2.589M (± 9.6%) i/s (386.17 ns/i) - 12.925M in 5.071853s json 2.594M (± 6.6%) i/s (385.46 ns/i) - 13.086M in 5.083035s oj 2.250M (± 2.3%) i/s (444.43 ns/i) - 11.255M in 5.004707s Comparison: json (reuse): 2589499.6 i/s json: 2594321.0 i/s - same-ish: difference falls within error oj: 2250064.0 i/s - 1.15x slower == Encoding small hash (65 bytes) ruby 3.3.4 (2024-07-09 revision `be1089c8ec`) +YJIT [arm64-darwin23] Warming up -------------------------------------- json (reuse) 656.373k i/100ms json 644.135k i/100ms oj 650.283k i/100ms Calculating ------------------------------------- json (reuse) 7.202M (± 7.1%) i/s (138.84 ns/i) - 36.101M in 5.051438s json 7.278M (± 1.7%) i/s (137.40 ns/i) - 36.716M in 5.046300s oj 7.036M (± 1.7%) i/s (142.12 ns/i) - 35.766M in 5.084729s Comparison: json (reuse): 7202447.9 i/s json: 7277883.0 i/s - same-ish: difference falls within error oj: 7036115.2 i/s - same-ish: difference falls within error ```	2024-11-01 13:04:24 +09:00
Jean Boussier	7daa1083c9	[ruby/json] Move State#configure back into C While less nice, this open the door to eluding the State object allocation when possible. `5c0d428d4c`	2024-11-01 13:04:24 +09:00
Jean Boussier	5dc3b15b3c	[ruby/json] generator.c: store pretty strings in VALUE Given we expect these to almost always be null, we might as well keep them in RString. And even when provided, assuming we're passed frozen strings we'll save on copying them. This also reduce the size of the struct from 112B to 72B. `6382c231b0`	2024-11-01 13:04:24 +09:00
Jean Boussier	59eebeca02	[ruby/json] Allocate the initial generator buffer on the stack Ref: https://github.com/ruby/json/issues/655 Followup: https://github.com/ruby/json/issues/657 Assuming the generator might be used for fairly small documents we can start with a reasonable buffer size of the stack, and if we outgrow it, we can spill on the heap. In a way this is optimizing for micro-benchmarks, but there are valid use case for fiarly small JSON document in actual real world scenarios, so trashing the GC less in such case make sense. Before: ``` ruby 3.3.4 (2024-07-09 revision `be1089c8ec`) +YJIT [arm64-darwin23] Warming up -------------------------------------- Oj 518.700k i/100ms JSON reuse 483.370k i/100ms Calculating ------------------------------------- Oj 5.722M (± 1.8%) i/s (174.76 ns/i) - 29.047M in 5.077823s JSON reuse 5.278M (± 1.5%) i/s (189.46 ns/i) - 26.585M in 5.038172s Comparison: Oj: 5722283.8 i/s JSON reuse: 5278061.7 i/s - 1.08x slower ``` After: ``` ruby 3.3.4 (2024-07-09 revision `be1089c8ec`) +YJIT [arm64-darwin23] Warming up -------------------------------------- Oj 517.837k i/100ms JSON reuse 548.871k i/100ms Calculating ------------------------------------- Oj 5.693M (± 1.6%) i/s (175.65 ns/i) - 28.481M in 5.004056s JSON reuse 5.855M (± 1.2%) i/s (170.80 ns/i) - 29.639M in 5.063004s Comparison: Oj: 5692985.6 i/s JSON reuse: 5854857.9 i/s - 1.03x faster ``` `fe607f4806`	2024-11-01 13:04:24 +09:00
Jean Boussier	d329896fb5	[ruby/json] Fix a memory leak in #to_json methods Fix: https://github.com/ruby/json/issues/460 The various `to_json` methods must rescue exceptions to free the buffer. ``` require 'json' data = 10_000.times.to_a << BasicObject.new 20.times do 100.times do begin data.to_json rescue NoMethodError end end puts `ps -o rss= -p #{$$}` end ``` ``` 20128 24992 29920 34672 39600 44336 49136 53936 58816 63616 68416 73232 78032 82896 87696 92528 97408 102208 107008 111808 ``` `d227d225ca`	2024-11-01 13:04:24 +09:00
Jean Boussier	f2e51146f8	[ruby/json] Remove dead cases from convert_UTF8_to_* functions `d54063a790`	2024-10-30 10:13:49 +09:00
Jean Boussier	5d176436ce	[ruby/json] Allocate the FBuffer struct on the stack Ref: https://github.com/ruby/json/issues/655 The actual buffer is still on the heap, but this saves a pair of malloc/free. This helps a lot on micro-benchmarks Before: ``` ruby 3.3.4 (2024-07-09 revision `be1089c8ec`) +YJIT [arm64-darwin23] Warming up -------------------------------------- Oj 531.598k i/100ms JSON reuse 417.666k i/100ms Calculating ------------------------------------- Oj 5.735M (± 1.3%) i/s (174.35 ns/i) - 28.706M in 5.005900s JSON reuse 4.604M (± 1.4%) i/s (217.18 ns/i) - 23.389M in 5.080779s Comparison: Oj: 5735475.6 i/s JSON reuse: 4604380.3 i/s - 1.25x slower ``` After: ``` ruby 3.3.4 (2024-07-09 revision `be1089c8ec`) +YJIT [arm64-darwin23] Warming up -------------------------------------- Oj 518.700k i/100ms JSON reuse 483.370k i/100ms Calculating ------------------------------------- Oj 5.722M (± 1.8%) i/s (174.76 ns/i) - 29.047M in 5.077823s JSON reuse 5.278M (± 1.5%) i/s (189.46 ns/i) - 26.585M in 5.038172s Comparison: Oj: 5722283.8 i/s JSON reuse: 5278061.7 i/s - 1.08x slower ``` Bench: ```ruby require 'benchmark/ips' require 'oj' require 'json' json_encoder = JSON::State.new(JSON.dump_default_options) test_data = [1, "string", { a: 1, b: 2 }, [3, 4, 5]] Oj.default_options = Oj.default_options.merge(mode: :compat) Benchmark.ips do \|x\| x.config(time: 5, warmup: 2) x.report("Oj") do Oj.dump(test_data) end x.report("JSON reuse") do json_encoder.generate(test_data) end x.compare!(order: :baseline) end ``` `72110f7992`	2024-10-30 10:13:48 +09:00
Jean Boussier	8018a3121f	[ruby/json] Workaround being loaded alongside a different `json_pure` version Fix: https://github.com/ruby/json/issues/646 Since both `json` and `json_pure` expose the same files, if the versions don't match, the native extension may be loaded with Ruby code that don't match and is incompatible. By doing the `require json/ext/generator/state` from C we ensure we're at least loading that. But this is a dirty workaround for the 2.7.x branch, we should find a better way to fully isolate the two gems. `dfdd4acf36`	2024-10-26 18:44:15 +09:00
Jean Boussier	a5bd0c638a	[ruby/json] Workaround rubygems $LOAD_PATH bug Ref: https://github.com/ruby/json/issues/647 Ref: https://github.com/rubygems/rubygems/pull/6490 Older rubygems are executing `extconf.rb` with a broken `$LOAD_PATH` causing the `json` gem native extension to be loaded with the stdlib version of the `.rb` files. This fails with ``` json/common.rb:82:in `initialize': wrong number of arguments (given 1, expected 0) (ArgumentError) ``` Since this is just for `extconf.rb` we can probably just accept that extra argument and ignore it. The bug was fixed in rubygems 3.4.9 / 2023-03-20 `1f5e849fe0`	2024-10-26 18:44:15 +09:00
Jean Boussier	bfdf02ea72	pretty_generate: don't apply object_nl / array_nl for empty containers Fix: https://github.com/ruby/json/issues/437 Before: ```json { "foo": { }, "bar": [ ] } ``` After: ```json { "foo": {}, "bar": [] } ```	2024-10-26 18:44:15 +09:00
Jean Boussier	fc9f0cb8c5	[ruby/json] JSON.dump / String#to_json: raise on invalid encoding This regressed since 2.7.2. `35407d6635`	2024-10-26 18:44:15 +09:00
Jean Boussier	a052d96103	[ruby/json] Compile with std=c99 `d4968d2e48`	2024-10-26 18:44:15 +09:00
Jean Boussier	cbd933bcf1	[ruby/json] convert_UTF8_to_ASCII_only_JSON: apply the same optimization pass `42edaf7f17`	2024-10-26 18:44:15 +09:00
Jean Boussier	e52b47680e	[ruby/json] Reduce encoding benchmark size Profiling revealed that we were spending lots of time growing the buffer. Buffer operations is definitely something we want to optimize, but for this specific benchmark what we're interested in is UTF-8 scanning performance. Each iteration of the two scaning benchmark were producing 20MB of JSON, now they only produce 5MB. Now: ``` == Encoding mostly utf8 (5001001 bytes) ruby 3.4.0dev (2024-10-18T19:01:45Z master `7be9a333ca`) +YJIT +PRISM [arm64-darwin23] Warming up -------------------------------------- json 35.000 i/100ms oj 36.000 i/100ms rapidjson 10.000 i/100ms Calculating ------------------------------------- json 359.161 (± 1.4%) i/s (2.78 ms/i) - 1.820k in 5.068542s oj 359.699 (± 0.6%) i/s (2.78 ms/i) - 1.800k in 5.004291s rapidjson 99.687 (± 2.0%) i/s (10.03 ms/i) - 500.000 in 5.017321s Comparison: json: 359.2 i/s oj: 359.7 i/s - same-ish: difference falls within error rapidjson: 99.7 i/s - 3.60x slower ``` `1a338532d2`	2024-10-26 18:44:15 +09:00
Jean Boussier	97713ac952	[ruby/json] convert_UTF8_to_JSON: repurpose the escape tables into size tables Since we're looking up the table anyway, we might as well store the UTF-8 char length in it. For single byte characters that don't need escaping we store `0`. This helps on strings with lots of multi-byte characters: Before: ``` == Encoding mostly utf8 (20004001 bytes) ruby 3.3.4 (2024-07-09 revision `be1089c8ec`) +YJIT [arm64-darwin23] Warming up -------------------------------------- json 6.000 i/100ms oj 10.000 i/100ms rapidjson 2.000 i/100ms Calculating ------------------------------------- json 67.978 (± 1.5%) i/s (14.71 ms/i) - 342.000 in 5.033062s oj 100.876 (± 2.0%) i/s (9.91 ms/i) - 510.000 in 5.058080s rapidjson 26.389 (± 7.6%) i/s (37.89 ms/i) - 132.000 in 5.027681s Comparison: json: 68.0 i/s oj: 100.9 i/s - 1.48x faster rapidjson: 26.4 i/s - 2.58x slower ``` After: ``` == Encoding mostly utf8 (20004001 bytes) ruby 3.3.4 (2024-07-09 revision `be1089c8ec`) +YJIT [arm64-darwin23] Warming up -------------------------------------- json 7.000 i/100ms oj 10.000 i/100ms rapidjson 2.000 i/100ms Calculating ------------------------------------- json 75.187 (± 2.7%) i/s (13.30 ms/i) - 378.000 in 5.030111s oj 95.196 (± 2.1%) i/s (10.50 ms/i) - 480.000 in 5.043565s rapidjson 25.969 (± 3.9%) i/s (38.51 ms/i) - 130.000 in 5.011471s Comparison: json: 75.2 i/s oj: 95.2 i/s - 1.27x faster rapidjson: 26.0 i/s - 2.90x slower ``` `51e2631d1f`	2024-10-26 18:44:15 +09:00
Jean Boussier	9f300d0541	[ruby/json] Optimize convert_UTF8_to_JSON for mostly ASCII strings If we assume that even UTF-8 strings are mostly ASCII, we can implement a fast path for the ASCII parts. Before: ``` == Encoding mixed utf8 (20012001 bytes) ruby 3.4.0dev (2024-10-18T15:12:54Z master `d1b5c10957`) +YJIT +PRISM [arm64-darwin23] Warming up -------------------------------------- json 5.000 i/100ms oj 9.000 i/100ms rapidjson 2.000 i/100ms Calculating ------------------------------------- json 49.403 (± 2.0%) i/s (20.24 ms/i) - 250.000 in 5.062647s oj 100.120 (± 2.0%) i/s (9.99 ms/i) - 504.000 in 5.035349s rapidjson 26.404 (± 0.0%) i/s (37.87 ms/i) - 132.000 in 5.001025s Comparison: json: 49.4 i/s oj: 100.1 i/s - 2.03x faster rapidjson: 26.4 i/s - 1.87x slower ``` After: ``` == Encoding mixed utf8 (20012001 bytes) ruby 3.4.0dev (2024-10-18T15:12:54Z master `d1b5c10957`) +YJIT +PRISM [arm64-darwin23] Warming up -------------------------------------- json 10.000 i/100ms oj 9.000 i/100ms rapidjson 2.000 i/100ms Calculating ------------------------------------- json 95.686 (± 2.1%) i/s (10.45 ms/i) - 480.000 in 5.018575s oj 96.875 (± 2.1%) i/s (10.32 ms/i) - 486.000 in 5.019097s rapidjson 26.260 (± 3.8%) i/s (38.08 ms/i) - 132.000 in 5.033151s Comparison: json: 95.7 i/s oj: 96.9 i/s - same-ish: difference falls within error rapidjson: 26.3 i/s - 3.64x slower ``` `f8166c2d7f`	2024-10-26 18:44:15 +09:00
Peter Zhu	48899d56a9	[ruby/json] Sync changes Some changes were missed in the automatic sync.	2024-10-17 21:07:54 +02:00
Peter Zhu	e4330536d2	[ruby/json] Fix State#max_nesting= Returning state->max_nesting is not valid because it's not a Ruby object. `6679ceb`	2024-10-17 13:39:48 -04:00
Jean Boussier	a7317f53e0	Add a fast path for ASCII strings This optimization is based on a few assumptions: - Most strings are ASCII only. - Most strings had their coderange scanned already. If the above is true, then by checking the string coderange, we can use a much more streamlined function to encode ASCII strings. Before: ``` == Encoding twitter.json (466906 bytes) ruby 3.4.0preview2 (2024-10-07 master `32c733f57b`) +YJIT +PRISM [arm64-darwin23] Warming up -------------------------------------- json 140.000 i/100ms oj 230.000 i/100ms rapidjson 108.000 i/100ms Calculating ------------------------------------- json 1.464k (± 1.4%) i/s (682.83 μs/i) - 7.420k in 5.067573s oj 2.338k (± 1.5%) i/s (427.64 μs/i) - 11.730k in 5.017336s rapidjson 1.075k (± 1.6%) i/s (930.40 μs/i) - 5.400k in 5.025469s Comparison: json: 1464.5 i/s oj: 2338.4 i/s - 1.60x faster rapidjson: 1074.8 i/s - 1.36x slower ``` After: ``` == Encoding twitter.json (466906 bytes) ruby 3.4.0preview2 (2024-10-07 master `32c733f57b`) +YJIT +PRISM [arm64-darwin23] Warming up -------------------------------------- json 189.000 i/100ms oj 228.000 i/100ms rapidjson 108.000 i/100ms Calculating ------------------------------------- json 1.903k (± 1.2%) i/s (525.55 μs/i) - 9.639k in 5.066521s oj 2.306k (± 1.3%) i/s (433.71 μs/i) - 11.628k in 5.044096s rapidjson 1.069k (± 2.4%) i/s (935.38 μs/i) - 5.400k in 5.053794s Comparison: json: 1902.8 i/s oj: 2305.7 i/s - 1.21x faster rapidjson: 1069.1 i/s - 1.78x slower ```	2024-10-17 15:21:34 +00:00
Jean Boussier	df48f597cf	[ruby/json] Get rid of some more outdated compatibility code All these macros are available on Ruby 2.3+ `227885f460`	2024-10-17 13:02:13 +00:00
Jean Boussier	a1c420c740	[ruby/json] generator.c: reduce the number of globals Most of these classes and modules don't need to be global variables `b783445ec9`	2024-10-17 11:35:32 +00:00
Jean Boussier	43e08133c3	[ruby/json] Convert Generator initialize and configure method into Ruby This helps very marginally with allocation speed. `25db79dfaa`	2024-10-17 11:35:32 +00:00
Yusuke Endoh	233f63c7fb	[ruby/json] Use `RB_ENCODING_GET` instead of `rb_enc_get` to improve performance This speeds up `JSON.generate` by about 12% in a benchmark. `4329e30826`	2024-10-17 08:54:48 +00:00
Yusuke Endoh	0b4257efa3	[ruby/json] Apply RB_UNLIKELY for less frequently used options This speeds up `JSON.generate` by about 4% in a benchmark. `6471710cfc`	2024-10-17 08:54:47 +00:00

1 2 3

140 commits