mirror of
https://github.com/ruby/ruby.git
synced 2025-08-23 21:14:23 +02:00

Fix: https://github.com/ruby/json/issues/655 For very small documents, the biggest performance gap with alternatives is that the API impose that we allocate the `State` object. In a real world app this doesn't make much of a difference, but when running in a micro-benchmark this doubles the allocations, causing twice the amount of GC runs, making us look bad. However, unless we have to call a `to_json` method, the `State` object isn't visible, so with some refactoring, we can elude that allocation entirely. Instead we allocate the State internal struct on the stack, and if we need to call a `to_json` method, we allocate the `State` and spill the struct on the heap. As a result, `JSON.generate` is now as fast as re-using a `State` instance, as long as only primitives are generated. Before: ``` == Encoding small mixed (34 bytes) ruby 3.3.4 (2024-07-09 revisionbe1089c8ec
) +YJIT [arm64-darwin23] Warming up -------------------------------------- json (reuse) 598.654k i/100ms json 400.542k i/100ms oj 533.353k i/100ms Calculating ------------------------------------- json (reuse) 6.371M (± 8.6%) i/s (156.96 ns/i) - 31.729M in 5.059195s json 4.120M (± 6.6%) i/s (242.72 ns/i) - 20.828M in 5.090549s oj 5.622M (± 6.4%) i/s (177.86 ns/i) - 28.268M in 5.061473s Comparison: json (reuse): 6371126.6 i/s oj: 5622452.0 i/s - same-ish: difference falls within error json: 4119991.1 i/s - 1.55x slower == Encoding small nested array (121 bytes) ruby 3.3.4 (2024-07-09 revisionbe1089c8ec
) +YJIT [arm64-darwin23] Warming up -------------------------------------- json (reuse) 248.125k i/100ms json 215.255k i/100ms oj 217.531k i/100ms Calculating ------------------------------------- json (reuse) 2.628M (± 6.1%) i/s (380.55 ns/i) - 13.151M in 5.030281s json 2.185M (± 6.7%) i/s (457.74 ns/i) - 10.978M in 5.057655s oj 2.217M (± 6.7%) i/s (451.10 ns/i) - 11.094M in 5.044844s Comparison: json (reuse): 2627799.4 i/s oj: 2216824.8 i/s - 1.19x slower json: 2184669.5 i/s - 1.20x slower == Encoding small hash (65 bytes) ruby 3.3.4 (2024-07-09 revisionbe1089c8ec
) +YJIT [arm64-darwin23] Warming up -------------------------------------- json (reuse) 641.334k i/100ms json 322.745k i/100ms oj 642.450k i/100ms Calculating ------------------------------------- json (reuse) 7.133M (± 6.5%) i/s (140.19 ns/i) - 35.915M in 5.068201s json 4.615M (± 7.0%) i/s (216.70 ns/i) - 22.915M in 5.003718s oj 6.912M (± 6.4%) i/s (144.68 ns/i) - 34.692M in 5.047690s Comparison: json (reuse): 7133123.3 i/s oj: 6911977.1 i/s - same-ish: difference falls within error json: 4614696.6 i/s - 1.55x slower ``` After: ``` == Encoding small mixed (34 bytes) ruby 3.3.4 (2024-07-09 revisionbe1089c8ec
) +YJIT [arm64-darwin23] Warming up -------------------------------------- json (reuse) 572.751k i/100ms json 457.741k i/100ms oj 512.247k i/100ms Calculating ------------------------------------- json (reuse) 6.324M (± 6.9%) i/s (158.12 ns/i) - 31.501M in 5.023093s json 6.263M (± 6.9%) i/s (159.66 ns/i) - 31.126M in 5.017086s oj 5.569M (± 6.6%) i/s (179.56 ns/i) - 27.661M in 5.003739s Comparison: json (reuse): 6324183.5 i/s json: 6263204.9 i/s - same-ish: difference falls within error oj: 5569049.2 i/s - same-ish: difference falls within error == Encoding small nested array (121 bytes) ruby 3.3.4 (2024-07-09 revisionbe1089c8ec
) +YJIT [arm64-darwin23] Warming up -------------------------------------- json (reuse) 258.505k i/100ms json 242.335k i/100ms oj 220.678k i/100ms Calculating ------------------------------------- json (reuse) 2.589M (± 9.6%) i/s (386.17 ns/i) - 12.925M in 5.071853s json 2.594M (± 6.6%) i/s (385.46 ns/i) - 13.086M in 5.083035s oj 2.250M (± 2.3%) i/s (444.43 ns/i) - 11.255M in 5.004707s Comparison: json (reuse): 2589499.6 i/s json: 2594321.0 i/s - same-ish: difference falls within error oj: 2250064.0 i/s - 1.15x slower == Encoding small hash (65 bytes) ruby 3.3.4 (2024-07-09 revisionbe1089c8ec
) +YJIT [arm64-darwin23] Warming up -------------------------------------- json (reuse) 656.373k i/100ms json 644.135k i/100ms oj 650.283k i/100ms Calculating ------------------------------------- json (reuse) 7.202M (± 7.1%) i/s (138.84 ns/i) - 36.101M in 5.051438s json 7.278M (± 1.7%) i/s (137.40 ns/i) - 36.716M in 5.046300s oj 7.036M (± 1.7%) i/s (142.12 ns/i) - 35.766M in 5.084729s Comparison: json (reuse): 7202447.9 i/s json: 7277883.0 i/s - same-ish: difference falls within error oj: 7036115.2 i/s - same-ish: difference falls within error ```
85 lines
3.7 KiB
Ruby
85 lines
3.7 KiB
Ruby
require "benchmark/ips"
|
|
require "json"
|
|
require "oj"
|
|
|
|
Oj.default_options = Oj.default_options.merge(mode: :compat)
|
|
|
|
if ENV["ONLY"]
|
|
RUN = ENV["ONLY"].split(/[,: ]/).map{|x| [x.to_sym, true] }.to_h
|
|
RUN.default = false
|
|
elsif ENV["EXCEPT"]
|
|
RUN = ENV["EXCEPT"].split(/[,: ]/).map{|x| [x.to_sym, false] }.to_h
|
|
RUN.default = true
|
|
else
|
|
RUN = Hash.new(true)
|
|
end
|
|
|
|
def implementations(ruby_obj)
|
|
state = JSON::State.new(JSON.dump_default_options)
|
|
{
|
|
json_state: ["json (reuse)", proc { state.generate(ruby_obj) }],
|
|
json: ["json", proc { JSON.generate(ruby_obj) }],
|
|
oj: ["oj", proc { Oj.dump(ruby_obj) }],
|
|
}
|
|
end
|
|
|
|
def benchmark_encoding(benchmark_name, ruby_obj, check_expected: true, except: [])
|
|
json_output = JSON.dump(ruby_obj)
|
|
puts "== Encoding #{benchmark_name} (#{json_output.bytesize} bytes)"
|
|
|
|
impls = implementations(ruby_obj).select { |name| RUN[name] }
|
|
except.each { |i| impls.delete(i) }
|
|
|
|
Benchmark.ips do |x|
|
|
expected = ::JSON.dump(ruby_obj) if check_expected
|
|
impls.values.each do |name, block|
|
|
begin
|
|
result = block.call
|
|
if check_expected && expected != result
|
|
puts "#{name} does not match expected output. Skipping"
|
|
puts "Expected:" + '-' * 40
|
|
puts expected
|
|
puts "Actual:" + '-' * 40
|
|
puts result
|
|
puts '-' * 40
|
|
next
|
|
end
|
|
rescue => error
|
|
puts "#{name} unsupported (#{error})"
|
|
next
|
|
end
|
|
x.report(name, &block)
|
|
end
|
|
x.compare!(order: :baseline)
|
|
end
|
|
puts
|
|
end
|
|
|
|
# On the first two micro benchmarks, the limitting factor is that we have to create a Generator::State object for every
|
|
# call to `JSON.dump`, so we cause 2 allocations per call where alternatives only do one allocation.
|
|
# The performance difference is mostly more time spent in GC because of this extra pressure.
|
|
# If we re-use the same `JSON::State` instance, we're faster than Oj on the array benchmark, and much closer
|
|
# on the Hash one.
|
|
benchmark_encoding "small mixed", [1, "string", { a: 1, b: 2 }, [3, 4, 5]]
|
|
benchmark_encoding "small nested array", [[1,2,3,4,5]]*10
|
|
benchmark_encoding "small hash", { "username" => "jhawthorn", "id" => 123, "event" => "wrote json serializer" }
|
|
|
|
# On these benchmarks we perform well. Either on par or very closely faster/slower
|
|
benchmark_encoding "integers", (1_000_000..1_001_000).to_a, except: %i(json_state)
|
|
benchmark_encoding "mixed utf8", ([("a" * 5000) + "€" + ("a" * 5000)] * 500), except: %i(json_state)
|
|
benchmark_encoding "mostly utf8", ([("€" * 3333)] * 500), except: %i(json_state)
|
|
benchmark_encoding "twitter.json", JSON.load_file("#{__dir__}/data/twitter.json"), except: %i(json_state)
|
|
benchmark_encoding "citm_catalog.json", JSON.load_file("#{__dir__}/data/citm_catalog.json"), except: %i(json_state)
|
|
|
|
# This benchmark spent the overwhelming majority of its time in `ruby_dtoa`. We rely on Ruby's implementation
|
|
# which uses a relatively old version of dtoa.c from David M. Gay.
|
|
# Oj in `compat` mode is ~10% slower than `json`, but in its default mode is noticeably faster here because
|
|
# it limits the precision of floats, breaking roundtriping. That's not something we should emulate.
|
|
#
|
|
# Since a few years there are now much faster float to string implementations such as Ryu, Dragonbox, etc,
|
|
# but all these are implemented in C++11 or newer, making it hard if not impossible to include them.
|
|
# Short of a pure C99 implementation of these newer algorithms, there isn't much that can be done to match
|
|
# Oj speed without losing precision.
|
|
benchmark_encoding "canada.json", JSON.load_file("#{__dir__}/data/canada.json"), check_expected: false, except: %i(json_state)
|
|
|
|
benchmark_encoding "many #to_json calls", [{object: Object.new, int: 12, float: 54.3, class: Float, time: Time.now, date: Date.today}] * 20, except: %i(json_state)
|