mirror of
https://github.com/ruby/ruby.git
synced 2025-08-15 13:39:04 +02:00
[ruby/json] convert_UTF8_to_JSON: repurpose the escape tables into size tables
Since we're looking up the table anyway, we might as well store the UTF-8 char length in it. For single byte characters that don't need escaping we store `0`. This helps on strings with lots of multi-byte characters: Before: ``` == Encoding mostly utf8 (20004001 bytes) ruby 3.3.4 (2024-07-09 revisionbe1089c8ec
) +YJIT [arm64-darwin23] Warming up -------------------------------------- json 6.000 i/100ms oj 10.000 i/100ms rapidjson 2.000 i/100ms Calculating ------------------------------------- json 67.978 (± 1.5%) i/s (14.71 ms/i) - 342.000 in 5.033062s oj 100.876 (± 2.0%) i/s (9.91 ms/i) - 510.000 in 5.058080s rapidjson 26.389 (± 7.6%) i/s (37.89 ms/i) - 132.000 in 5.027681s Comparison: json: 68.0 i/s oj: 100.9 i/s - 1.48x faster rapidjson: 26.4 i/s - 2.58x slower ``` After: ``` == Encoding mostly utf8 (20004001 bytes) ruby 3.3.4 (2024-07-09 revisionbe1089c8ec
) +YJIT [arm64-darwin23] Warming up -------------------------------------- json 7.000 i/100ms oj 10.000 i/100ms rapidjson 2.000 i/100ms Calculating ------------------------------------- json 75.187 (± 2.7%) i/s (13.30 ms/i) - 378.000 in 5.030111s oj 95.196 (± 2.1%) i/s (10.50 ms/i) - 480.000 in 5.043565s rapidjson 25.969 (± 3.9%) i/s (38.51 ms/i) - 130.000 in 5.011471s Comparison: json: 75.2 i/s oj: 95.2 i/s - 1.27x faster rapidjson: 26.0 i/s - 2.90x slower ```51e2631d1f
This commit is contained in:
parent
9f300d0541
commit
97713ac952
2 changed files with 101 additions and 66 deletions
|
@ -59,6 +59,9 @@ end
|
|||
benchmark_encoding "small nested array", [[1,2,3,4,5]]*10
|
||||
benchmark_encoding "small hash", { "username" => "jhawthorn", "id" => 123, "event" => "wrote json serializer" }
|
||||
|
||||
# On this one we're a bit slower (~25%).
|
||||
benchmark_encoding "mostly utf8", ([("€" * 3333)] * 2000), except: %i(json_state)
|
||||
|
||||
# On these three benchmarks we perform well. Either on par or very closely faster/slower
|
||||
benchmark_encoding "mixed utf8", ([("a" * 5000) + "€" + ("a" * 5000)] * 2000), except: %i(json_state)
|
||||
benchmark_encoding "twitter.json", JSON.load_file("#{__dir__}/data/twitter.json"), except: %i(json_state)
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue