Jean Boussier
b8b33efd4d
[ruby/json] Remove String#-@ check in extconf.rb
...
Now that older rubies have been droped, we no longer need to check
for all that.
35cf2b84e0
2024-11-01 13:04:24 +09:00
Jean Boussier
165cc6cf40
[ruby/json] json_string_unescape: assume the string doesn't need escaping
...
If that assumption holds true, then we don't need to copy the
string into a buffer to unescape it. For small string is just saves
copying, but for large ones it also saves a malloc/free combo.
Before:
```
== Parsing twitter.json (567916 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec
) +YJIT [arm64-darwin23]
Warming up --------------------------------------
json 52.000 i/100ms
oj 61.000 i/100ms
oj strict 70.000 i/100ms
Oj::Parser 71.000 i/100ms
rapidjson 55.000 i/100ms
Calculating -------------------------------------
json 510.111 (± 2.9%) i/s (1.96 ms/i) - 2.548k in 5.000029s
oj 610.232 (± 3.1%) i/s (1.64 ms/i) - 3.050k in 5.003725s
oj strict 713.231 (± 3.2%) i/s (1.40 ms/i) - 3.570k in 5.010902s
Oj::Parser 762.598 (± 3.0%) i/s (1.31 ms/i) - 3.834k in 5.033130s
rapidjson 553.029 (± 7.4%) i/s (1.81 ms/i) - 2.750k in 5.022630s
Comparison:
json: 510.1 i/s
Oj::Parser: 762.6 i/s - 1.49x faster
oj strict: 713.2 i/s - 1.40x faster
oj: 610.2 i/s - 1.20x faster
rapidjson: 553.0 i/s - same-ish: difference falls within error
== Parsing citm_catalog.json (1727030 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec
) +YJIT [arm64-darwin23]
Warming up --------------------------------------
json 28.000 i/100ms
oj 33.000 i/100ms
oj strict 37.000 i/100ms
Oj::Parser 43.000 i/100ms
rapidjson 38.000 i/100ms
Calculating -------------------------------------
json 303.853 (± 3.6%) i/s (3.29 ms/i) - 1.540k in 5.076079s
oj 348.009 (± 2.0%) i/s (2.87 ms/i) - 1.749k in 5.027738s
oj strict 396.679 (± 3.3%) i/s (2.52 ms/i) - 1.998k in 5.042271s
Oj::Parser 406.699 (± 2.2%) i/s (2.46 ms/i) - 2.064k in 5.077587s
rapidjson 393.463 (± 3.3%) i/s (2.54 ms/i) - 1.976k in 5.028501s
Comparison:
json: 303.9 i/s
Oj::Parser: 406.7 i/s - 1.34x faster
oj strict: 396.7 i/s - 1.31x faster
rapidjson: 393.5 i/s - 1.29x faster
oj: 348.0 i/s - 1.15x faster
```
After:
```
== Parsing twitter.json (567916 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec
) +YJIT [arm64-darwin23]
Warming up --------------------------------------
json 56.000 i/100ms
oj 62.000 i/100ms
oj strict 72.000 i/100ms
Oj::Parser 77.000 i/100ms
rapidjson 55.000 i/100ms
Calculating -------------------------------------
json 568.025 (± 2.1%) i/s (1.76 ms/i) - 2.856k in 5.030272s
oj 630.936 (± 1.4%) i/s (1.58 ms/i) - 3.162k in 5.012630s
oj strict 705.784 (±11.2%) i/s (1.42 ms/i) - 3.456k in 5.006706s
Oj::Parser 783.989 (± 1.7%) i/s (1.28 ms/i) - 3.927k in 5.010343s
rapidjson 557.630 (± 2.0%) i/s (1.79 ms/i) - 2.805k in 5.032388s
Comparison:
json: 568.0 i/s
Oj::Parser: 784.0 i/s - 1.38x faster
oj strict: 705.8 i/s - 1.24x faster
oj: 630.9 i/s - 1.11x faster
rapidjson: 557.6 i/s - same-ish: difference falls within error
== Parsing citm_catalog.json (1727030 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec
) +YJIT [arm64-darwin23]
Warming up --------------------------------------
json 29.000 i/100ms
oj 33.000 i/100ms
oj strict 38.000 i/100ms
Oj::Parser 43.000 i/100ms
rapidjson 37.000 i/100ms
Calculating -------------------------------------
json 319.271 (± 3.1%) i/s (3.13 ms/i) - 1.595k in 5.001128s
oj 347.946 (± 1.7%) i/s (2.87 ms/i) - 1.749k in 5.028395s
oj strict 396.914 (± 3.0%) i/s (2.52 ms/i) - 2.014k in 5.079645s
Oj::Parser 409.311 (± 2.7%) i/s (2.44 ms/i) - 2.064k in 5.046626s
rapidjson 394.752 (± 1.5%) i/s (2.53 ms/i) - 1.998k in 5.062776s
Comparison:
json: 319.3 i/s
Oj::Parser: 409.3 i/s - 1.28x faster
oj strict: 396.9 i/s - 1.24x faster
rapidjson: 394.8 i/s - 1.24x faster
oj: 347.9 i/s - 1.09x faster
```
7e0f66546a
2024-11-01 13:04:24 +09:00
Jean Boussier
081689b9e2
[ruby/json] parser.rl: extract build_string
...
7e557ee291
2024-11-01 13:04:24 +09:00
Benoit Daloze
6412e6f6c3
[ruby/json] Use String#encode instead of rb_str_conv_enc()
...
* rb_str_conv_enc() returns the source string unmodified
if the conversion did not work. But we should be consistent with
the generator here and only accept BINARY or convertible to UTF-8.
1344ad6f66
2024-11-01 13:04:24 +09:00
Jean Boussier
3782600f0f
[ruby/json] Emit warnings when dumping binary strings
...
Because of it's Ruby 1.8 heritage, the C extension doesn't care
much about strings encoding. We should get stricter over time.
42402fc13f
2024-11-01 13:04:24 +09:00
Jean Boussier
f2b8829df0
Deprecate unsafe default options of JSON.load
...
[Feature #19528 ]
Ref: https://bugs.ruby-lang.org/issues/19528
`load` is understood as the default method for serializer kind of libraries, and
the default options of `JSON.load` has caused many security vulnerabilities over the
years.
The plan is to do like YAML/Psych, deprecate these default options and direct
users toward using `JSON.unsafe_load` so at least it's obvious it should be
used against untrusted data.
2024-11-01 13:04:24 +09:00
Jean Boussier
59eebeca02
[ruby/json] Allocate the initial generator buffer on the stack
...
Ref: https://github.com/ruby/json/issues/655
Followup: https://github.com/ruby/json/issues/657
Assuming the generator might be used for fairly small documents
we can start with a reasonable buffer size of the stack, and if
we outgrow it, we can spill on the heap.
In a way this is optimizing for micro-benchmarks, but there are
valid use case for fiarly small JSON document in actual real world
scenarios, so trashing the GC less in such case make sense.
Before:
```
ruby 3.3.4 (2024-07-09 revision be1089c8ec
) +YJIT [arm64-darwin23]
Warming up --------------------------------------
Oj 518.700k i/100ms
JSON reuse 483.370k i/100ms
Calculating -------------------------------------
Oj 5.722M (± 1.8%) i/s (174.76 ns/i) - 29.047M in 5.077823s
JSON reuse 5.278M (± 1.5%) i/s (189.46 ns/i) - 26.585M in 5.038172s
Comparison:
Oj: 5722283.8 i/s
JSON reuse: 5278061.7 i/s - 1.08x slower
```
After:
```
ruby 3.3.4 (2024-07-09 revision be1089c8ec
) +YJIT [arm64-darwin23]
Warming up --------------------------------------
Oj 517.837k i/100ms
JSON reuse 548.871k i/100ms
Calculating -------------------------------------
Oj 5.693M (± 1.6%) i/s (175.65 ns/i) - 28.481M in 5.004056s
JSON reuse 5.855M (± 1.2%) i/s (170.80 ns/i) - 29.639M in 5.063004s
Comparison:
Oj: 5692985.6 i/s
JSON reuse: 5854857.9 i/s - 1.03x faster
```
fe607f4806
2024-11-01 13:04:24 +09:00
Peter Zhu
e077be119b
[ruby/json] Remove double semicolon at end of line in parser
...
f6d6ca3c17
2024-10-30 10:13:49 +09:00
Jean Boussier
5d176436ce
[ruby/json] Allocate the FBuffer struct on the stack
...
Ref: https://github.com/ruby/json/issues/655
The actual buffer is still on the heap, but this saves a pair
of malloc/free.
This helps a lot on micro-benchmarks
Before:
```
ruby 3.3.4 (2024-07-09 revision be1089c8ec
) +YJIT [arm64-darwin23]
Warming up --------------------------------------
Oj 531.598k i/100ms
JSON reuse 417.666k i/100ms
Calculating -------------------------------------
Oj 5.735M (± 1.3%) i/s (174.35 ns/i) - 28.706M in 5.005900s
JSON reuse 4.604M (± 1.4%) i/s (217.18 ns/i) - 23.389M in 5.080779s
Comparison:
Oj: 5735475.6 i/s
JSON reuse: 4604380.3 i/s - 1.25x slower
```
After:
```
ruby 3.3.4 (2024-07-09 revision be1089c8ec
) +YJIT [arm64-darwin23]
Warming up --------------------------------------
Oj 518.700k i/100ms
JSON reuse 483.370k i/100ms
Calculating -------------------------------------
Oj 5.722M (± 1.8%) i/s (174.76 ns/i) - 29.047M in 5.077823s
JSON reuse 5.278M (± 1.5%) i/s (189.46 ns/i) - 26.585M in 5.038172s
Comparison:
Oj: 5722283.8 i/s
JSON reuse: 5278061.7 i/s - 1.08x slower
```
Bench:
```ruby
require 'benchmark/ips'
require 'oj'
require 'json'
json_encoder = JSON::State.new(JSON.dump_default_options)
test_data = [1, "string", { a: 1, b: 2 }, [3, 4, 5]]
Oj.default_options = Oj.default_options.merge(mode: :compat)
Benchmark.ips do |x|
x.config(time: 5, warmup: 2)
x.report("Oj") do
Oj.dump(test_data)
end
x.report("JSON reuse") do
json_encoder.generate(test_data)
end
x.compare!(order: :baseline)
end
```
72110f7992
2024-10-30 10:13:48 +09:00
Jean Boussier
a3c21756e9
[ruby/json] Use smaller types for JSON_Parser boolean fields
...
7f079b25be
2024-10-26 18:44:15 +09:00
Jean Boussier
fc9f0cb8c5
[ruby/json] JSON.dump / String#to_json: raise on invalid encoding
...
This regressed since 2.7.2.
35407d6635
2024-10-26 18:44:15 +09:00
Jean Boussier
70f554efb4
[ruby/json] raise_parse_error: avoid UB
...
Fix: https://github.com/ruby/json/pull/625
Declaring the buffer in a sub block cause bugs on some compilers.
90967c9eb0
2024-10-26 18:44:15 +09:00
Étienne Barrié
82f7550f65
Use frozen string literals
...
Co-authored-by: Jean Boussier <jean.boussier@gmail.com>
2024-10-26 18:44:15 +09:00
Jean Boussier
a052d96103
[ruby/json] Compile with std=c99
...
d4968d2e48
2024-10-26 18:44:15 +09:00
Jean Boussier
07fc21cfad
[ruby/json] Ext::Parser avoid costly check on decimal_class when it is nil
...
Closes: https://github.com/ruby/json/pull/512
d882a45d82
Co-Authored-By: lukeg <luke.gru@gmail.com>
2024-10-26 18:44:15 +09:00
Jean Boussier
9045258c88
[ruby/json] Limit the size of ParserError exception messages
...
Fix: https://github.com/ruby/json/issues/534
Only include up to 32 bytes of unparseable the source.
f44995cfb6
2024-10-26 18:44:15 +09:00
Jean Boussier
7dfc1f3d66
[ruby/json] parser.c: refactor raise_parse_error
...
09e1df2643
2024-10-26 18:44:15 +09:00
Jean Boussier
618085f48d
[ruby/json] Get rid of the remaining tabs.
...
1a9af430d2
2024-10-26 18:44:15 +09:00
Jean Boussier
e0f8732023
Reduce allocations in parse
and load
argument handling
...
Avoid needless hash allocations and such that degrade performance
significantly on micro-benchmarks.
2024-10-26 18:44:15 +09:00
Jean Boussier
8e7e638221
Add more precise documentation for object_class
and array_class
...
Fix: https://github.com/ruby/json/issues/419
2024-10-26 18:44:15 +09:00
Takumasa Ochi
20dc1e5c25
[ruby/json] Always dup argument to preserve original encoding for force_encoding
...
db9a489ca2
2024-10-18 11:30:42 +09:00
Jean Boussier
c4d4c6b846
[ruby/json] Speedup Parser initialization
...
Extracted from: https://github.com/ruby/json/pull/512
Use `rb_hash_lookup2` to check for hash key existence instead
of going through `rb_funcall`.
43835a0d13
Co-Authored-By: lukeg <luke.gru@gmail.com>
2024-10-18 11:28:12 +09:00
Peter Zhu
48899d56a9
[ruby/json] Sync changes
...
Some changes were missed in the automatic sync.
2024-10-17 21:07:54 +02:00
Jean Boussier
df48f597cf
[ruby/json] Get rid of some more outdated compatibility code
...
All these macros are available on Ruby 2.3+
227885f460
2024-10-17 13:02:13 +00:00
Jean Boussier
6105bae331
[ruby/json] Get rid of compatibility code for older rubies
...
All of these are for rubies older than 2.3.
811297f86a
2024-10-17 12:22:16 +00:00
Hiroshi SHIBATA
8a79f345a2
[ruby/json] Unicode string like § is not allowed in C files at ruby/ruby repo
...
53409bcc74
2024-10-08 14:10:05 +09:00
Luke T. Shumaker
74d459fd52
[ruby/json] Adjust to the CVTUTF code being gone
...
I, Luke T. Shumaker, am the sole author of the added code.
I did not reference CVTUTF when writing it. I did reference the
Unicode standard (15.0.0), the Wikipedia article on UTF-8, and the
Wikipedia article on UTF-16. When I saw some tests fail, I did
reference the old deleted code (but a JSON-specific part, inherently
not as based on CVTUTF) to determine that script_safe should also
escape U+2028 and U+2029.
I targeted simplicity and clarity when writing the code--it can likely
be optimized. In my mind, the obvious next optimization is to have it
combine contiguous non-escaped characters into just one call to
fbuffer_append(), instead of calling fbuffer_append() for each
character.
Regarding the use of the "modern" types `uint32_t`, `uint16_t`, and
`bool`:
- ruby.h is guaranteed to give us uint32_t and uint16_t.
- Since Ruby 3.0.0, ruby.h is guaranteed to give us bool... but we
support down to Ruby 2.3. But, ruby.h is guaranteed to give us
HAVE_STDBOOL_H for the C99 stdbool.h; so use that to include
stdbool.h if we can, and if not then fall back to a copy of the
same bool definition that Ruby 3.0.5 uses with C89.
c96351f874
2024-10-08 14:10:05 +09:00
Luke T. Shumaker
6e47968929
[ruby/json] Delete code that is based on CVTUTF
...
I did this based on manual inspection, comparing the code to my re-created
history of CVTUTF at https://git.lukeshu.com/2git/cvtutf/ (created by the
scripts at https://git.lukeshu.com/2git/cvtutf-make/ )
0819553144
2024-10-08 14:10:05 +09:00
Jean Boussier
d612f9fd34
[flori/json] Remove outdated ifdef checks
...
`json` requires Ruby 2.3, so `HAVE_RUBY_ENCODING_H` and `HAVE_RB_ENC_RAISE`
are always true.
5c8dc6b70a
2024-09-03 11:51:51 +09:00
Jean Boussier
c5ae432ec8
[flori/json] Cleanup useless ifdef
...
The json gem now requires Ruby 2.3, so there is no point keeping
compatibility code for older releases that don't have the
TypedData API.
45c86e153f
2024-06-04 12:23:48 +09:00
卜部昌平
c844968b72
ruby tool/update-deps --fix
2024-04-27 21:55:28 +09:00
Hiroshi SHIBATA
86045fca24
Manually merged from flori/json
...
> https://github.com/flori/json/pull/525
> Rename escape_slash in script_safe and also escape E+2028 and E+2029
Co-authored-by: Jean Boussier <jean.boussier@gmail.com>
> https://github.com/flori/json/pull/454
> Remove unnecessary initialization of create_id in JSON.parse()
Co-authored-by: Watson <watson1978@gmail.com>
2023-12-01 16:47:06 +09:00
Jean Boussier
698cb84062
Use ruby_xfree to free buffers
...
They are allocated with ruby_xmalloc, they should be freed with
ruby_xfree.
2023-12-01 16:47:06 +09:00
John Hawthorn
4b770527c2
[flori/json] Fix "unexpected token" offset for Infinity
...
Previously in the JSON::Ext parser, when we encountered an "Infinity"
token (and weren't allowing NaN/Infinity) we would try to display the
"unexpected token" at the character before.
42ac170712
2023-12-01 16:47:06 +09:00
Nobuyoshi Nakada
19486ebd72
[flori/json] Re-generate parser.c
...
82a75ba98e
2023-07-19 00:02:58 +09:00
Nobuyoshi Nakada
104089ce02
[flori/json] [DOC] Remove duplicate sentence
...
ed242667b4
2023-07-19 00:02:58 +09:00
Nobuyoshi Nakada
f1f84ca71c
[flori/json] Remove HAVE_RB_SCAN_ARGS_OPTIONAL_HASH
check
...
This macro is defined since ruby 2.1, which is older than the required
ruby version.
dd1d54e78a
2023-07-19 00:02:58 +09:00
Dimitar Haralanov
9977462fd9
[flori/json] Rename JSON::ParseError to JSON:ParserError
...
20b80ca317
2023-07-18 12:25:54 +09:00
Matt Valentine-House
5e4b80177e
Update the depend files
2023-02-28 09:09:00 -08:00
Matt Valentine-House
f38c6552f9
Remove intern/gc.h from Make deps
2023-02-27 10:11:56 -08:00
Nobuyoshi Nakada
899ea35035
Extract include/ruby/internal/attr/packed_struct.h
...
Split `PACKED_STRUCT` and `PACKED_STRUCT_UNALIGNED` macros into the
macros bellow:
* `RBIMPL_ATTR_PACKED_STRUCT_BEGIN`
* `RBIMPL_ATTR_PACKED_STRUCT_END`
* `RBIMPL_ATTR_PACKED_STRUCT_UNALIGNED_BEGIN`
* `RBIMPL_ATTR_PACKED_STRUCT_UNALIGNED_END`
2023-02-08 12:34:13 +09:00
Jean Boussier
66b52f046f
[flori/json] Stop including the parser source __LINE__ in exceptions
...
It makes testing for JSON errors very tedious. You either have
to use a Regexp or to regularly update all your assertions
when JSON is upgraded.
de9eb1d28e
2022-07-29 19:10:10 +09:00
Andrew Bromwich
a15d0e267a
[flori/json] Fix parser bug for empty string allocation
...
When `HAVE_RB_ENC_INTERNED_STR` is enabled it is possible to
pass through a null pointer to `rb_enc_interned_str` resulting
in a segfault
Fixes #495
b59368a8c2
2022-05-20 17:49:13 +09:00
Hiroshi SHIBATA
767f3904ee
[flori/json] Doc: Improve documentation on JSON#parse and JSON#parse!
...
75ada77b96
Co-authored-by: Bruno Gomes da Silva <brunojabs@gmail.com>
2022-05-20 17:49:13 +09:00
Peter Zhu
2d5ecd60a5
[Feature #18249 ] Update dependencies
2022-02-22 09:55:21 -05:00
Nobuyoshi Nakada
ac152b3cac
Update dependencies
2021-11-21 16:21:18 +09:00
卜部昌平
5c167a9778
ruby tool/update-deps --fix
2021-10-05 14:18:23 +09:00
Nobuyoshi Nakada
1d170fdc6d
ext/json/parser/parser.h: Add fallback MAYBE_UNUSED
...
e2ad91fc20
2021-05-19 10:16:22 +09:00
Nobuyoshi Nakada
7c716b686c
ext/json/parser/prereq.mk: fix warnings for code generated by ragel
...
* type-limits when plain-char is unsigned
* unused-const-variable for NFA constants
2021-05-18 23:26:03 +09:00
Jean Boussier
2de594ca98
[flori/json] Deduplicate strings inside json_string_unescape
...
[ci 2]
1982070cb8
2021-05-17 19:51:51 +09:00