Commit graph

209 commits

Author SHA1 Message Date
Peter Zhu
606db2c423 Move special const checks to rb_gc_mark_weak 2024-09-12 16:03:28 -04:00
Peter Zhu
1205f17125 ASAN unlock freelist in size_pool_add_page 2024-09-09 10:55:18 -04:00
Peter Zhu
f2057277ea ASAN unlock freelist in gc_sweep_step 2024-09-09 10:23:25 -04:00
Peter Zhu
5a502c1873 Add keys to GC.stat and fix tests
This adds keys heap_empty_pages and heap_allocatable_slots to GC.stat.
2024-09-09 10:15:21 -04:00
Peter Zhu
079ef92b5e Implement global allocatable slots and empty pages
[Bug #20710]

This commit introduces moves allocatable slots and empty pages from per
size pool to global. This allows size pools to grow globally and allows
empty pages to move between size pools.

For the benchmark in [Bug #20710], this signficantly improves performance:

    Before:
        new_env      2.563 (± 0.0%) i/s -     26.000 in  10.226703s
        new_rails_env      0.293 (± 0.0%) i/s -      3.000 in  10.318960s

    After:
        new_env      3.781 (±26.4%) i/s -     37.000 in  10.302374s
        new_rails_env      0.911 (± 0.0%) i/s -      9.000 in  10.049337s

In the headline benchmarks on yjit-bench, we see the performance is
basically on-par with before, with ruby-lsp being signficantly faster
and activerecord and erubi-rails being slightly slower:

    --------------  -----------  ----------  -----------  ----------  --------------  -------------
    bench           master (ms)  stddev (%)  branch (ms)  stddev (%)  branch 1st itr  master/branch
    activerecord    452.2        0.3         479.4        0.4         0.96            0.94
    chunky-png      1157.0       0.4         1172.8       0.1         0.99            0.99
    erubi-rails     905.4        0.3         967.2        0.4         0.94            0.94
    hexapdf         3566.6       0.6         3553.2       0.3         1.03            1.00
    liquid-c        88.9         0.9         89.0         1.3         0.98            1.00
    liquid-compile  93.4         0.9         89.9         3.5         1.01            1.04
    liquid-render   224.1        0.7         227.1        0.5         1.00            0.99
    lobsters        1052.0       3.5         1067.4       2.1         0.99            0.99
    mail            197.1        0.4         196.5        0.5         0.98            1.00
    psych-load      2960.3       0.1         2988.4       0.8         1.00            0.99
    railsbench      2252.6       0.4         2255.9       0.5         0.99            1.00
    rubocop         262.7        1.4         270.1        1.8         1.02            0.97
    ruby-lsp        275.4        0.5         242.0        0.3         0.97            1.14
    sequel          98.4         0.7         98.3         0.6         1.01            1.00
    --------------  -----------  ----------  -----------  ----------  --------------  -------------
2024-09-09 10:15:21 -04:00
Peter Zhu
de7ac11a09 Replace heap_allocated_pages with rb_darray_size 2024-09-09 10:15:21 -04:00
Peter Zhu
b66d6e48c8 Switch sorted list of pages in the GC to a darray 2024-09-09 10:15:21 -04:00
Peter Zhu
ae84c017d6 Remove unused allocatable_pages field in objspace 2024-09-04 09:29:18 -04:00
Peter Zhu
e7fbdf8187 Fix indentation broken in 53eaa67 [ci skip] 2024-09-03 13:45:54 -04:00
Peter Zhu
53eaa67305 Unpoision the object in rb_gc_impl_garbage_object_p 2024-09-03 13:43:33 -04:00
Peter Zhu
3c63a01295 Move responsibility of heap walking into Ruby
This commit removes the need for the GC implementation to implement heap
walking and instead Ruby will implement it.
2024-09-03 10:05:38 -04:00
Peter Zhu
6b08a50a62 Move checks for special const for marking
This commit moves checks to RB_SPECIAL_CONST_P out of the GC implmentation
and into gc.c.
2024-08-29 09:11:40 -04:00
Peter Zhu
8c01dec827 Skip assertion in gc/default.c when multi-Ractor
The counter for total allocated objects may not be accurate when there are
multiple Ractors since it is not atomic so there could be race conditions
when it is incremented.
2024-08-26 13:25:12 -04:00
Peter Zhu
1cafc9d51d Use rb_gc_multi_ractor_p in gc/default.c 2024-08-26 13:25:12 -04:00
Peter Zhu
80d457b4b4 Fix object allocation counters in compaction
When we move an object in compaction, we do not decrement the total_freed_objects
of the original size pool or increment the total_allocated_objects of the
new size pool. This means that when this object dies, it will appear as
if the object was never freed from the original size pool and the new
size pool will have one more free than expected. This means that the new
size pool could appear to have a negative number of live objects.
2024-08-26 09:40:07 -04:00
Peter Zhu
c3dc1322ba Move final_slots_count to per size pool 2024-08-26 09:40:07 -04:00
Peter Zhu
3f6be01bfc Make object ID faster by checking flags
We can improve object ID performance by checking the FL_SEEN_OBJ_ID flag
instead of looking up in the table.
2024-08-23 10:49:27 -04:00
Peter Zhu
165635049a Don't use gc_impl.h inside of gc/gc.h
Using gc_impl.h inside of gc/gc.h will cause gc/gc.h to use the functions
in gc/default.c when builing with shared GC support because gc/gc.h is
included into gc.c before the rb_gc_impl functions are overridden by the
preprocessor.
2024-08-22 13:50:17 -04:00
Peter Zhu
b0c92d6c3f Change hash_replace_ref_value to assume value moved
When hash_foreach_replace_value returns ST_REPLACE, it's guaranteed that
the value has moved in hash_replace_ref_value.
2024-08-22 13:50:17 -04:00
Peter Zhu
e15b454bc3 Simplify how finalizers are ran at shutdown
We don't need to build a linked list from the finalizer table and
instead we can just run the finalizers by iterating the ST table.

This also improves the performance at shutdown, for example:

    1_000_000.times.map do
      o = Object.new
      ObjectSpace.define_finalizer(o, proc { })
      o
    end

Before:

    Time (mean ± σ):      1.722 s ±  0.056 s    [User: 1.597 s, System: 0.113 s]
    Range (min … max):    1.676 s …  1.863 s    10 runs

After:

    Time (mean ± σ):      1.538 s ±  0.025 s    [User: 1.437 s, System: 0.093 s]
    Range (min … max):    1.510 s …  1.586 s    10 runs
2024-08-21 11:12:07 -04:00
Peter Zhu
cb28487722 Make assertions allow incremental GC when disabled
When assertions are enabled, the following code triggers an assertion
error:

    GC.disable
    GC.start(immediate_mark: false, immediate_sweep: false)

    10_000_000.times { Object.new }

This is because the GC.start ignores that the GC is disabled and will
start incremental marking and lazy sweeping. But the assertions in
gc_marks_continue and gc_sweep_continue assert that GC is not disabled.

This commit changes it for the assertion to pass if the GC was triggered
from a method.
2024-08-19 10:58:36 -04:00
Peter Zhu
bbbe07a5db Speed up finalizers for objects without object ID
If the object being finalized does not have an object ID, then we don't
need to insert into the object ID table, we can simply just allocate a
new object ID by bumping the next_object_id counter. This speeds up
finalization for objects that don't have an object ID. For example, the
following script now runs faster:

    1_000_000.times do
      o = Object.new
      ObjectSpace.define_finalizer(o) {}
    end

Before:

    Time (mean ± σ):      1.462 s ±  0.019 s    [User: 1.360 s, System: 0.094 s]
    Range (min … max):    1.441 s …  1.503 s    10 runs

After:

    Time (mean ± σ):      1.199 s ±  0.015 s    [User: 1.103 s, System: 0.086 s]
    Range (min … max):    1.181 s …  1.229 s    10 runs
2024-08-16 09:26:51 -04:00
Peter Zhu
2c6e16eb51 Don't assume st_data_t and VALUE are the same in rb_gc_impl_object_id 2024-08-15 14:33:13 -04:00
Peter Zhu
8312c5be74 Fix GC_ASSERT for gc.c and gc/default.c
gc.c mistakenly defined GC_ASSERT as blank, which caused it to be a
no-op. This caused all assertions in gc.c and gc/default.c to not do
anything. This commit fixes it by moving the definition of GC_ASSERT
to gc/gc.h.
2024-08-15 10:38:24 -04:00
Peter Zhu
0610f1b083 Fix crash when GC runs during finalizers at shutdown
We need to remove from the finalizer_table after running all the
finalizers because GC could trigger during the finalizer which could
reclaim the finalizer table array.

The following code crashes:

    1_000_000.times do
      o = Object.new
      ObjectSpace.define_finalizer(o, proc { })
    end
2024-08-14 13:49:52 -04:00
Nobuyoshi Nakada
21a9d7664c
Fix flag test macro
`RBOOL` is a macro to convert C boolean to Ruby boolean.
2024-08-11 02:36:37 +09:00
Nobuyoshi Nakada
04d57e2c5c
Evaluate macro arguments just once
And fix unclosed parenthesis.
2024-08-11 02:36:11 +09:00
Peter Zhu
c91ec7ba1e Remove rb_gc_impl_objspace_mark
It's not necessary for the GC implementation to call rb_gc_mark_roots
which calls back into the GC implementation's rb_gc_impl_objspace_mark.
2024-08-09 10:27:40 -04:00
Peter Zhu
868d63f0a3 Disable GC even during finalizing
We're seeing a crash during shutdown in rb_gc_impl_objspace_free because
it's running lazy sweeping during shutdown. It appears that it's due to
`finalizing` being set, which causes GC to not be aborted and not
disabled which causes it to be in lazy sweeping at shutdown.

The full stack trace is:

    #6  rb_bug (fmt=fmt@entry=0x5643b8ebde78 "lazy sweeping underway when freeing object space") at error.c:1095
    #7  0x00005643b8a3c697 in rb_gc_impl_objspace_free (objspace_ptr=<optimized out>) at gc/default.c:9507
    #8  0x00005643b8c269eb in ruby_vm_destruct (vm=0x7e2fdc84d000) at vm.c:3141
    #9  0x00005643b8a5147b in rb_ec_cleanup (ec=<optimized out>, ex=<optimized out>) at eval.c:263
    #10 0x00005643b8a51c93 in ruby_run_node (n=<optimized out>) at eval.c:319
    #11 0x00005643b8a4c7c7 in rb_main (argv=0x7fffef15e7f8, argc=18) at ./main.c:43
    #12 main (argc=<optimized out>, argv=<optimized out>) at ./main.c:62
2024-08-08 10:11:49 -04:00
Peter Zhu
f6e829603e Removed unused macro RVALUE_PAGE_MARKED 2024-08-01 15:54:08 -04:00
git
cb5c460594 * expand tabs. [ci skip]
Please consider using misc/expand_tabs.rb as a pre-commit hook.
2024-07-26 15:44:44 +00:00
Alan Wu
158177e399 Improve allocation throughput by outlining cache miss code path
Previously, GCC 11 on x86-64 inlined the heavy weight logic for
potentially triggering GC into newobj_alloc(). This slowed down
the hotter code path where the ractor cache hits, causing a degradation
to allocation throughput.

Outline the logic into a separate function and have it never inlined.

This restores allocation throughput to the same level as
98eeadc ("Development of 3.4.0 started.").

To evaluate, instrument miniruby so it allocates a bunch of objects and
then exits:

    diff --git a/eval.c b/eval.c
    --- a/eval.c
    +++ b/eval.c
    @@ -92,6 +92,15 @@ ruby_setup(void)
         }
         EC_POP_TAG();

    +rb_gc_disable();
    +rb_execution_context_t *ec = GET_EC();
    +long const n = 20000000;
    +for (long i = 0; i < n; ++i) {
    +    rb_wb_protected_newobj_of(ec, 0, T_OBJECT, 40);
    +}
    +printf("alloc %ld\n", n);
    +exit(0);
    +
         return state;
     }

With `3.3-equiv` being 98eeadc, and `pre` being f2728c3393
and `post` being this commit, I have:

    $ hyperfine -L buildtag post,pre,3.3-equiv '/ruby/build-{buildtag}/miniruby'
    Benchmark 1: /ruby/build-post/miniruby
      Time (mean ± σ):     873.4 ms ±   2.8 ms    [User: 377.6 ms, System: 490.2 ms]
      Range (min … max):   868.3 ms … 877.8 ms    10 runs

    Benchmark 2: /ruby/build-pre/miniruby
      Time (mean ± σ):     960.1 ms ±   2.8 ms    [User: 430.8 ms, System: 523.9 ms]
      Range (min … max):   955.5 ms … 964.2 ms    10 runs

    Benchmark 3: /ruby/build-3.3-equiv/miniruby
      Time (mean ± σ):     886.9 ms ±   2.8 ms    [User: 379.5 ms, System: 501.0 ms]
      Range (min … max):   883.0 ms … 890.8 ms    10 runs

    Summary
      '/ruby/build-post/miniruby' ran
        1.02 ± 0.00 times faster than '/ruby/build-3.3-equiv/miniruby'
        1.10 ± 0.00 times faster than '/ruby/build-pre/miniruby'

These results are from a Skylake server with GCC 11.
2024-07-26 11:44:34 -04:00
Alan Wu
0ada02abe2 Put the default GC implementation back into gc.o
We discovered that having gc.o and gc_impl.o in separate translation
units diminishes codegen quality with GCC 11 on x86-64. This commit
solves that problem by including default/gc.c into gc.c, letting the
optimizer have visibility into the body of functions again in builds
not using link-time optimization, which are common.

This effectively restores things to the way they were before
[Feature #20470] from the optimizer's perspective while maintaining the
ability to build gc/default.c as a DSO.

There were a few functions duplicated across gc.c and gc/default.c.
Extract them and put them into gc/gc.h.
2024-07-26 11:44:34 -04:00
Alan Wu
cef959df90 Delete unused rb_gc_impl_get_finalizers() not in gc_impl.h 2024-07-26 11:44:34 -04:00
Alan Wu
83b0cedffe Add branch prediction annotations for object allocation
I get a slight boost from these with GCC 11 on Intel Skylake.

Part of a larger story to fix an allocation throughput regression
compared to 98eeadc ("Development of 3.4.0 started.") as the baseline.
2024-07-25 12:46:33 -04:00
Peter Zhu
0a9f771e19 Don't check live slot count when multi-Ractor 2024-07-24 09:44:54 -04:00
Peter Zhu
6770bb4a8c Fix running GC in finalizer when RUBY_FREE_AT_EXIT
The following code crashes because the GC ran during finalizers will
cause T_ZOMBIE objects to be on the heap, which crashes when we call
rb_gc_obj_free on it:

    raise_proc = proc do |id|
      GC.start
    end
    1000.times do
      ObjectSpace.define_finalizer(Object.new, raise_proc)
    end
2024-07-23 14:45:45 -04:00
Peter Zhu
51505f70e3 Move frozen check out of rb_gc_impl_undefine_finalizer 2024-07-19 08:53:32 -04:00
Peter Zhu
4b05d2dbb0 Make rb_gc_impl_undefine_finalizer return void 2024-07-19 08:53:32 -04:00
Peter Zhu
57d9b8ee07 Assert that object is not frozen in rb_gc_impl_define_finalizer 2024-07-19 08:53:32 -04:00
Peter Zhu
e8aa9daa5b Move return value of rb_define_finalizer out
Moves return value logic of rb_define_finalizer out from
rb_gc_impl_define_finalizer.
2024-07-19 08:53:32 -04:00
Peter Zhu
d6ef74407b Use rb_obj_hide instead of setting klass to 0 2024-07-18 13:47:00 -04:00
Peter Zhu
573c2893dc Don't disable GC in rb_gc_impl_object_id
Disabling GC when creating the object ID was introduced in commit
67b2c21, but we shouldn't need to disable the GC.
2024-07-17 15:46:41 -04:00
Peter Zhu
403f44ec2c Make OBJ_ID_INCREMENT == RUBY_IMMEDIATE_MASK + 1
All the non-GC objects (i.e. immediates) have addresses such that
`obj % RUBY_IMMEDIATE_MASK != 0` (except for `Qfalse`, which is 0). We
can define `OBJ_ID_INCREMENT` as `RUBY_IMMEDIATE_MASK + 1` which should
guarantee that GC objects never have conflicting object IDs with
immediates.
2024-07-17 09:01:42 -04:00
Matt Valentine-House
690ea013ca Remove unused variable from GC compaction path 2024-07-17 12:47:27 +01:00
Peter Zhu
4fe3082b63 [DOC] Fix typo in gc/default.c 2024-07-16 09:55:48 -04:00
Peter Zhu
93489d536b Remove dependency on dtrace when building shared GC 2024-07-16 09:09:41 -04:00
卜部昌平
a887b41875 static const char *type_name() implemented
The function body was missing.
2024-07-16 13:09:19 +09:00
卜部昌平
963059a8d2 fix compile error 2024-07-16 13:09:19 +09:00
Peter Zhu
2245f278d3 Remove unused ruby_initial_gc_stress 2024-07-15 11:28:00 -04:00