Commit graph

70 commits

Author SHA1 Message Date
Peter Zhu
f6dcab5f50 Assert that objects in write barrier are not dead 2024-09-23 10:36:48 -04:00
KJ Tsanaktsidis
02b36f7572 Unpoison page->freelist before trying to assert on it
Otherwise trying to deref the pointer can cause an ASAN crash, even
though the only reason we're dereferencing it is so that we can assert
on it.
2024-09-23 10:11:54 +10:00
Peter Zhu
2882408dcb Remove unneeded function prototype for rb_gc_impl_mark 2024-09-20 10:58:19 -04:00
Peter Zhu
167fba52f0 Remove rb_gc_impl_initial_stress_set 2024-09-19 08:21:10 -04:00
Peter Zhu
5df5eba465 Change rb_gc_impl_get_measure_total_time to return a bool 2024-09-18 10:18:47 -04:00
Peter Zhu
5307c65c76 Make rb_gc_impl_set_measure_total_time return void 2024-09-17 16:35:52 -04:00
Peter Zhu
dc61c7fc7d Rename rb_gc_impl_get_profile_total_time to rb_gc_impl_get_total_time 2024-09-17 15:22:43 -04:00
Peter Zhu
2af080bd30 Change rb_gc_impl_get_profile_total_time to return unsigned long long 2024-09-17 15:22:43 -04:00
Peter Zhu
5de7517bcb Use unsigned long long for marking and sweeping time 2024-09-17 15:22:43 -04:00
Peter Zhu
50d4840bd9 Move desired_compaction_pages_i inside of GC_CAN_COMPILE_COMPACTION
Fixes the following warning on WebAssembly:

    gc/default.c:7306:1: warning: unused function 'desired_compaction_pages_i' [-Wunused-function]
    desired_compaction_pages_i(struct heap_page *page, void *data)
2024-09-16 15:58:27 -04:00
Peter Zhu
50564f8882 ASAN unpoison whole heap page after adding to size pool 2024-09-16 09:27:29 -04:00
Peter Zhu
46ba3752c2 Don't return inside of asan_unpoisoning_object 2024-09-16 09:27:29 -04:00
Peter Zhu
c5a782dfb0 Replace with asan_unpoisoning_object 2024-09-16 09:27:29 -04:00
Peter Zhu
0fc8422a05 Move checks for heap traversal to rb_gc_mark_weak
If we are during heap traversal, we don't want to call rb_gc_impl_mark_weak.
This commit moves that check from rb_gc_impl_mark_weak to rb_gc_mark_weak.
2024-09-12 16:03:28 -04:00
Peter Zhu
606db2c423 Move special const checks to rb_gc_mark_weak 2024-09-12 16:03:28 -04:00
Peter Zhu
1205f17125 ASAN unlock freelist in size_pool_add_page 2024-09-09 10:55:18 -04:00
Peter Zhu
f2057277ea ASAN unlock freelist in gc_sweep_step 2024-09-09 10:23:25 -04:00
Peter Zhu
5a502c1873 Add keys to GC.stat and fix tests
This adds keys heap_empty_pages and heap_allocatable_slots to GC.stat.
2024-09-09 10:15:21 -04:00
Peter Zhu
079ef92b5e Implement global allocatable slots and empty pages
[Bug #20710]

This commit introduces moves allocatable slots and empty pages from per
size pool to global. This allows size pools to grow globally and allows
empty pages to move between size pools.

For the benchmark in [Bug #20710], this signficantly improves performance:

    Before:
        new_env      2.563 (± 0.0%) i/s -     26.000 in  10.226703s
        new_rails_env      0.293 (± 0.0%) i/s -      3.000 in  10.318960s

    After:
        new_env      3.781 (±26.4%) i/s -     37.000 in  10.302374s
        new_rails_env      0.911 (± 0.0%) i/s -      9.000 in  10.049337s

In the headline benchmarks on yjit-bench, we see the performance is
basically on-par with before, with ruby-lsp being signficantly faster
and activerecord and erubi-rails being slightly slower:

    --------------  -----------  ----------  -----------  ----------  --------------  -------------
    bench           master (ms)  stddev (%)  branch (ms)  stddev (%)  branch 1st itr  master/branch
    activerecord    452.2        0.3         479.4        0.4         0.96            0.94
    chunky-png      1157.0       0.4         1172.8       0.1         0.99            0.99
    erubi-rails     905.4        0.3         967.2        0.4         0.94            0.94
    hexapdf         3566.6       0.6         3553.2       0.3         1.03            1.00
    liquid-c        88.9         0.9         89.0         1.3         0.98            1.00
    liquid-compile  93.4         0.9         89.9         3.5         1.01            1.04
    liquid-render   224.1        0.7         227.1        0.5         1.00            0.99
    lobsters        1052.0       3.5         1067.4       2.1         0.99            0.99
    mail            197.1        0.4         196.5        0.5         0.98            1.00
    psych-load      2960.3       0.1         2988.4       0.8         1.00            0.99
    railsbench      2252.6       0.4         2255.9       0.5         0.99            1.00
    rubocop         262.7        1.4         270.1        1.8         1.02            0.97
    ruby-lsp        275.4        0.5         242.0        0.3         0.97            1.14
    sequel          98.4         0.7         98.3         0.6         1.01            1.00
    --------------  -----------  ----------  -----------  ----------  --------------  -------------
2024-09-09 10:15:21 -04:00
Peter Zhu
de7ac11a09 Replace heap_allocated_pages with rb_darray_size 2024-09-09 10:15:21 -04:00
Peter Zhu
b66d6e48c8 Switch sorted list of pages in the GC to a darray 2024-09-09 10:15:21 -04:00
Peter Zhu
ae84c017d6 Remove unused allocatable_pages field in objspace 2024-09-04 09:29:18 -04:00
Peter Zhu
e7fbdf8187 Fix indentation broken in 53eaa67 [ci skip] 2024-09-03 13:45:54 -04:00
Peter Zhu
53eaa67305 Unpoision the object in rb_gc_impl_garbage_object_p 2024-09-03 13:43:33 -04:00
Peter Zhu
3c63a01295 Move responsibility of heap walking into Ruby
This commit removes the need for the GC implementation to implement heap
walking and instead Ruby will implement it.
2024-09-03 10:05:38 -04:00
Peter Zhu
6b08a50a62 Move checks for special const for marking
This commit moves checks to RB_SPECIAL_CONST_P out of the GC implmentation
and into gc.c.
2024-08-29 09:11:40 -04:00
Peter Zhu
8c01dec827 Skip assertion in gc/default.c when multi-Ractor
The counter for total allocated objects may not be accurate when there are
multiple Ractors since it is not atomic so there could be race conditions
when it is incremented.
2024-08-26 13:25:12 -04:00
Peter Zhu
1cafc9d51d Use rb_gc_multi_ractor_p in gc/default.c 2024-08-26 13:25:12 -04:00
Peter Zhu
80d457b4b4 Fix object allocation counters in compaction
When we move an object in compaction, we do not decrement the total_freed_objects
of the original size pool or increment the total_allocated_objects of the
new size pool. This means that when this object dies, it will appear as
if the object was never freed from the original size pool and the new
size pool will have one more free than expected. This means that the new
size pool could appear to have a negative number of live objects.
2024-08-26 09:40:07 -04:00
Peter Zhu
c3dc1322ba Move final_slots_count to per size pool 2024-08-26 09:40:07 -04:00
Peter Zhu
3f6be01bfc Make object ID faster by checking flags
We can improve object ID performance by checking the FL_SEEN_OBJ_ID flag
instead of looking up in the table.
2024-08-23 10:49:27 -04:00
Peter Zhu
165635049a Don't use gc_impl.h inside of gc/gc.h
Using gc_impl.h inside of gc/gc.h will cause gc/gc.h to use the functions
in gc/default.c when builing with shared GC support because gc/gc.h is
included into gc.c before the rb_gc_impl functions are overridden by the
preprocessor.
2024-08-22 13:50:17 -04:00
Peter Zhu
e15b454bc3 Simplify how finalizers are ran at shutdown
We don't need to build a linked list from the finalizer table and
instead we can just run the finalizers by iterating the ST table.

This also improves the performance at shutdown, for example:

    1_000_000.times.map do
      o = Object.new
      ObjectSpace.define_finalizer(o, proc { })
      o
    end

Before:

    Time (mean ± σ):      1.722 s ±  0.056 s    [User: 1.597 s, System: 0.113 s]
    Range (min … max):    1.676 s …  1.863 s    10 runs

After:

    Time (mean ± σ):      1.538 s ±  0.025 s    [User: 1.437 s, System: 0.093 s]
    Range (min … max):    1.510 s …  1.586 s    10 runs
2024-08-21 11:12:07 -04:00
Peter Zhu
cb28487722 Make assertions allow incremental GC when disabled
When assertions are enabled, the following code triggers an assertion
error:

    GC.disable
    GC.start(immediate_mark: false, immediate_sweep: false)

    10_000_000.times { Object.new }

This is because the GC.start ignores that the GC is disabled and will
start incremental marking and lazy sweeping. But the assertions in
gc_marks_continue and gc_sweep_continue assert that GC is not disabled.

This commit changes it for the assertion to pass if the GC was triggered
from a method.
2024-08-19 10:58:36 -04:00
Peter Zhu
bbbe07a5db Speed up finalizers for objects without object ID
If the object being finalized does not have an object ID, then we don't
need to insert into the object ID table, we can simply just allocate a
new object ID by bumping the next_object_id counter. This speeds up
finalization for objects that don't have an object ID. For example, the
following script now runs faster:

    1_000_000.times do
      o = Object.new
      ObjectSpace.define_finalizer(o) {}
    end

Before:

    Time (mean ± σ):      1.462 s ±  0.019 s    [User: 1.360 s, System: 0.094 s]
    Range (min … max):    1.441 s …  1.503 s    10 runs

After:

    Time (mean ± σ):      1.199 s ±  0.015 s    [User: 1.103 s, System: 0.086 s]
    Range (min … max):    1.181 s …  1.229 s    10 runs
2024-08-16 09:26:51 -04:00
Peter Zhu
2c6e16eb51 Don't assume st_data_t and VALUE are the same in rb_gc_impl_object_id 2024-08-15 14:33:13 -04:00
Peter Zhu
8312c5be74 Fix GC_ASSERT for gc.c and gc/default.c
gc.c mistakenly defined GC_ASSERT as blank, which caused it to be a
no-op. This caused all assertions in gc.c and gc/default.c to not do
anything. This commit fixes it by moving the definition of GC_ASSERT
to gc/gc.h.
2024-08-15 10:38:24 -04:00
Peter Zhu
0610f1b083 Fix crash when GC runs during finalizers at shutdown
We need to remove from the finalizer_table after running all the
finalizers because GC could trigger during the finalizer which could
reclaim the finalizer table array.

The following code crashes:

    1_000_000.times do
      o = Object.new
      ObjectSpace.define_finalizer(o, proc { })
    end
2024-08-14 13:49:52 -04:00
Nobuyoshi Nakada
21a9d7664c
Fix flag test macro
`RBOOL` is a macro to convert C boolean to Ruby boolean.
2024-08-11 02:36:37 +09:00
Nobuyoshi Nakada
04d57e2c5c
Evaluate macro arguments just once
And fix unclosed parenthesis.
2024-08-11 02:36:11 +09:00
Peter Zhu
c91ec7ba1e Remove rb_gc_impl_objspace_mark
It's not necessary for the GC implementation to call rb_gc_mark_roots
which calls back into the GC implementation's rb_gc_impl_objspace_mark.
2024-08-09 10:27:40 -04:00
Peter Zhu
868d63f0a3 Disable GC even during finalizing
We're seeing a crash during shutdown in rb_gc_impl_objspace_free because
it's running lazy sweeping during shutdown. It appears that it's due to
`finalizing` being set, which causes GC to not be aborted and not
disabled which causes it to be in lazy sweeping at shutdown.

The full stack trace is:

    #6  rb_bug (fmt=fmt@entry=0x5643b8ebde78 "lazy sweeping underway when freeing object space") at error.c:1095
    #7  0x00005643b8a3c697 in rb_gc_impl_objspace_free (objspace_ptr=<optimized out>) at gc/default.c:9507
    #8  0x00005643b8c269eb in ruby_vm_destruct (vm=0x7e2fdc84d000) at vm.c:3141
    #9  0x00005643b8a5147b in rb_ec_cleanup (ec=<optimized out>, ex=<optimized out>) at eval.c:263
    #10 0x00005643b8a51c93 in ruby_run_node (n=<optimized out>) at eval.c:319
    #11 0x00005643b8a4c7c7 in rb_main (argv=0x7fffef15e7f8, argc=18) at ./main.c:43
    #12 main (argc=<optimized out>, argv=<optimized out>) at ./main.c:62
2024-08-08 10:11:49 -04:00
Peter Zhu
f6e829603e Removed unused macro RVALUE_PAGE_MARKED 2024-08-01 15:54:08 -04:00
git
cb5c460594 * expand tabs. [ci skip]
Please consider using misc/expand_tabs.rb as a pre-commit hook.
2024-07-26 15:44:44 +00:00
Alan Wu
158177e399 Improve allocation throughput by outlining cache miss code path
Previously, GCC 11 on x86-64 inlined the heavy weight logic for
potentially triggering GC into newobj_alloc(). This slowed down
the hotter code path where the ractor cache hits, causing a degradation
to allocation throughput.

Outline the logic into a separate function and have it never inlined.

This restores allocation throughput to the same level as
98eeadc ("Development of 3.4.0 started.").

To evaluate, instrument miniruby so it allocates a bunch of objects and
then exits:

    diff --git a/eval.c b/eval.c
    --- a/eval.c
    +++ b/eval.c
    @@ -92,6 +92,15 @@ ruby_setup(void)
         }
         EC_POP_TAG();

    +rb_gc_disable();
    +rb_execution_context_t *ec = GET_EC();
    +long const n = 20000000;
    +for (long i = 0; i < n; ++i) {
    +    rb_wb_protected_newobj_of(ec, 0, T_OBJECT, 40);
    +}
    +printf("alloc %ld\n", n);
    +exit(0);
    +
         return state;
     }

With `3.3-equiv` being 98eeadc, and `pre` being f2728c3393
and `post` being this commit, I have:

    $ hyperfine -L buildtag post,pre,3.3-equiv '/ruby/build-{buildtag}/miniruby'
    Benchmark 1: /ruby/build-post/miniruby
      Time (mean ± σ):     873.4 ms ±   2.8 ms    [User: 377.6 ms, System: 490.2 ms]
      Range (min … max):   868.3 ms … 877.8 ms    10 runs

    Benchmark 2: /ruby/build-pre/miniruby
      Time (mean ± σ):     960.1 ms ±   2.8 ms    [User: 430.8 ms, System: 523.9 ms]
      Range (min … max):   955.5 ms … 964.2 ms    10 runs

    Benchmark 3: /ruby/build-3.3-equiv/miniruby
      Time (mean ± σ):     886.9 ms ±   2.8 ms    [User: 379.5 ms, System: 501.0 ms]
      Range (min … max):   883.0 ms … 890.8 ms    10 runs

    Summary
      '/ruby/build-post/miniruby' ran
        1.02 ± 0.00 times faster than '/ruby/build-3.3-equiv/miniruby'
        1.10 ± 0.00 times faster than '/ruby/build-pre/miniruby'

These results are from a Skylake server with GCC 11.
2024-07-26 11:44:34 -04:00
Alan Wu
0ada02abe2 Put the default GC implementation back into gc.o
We discovered that having gc.o and gc_impl.o in separate translation
units diminishes codegen quality with GCC 11 on x86-64. This commit
solves that problem by including default/gc.c into gc.c, letting the
optimizer have visibility into the body of functions again in builds
not using link-time optimization, which are common.

This effectively restores things to the way they were before
[Feature #20470] from the optimizer's perspective while maintaining the
ability to build gc/default.c as a DSO.

There were a few functions duplicated across gc.c and gc/default.c.
Extract them and put them into gc/gc.h.
2024-07-26 11:44:34 -04:00
Alan Wu
cef959df90 Delete unused rb_gc_impl_get_finalizers() not in gc_impl.h 2024-07-26 11:44:34 -04:00
Alan Wu
83b0cedffe Add branch prediction annotations for object allocation
I get a slight boost from these with GCC 11 on Intel Skylake.

Part of a larger story to fix an allocation throughput regression
compared to 98eeadc ("Development of 3.4.0 started.") as the baseline.
2024-07-25 12:46:33 -04:00
Peter Zhu
0a9f771e19 Don't check live slot count when multi-Ractor 2024-07-24 09:44:54 -04:00
Peter Zhu
6770bb4a8c Fix running GC in finalizer when RUBY_FREE_AT_EXIT
The following code crashes because the GC ran during finalizers will
cause T_ZOMBIE objects to be on the heap, which crashes when we call
rb_gc_obj_free on it:

    raise_proc = proc do |id|
      GC.start
    end
    1000.times do
      ObjectSpace.define_finalizer(Object.new, raise_proc)
    end
2024-07-23 14:45:45 -04:00