Commit graph

27 commits

Author SHA1 Message Date
Jean Boussier
f48e45d1e9 Move object_id in object fields.
And get rid of the `obj_to_id_tbl`

It's no longer needed, the `object_id` is now stored inline
in the object alongside instance variables.

We still need the inverse table in case `_id2ref` is invoked, but
we lazily build it by walking the heap if that happens.

The `object_id` concern is also no longer a GC implementation
concern, but a generic implementation.

Co-Authored-By: Matt Valentine-House <matt@eightbitraptor.com>
2025-05-08 07:58:05 +02:00
Jean Boussier
7db0e07134 Don't preserve object_id when moving object to another Ractor
That seemed like the logical thing to do to me, but ko1 disagree.
2025-03-31 12:01:55 +02:00
Jean Boussier
0350290262 Ractor: Fix moving embedded objects
[Bug #20271]
[Bug #20267]
[Bug #20255]

`rb_obj_alloc(RBASIC_CLASS(obj))` will always allocate from the basic
40B pool, so if `obj` is larger than `40B`, we'll create a corrupted
object when we later copy the shape_id.

Instead we can use the same logic than ractor copy, which is
to use `rb_obj_clone`, and later ask the GC to free the original
object.

We then must turn it into a `T_OBJECT`, because otherwise
just changing its class to `RactorMoved` leaves a lot of
ways to keep using the object, e.g.:

```
a = [1, 2, 3]
Ractor.new{}.send(a, move: true)
[].concat(a) # Should raise, but wasn't.
```

If it turns out that `rb_obj_clone` isn't performant enough
for some uses, we can always have carefully crafted specialized
paths for the types that would benefit from it.
2025-03-31 12:01:55 +02:00
Peter Zhu
319fcca656 Move rb_gc_impl_ractor_cache_free to shutdown section 2025-03-24 08:49:30 -04:00
Peter Zhu
a572ec1ba0 Move rb_gc_impl_objspace_free to shutdown section 2025-03-24 08:49:30 -04:00
Peter Zhu
7b6e07ea93 Add rb_gc_object_metadata API
This function replaces the internal rb_obj_gc_flags API. rb_gc_object_metadata
returns an array of name and value pairs, with the last element having
0 for the name.
2025-02-19 09:47:28 -05:00
Peter Zhu
c45503f957 Add rb_gc_impl_active_gc_name to gc/gc_impl.h 2024-12-06 10:22:03 -05:00
Peter Zhu
ce1ad1b816 Standardize on the name "modular GC"
We have name fragmentation for this feature, including "shared GC",
"modular GC", and "external GC". This commit standardizes the feature
name to "modular GC" and the implementation to "GC library".
2024-12-05 10:33:26 -05:00
Peter Zhu
62b51d9ad7 Use BUILDING_SHARED_GC instead of RB_AMALGAMATED_DEFAULT_GC
We can use the BUILDING_SHARED_GC flag to check if we're building gc_impl.h
as a shared GC or building the default GC.
2024-12-04 10:25:43 -05:00
卜部昌平
25ad7e8e6c rb_gc_impl_malloc can return NULL
Let there be rooms for each GC implementations how to handle multi
threaded situations.  They can be totally reentrant, or can have
their own mutex, or can rely on rb_thread_call_with_gvl.

In any ways the allocator (has been, but now officially is)
expected to run properly without a GVL.  This means there need be
a way for them to inform the interpreter about their allocation
failures, without relying on raising exceptions.

Let them do so by returning NULL.
2024-11-29 23:19:05 +09:00
Matt Valentine-House
551be8219e Place all non-default GC API behind USE_SHARED_GC
So that it doesn't get included in the generated binaries for builds
that don't support loading shared GC modules

Co-Authored-By: Peter Zhu <peter@peterzhu.ca>
2024-11-25 13:05:23 +00:00
Matt Valentine-House
d61933e503 Use extconf to build external GC modules
Co-Authored-By: Peter Zhu <peter@peterzhu.ca>
2024-11-25 13:05:23 +00:00
Peter Zhu
d3aaca9785 Make rb_gc_impl_stat_heap return a VALUE instead of size_t 2024-10-23 13:18:09 -04:00
Peter Zhu
9dea0fae25 Make rb_gc_impl_stat return a VALUE instead of size_t 2024-10-23 13:18:09 -04:00
Peter Zhu
3d8fe462df Move return value of rb_gc_impl_config_set to gc.c 2024-10-10 14:34:54 -04:00
Matt Valentine-House
8e7df4b7c6 Rename size_pool -> heap
Now that we've inlined the eden_heap into the size_pool, we should
rename the size_pool to heap. So that Ruby contains multiple heaps, with
different sized objects.

The term heap as a collection of memory pages is more in memory
management nomenclature, whereas size_pool was a name chosen out of
necessity during the development of the Variable Width Allocation
features of Ruby.

The concept of size pools was introduced in order to facilitate
different sized objects (other than the default 40 bytes). They wrapped
the eden heap and the tomb heap, and some related state, and provided a
reasonably simple way of duplicating all related concerns, to provide
multiple pools that all shared the same structure but held different
objects.

Since then various changes have happend in Ruby's memory layout:

* The concept of tomb heaps has been replaced by a global free pages list,
  with each page having it's slot size reconfigured at the point when it
  is resurrected
* the eden heap has been inlined into the size pool itself, so that now
  the size pool directly controls the free_pages list, the sweeping
  page, the compaction cursor and the other state that was previously
  being managed by the eden heap.

Now that there is no need for a heap wrapper, we should refer to the
collection of pages containing Ruby objects as a heap again rather than
a size pool
2024-10-03 21:20:09 +01:00
Peter Zhu
167fba52f0 Remove rb_gc_impl_initial_stress_set 2024-09-19 08:21:10 -04:00
Peter Zhu
5df5eba465 Change rb_gc_impl_get_measure_total_time to return a bool 2024-09-18 10:18:47 -04:00
Peter Zhu
5307c65c76 Make rb_gc_impl_set_measure_total_time return void 2024-09-17 16:35:52 -04:00
Peter Zhu
dc61c7fc7d Rename rb_gc_impl_get_profile_total_time to rb_gc_impl_get_total_time 2024-09-17 15:22:43 -04:00
Peter Zhu
2af080bd30 Change rb_gc_impl_get_profile_total_time to return unsigned long long 2024-09-17 15:22:43 -04:00
Peter Zhu
c91ec7ba1e Remove rb_gc_impl_objspace_mark
It's not necessary for the GC implementation to call rb_gc_mark_roots
which calls back into the GC implementation's rb_gc_impl_objspace_mark.
2024-08-09 10:27:40 -04:00
Alan Wu
0ada02abe2 Put the default GC implementation back into gc.o
We discovered that having gc.o and gc_impl.o in separate translation
units diminishes codegen quality with GCC 11 on x86-64. This commit
solves that problem by including default/gc.c into gc.c, letting the
optimizer have visibility into the body of functions again in builds
not using link-time optimization, which are common.

This effectively restores things to the way they were before
[Feature #20470] from the optimizer's perspective while maintaining the
ability to build gc/default.c as a DSO.

There were a few functions duplicated across gc.c and gc/default.c.
Extract them and put them into gc/gc.h.
2024-07-26 11:44:34 -04:00
Peter Zhu
4b05d2dbb0 Make rb_gc_impl_undefine_finalizer return void 2024-07-19 08:53:32 -04:00
Peter Zhu
4b0244a1f3 Rename GC_IMPL_H macro to GC_GC_IMPL_H 2024-07-15 08:57:14 -04:00
Matt Valentine-House
f543c68e1c Provide GC.config to disable major GC collections
This feature provides a new method `GC.config` that configures internal
GC configuration variables provided by an individual GC implementation.

Implemented in this PR is the option `full_mark`: a boolean value that
will determine whether the Ruby GC is allowed to run a major collection
while the process is running.

It has the following semantics

This feature configures Ruby's GC to only run minor GC's. It's designed
to give users relying on Out of Band GC complete control over when a
major GC is run. Configuring `full_mark: false` does two main things:

* Never runs a Major GC. When the heap runs out of space during a minor
  and when a major would traditionally be run, instead we allocate more
  heap pages, and mark objspace as needing a major GC.
* Don't increment object ages. We don't promote objects during GC, this
  will cause every object to be scanned on every minor. This is an
  intentional trade-off between minor GC's doing more work every time,
  and potentially promoting objects that will then never be GC'd.

The intention behind not aging objects is that users of this feature
should use a preforking web server, or some other method of pre-warming
the oldgen (like Nakayoshi fork)before disabling Majors. That way most
objects that are going to be old will have already been promoted.

This will interleave major and minor GC collections in exactly the same
what that the Ruby GC runs in versions previously to this. This is the
default behaviour.

* This new method has the following extra semantics:
  - `GC.config` with no arguments returns a hash of the keys of the
    currently configured GC
  - `GC.config` with a key pair (eg. `GC.config(full_mark: true)` sets
    the matching config key to the corresponding value and returns the
    entire known config hash, including the new values. If the key does
    not exist, `nil` is returned

* When a minor GC is run, Ruby sets an internal status flag to determine
  whether the next GC will be a major or a minor. When `full_mark:
  false` this flag is ignored and every GC will be a minor.

  This status flag can be accessed at
  `GC.latest_gc_info(:needs_major_by)`. Any value other than `nil` means
  that the next collection would have been a major.

  Thus it's possible to use this feature to check at a predetermined
  time, whether a major GC is necessary and run one if it is. eg. After
  a request has finished processing.

  ```ruby
  if GC.latest_gc_info(:needs_major_by)
    GC.start(full_mark: true)
  end
  ```

[Feature #20443]
2024-07-12 14:43:33 +01:00
Peter Zhu
00d0ddd48a Add gc/gc_impl.h for GC implementation headers 2024-07-12 08:41:33 -04:00