Provide GC.config to disable major GC collections

This feature provides a new method `GC.config` that configures internal
GC configuration variables provided by an individual GC implementation.

Implemented in this PR is the option `full_mark`: a boolean value that
will determine whether the Ruby GC is allowed to run a major collection
while the process is running.

It has the following semantics

This feature configures Ruby's GC to only run minor GC's. It's designed
to give users relying on Out of Band GC complete control over when a
major GC is run. Configuring `full_mark: false` does two main things:

* Never runs a Major GC. When the heap runs out of space during a minor
  and when a major would traditionally be run, instead we allocate more
  heap pages, and mark objspace as needing a major GC.
* Don't increment object ages. We don't promote objects during GC, this
  will cause every object to be scanned on every minor. This is an
  intentional trade-off between minor GC's doing more work every time,
  and potentially promoting objects that will then never be GC'd.

The intention behind not aging objects is that users of this feature
should use a preforking web server, or some other method of pre-warming
the oldgen (like Nakayoshi fork)before disabling Majors. That way most
objects that are going to be old will have already been promoted.

This will interleave major and minor GC collections in exactly the same
what that the Ruby GC runs in versions previously to this. This is the
default behaviour.

* This new method has the following extra semantics:
  - `GC.config` with no arguments returns a hash of the keys of the
    currently configured GC
  - `GC.config` with a key pair (eg. `GC.config(full_mark: true)` sets
    the matching config key to the corresponding value and returns the
    entire known config hash, including the new values. If the key does
    not exist, `nil` is returned

* When a minor GC is run, Ruby sets an internal status flag to determine
  whether the next GC will be a major or a minor. When `full_mark:
  false` this flag is ignored and every GC will be a minor.

  This status flag can be accessed at
  `GC.latest_gc_info(:needs_major_by)`. Any value other than `nil` means
  that the next collection would have been a major.

  Thus it's possible to use this feature to check at a predetermined
  time, whether a major GC is necessary and run one if it is. eg. After
  a request has finished processing.

  ```ruby
  if GC.latest_gc_info(:needs_major_by)
    GC.start(full_mark: true)
  end
  ```

[Feature #20443]
This commit is contained in:
Matt Valentine-House 2024-07-04 15:21:09 +01:00
parent 00d0ddd48a
commit f543c68e1c
6 changed files with 235 additions and 5 deletions

24
gc.rb
View file

@ -253,6 +253,30 @@ module GC
Primitive.gc_stat_heap heap_name, hash_or_key
end
# call-seq:
# GC.config -> hash
# GC.config(hash) -> hash
#
# Sets or gets information about the current GC config.
#
# The contents of the hash are implementation specific and may change in
# the future without notice.
#
# If the optional argument, hash, is given, it is overwritten and returned.
#
# This method is only expected to work on CRuby.
#
# The hash includes the following keys about the internal information in
# the \GC:
#
# [slot_size]
# The slot size of the heap in bytes.
def self.config hash = nil
return Primitive.gc_config_get unless hash
Primitive.gc_config_set hash
end
# call-seq:
# GC.latest_gc_info -> hash
# GC.latest_gc_info(hash) -> hash