This commit changes the way compaction moves objects and sweeps pages in
order to better facilitate object movement between size pools.
Previously we would move the scan cursor first until we found an empty
slot and then we'd decrement the compact cursor until we found something
to move into that slot. We would sweep the page that contained the scan
cursor before trying to fill it
In this algorithm we first move the compact cursor down until we find an
object to move - We then take a free page from the desired destination
heap (always the same heap in this current iteration of the code).
If there is no free page we sweep the page at the sweeping_page cursor,
add it to the free pages, and advance the cursor to the next page, and
try again.
We sweep one page from each size pool in this way, and then repeat that
process until all the size pools are compacted (all the cursors have
met), and then we update references and sweep the rest of the heap.
Currently, the number of incremental marking steps is calculated based
on the number of pooled pages available. This means that if we make Ruby
heap pages larger, it would run fewer incremental marking steps (which
would mean each incremental marking step takes longer).
This commit changes incremental marking to run after every
INCREMENTAL_MARK_STEP_ALLOCATIONS number of allocations. This means that
the behaviour of incremental marking remains the same regardless of the
Ruby heap page size.
I've benchmarked against discourse benchmarks and did not get a
significant change in response times beyond the margin of error. This is
expected as this new incremental marking algorithm behaves very
similarly to the previous one.
Use ISEQ_BODY macro to get the rb_iseq_constant_body of the ISeq. Using
this macro will make it easier for us to change the allocation strategy
of rb_iseq_constant_body when using Variable Width Allocation.
Previously, we would build a new `superclasses` array for each class,
even though for all immediate subclasses of a class, the array is
identical.
This avoids duplicating the arrays on leaf classes (those without
subclasses) by calculating and storing a "superclasses including self"
array on a class when it's first inherited and sharing that among all
superclasses.
An additional trick used is that the "superclass array including self"
is valid as "self"'s superclass array. It just has it's own class at the
end. We can use this to avoid an extra pointer of storage and can use
one bit of a flag to track that we've "upgraded" the array.
Previously when checking ancestors, we would walk all the way up the
ancestry chain checking each parent for a matching class or module.
I believe this was especially unfriendly to CPU cache since for each
step we need to check two cache lines (the class and class ext).
This check is used quite often in:
* case statements
* rescue statements
* Calling protected methods
* Class#is_a?
* Module#===
* Module#<=>
I believe it's most common to check a class against a parent class, to
this commit aims to improve that (unfortunately does not help checking
for an included Module).
This is done by storing on each class the number and an array of all
parent classes, in order (BasicObject is at index 0). Using this we can
check whether a class is a subclass of another in constant time since we
know the location to expect it in the hierarchy.
(1) gc_verify_internal_consistency() use barrier locking
for consistency while `during_gc == true` at the end
of the sweep on `RGENGC_CHECK_MODE >= 2`.
(2) `rb_objspace_reachable_objects_from()` is called without
VM synchronization and it checks `during_gc != true`.
So (1) and (2) causes BUG because of `during_gc == true`.
To prevent this error, wait for VM barrier on `during_gc == false`
and introduce VM locking on `rb_objspace_reachable_objects_from()`.
http://ci.rvm.jp/results/trunk-asserts@phosphorus-docker/3830088
gc_marks_continue will start sweeping when it finishes marking. However,
if the heap we are trying to allocate into is full, then the sweeping
may not yield any free slots. If we don't call gc_sweep_continue
immediate after this, then another GC will be started halfway during
lazy sweeping. gc_sweep_continue will either grow the heap or finish
sweeping.
Add a new macro BASE_SLOT_SIZE that determines the slot size.
For Variable Width Allocation (compiled with USE_RVARGC=1), all slot
sizes are powers-of-2 multiples of BASE_SLOT_SIZE.
For USE_RVARGC=0, BASE_SLOT_SIZE is set to sizeof(RVALUE).
Renames rb_id_table_foreach_with_replace to
rb_id_table_foreach_values_with_replace and passes only the value to the
callback. We can use this in GC compaction when we cannot access the
global symbol array.
NUM_IN_PAGE could return a value much larger than 64. According to the
C11 spec 6.5.7 paragraph 3 this is undefined behavior:
> If the value of the right operand is negative or is greater than or
> equal to the width of the promoted left operand, the behavior is
> undefined.
On most platforms, this is usually not a problem as the architecture
will mask off all out-of-range bits.
WebAssembly has function local infinite registers and stack values, but
there is no way to scan the values in a call stack for now.
This implementation uses Asyncify to spilling out wasm locals into
linear memory.
On 32-bit systems, VWA causes class_serial to not be aligned (it only
guarantees 4 byte alignment but class_serial is 8 bytes and requires 8
byte alignment). This commit uses a hack to allocate class_serial
through malloc. Once VWA allocates with 8 byte alignment in the future,
we will revert this commit.
This commit switches from a custom implemented bsearch algorithm to
use the one provided by the C standard library.
Because `is_pointer_to_heap` will only return true if the pointer
being searched for is a valid slot starting address within the heap
page body, we've extracted the bsearch call site into a more general
function so we can use it elsewhere.
The new function `heap_page_for_ptr` returns the heap page for any heap
page pointer, regardless of whether that is at the start of a slot or
in the middle of one.
We then use this function as the basis of `is_pointer_to_heap`.
Some callable method entries (cme) can be a key of `overloaded_cme_table`
and the keys should be pinned because the table is numtable (VALUE is a key).
Before the patch GC checks the cme is in `overloaded_cme_table` by looking up
the table, but it needs VM locking.
It works well in normal GC marking because it is protected by the VM lock,
but it doesn't work on `rb_objspace_reachable_objects_from` because it doesn't
use VM lock.
Now, the number of target cmes are small enough, I decide to pin down
all possible cmes instead of using looking up the table.
`overloaded_cme_table` keeps cme -> monly_cme pairs to manage
corresponding `monly_cme` for `cme`. The lifetime of the `monly_cme`
should be longer than `monly_cme`, but the previous patch losts the
reference to the living `monly_cme`.
Now `overloaded_cme_table` values are always root (keys are only weak
reference), it means `monly_cme` does not freed until corresponding
`cme` is invalidated.
To make managing easy, move `overloaded_cme_table` to `rb_vm_t`.
`def` (`rb_method_definition_t`) is shared by multiple callable
method entries (cme, `rb_callable_method_entry_t`).
There are two issues:
* old -> young reference: `cme1->def->mandatory_only_cme = monly_cme`
if `cme1` is young and `monly_cme` is young, there is no problem.
Howevr, another old `cme2` can refer `def`, in this case, old `cme2`
points young `monly_cme` and it violates gengc assumption.
* cme can have different `defined_class` but `monly_cme` only has
one `defined_class`. It does not make sense and `monly_cme`
should be created for a cme (not `def`).
To solve these issues, this patch allocates `monly_cme` per `cme`.
`cme` does not have another room to store a pointer to the `monly_cme`,
so this patch introduces `overloaded_cme_table`, which is weak key map
`[cme] -> [monly_cme]`.
`def::body::iseqptr::monly_cme` is deleted.
The first issue is reported by Alan Wu.
When using `rp(obj)` for debugging during development, it may be
useful to know that an object is soon to be swept. Add a new letter to
the object dump for whether the object is garbage. It's easy to forget
about lazy sweep.
Except on Windows and MinGW, we can only use compaction on systems that
use mmap (only systems that use mmap can use the read barrier that
compaction requires). We don't need to separately detect whether we can
support compaction or not.
* Lazily create singletons on instance_{exec,eval}
Previously when instance_exec or instance_eval was called on an object,
that object would be given a singleton class so that method
definitions inside the block would be added to the object rather than
its class.
This commit aims to improve performance by delaying the creation of the
singleton class unless/until one is needed for method definition. Most
of the time instance_eval is used without any method definition.
This was implemented by adding a flag to the cref indicating that it
represents a singleton of the object rather than a class itself. In this
case CREF_CLASS returns the object's existing class, but in cases that
we are defining a method (either via definemethod or
VM_SPECIAL_OBJECT_CBASE which is used for undef and alias).
This also happens to fix what I believe is a bug. Previously
instance_eval behaved differently with regards to constant access for
true/false/nil than for all other objects. I don't think this was
intentional.
String::Foo = "foo"
"".instance_eval("Foo") # => "foo"
Integer::Foo = "foo"
123.instance_eval("Foo") # => "foo"
TrueClass::Foo = "foo"
true.instance_eval("Foo") # NameError: uninitialized constant Foo
This also slightly changes the error message when trying to define a method
through instance_eval on an object which can't have a singleton class.
Before:
$ ruby -e '123.instance_eval { def foo; end }'
-e:1:in `block in <main>': no class/module to add method (TypeError)
After:
$ ./ruby -e '123.instance_eval { def foo; end }'
-e:1:in `block in <main>': can't define singleton (TypeError)
IMO this error is a small improvement on the original and better matches
the (both old and new) message when definging a method using `def self.`
$ ruby -e '123.instance_eval{ def self.foo; end }'
-e:1:in `block in <main>': can't define singleton (TypeError)
Co-authored-by: Matthew Draper <matthew@trebex.net>
* Remove "under" argument from yield_under
* Move CREF_SINGLETON_SET into vm_cref_new
* Simplify vm_get_const_base
* Fix leaf VM_SPECIAL_OBJECT_CONST_BASE
Co-authored-by: Matthew Draper <matthew@trebex.net>
suseconds_t, which is the type of tv_usec, may be defined with a longer
size type than tv_nsec's type (long). So usec to nsec conversion needs
an explicit casting.