Commit graph

2203 commits

Author SHA1 Message Date
Peter Zhu
0ddf29f4d1 Remove unused preprocessor block 2023-02-09 11:38:32 -05:00
Matt Valentine-House
72aba64fff Merge gc.h and internal/gc.h
[Feature #19425]
2023-02-09 10:32:29 -05:00
Peter Zhu
861d70e383 Rename iseq_mark_and_update to iseq_mark_and_move
The new name is more consistent.
2023-02-08 12:43:25 -05:00
Jean Boussier
3ab3455145 Add RUBY_GC_HEAP_INIT_SIZE_%d_SLOTS to pre-init pools granularly
The old RUBY_GC_HEAP_INIT_SLOTS isn't really usable anymore as
it initalize all the pools by the same factor, but it's unlikely
that pools will need similar sizes.

In production our 40B pool is 5 to 6 times bigger than our 80B pool.
2023-02-08 09:26:07 +01:00
Jean byroot Boussier
4713b084da Revert "Revert "Consider DATA objects without a mark function as protected""
This reverts commit 6eae8e5f51.
2023-02-07 22:33:12 +01:00
Jean Boussier
6eae8e5f51 Revert "Consider DATA objects without a mark function as protected"
This reverts commit 6e4c242130.
2023-02-07 15:22:06 +01:00
Jean Boussier
6e4c242130 Consider DATA objects without a mark function as protected
It's not uncommon for simple binding to wrap structs without
any Ruby object references. Hence with no `mark` function.

Might as well mark them as protected by a write barrier.
2023-02-07 11:48:49 +01:00
Peter Zhu
c6f84e9189 [Bug #19398] Memory leak in WeakMap
There's a memory leak in ObjectSpace::WeakMap due to not freeing
the `struct weakmap`. It can be seen in the following script:

```
100.times do
  10000.times do
    ObjectSpace::WeakMap.new
  end

  # Output the Resident Set Size (memory usage, in KB) of the current Ruby process
  puts `ps -o rss= -p #{$$}`
end
```
2023-02-01 13:23:55 -05:00
Kunshan Wang
de724487f0 Copying GC support for EXIVAR
Instance variables held in gen_ivtbl are marked with rb_gc_mark.  It
prevents the referenced objects from moving, which is bad for copying
garbage collectors.

This commit allows those instance variables to be updated during
gc_update_object_references.
2023-01-31 09:24:26 -05:00
Peter Zhu
41bf2354e3 Add rb_gc_mark_and_move and implement on iseq
This commit adds rb_gc_mark_and_move which takes a pointer to an object
and marks it during marking phase and updates references during compaction.
This allows for marking and reference updating to be combined into a
single function, which reduces code duplication and prevents bugs if
marking and reference updating goes out of sync.

This commit also implements rb_gc_mark_and_move on iseq as an example.
2023-01-19 11:23:35 -05:00
Peter Zhu
abff5f6203 Move classpath to rb_classext_t
This commit moves the classpath (and tmp_classpath) from instance
variables to the rb_classext_t. This improves performance as we no
longer need to set an instance variable when assigning a classpath to
a class.

I benchmarked with the following script:

```ruby
name = :MyClass

puts(Benchmark.measure do
  10_000_000.times do |i|
    Object.const_set(name, Class.new)
    Object.send(:remove_const, name)
  end
end)
```

Before this patch:

```
  5.440119   0.025264   5.465383 (  5.467105)
```

After this patch:

```
  4.889646   0.028325   4.917971 (  4.942678)
```
2023-01-11 11:06:58 -05:00
Peter Zhu
3be2acfafd Fix re-embedding of strings during compaction
The reference updating code for strings is not re-embedding strings
because the code is incorrectly wrapped inside of a
`if (STR_SHARED_P(obj))` clause. Shared strings can't be re-embedded
so this ends up being a no-op. This means that strings can be moved to a
large size pool during compaction, but won't be re-embedded, which would
waste the space.
2023-01-09 08:49:29 -05:00
Peter Zhu
3bcf92d8af Allow malloc during gc when GC has been disabled
We should allow malloc during GC when GC has been explicitly disabled
since garbage_collect_with_gvl won't do anything if GC has been disabled.
2023-01-04 09:10:58 -05:00
Peter Zhu
184739f1e2 [ci skip] Remove trailing semicolon in gc.c 2023-01-03 11:43:43 -05:00
Peter Zhu
90a80eb076 Fix integer underflow when using HEAP_INIT_SLOTS
There is an integer underflow when the environment variable
RUBY_GC_HEAP_INIT_SLOTS is less than the number of slots currently
in the Ruby heap.

[Bug #19284]
2022-12-30 09:01:50 -05:00
Nobuyoshi Nakada
5df7118445
Skip insanely memory consuming tests
These tests do not only consume hundreds GiB bytes memory, result in
`rb_bug` when `RUBY_DEBUG` is enabled.
2022-12-26 15:01:44 +09:00
Peter Zhu
39e70eef72 [DOC] Fix formatting for GC.compact 2022-12-20 15:18:36 -05:00
Peter Zhu
9f4472cad7 [DOC] Escape all usages of GC
RDoc was making every usage of the word "GC" link to the page for GC
(which is the same page).
2022-12-20 15:16:36 -05:00
Peter Zhu
63fe03aa4e [DOC] Fix call-seq for GC methods
RDoc parses the last arrow in the call-seq as the arrow for the return
type. It was getting confused over the arrow in the hash.
2022-12-20 15:09:14 -05:00
Peter Zhu
ae53986834 [DOC] Fix formatting for GC#latest_compact_info 2022-12-20 15:06:06 -05:00
Peter Zhu
80e56d1438 Fix thrashing of major GC when size pool is small
If a size pooll is small, then `min_free_slots < heap_init_slots` is true.
This means that min_free_slots will be set to heap_init_slots. This
causes `swept_slots < min_free_slots` to be true in a later if statement.
The if statement could trigger a major GC which could cause major GC
thrashing.
2022-12-20 11:32:51 -05:00
Peter Zhu
e7915d6d70 Fix misfire of compaction read barrier
gc_compact_move incorrectly returns false when destination heap is full
after sweeping. It returns false even if destination heap is different
than source heap (returning false means that the source heap has
finished compacting). This causes the source page to get locked, which
causes a read barrier fire when we try to compact the source heap again.
2022-12-19 17:09:08 -05:00
Peter Zhu
8275cad1e1 Fix buffer overrun when re-embedding objects
We eagerly set the new shape of an object when moving an object during
compaction. This new shape may have a different capacity than the
current original shape capacity. This means that we cannot copy from the
original buffer using size of the new capacity. Instead, we should use
the ivar count (which is less than or equal to both the new and original
capacities).

Co-Authored-By: Matt Valentine-House <matt@eightbitraptor.com>
2022-12-19 13:13:26 -05:00
Peter Zhu
6e3bc67103 Hard crash when allocating in GC when RUBY_DEBUG
Not all builds have RGENGC_CHECK_MODE set, so it should also crash when
RUBY_DEBUG is set.
2022-12-17 09:18:54 -05:00
Peter Zhu
965f4259db Move check for GC to xmalloc and xcalloc
Moves the check earlier to before we actually perform the allocation.
2022-12-17 09:16:26 -05:00
Peter Zhu
2ccf6e5394 Don't allow allocating memory during GC
Allocating memory (xmalloc and xrealloc) during GC could cause GC to
trigger, which would crash with `[BUG] during_gc != 0`. This is an
intermittent bug which could be hard to debug.

This commit changes it so that any memory allocation during GC will
emit a warning. When debug flags are enabled it will also cause a crash.
2022-12-16 10:01:53 -05:00
Peter Zhu
5e81cf8fd0 Refactor to only attempt to move movable objects
Moves check for gc_is_moveable_obj from try_move to gc_compact_plane.

Co-Authored-By: Matt Valentine-House <matt@eightbitraptor.com>
2022-12-15 15:27:38 -05:00
Matt Valentine-House
bfc66e07b7 Fix Object Movement allocation in GC
When moving Objects between size pools we have to assign a new shape.

This happened during updating references - we tried to create a new shape
tree that mirrored the existing tree, but based on the root shape of the
new size pool.

This causes allocations to happen if the new tree doesn't already exist,
potentially triggering a GC, during GC.

This commit changes object movement to look for a pre-existing new tree
during object movement, and if that tree does not exist, we don't move
the object to the new pool.

This allows us to remove the shape allocation from update references.

Co-Authored-By: Peter Zhu <peter@peterzhu.ca>
2022-12-15 15:27:38 -05:00
Jemma Issroff
c1ab6ddc9a Transition complex objects to "too complex" shape
When an object becomes "too complex" (in other words it has too many
variations in the shape tree), we transition it to use a "too complex"
shape and use a hash for storing instance variables.

Without this patch, there were rare cases where shape tree growth could
"explode" and cause performance degradation on what would otherwise have
been cached fast paths.

This patch puts a limit on shape tree growth, and gracefully degrades in
the rare case where there could be a factorial growth in the shape tree.

For example:

```ruby
class NG; end

HUGE_NUMBER.times do
  NG.new.instance_variable_set(:"@unique_ivar_#{_1}", 1)
end
```

We consider objects to be "too complex" when the object's class has more
than SHAPE_MAX_VARIATIONS (currently 8) leaf nodes in the shape tree and
the object introduces a new variation (a new leaf node) associated with
that class.

For example, new variations on instances of the following class would be
considered "too complex" because those instances create more than 8
leaves in the shape tree:

```ruby
class Foo; end
9.times { Foo.new.instance_variable_set(":@uniq_#{_1}", 1) }
```

However, the following class is *not* too complex because it only has
one leaf in the shape tree:

```ruby
class Foo
  def initialize
    @a = @b = @c = @d = @e = @f = @g = @h = @i = nil
  end
end
9.times { Foo.new }
``

This case is rare, so we don't expect this change to impact performance
of most applications, but it needs to be handled.

Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
2022-12-15 10:06:04 -08:00
Peter Zhu
f50aa19da6 Revert "Fix Object Movement allocation in GC"
This reverts commit 9c54466e29.

We're seeing crashes in Shopify CI after this commit.
2022-12-15 12:00:30 -05:00
Matt Valentine-House
9c54466e29 Fix Object Movement allocation in GC
When moving Objects between size pools we have to assign a new shape.

This happened during updating references - we tried to create a new shape
tree that mirrored the existing tree, but based on the root shape of the
new size pool.

This causes allocations to happen if the new tree doesn't already exist,
potentially triggering a GC, during GC.

This commit changes object movement to look for a pre-existing new tree
during object movement, and if that tree does not exist, we don't move
the object to the new pool.

This allows us to remove the shape allocation from update references.

Co-Authored-By: Peter Zhu <peter@peterzhu.ca>
2022-12-15 09:04:30 -05:00
Matt Valentine-House
856e0279ec fix indentation: gc_compact_destination_pool
[ci skip]

Co-Authored-By: Peter Zhu <peter@peterzhu.ca>
2022-12-13 13:31:10 -05:00
Peter Zhu
0b4fda11ec [DOC] Don't document private methods in objspace 2022-12-12 09:48:06 -05:00
Mirek Klimos
ea613c6360
Expose need_major_gc via GC.latest_gc_info (#6791) 2022-12-10 13:35:31 -05:00
Matt Valentine-House
12b5268679 Remove unused counter for heap_page->pinned_slots 2022-12-09 09:34:17 -05:00
Jemma Issroff
9c5e3671eb
Increment max_iv_count on class based on number of set_iv in initialize (#6788)
We can loosely predict the number of ivar sets on a class based on the
number of iv set instructions in the initialize method. This should give
us a more accurate estimate to use for initial size pool allocation,
which should in turn give us more cache hits.
2022-11-22 15:28:14 -05:00
Peter Zhu
5f95228c76 Add RVALUE_OVERHEAD and move ractor_belonging_id
This commit adds RVALUE_OVERHEAD for storing metadata at the end of the
slot. This commit moves the ractor_belonging_id in debug builds from the
flags to RVALUE_OVERHEAD which frees the 16 bits in the headers for
object shapes.
2022-11-21 11:26:26 -05:00
Aaron Patterson
10788166e7 Differentiate T_OBJECT shapes from other objects
We would like to differentiate types of objects via their shape.  This
commit adds a special T_OBJECT shape when we allocate an instance of
T_OBJECT.  This allows us to avoid testing whether an object is an
instance of a T_OBJECT or not, we can just check the shape.
2022-11-18 08:31:56 -08:00
S-H-GAMELINKS
1f4f6c9832 Using UNDEF_P macro 2022-11-16 18:58:33 +09:00
Jemma Issroff
c726c48a3d Remove numiv from RObject
Since object shapes store the capacity of an object, we no longer
need the numiv field on RObjects. This gives us one extra slot which
we can use to give embedded objects one more instance variable (for a
total of 3 ivs). This commit removes the concept of numiv from RObject.
2022-11-10 10:11:34 -05:00
Jemma Issroff
5246f4027e Transition shape when object's capacity changes
This commit adds a `capacity` field to shapes, and adds shape
transitions whenever an object's capacity changes. Objects which are
allocated out of a bigger size pool will also make a transition from the
root shape to the shape with the correct capacity for their size pool
when they are allocated.

This commit will allow us to remove numiv from objects completely, and
will also mean we can guarantee that if two objects share shapes, their
IVs are in the same positions (an embedded and extended object cannot
share shapes). This will enable us to implement ivar sets in YJIT using
object shapes.

Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
2022-11-10 10:11:34 -05:00
Yuta Saito
3a6cdeda89 [wasm] Scan machine stack based on ec->machine.stack_{start,end}
fiber machine stack is placed outside of C stack allocated by wasm-ld,
so highest stack address recorded by `rb_wasm_record_stack_base` is
invalid when running on non-main fiber.
Therefore, we should scan `stack_{start,end}` which always point a valid
stack range in any context.
2022-11-06 05:03:21 +09:00
Jemma Issroff
6e4b97f1da Increment max_iv_count on class in gc marking, not gc freeing
We were previously incrementing the max_iv_count on a class in gc
freeing. By the time we free an object though, we're not guaranteed its
class is still valid. Instead, we can do this when marking and we're
guaranteed the object still knows its class.
2022-11-04 11:41:10 -04:00
John Hawthorn
02f1554224
Implement object shapes for T_CLASS and T_MODULE (#6637)
* Avoid RCLASS_IV_TBL in marshal.c
* Avoid RCLASS_IV_TBL for class names
* Avoid RCLASS_IV_TBL for autoload
* Avoid RCLASS_IV_TBL for class variables
* Avoid copying RCLASS_IV_TBL onto ICLASSes
* Use object shapes for Class and Module IVs
2022-10-31 14:05:37 -07:00
Aaron Patterson
5e0432f59b
fix ASAN error in GC 2022-10-28 16:10:55 -07:00
Jemma Issroff
a11952dac1 Rename iv_count on shapes to next_iv_index
`iv_count` is a misleading name because when IVs are unset, the new
shape doesn't decrement this value. `next_iv_count` is an accurate, and
more descriptive name.
2022-10-21 14:57:34 -07:00
Jemma Issroff
13bd617ea6 Remove unused class serial
Before object shapes, we were using class serial to invalidate
inline caches. Now that we use shape_id for inline cache keys,
the class serial is unnecessary.

Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
2022-10-21 14:56:48 -07:00
Nobuyoshi Nakada
e72c5044ce
Check writebarrier arguments only when RGENGC_CHECK_MODE [ci skip]
The commit 575ae50d16 was for debugging
the failure triggered by f55212bce9, and
it was fixed at the commit 39f7eddec4.
2022-10-21 10:02:16 +09:00
Nobuyoshi Nakada
9a0a165a5d Check writebarrier arguments 2022-10-20 15:43:34 -04:00
Aaron Patterson
eeea633eb2 Stop zeroing memory on allocation / copy
Shapes gives us an almost exact count of instance variables on an
object.  Since we know the number of instance variables that have been
set, we will never access slots that haven't been initialized with an
IV.
2022-10-19 07:54:46 -07:00