Commit graph

2250 commits

Author SHA1 Message Date
Jemma Issroff
ad63b668e2
Revert "Revert "This commit implements the Object Shapes technique in CRuby.""
This reverts commit 9a6803c90b.
2022-10-11 08:40:56 -07:00
Samuel Williams
e4f91bbdba
Add IO#timeout attribute and use it for blocking IO operations. (#5653) 2022-10-07 21:48:38 +13:00
Nobuyoshi Nakada
40ceceb1a5 [Bug #19028] Suppress GCC 12 -Wuse-after-free false warning
GCC 12 introduced a new warning flag `-Wuse-after-free`, however it
has a false positive at `realloc` when optimization is disabled, since
the memory requested for reallocation is guaranteed to not be touched.
This workaround is very unclear why the false warning is suppressed by
a statement-expression GCC extension.
2022-10-04 21:53:59 +09:00
Aaron Patterson
9a6803c90b
Revert "This commit implements the Object Shapes technique in CRuby."
This reverts commit 68bc9e2e97d12f80df0d113e284864e225f771c2.
2022-09-30 16:01:50 -07:00
Jemma Issroff
d594a5a8bd
This commit implements the Object Shapes technique in CRuby.
Object Shapes is used for accessing instance variables and representing the
"frozenness" of objects.  Object instances have a "shape" and the shape
represents some attributes of the object (currently which instance variables are
set and the "frozenness").  Shapes form a tree data structure, and when a new
instance variable is set on an object, that object "transitions" to a new shape
in the shape tree.  Each shape has an ID that is used for caching. The shape
structure is independent of class, so objects of different types can have the
same shape.

For example:

```ruby
class Foo
  def initialize
    # Starts with shape id 0
    @a = 1 # transitions to shape id 1
    @b = 1 # transitions to shape id 2
  end
end

class Bar
  def initialize
    # Starts with shape id 0
    @a = 1 # transitions to shape id 1
    @b = 1 # transitions to shape id 2
  end
end

foo = Foo.new # `foo` has shape id 2
bar = Bar.new # `bar` has shape id 2
```

Both `foo` and `bar` instances have the same shape because they both set
instance variables of the same name in the same order.

This technique can help to improve inline cache hits as well as generate more
efficient machine code in JIT compilers.

This commit also adds some methods for debugging shapes on objects.  See
`RubyVM::Shape` for more details.

For more context on Object Shapes, see [Feature: #18776]

Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
Co-Authored-By: Eileen M. Uchitelle <eileencodes@gmail.com>
Co-Authored-By: John Hawthorn <john@hawthorn.email>
2022-09-28 08:26:21 -07:00
Nobuyoshi Nakada
a05b261464 Always use the longer version of TRY_WITH_GC 2022-09-28 23:51:38 +09:00
Aaron Patterson
06abfa5be6
Revert this until we can figure out WB issues or remove shapes from GC
Revert "* expand tabs. [ci skip]"

This reverts commit 830b5b5c35.

Revert "This commit implements the Object Shapes technique in CRuby."

This reverts commit 9ddfd2ca00.
2022-09-26 16:10:11 -07:00
git
830b5b5c35 * expand tabs. [ci skip]
Tabs were expanded because the file did not have any tab indentation in unedited lines.
Please update your editor config, and use misc/expand_tabs.rb in the pre-commit hook.
2022-09-27 01:21:58 +09:00
Jemma Issroff
9ddfd2ca00 This commit implements the Object Shapes technique in CRuby.
Object Shapes is used for accessing instance variables and representing the
"frozenness" of objects.  Object instances have a "shape" and the shape
represents some attributes of the object (currently which instance variables are
set and the "frozenness").  Shapes form a tree data structure, and when a new
instance variable is set on an object, that object "transitions" to a new shape
in the shape tree.  Each shape has an ID that is used for caching. The shape
structure is independent of class, so objects of different types can have the
same shape.

For example:

```ruby
class Foo
  def initialize
    # Starts with shape id 0
    @a = 1 # transitions to shape id 1
    @b = 1 # transitions to shape id 2
  end
end

class Bar
  def initialize
    # Starts with shape id 0
    @a = 1 # transitions to shape id 1
    @b = 1 # transitions to shape id 2
  end
end

foo = Foo.new # `foo` has shape id 2
bar = Bar.new # `bar` has shape id 2
```

Both `foo` and `bar` instances have the same shape because they both set
instance variables of the same name in the same order.

This technique can help to improve inline cache hits as well as generate more
efficient machine code in JIT compilers.

This commit also adds some methods for debugging shapes on objects.  See
`RubyVM::Shape` for more details.

For more context on Object Shapes, see [Feature: #18776]

Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
Co-Authored-By: Eileen M. Uchitelle <eileencodes@gmail.com>
Co-Authored-By: John Hawthorn <john@hawthorn.email>
2022-09-26 09:21:30 -07:00
Samuel Williams
22af2e9084 Rework vm_core to use int first_lineno struct member. 2022-09-26 00:41:16 +13:00
Nobuyoshi Nakada
ff07e5c264
Skip poisoned regions
Poisoned regions cannot be accessed without unpoisoning outside gc.c.
Specifically, debug.gem is terminated by AddressSanitizer.

```
SUMMARY: AddressSanitizer: use-after-poison iseq_collector.c:39 in iseq_i
```
2022-08-09 20:11:48 +09:00
Peter Zhu
229cf263df Lock the VM for rb_gc_writebarrier_unprotect
When using Ractors, rb_gc_writebarrier_unprotect requries a VM lock
since it modifies the bitmaps.
2022-07-28 10:02:12 -04:00
Peter Zhu
1c16645216 Make array slices views rather than copies
Before this commit, if the slice fits in VWA, it would make a copy
rather than a view. This is slower as it requires a memcpy of the
contents.
2022-07-28 10:02:12 -04:00
Peter Zhu
2375afb8d6 Refactor gc_ref_update_array 2022-07-28 10:02:12 -04:00
Nobuyoshi Nakada
5d5c1d0fbd
Suppress use-after-free warning by gcc-12 2022-07-28 09:06:42 +09:00
Nobuyoshi Nakada
f42230ff22
Adjust styles [ci skip] 2022-07-27 18:42:27 +09:00
git
3b1ed03d8c * expand tabs. [ci skip]
Tabs were expanded because the file did not have any tab indentation in unedited lines.
Please update your editor config, and use misc/expand_tabs.rb in the pre-commit hook.
2022-07-27 01:40:03 +09:00
Jemma Issroff
36d0c71ace
Refactored poisoning and unpoisoning freelist to simpler API 2022-07-26 09:39:31 -07:00
Peter Zhu
efb91ff19b Rename rb_ary_tmp_new to rb_ary_hidden_new
rb_ary_tmp_new suggests that the array is temporary in some way, but
that's not true, it just creates an array that's hidden and not on the
transient heap. This commit renames it to rb_ary_hidden_new.
2022-07-26 09:12:09 -04:00
Nobuyoshi Nakada
b30b727c24
Fix format specifier
`uintptr_t` is not always `unsigned long`, but can be casted to void
pointer safely.
2022-07-25 09:18:36 +09:00
Takashi Kokubun
5b21e94beb Expand tabs [ci skip]
[Misc #18891]
2022-07-21 09:42:04 -07:00
Peter Zhu
cdbb9b8555 [Bug #18929] Fix heap creation thrashing in GC
Before this commit, if we don't have enough slots after sweeping but
had pages on the tomb heap, then the GC would frequently allocate and
deallocate pages. This is because after sweeping it would set
allocatable pages (since there were not enough slots) but free the
pages on the tomb heap.

This commit reuses pages on the tomb heap if there's not enough slots
after sweeping.
2022-07-21 10:46:32 -04:00
Peter Zhu
1c9acb6bb1 Refactor macros of array.c
Move some macros in array.c to internal/array.h so that other files
can also access these macros.
2022-07-21 09:02:45 -04:00
Daniel Colson
32e406d6d3 Ensure _id2ref finds symbols with the correct type
Prior to this commit it was possible to call `ObjectSpace._id2ref` with
an offset static symbol object_id and get back a new, incorrectly tagged
symbol:

```
> sensible_sym = ObjectSpace._id2ref(:a.object_id)
=> :a
> nonsense_sym = ObjectSpace._id2ref(:a.object_id + 40)
=> :a
> sensible_sym == nonsense_sym
=> false
```

`nonsense_sym` ends up tagged with `RUBY_ID_INSTANCE` instead of
`RB_ID_LOCAL`. That means we can do silly things like:

```
> foo = Object.new
> foo.instance_variable_set(:a, 123)
(irb):2:in `instance_variable_set': `a' is not allowed as an instance variable name (NameError)
> foo.instance_variable_set(ObjectSpace._id2ref(:a.object_id + 40), 123)
=> 123
> foo.instance_variables
=> [:a]
```

This was happening because `get_id_entry` ignores the tag bits when
looking up the symbol. So `rb_id2str(symid)` would return a value and
then we'd continue on with the nonsense `symid`.

This commit prevents the situation by checking that the `symid` actually
matches what we get back from `get_id_entry`. Now we get a `RangeError`
for the nonsense id:

```
> ObjectSpace._id2ref(:a.object_id)
=> :a
> ObjectSpace._id2ref(:a.object_id + 40)
(irb):1:in `_id2ref': 0x000000000013f408 is not symbol id value (RangeError)
```

Co-authored-by: John Hawthorn <jhawthorn@github.com>
2022-07-20 10:38:44 -07:00
Peter Zhu
86d061294d [Bug #18928] Fix crash in WeakMap
In wmap_live_p, if is_pointer_to_heap returns false, then the page is
either in the tomb or has already been freed, so the object is dead. In
this case, wmap_live_p should return false.
2022-07-20 08:40:31 -04:00
Nobuyoshi Nakada
472740de41
Fix free objects count condition
Free objects have `T_NONE` as the builtin type.  A pointer to a valid
array element will never be `NULL`.
2022-07-20 17:39:54 +09:00
Peter Zhu
7424ea184f Implement Objects on VWA
This commit implements Objects on Variable Width Allocation. This allows
Objects with more ivars to be embedded (i.e. contents directly follow the
object header) which improves performance through better cache locality.
2022-07-15 09:21:07 -04:00
Matt Valentine-House
214ed4cbc6 [Feature #18901] Support size pool movement for Arrays
This commit enables Arrays to move between size pools during compaction.
This can occur if the array is mutated such that it would fit in a
different size pool when embedded.

The move is carried out in two stages:

1. The RVALUE is moved to a destination heap during object movement
   phase of compaction
2. The array data is re-embedded and the original buffer free'd if
   required. This happens during the update references step
2022-07-12 08:50:33 -04:00
Matt Valentine-House
a6dd859aff Add expand_heap option to GC.verify_compaction_references
In order to reliably test compaction we need to be able to move objects
between size pools.

In order for this to happen there must be pages in a size pool into
which we can allocate.

The existing implementation of `double_heap` only doubled the existing
number of pages in the heap, so if a size pool had a low number of pages
(or 0) it's not guaranteed that enough space will be created to move
objects into that size pool.

This commit deprecates the `double_heap` option and replaces it with
`expand_heap` instead.

expand heap will expand each heap by enough pages to hold a number of
slots defined by `GC_HEAP_INIT_SLOTS` or by `heap->total_pags` whichever
is larger.

If both `double_heap` and `expand_heap` are present, a deprecation
warning will be shown for `double_heap` and the `expand_heap` behaviour
will take precedence

Given that this is an API intended for debugging and testing GC
compaction I'm not concerned about the extra memory usage or time taken
to create the pages. However, for completeness:

Running the following `test.rb` and using `time` on my Macbook Pro shows
the following memory usage and time impact:

pp "RSS (kb): #{`ps -o rss #{Process.pid}`.lines.last.to_i}"
GC.verify_compaction_references(double_heap: true, toward: :empty)
pp "RSS (kb): #{`ps -o rss #{Process.pid}`.lines.last.to_i}"

❯ time make run
./miniruby -I./lib -I. -I.ext/common  -r./arm64-darwin21-fake  ./test.rb
"RSS (kb): 24000"
<internal:gc>:251: warning: double_heap is deprecated and will be removed
"RSS (kb): 25232"

________________________________________________________
Executed in  124.37 millis    fish           external
   usr time   82.22 millis    0.09 millis   82.12 millis
   sys time   28.76 millis    2.61 millis   26.15 millis

❯ time make run
./miniruby -I./lib -I. -I.ext/common  -r./arm64-darwin21-fake  ./test.rb
"RSS (kb): 24000"
"RSS (kb): 49040"

________________________________________________________
Executed in  150.13 millis    fish           external
   usr time  103.32 millis    0.10 millis  103.22 millis
   sys time   35.73 millis    2.59 millis   33.14 millis
2022-07-11 09:00:03 -04:00
Nobuyoshi Nakada
ec09ba58d1
Extract atomic_inc_wraparound function 2022-07-10 17:56:36 +09:00
Nobuyoshi Nakada
b1b8172328
Add asan_unpoisoning_object to execute the block with unpoisoning 2022-07-10 13:11:07 +09:00
Nobuyoshi Nakada
ec303e49af
Split rb_raw_obj_info 2022-07-10 13:11:07 +09:00
Nobuyoshi Nakada
233054a609
Cycle obj_info_buffers_index atomically 2022-07-10 13:11:06 +09:00
Nobuyoshi Nakada
a006dcb73f
APPEND_S for no conversion formats 2022-07-10 13:07:40 +09:00
Nobuyoshi Nakada
2bf0313561
Rewrite APPENDF using variadic arguments 2022-07-10 13:03:22 +09:00
Nobuyoshi Nakada
51025a9013
Use size_t for rb_raw_obj_info 2022-07-10 13:03:22 +09:00
Nobuyoshi Nakada
fbe3651466
Use asan_unpoison_object_temporary 2022-07-10 13:03:22 +09:00
Nobuyoshi Nakada
b16f44ad4f
Get rid of static buffer in obj_info 2022-07-10 13:03:21 +09:00
Nobuyoshi Nakada
61c7ae4d27 Gather heap page size conditions combination
When similar combination of conditions are separated in two places, it
is harder to make sure the conditional blocks match each other,
2022-07-07 22:39:59 +09:00
Peter Zhu
f36859869f Improve error message for segv in read_barrier_handler
If the page_body is a null pointer, then read_barrier_handler will
crash with an unrelated message. This commit improves the error message.

Before:

test.rb:1: [BUG] Couldn't unprotect page 0x0000000000000000, errno: Cannot allocate memory

After:

test.rb:1: [BUG] read_barrier_handler: segmentation fault at 0x14
2022-07-07 09:39:28 -04:00
Peter Zhu
d6c98626da Fix crash in compaction due to unlocked page
The page of src could be partially compacted, so it may contain
T_MOVED. Sweeping a page may read objects on this page, so we
need to lock the page.
2022-07-07 09:39:28 -04:00
Peter Zhu
d7c5a6d49b Fix typo in gc_compact_move
The page we're sweeping is on the destination heap `dheap`, not the
source heap `heap`.
2022-07-07 09:39:28 -04:00
Nobuyoshi Nakada
f681f9ae24
Adjust indents [ci skip] 2022-07-06 00:45:06 +09:00
Nobuyoshi Nakada
cab10a2c50
Extract protect_page_body to fix mismatched braces 2022-06-18 10:20:46 +09:00
KJ Tsanaktsidis
05ffc037ad Disable Mach exception handlers when read barriers in place
The GC compaction mechanism implements a kind of read barrier by marking
some (OS) pages as unreadable, and installing a SIGBUS/SIGSEGV handler
to detect when they're accessed and invalidate an attempt to move the
object.

Unfortunately, when a debugger is attached to the Ruby interpreter on
Mac OS, the debugger will trap the EXC_BAD_ACCES mach exception before
the runtime can transform that into a SIGBUS signal and dispatch it.
Thus, execution gets stuck; any attempt to continue from the debugger
re-executes the line that caused the exception and no forward progress
can be made.

This makes it impossible to debug either the Ruby interpreter or a C
extension whilst compaction is in use.

To fix this, we disable the EXC_BAD_ACCESS handler when installing the
SIGBUS/SIGSEGV handlers, and re-enable them once the compaction is done.
The debugger will still trap on the attempt to read the bad page, but it
will be trapping the SIGBUS signal, rather than the EXC_BAD_ACCESS mach
exception. It's possible to continue from this in the debugger, which
invokes the signal handler and allows forward progress to be made.
2022-06-18 00:10:16 +09:00
Nobuyoshi Nakada
2c19086323
Suppress code unused unless GC_CAN_COMPILE_COMPACTION 2022-06-17 10:47:16 +09:00
Peter Zhu
79eaaf2d0b Include runtime checks for compaction support
Commit 0c36ba5319 changed GC compaction
methods to not be implemented when not supported. However, that commit
only does compile time checks (which currently only checks for WASM),
but there are additional compaction support checks during run time.

This commit changes it so that GC compaction methods aren't defined
during run time if the platform does not support GC compaction.

[Bug #18829]
2022-06-16 10:18:46 -04:00
Peter Zhu
52d42e7023 Rename GC_COMPACTION_SUPPORTED
Naming this macro GC_COMPACTION_SUPPORTED is misleading because it
only checks whether compaction is supported at compile time.

[Bug #18829]
2022-06-16 10:18:46 -04:00
Takashi Kokubun
1162523bae
Remove MJIT worker thread (#6006)
[Misc #18830]
2022-06-15 09:40:54 -07:00
Matt Valentine-House
56cc3e99b6 Move String RVALUES between pools
And re-embed any strings that can now fit inside the slot they've been
moved to
2022-06-13 10:11:27 -07:00