Commit graph

349 commits

Author SHA1 Message Date
John Hawthorn
640a2f1dc7 Ensure ObjectSpace.dump won't call cc_cme on invalidated CC 2025-08-06 15:57:13 -07:00
Jean Boussier
f0c31c5e64 Get rid of RSHAPE_PARENT in favor of RSHAPE_DIRECT_CHILD_P
`RSHAPE_PARENT` is error prone because it returns a raw untagged
shape_id.

To check if a shape is a direct parent of another, tags should be
discarded. So providing a comparison function is better than exposing
untagged ids.
2025-07-31 21:55:51 +02:00
Jean Boussier
7ee127d2d1 Get rid of imemo_ast
It has been marked as obsolete for a while and I see no reason
to keep it.
2025-07-29 13:05:12 +02:00
Peter Zhu
f186f2cb70 Remove unused imemo_parser_strterm 2025-07-24 09:49:13 -04:00
Stan Lo
78820e86c7 Update doc for ObjectSpace.memsize_of 2025-07-23 15:49:04 -04:00
Jeremy Evans
0b23a8db60 Update dependencies for addition of set.h to public headers 2025-07-11 15:24:23 +09:00
Jean Boussier
cd9f447be2 Refactor generic fields to use T_IMEMO/fields objects.
Followup: https://github.com/ruby/ruby/pull/13589

This simplify a lot of things, as we no longer need to manually
manage the memory, we can use the Read-Copy-Update pattern and
avoid numerous race conditions.

Co-Authored-By: Étienne Barrié <etienne.barrie@gmail.com>
2025-06-17 15:28:05 +02:00
Jean Boussier
fb68721f63 Rename imemo_class_fields -> imemo_fields 2025-06-17 15:28:05 +02:00
Jean Boussier
3abdd4241f Turn rb_classext_t.fields into a T_IMEMO/class_fields
This behave almost exactly as a T_OBJECT, the layout is entirely
compatible.

This aims to solve two problems.

First, it solves the problem of namspaced classes having
a single `shape_id`. Now each namespaced classext
has an object that can hold the namespace specific
shape.

Second, it open the door to later make class instance variable
writes atomics, hence be able to read class variables
without locking the VM.
In the future, in multi-ractor mode, we can do the write
on a copy of the `fields_obj` and then atomically swap it.

Considerations:

  - Right now the `RClass` shape_id is always synchronized,
    but with namespace we should likely mark classes that have
    multiple namespace with a specific shape flag.
2025-06-12 07:58:16 +02:00
Jean Boussier
95201299fd Refactor the last references to rb_shape_t
The type isn't opaque because Ruby isn't often compiled with LTO,
so for optimization purpose it's better to allow as much inlining
as possible.

However ideally only `shape.c` and `shape.h` should deal with
the actual struct, and everything else should just deal with opaque
`shape_id_t`.
2025-06-11 16:38:38 +02:00
Jean Boussier
6eb0cd8df7 Get rid of SHAPE_T_OBJECT
Now that we have the `heap_index` in shape flags we no longer
need `T_OBJECT` shapes.
2025-06-07 18:30:44 +02:00
Jean Boussier
675f33508c Get rid of TOO_COMPLEX shape type
Instead it's now a `shape_id` flag.

This allows to check if an object is complex without having
to chase the `rb_shape_t` pointer.
2025-06-04 13:13:50 +02:00
Jean Boussier
625d6a9cbb Get rid of frozen shapes.
Instead `shape_id_t` higher bits contain flags, and the first one
tells whether the shape is frozen.

This has multiple benefits:
  - Can check if a shape is frozen with a single bit check instead of
    dereferencing a pointer.
  - Guarantees it is always possible to transition to frozen.
  - This allow reclaiming `FL_FREEZE` (not done yet).

The downside is you have to be careful to preserve these flags
when transitioning.
2025-06-04 07:59:20 +02:00
Jean Boussier
e535f8248b Get rid of rb_shape_id(rb_shape_t *)
We should avoid conversions from `rb_shape_t *` into `shape_id_t`
outside of `shape.c` as the short term goal is to have `shape_id_t`
contain tags.
2025-05-27 12:45:24 +02:00
Jean Boussier
60ffb714d2 Ensure shape_id is never used on T_IMEMO
It doesn't make sense to set ivars or anything shape
related on a T_IMEMO.

Co-Authored-By: John Hawthorn <john@hawthorn.email>
2025-05-15 16:06:52 +02:00
Yusuke Endoh
cb99e54486 Update common.mk dependencies 2025-05-11 23:32:50 +09:00
Satoshi Tagomori
382645d440 namespace on read 2025-05-11 23:32:50 +09:00
Jean Boussier
ea77250847 Rename RB_OBJ_SHAPE -> rb_obj_shape
As well as `RB_OBJ_SHAPE_ID` -> `rb_obj_shape_id`
and `RSHAPE` is now a simple alias for `rb_shape_lookup`.

I tried to turn all these into `static inline` but I'm having
trouble with `RUBY_EXTERN rb_shape_tree_t *rb_shape_tree_ptr;`
not being exposed as I'd expect.
2025-05-09 10:22:51 +02:00
Jean Boussier
5782561fc1 Rename rb_shape_get_shape_id -> RB_OBJ_SHAPE_ID
And `rb_shape_get_shape` -> `RB_OBJ_SHAPE`.
2025-05-09 10:22:51 +02:00
Jean Boussier
3f7c0af051 Rename rb_shape_obj_too_complex -> rb_shape_obj_too_complex_p 2025-05-09 10:22:51 +02:00
Jean Boussier
e4f97ce387 Refactor rb_shape_depth to take an ID rather than a pointer.
As well as `rb_shape_edges_count` and `rb_shape_memsize`.
2025-05-09 10:22:51 +02:00
Jean Boussier
f48e45d1e9 Move object_id in object fields.
And get rid of the `obj_to_id_tbl`

It's no longer needed, the `object_id` is now stored inline
in the object alongside instance variables.

We still need the inverse table in case `_id2ref` is invoked, but
we lazily build it by walking the heap if that happens.

The `object_id` concern is also no longer a GC implementation
concern, but a generic implementation.

Co-Authored-By: Matt Valentine-House <matt@eightbitraptor.com>
2025-05-08 07:58:05 +02:00
Jean Boussier
0ea210d1ea Rename ivptr -> fields, next_iv_index -> next_field_index
Ivars will longer be the only thing stored inline
via shapes, so keeping the `iv_index` and `ivptr` names
would be confusing.

Instance variables won't be the only thing stored inline
via shapes, so keeping the `ivptr` name would be confusing.

`field` encompass anything that can be stored in a VALUE array.

Similarly, `gen_ivtbl` becomes `gen_fields_tbl`.
2025-05-08 07:58:05 +02:00
Jean Boussier
3ec7bfff2e Use a set_table for rb_vm_struct.unused_block_warning_table
Now that we have a hash-set implementation we can use that
instead of a hash-table with a static value.
2025-04-27 11:59:28 +02:00
Peter Zhu
bdb25959fb Move object_id to flags for ObjectSpace dumps
Moving object_id dumping from ObjectSpace to the GC flags allows ObjectSpace
to not assume the FL_SEEN_OBJ_ID flag and instead move it to the responsibility
of the GC.
2025-03-13 10:12:24 -04:00
Peter Zhu
7b6e07ea93 Add rb_gc_object_metadata API
This function replaces the internal rb_obj_gc_flags API. rb_gc_object_metadata
returns an array of name and value pairs, with the last element having
0 for the name.
2025-02-19 09:47:28 -05:00
Peter Zhu
d729c1575e Output object_id in ObjectSpace.dump
Outputs the object ID in the dump for objects that have it seen.
2025-01-30 11:48:14 -05:00
Koichi Sasada
c695536cc8 use st_update to prevent table extension
to prevent the following scenario:

1. `delete_unique_str()` can be called while GC (sweeping)
2. it calls `st_insert()` to decrement the counter
3. `st_insert()` can try to extend the table even if the key exists
4. `xmalloc` while GC and cause BUG
2024-12-23 11:05:34 +09:00
Peter Zhu
a58675386c Prefix asan_poison_object with rb 2024-12-19 09:14:34 -05:00
Peter Zhu
516a6cd1ad Check whether object is valid in allocation_info_tracer_compact
When reference updating ObjectSpace.trace_object_allocations, we need to
check whether the object is valid or not because it does not mark the
object so the object may be dead. This can cause a segmentation fault
if the object is on a free heap page.

For example, the following script crashes:

    require "objspace"

    objs = []
    ObjectSpace.trace_object_allocations do
      1_000_000.times do
        objs << Object.new
      end
    end

    objs = nil

    # Free pages that the objs were on
    GC.start

    # Run compaction and check that it doesn't crash
    GC.compact
2024-12-16 12:24:24 -05:00
Peter Zhu
15765eac0a Fix ObjectSpace.trace_object_allocations for compaction
We need to reinsert into the ST table when an object moves because it is
a numtable that hashes on the object address, so when an object moves we
need to reinsert it rather than just updating the key.
2024-12-16 10:12:54 -05:00
Peter Zhu
b038530506 Fix compaction check for ObjectSpace.trace_object_allocations
We should be checking for key for moved objects rather than the value
because the key is a Ruby object and the value is malloc'd memory.
2024-12-16 10:12:54 -05:00
Alan Wu
476d655053 objspace_dump: Use FILE* to avoid crashing in mark functions
We observed crashes from rb_io_bufwrite() thread switching (through
rb_thread_check_ints()) in the middle of rb_execution_context_mark(). By
the time rb_execution_context_mark() gets a timeslice again, it read
garbage from a frame that was already popped in another thread, crashing
the process in SEGV. Other mark functions probably have their own ways
of breaking, but clearly, the usual IO code do too much for this
perilous pseudo GC context.

Use `FILE*` like before 5001cc4716
("Optimize ObjectSpace.dump_all"). Also, add type checking for
the private _dump methods.

Co-authored-by: Peter Zhu <peter@peterzhu.ca>
2024-12-09 16:08:35 -05:00
Jean Boussier
ee1cd1656f ObjectSpace.dump: handle Module#set_temporary_name
[Bug #20892]

Until the introduction of that method, it was impossible for a
Module name not to be valid JSON, hence it wasn't going through
the slower escaping function.

This assumption no longer hold.
2024-11-12 20:21:27 +01:00
Peter Zhu
51bd816517 [Feature #20470] Split GC into gc_impl.c
This commit splits gc.c into two files:

- gc.c now only contains code not specific to Ruby GC. This includes
  code to mark objects (which the GC implementation may choose not to
  use) and wrappers for internal APIs that the implementation may need
  to use (e.g. locking the VM).

- gc_impl.c now contains the implementation of Ruby's GC. This includes
  marking, sweeping, compaction, and statistics. Most importantly,
  gc_impl.c only uses public APIs in Ruby and a limited set of functions
  exposed in gc.c. This allows us to build gc_impl.c independently of
  Ruby and plug Ruby's GC into itself.
2024-07-03 09:03:40 -04:00
卜部昌平
c844968b72 ruby tool/update-deps --fix 2024-04-27 21:55:28 +09:00
Étienne Barrié
12be40ae6b Implement chilled strings
[Feature #20205]

As a path toward enabling frozen string literals by default in the future,
this commit introduce "chilled strings". From a user perspective chilled
strings pretend to be frozen, but on the first attempt to mutate them,
they lose their frozen status and emit a warning rather than to raise a
`FrozenError`.

Implementation wise, `rb_compile_option_struct.frozen_string_literal` is
no longer a boolean but a tri-state of `enabled/disabled/unset`.

When code is compiled with frozen string literals neither explictly enabled
or disabled, string literals are compiled with a new `putchilledstring`
instruction. This instruction is identical to `putstring` except it marks
the String with the `STR_CHILLED (FL_USER3)` and `FL_FREEZE` flags.

Chilled strings have the `FL_FREEZE` flag as to minimize the need to check
for chilled strings across the codebase, and to improve compatibility with
C extensions.

Notes:
  - `String#freeze`: clears the chilled flag.
  - `String#-@`: acts as if the string was mutable.
  - `String#+@`: acts as if the string was mutable.
  - `String#clone`: copies the chilled flag.

Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
2024-03-19 09:26:49 +01:00
Jean Boussier
b4a69351ec Move FL_SINGLETON to FL_USER1
This frees FL_USER0 on both T_MODULE and T_CLASS.

Note: prior to this, FL_SINGLETON was never set on T_MODULE,
so checking for `FL_SINGLETON` without first checking that
`FL_TYPE` was `T_CLASS` was valid. That's no longer the case.
2024-03-06 13:11:41 -05:00
Lazarus Lazaridis
61ea202f8b
[DOC] Fix invalid documentation for reachable_objects_from (#10172)
Previous documentation is stating the opposite (that the method won't
work for CRuby).
2024-03-05 01:35:51 +09:00
Peter Zhu
386a006630 Use rb_hash_foreach in objspace.c
Using RHASH_TBL_RAW is a private API, so we should use rb_hash_foreach
rather than RHASH_TBL_RAW with st_foreach.
2024-02-23 10:24:21 -05:00
KJ Tsanaktsidis
61da90c1b8 Mark asan fake stacks during machine stack marking
ASAN leaves a pointer to the fake frame on the stack; we can use the
__asan_addr_is_in_fake_stack API to work out the extent of the fake
stack and thus mark any VALUEs contained therein.

[Bug #20001]
2024-01-19 09:55:12 +11:00
KJ Tsanaktsidis
688a6ff510 Revert "Mark asan fake stacks during machine stack marking"
This reverts commit d10bc3a2b8.
2024-01-12 17:58:54 +11:00
KJ Tsanaktsidis
d10bc3a2b8 Mark asan fake stacks during machine stack marking
ASAN leaves a pointer to the fake frame on the stack; we can use the
__asan_addr_is_in_fake_stack API to work out the extent of the fake
stack and thus mark any VALUEs contained therein.

[Bug #20001]
2024-01-12 17:29:48 +11:00
Jean Boussier
6391ae9ebc objspace_dump.c: dump call cache ids with dump_append_id
Not all `ID` have an associated string.

Fixes a SEGFAULT in ObjectSpace.dump_all spec.
2023-11-22 10:24:35 +01:00
yui-knk
c3ab946e86 ObjectSpace.count_nodes doesn't count nodes
Node has not been managed by GC from Ruby 2.5.
Therefore these codes are not needed. If ObjectSpace depends on Node,
it needs to update the file when node type is updated. Delete node
related codes to avoid such update.
2023-11-21 14:39:06 +09:00
Aaron Patterson
6fce8c7980 Don't try compacting ivars on Classes that are "too complex"
Too complex classes use a hash table to store ivs, and should always pin
their IVs.  We shouldn't touch those classes in compaction.
2023-11-20 16:09:48 -08:00
Peter Zhu
68869e9bd9 Revert "Revert "Remove SHAPE_CAPACITY_CHANGE shapes""
This reverts commit 5f3fb4f4e3.
2023-11-13 18:26:36 -05:00
John Hawthorn
b41270842a Record more info from CALLCACHE in heap dumps
This records the called_id and klass from imemo_callcache objects in
heap dumps.
2023-11-13 15:03:11 -08:00
Peter Zhu
5f3fb4f4e3 Revert "Remove SHAPE_CAPACITY_CHANGE shapes"
This reverts commit f6910a6112.

We're seeing crashes in the test suite of Shopify's core monolith after
this change.
2023-11-10 11:27:49 -05:00
Peter Zhu
f6910a6112 Remove SHAPE_CAPACITY_CHANGE shapes
We don't need to create a shape to transition capacity as we can
transition the capacity when the capacity of the SHAPE_IVAR changes.
2023-11-09 09:25:02 -05:00