archive/ruby - Eplg Git: Free And Private Git Hosting

mirror of https://github.com/ruby/ruby.git synced 2025-08-15 13:39:04 +02:00

Author	SHA1	Message	Date
Jean Boussier	f3206cc79b	Struct: keep direct reference to IMEMO/fields when space allows It's not rare for structs to have additional ivars, hence are one of the most common, if not the most common type in the `gen_fields_tbl`. This can cause Ractor contention, but even in single ractor mode means having to do a hash lookup to access the ivars, and increase GC work. Instead, unless the struct is perfectly right sized, we can store a reference to the associated IMEMO/fields object right after the last struct member. ``` compare-ruby: ruby 3.5.0dev (2025-08-06T12:50:36Z struct-ivar-fields-2 9a30d141a1) +PRISM [arm64-darwin24] built-ruby: ruby 3.5.0dev (2025-08-06T12:57:59Z struct-ivar-fields-2 2ff3ec237f) +PRISM [arm64-darwin24] warming up..... \| \|compare-ruby\|built-ruby\| \|:---------------------\|-----------:\|---------:\| \|member_reader \| 590.317k\| 579.246k\| \| \| 1.02x\| -\| \|member_writer \| 543.963k\| 527.104k\| \| \| 1.03x\| -\| \|member_reader_method \| 213.540k\| 213.004k\| \| \| 1.00x\| -\| \|member_writer_method \| 192.657k\| 191.491k\| \| \| 1.01x\| -\| \|ivar_reader \| 403.993k\| 569.915k\| \| \| -\| 1.41x\| ``` Co-Authored-By: Étienne Barrié <etienne.barrie@gmail.com>	2025-08-06 17:07:49 +02:00
Jean Boussier	1064c63643	Fix rb_shape_transition_object_id transition to TOO_COMPLEX If `get_next_shape_internal` fail to return a shape, we must transitiont to a complex shape. `shape_transition_object_id` mistakenly didn't. Co-Authored-By: Peter Zhu <peter@peterzhu.ca>	2025-08-01 12:39:14 +02:00
Jean Boussier	f2a7e48dea	Make `RClass.cc_table` a managed object For now this doesn't change anything, but now that the table is managed by GC, it opens the door to use RCU when in multi-ractor mode, hence allow unsynchornized reads.	2025-08-01 10:42:04 +02:00
Jean Boussier	fc5e1541e4	Use `rb_gc_mark_weak` for `cc->klass`. One of the biggest remaining contention point is `RClass.cc_table`. The logical solution would be to turn it into a managed object, so we can use an RCU strategy, given it's read heavy. However, that's not currently possible because the table can't be freed before the owning class, given the class free function MUST go over all the CC entries to invalidate them. However if the `CC->klass` reference is weak marked, then the GC will take care of setting the reference to `Qundef`.	2025-08-01 10:42:04 +02:00
Peter Zhu	7cece235ab	Don't check the symbol's fstr at shutdown During Ruby's shutdown, we no longer need to check the fstr of the symbol because we don't use the fstr anymore for freeing the symbol. This can also fix the following ASAN error: ==2721247==ERROR: AddressSanitizer: use-after-poison on address 0x75fa90a627b8 at pc 0x64a7b06fb4bc bp 0x7ffdf95ba9b0 sp 0x7ffdf95ba9a8 READ of size 8 at 0x75fa90a627b8 thread T0 #0 0x64a7b06fb4bb in RB_BUILTIN_TYPE include/ruby/internal/value_type.h:191:30 #1 0x64a7b06fb4bb in rb_gc_shutdown_call_finalizer_p gc.c:357:18 #2 0x64a7b06fb4bb in rb_gc_impl_shutdown_call_finalizer gc/default/default.c:3045:21 #3 0x64a7b06fb4bb in rb_objspace_call_finalizer gc.c:1739:5 #4 0x64a7b06ca1b2 in rb_ec_finalize eval.c:165:5 #5 0x64a7b06ca1b2 in rb_ec_cleanup eval.c:256:5 #6 0x64a7b06c98a3 in ruby_cleanup eval.c:179:12	2025-07-30 13:57:32 -04:00
Peter Zhu	a2e165e8a0	Remove dsymbol_fstr_hash We don't need to delay the freeing of the fstr for the symbol if we store the hash of the fstr in the dynamic symbol and we use compare-by-identity for removing the dynamic symbol from the sym_set.	2025-07-21 10:58:30 -04:00
Peter Zhu	2bcb155b49	Convert global symbol table to concurrent set	2025-07-21 10:58:30 -04:00
Yusuke Endoh	830ab2c5b5	Add a comment to count_objects to prevent future regression	2025-07-16 20:14:20 +09:00
Yusuke Endoh	6d17a3e647	Prevent ObjectSpace.count_objects from allocating extra arrays `ObjectSpace.count_objects` could cause an unintended array allocation. It returns a hash like `{ :T_ARRAY => 100, :T_STRING => 100, ... }`, so it creates the key symbol (e.g., `:T_STRING`) for the first time. On rare occations, this symbol creation internally allocates a new array for symbol management. This led to a problematic side effect where calling `count_objects` twice in a row could produce inconsistent results: the first call would trigger the hidden array allocation, and the second call would then report an increased count for `:T_ARRAY`. This behavior caused test failures in `test/ruby/test_allocation.rb`, which performs a baseline measurement before an operation and then asserts the exact number of new allocations. `20250716`T053005Z.fail.html.gz > 1) Failure: > TestAllocation::ProcCall::WithBlock#test_ruby2_keywords [...]: > Expected 1 array allocations for "r2k.(1, a: 2, &block)", but 2 arrays allocated. This change resolves the issue by pre-interning all key symbols used by `ObjectSpace.count_objects` before its counting. This eliminates the side effect and ensures the stability of allocation-sensitive tests. Co-authored-by: Koichi Sasada <ko1@atdot.net>	2025-07-16 18:31:10 +09:00
Kunshan Wang	51a3ea5ade	YJIT: Set code mem permissions in bulk Some GC modules, notably MMTk, support parallel GC, i.e. multiple GC threads work in parallel during a GC. Currently, when two GC threads scan two iseq objects simultaneously when YJIT is enabled, both threads will attempt to borrow `CodeBlock::mem_block`, which will result in panic. This commit makes one part of the change. We now set the YJIT code memory to writable in bulk before the reference-updating phase, and reset it to executable in bulk after the reference-updating phase. Previously, YJIT lazily sets memory pages writable while updating object references embedded in JIT-compiled machine code, and sets the memory back to executable by calling `mark_all_executable`. This approach is inherently unfriendly to parallel GC because (1) it borrows `CodeBlock::mem_block`, and (2) it sets the whole `CodeBlock` as executable which races with other GC threads that are updating other iseq objects. It also has performance overhead due to the frequent invocation of system calls. We now set the permission of all the code memory in bulk before and after the reference updating phase. Multiple GC threads can now perform raw memory writes in parallel. We should also see performance improvement during moving GC because of the reduced number of `mprotect` system calls.	2025-07-14 16:21:55 -04:00
Peter Zhu	ead3739c34	Inline ASAN poison functions when ASAN is not enabled The ASAN poison functions was always defined in gc.c, even if ASAN was not enabled. This made function calls to happen all the time even if ASAN is not enabled. This commit defines these functions as empty macros when ASAN is not enabled.	2025-06-30 10:25:58 -04:00
Peter Zhu	d9b2d89976	Extract Ractor safe table used for frozen strings This commit extracts the Ractor safe table used for frozen strings into ractor_safe_table.c, which will allow it to be used elsewhere, including for the global symbol table.	2025-06-27 09:23:14 -04:00
Jean Boussier	242343ff80	variable.c: Refactor `generic_field_set` / `generic_ivar_set` These two functions are very similar, they can share most of their logic.	2025-06-26 16:25:57 +02:00
Peter Zhu	aed7a95f9d	Move RUBY_ATOMIC_VALUE_LOAD to ruby_atomic.h Deduplicates RUBY_ATOMIC_VALUE_LOAD by moving it to ruby_atomic.h.	2025-06-25 13:04:25 -04:00
Jean Boussier	071b9affe6	Ensure `RCLASS_CLASSEXT_TBL` accessor is always used.	2025-06-23 10:04:58 +01:00
Jean Boussier	cd9f447be2	Refactor generic fields to use `T_IMEMO/fields` objects. Followup: https://github.com/ruby/ruby/pull/13589 This simplify a lot of things, as we no longer need to manually manage the memory, we can use the Read-Copy-Update pattern and avoid numerous race conditions. Co-Authored-By: Étienne Barrié <etienne.barrie@gmail.com>	2025-06-17 15:28:05 +02:00
Satoshi Tagomori	50c6bd47ef	Update vm->self location and mark it in vm.c for consistency	2025-06-17 10:07:53 +09:00
ydah	251cfdfe22	Fix typo in rb_bug message for unreachable code	2025-06-15 22:35:26 +09:00
Jean Boussier	15084fbc3c	Get rid of FL_EXIVAR Now that the shape_id gives us all the same information, it's no longer needed.	2025-06-13 23:50:30 +02:00
Jean Boussier	6dbe24fe56	Use the `shape_id` rather than `FL_EXIVAR` We still keep setting `FL_EXIVAR` so that `rb_shape_verify_consistency` can detect discrepancies.	2025-06-13 23:50:30 +02:00
Jean Boussier	b51078f82e	Enforce consistency between shape_id and FL_EXIVAR The FL_EXIVAR is a bit redundant with the shape_id. Now that the `shape_id` is embedded in all objects on all archs, we can cheaply check if an object has any fields with a simple bitmask.	2025-06-13 23:50:30 +02:00
Jean Boussier	3abdd4241f	Turn `rb_classext_t.fields` into a T_IMEMO/class_fields This behave almost exactly as a T_OBJECT, the layout is entirely compatible. This aims to solve two problems. First, it solves the problem of namspaced classes having a single `shape_id`. Now each namespaced classext has an object that can hold the namespace specific shape. Second, it open the door to later make class instance variable writes atomics, hence be able to read class variables without locking the VM. In the future, in multi-ractor mode, we can do the write on a copy of the `fields_obj` and then atomically swap it. Considerations: - Right now the `RClass` shape_id is always synchronized, but with namespace we should likely mark classes that have multiple namespace with a specific shape flag.	2025-06-12 07:58:16 +02:00
Peter Zhu	837699e160	Take file and line in GC VM locks This commit adds file and line to GC VM locking functions for debugging purposes and adds upper case macros to pass __FILE__ and __LINE__.	2025-06-09 13:57:18 -04:00
Jean Boussier	f9966b9b76	Get rid of `gen_fields_tbl.fields_count` This data is redundant because the shape already contains both the length and capacity of the object's fields. So it both waste space and create the possibility of a desync between the two. We also do not need to initialize everything to Qundef, this seem to be a left-over from pre-shape instance variables.	2025-06-09 16:38:29 +02:00
alpaca-tc	c8ddc0a843	Optimize callcache invalidation for refinements Fixes [Bug #21201] This change addresses a performance regression where defining methods inside `refine` blocks caused severe slowdowns. The issue was due to `rb_clear_all_refinement_method_cache()` triggering a full object space scan via `rb_objspace_each_objects` to find and invalidate affected callcaches, which is very inefficient. To fix this, I introduce `vm->cc_refinement_table` to track callcaches related to refinements. This allows us to invalidate only the necessary callcaches without scanning the entire heap, resulting in significant performance improvement.	2025-06-09 12:33:35 +09:00
Jean Boussier	a640723d31	Simplify `rb_gc_rebuild_shape` Now that there no longer multiple shape roots, all we need to do when moving an object from one slot to the other is to update the `heap_index` part of the shape_id. Since this never need to create a shape transition, it will always work and never result in a complex shape.	2025-06-07 18:30:44 +02:00
Koichi Sasada	1605704117	ignore confirming belonging while finrializer A finalizer registerred in Ractor A can be invoked in B. ```ruby require "tempfile" r = Ractor.new{ 10_000.times{\|i\| Tempfile.new(["file_to_require_from_ractor#{i}", ".rb"]) } } sleep 0.1 ``` For example, above script makes tempfiles which have finalizers on Ractor r, but at the end of the process, main Ractor will invoke finalizers and it violates belonging check. This patch just ignore the belonging check to avoid CI failure. Of course it violates Ractor's isolation and wrong workaround. This issue will be solved with Ractor local GC.	2025-06-07 09:52:03 +09:00
Koichi Sasada	1baa396e21	fix `rp(obj)` for any object Now `rp(obj)` doesn't work if the `obj` is out-of-heap because of `asan_unpoisoning_object()`, so this patch solves it. Also add pointer information and type information to show.	2025-06-06 13:44:15 +09:00
Jean Boussier	772fc1f187	Get rid of `rb_shape_t.flags` Now all flags are only in the `shape_id_t`, and can all be checked without needing to dereference a pointer.	2025-06-05 07:44:44 +02:00
Peter Zhu	99cc100cdf	Remove dead rb_malloc_info_show_results	2025-06-04 14:07:19 -04:00
John Hawthorn	e596cf6e93	Make FrozenCore a plain T_CLASS	2025-06-02 14:57:48 -04:00
Koichi Sasada	ef2bb61018	`Ractor::Port` * Added `Ractor::Port` * `Ractor::Port#receive` (support multi-threads) * `Rcator::Port#close` * `Ractor::Port#closed?` * Added some methods * `Ractor#join` * `Ractor#value` * `Ractor#monitor` * `Ractor#unmonitor` * Removed some methods * `Ractor#take` * `Ractor.yield` * Change the spec * `Racotr.select` You can wait for multiple sequences of messages with `Ractor::Port`. ```ruby ports = 3.times.map{ Ractor::Port.new } ports.map.with_index do \|port, ri\| Ractor.new port,ri do \|port, ri\| 3.times{\|i\| port << "r#{ri}-#{i}"} end end p ports.each{\|port\| pp 3.times.map{port.receive}} ``` In this example, we use 3 ports, and 3 Ractors send messages to them respectively. We can receive a series of messages from each port. You can use `Ractor#value` to get the last value of a Ractor's block: ```ruby result = Ractor.new do heavy_task() end.value ``` You can wait for the termination of a Ractor with `Ractor#join` like this: ```ruby Ractor.new do some_task() end.join ``` `#value` and `#join` are similar to `Thread#value` and `Thread#join`. To implement `#join`, `Ractor#monitor` (and `Ractor#unmonitor`) is introduced. This commit changes `Ractor.select()` method. It now only accepts ports or Ractors, and returns when a port receives a message or a Ractor terminates. We removes `Ractor.yield` and `Ractor#take` because: * `Ractor::Port` supports most of similar use cases in a simpler manner. * Removing them significantly simplifies the code. We also change the internal thread scheduler code (thread_pthread.c): * During barrier synchronization, we keep the `ractor_sched` lock to avoid deadlocks. This lock is released by `rb_ractor_sched_barrier_end()` which is called at the end of operations that require the barrier. * fix potential deadlock issues by checking interrupts just before setting UBF. https://bugs.ruby-lang.org/issues/21262	2025-05-31 04:01:33 +09:00
John Hawthorn	6a62a46c3c	Read {max_iv,variation}_count from prime classext MAX_IV_COUNT is a hint which determines the size of variable width allocation we should use for a given class. We don't need to scope this by namespace, if we end up with larger builtin objects on some namespaces that isn't a user-visible problem, just extra memory use. Similarly variation_count is used to track if a given object has had too many branches in shapes it has used, and to use too_complex when that happens. That's also just a hint, so we can use the same value across namespaces without it being visible to users. Previously variation_count was being incremented (written to) on the RCLASS_EXT_READABLE ext, which seems incorrect if we wanted it to be different across namespaces	2025-05-29 16:02:07 -04:00
Jean Boussier	925dec8d70	Rename `rb_shape_set_shape_id` in `rb_obj_set_shape_id`	2025-05-27 15:34:02 +02:00
Jean Boussier	ccf2b7c5b8	Refactor `rb_shape_too_complex_p` to take a `shape_id_t`.	2025-05-27 15:34:02 +02:00
Jean Boussier	a1f72d23a9	Refactor `rb_shape_has_object_id` Now takes a `shape_id_t` and the version that takes a `rb_shape_t *` is private.	2025-05-27 15:34:02 +02:00
Jean Boussier	a80a5000ab	Refactor `rb_obj_shape` out. It still exists but only in `shape.c`.	2025-05-27 15:34:02 +02:00
Peter Zhu	be5450467b	Fix reference updating for id2ref table The id2ref table could contain dead entries which should not be passed into rb_gc_location. Also, we already update references in gc_update_references using rb_gc_vm_weak_table_foreach so we do not need to update it again.	2025-05-27 08:22:26 +02:00
John Hawthorn	f483befd90	Add shape_id to RBasic under 32 bit This makes `RBobject` `4B` larger on 32 bit systems but simplifies the implementation a lot. [Feature #21353] Co-authored-by: Jean Boussier <byroot@ruby-lang.org>	2025-05-26 10:31:54 +02:00
Nobuyoshi Nakada	aad9fa2853	Use RB_VM_LOCKING	2025-05-25 15:22:43 +09:00
John Hawthorn	11ad7f5f47	Don't use namespaced classext for superclasses Superclasses can't be modified by user code, so do not need namespace indirection. For example Object.superclass is always BasicObject, no matter what modules are included onto it.	2025-05-23 10:22:24 -07:00
Nobuyoshi Nakada	7154b4208b	Fix a -Wmaybe-uninitialized lev in rb_gc_vm_lock() is uninitialized in single ractor mode.	2025-05-22 10:55:19 +09:00
John Hawthorn	6a16c3e26d	Remove too_complex GC assertion Classes from the default namespace are not writable, however they do not transition to too_complex until they have been written to inside a user namespace. So this assertion is invalid (as is the previous location it was) but it doesn't seem to provide us much value. Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>	2025-05-21 17:23:18 -07:00
Aaron Patterson	6ea893f376	Add assertion for RCLASS_SET_PRIME_CLASSEXT_WRITABLE When classes are booted, they should all be writeable unless namespaces are enabled. This commit adds an assertion to ensure that classes are writable.	2025-05-21 09:51:32 -07:00
Peter Zhu	ac23fa0902	Use rb_id_table_foreach_values for mark_cc_tbl We don't need the key, so we can improve performance by only iterating on the value. This will also fix the MMTk build because looking up the key in rb_id_table_foreach requires locking the VM, which is not supported in the MMTk worker threads.	2025-05-21 11:27:02 -04:00
Jean Boussier	31ba881684	Disable GC when building id2ref table Building that table will likely malloc several time which can trigger GC and cause race condition by freeing objects that were just added to the table. Disabling GC to prevent the race condition isn't elegant, but iven this is a deprecated callpath that is executed at most once per process, it seems acceptable.	2025-05-15 16:29:45 +02:00
Jean Boussier	60ffb714d2	Ensure shape_id is never used on T_IMEMO It doesn't make sense to set ivars or anything shape related on a T_IMEMO. Co-Authored-By: John Hawthorn <john@hawthorn.email>	2025-05-15 16:06:52 +02:00
Jean Boussier	b5575a80bc	Reduce `Object#object_id` contention. If the object isn't shareable and already has a object_id we can access it without a lock. If we need to generate an ID, we may need to lock to find the child shape. We also generate the next `object_id` using atomics.	2025-05-14 14:41:46 +02:00
Jean Boussier	f9c3feccf4	Rename `id_to_obj_tbl` -> `id2ref_tbl` As well as associated functions, this should make it more obvious what the purpose is.	2025-05-14 11:41:14 +02:00
Jean Boussier	9400119702	Fix `object_id` for classes and modules in namespace context Given classes and modules have a different set of fields in every namespace, we can't store the object_id in fields for them. Given that some space was freed in `RClass` we can store it there instead.	2025-05-14 10:26:48 +02:00

1 2 3 4 5 ...

2705 commits