archive/ruby - Eplg Git: Free And Private Git Hosting

mirror of https://github.com/ruby/ruby.git synced 2025-08-15 13:39:04 +02:00

Author	SHA1	Message	Date
Peter Zhu	95320f1ddf	Fix RUBY_FREE_AT_EXIT for static symbols Since static symbols allocate memory, we should deallocate them at shutdown to prevent memory leaks from being reported with RUBY_FREE_AT_EXIT.	2025-08-05 12:04:27 -04:00
Jean Boussier	fc5e1541e4	Use `rb_gc_mark_weak` for `cc->klass`. One of the biggest remaining contention point is `RClass.cc_table`. The logical solution would be to turn it into a managed object, so we can use an RCU strategy, given it's read heavy. However, that's not currently possible because the table can't be freed before the owning class, given the class free function MUST go over all the CC entries to invalidate them. However if the `CC->klass` reference is weak marked, then the GC will take care of setting the reference to `Qundef`.	2025-08-01 10:42:04 +02:00
Takashi Kokubun	2cd10de330	ZJIT: Prepare for sharing JIT hooks with ZJIT (#14044 )	2025-07-30 10:11:10 -07:00
Takashi Kokubun	b22eb0e468	ZJIT: Add --zjit-stats (#14034 )	2025-07-29 10:00:15 -07:00
Peter Zhu	2bcb155b49	Convert global symbol table to concurrent set	2025-07-21 10:58:30 -04:00
John Hawthorn	cfc006d410	Always use atomics to get the shape count When sharing between threads we need both atomic reads and writes. We probably didn't need to use this in some cases (where we weren't running in multi-ractor mode) but I think it's best to be consistent.	2025-07-09 10:38:04 -07:00
John Hawthorn	2ed4862690	Remove unnecessary union	2025-06-24 20:02:30 -07:00
Luke Gruber	e3ec101cc2	thread_cleanup: set CFP to NULL before clearing ec's stack We clear the CFP first so that if a sampling profiler interrupts the current thread during `rb_ec_set_vm_stack`, `thread_profile_frames` returns early instead of trying to walk the stack that's no longer set on the ec. The early return in `thread_profile_frames` was introduced at `eab7f4623f`. Fixes [Bug #21441]	2025-06-17 15:03:39 -07:00
Satoshi Tagomori	50c6bd47ef	Update vm->self location and mark it in vm.c for consistency	2025-06-17 10:07:53 +09:00
Jean Boussier	7c22330cd2	Allocate `rb_shape_tree` statically There is no point allocating it during init, it adds a useless indirection.	2025-06-12 17:08:22 +02:00
Jean Boussier	de4b910381	Get rid of GET_SHAPE_TREE() It's a useless indirection.	2025-06-12 17:08:22 +02:00
alpaca-tc	c8ddc0a843	Optimize callcache invalidation for refinements Fixes [Bug #21201] This change addresses a performance regression where defining methods inside `refine` blocks caused severe slowdowns. The issue was due to `rb_clear_all_refinement_method_cache()` triggering a full object space scan via `rb_objspace_each_objects` to find and invalidate affected callcaches, which is very inefficient. To fix this, I introduce `vm->cc_refinement_table` to track callcaches related to refinements. This allows us to invalidate only the necessary callcaches without scanning the entire heap, resulting in significant performance improvement.	2025-06-09 12:33:35 +09:00
John Hawthorn	e596cf6e93	Make FrozenCore a plain T_CLASS	2025-06-02 14:57:48 -04:00
Jean Boussier	e9fd44dd72	shape.c: Implement a lock-free version of get_next_shape_internal Whenever we run into an inline cache miss when we try to set an ivar, we may need to take the global lock, just to be able to lookup inside `shape->edges`. To solve that, when we're in multi-ractor mode, we can treat the `shape->edges` as immutable. When we need to add a new edge, we first copy the table, and then replace it with CAS. This increases memory allocations, however we expect that creating new transitions becomes increasingly rare over time. ```ruby class A def initialize(bool) @a = 1 if bool @b = 2 else @c = 3 end end def test @d = 4 end end def bench(iterations) i = iterations while i > 0 A.new(true).test A.new(false).test i -= 1 end end if ARGV.first == "ractor" ractors = 8.times.map do Ractor.new do bench(20_000_000 / 8) end end ractors.each(&:take) else bench(20_000_000) end ``` The above benchmark takes 27 seconds in Ractor mode on Ruby 3.4, and only 1.7s with this branch. Co-Authored-By: Étienne Barrié <etienne.barrie@gmail.com>	2025-06-02 17:49:53 +02:00
Koichi Sasada	ef2bb61018	`Ractor::Port` * Added `Ractor::Port` * `Ractor::Port#receive` (support multi-threads) * `Rcator::Port#close` * `Ractor::Port#closed?` * Added some methods * `Ractor#join` * `Ractor#value` * `Ractor#monitor` * `Ractor#unmonitor` * Removed some methods * `Ractor#take` * `Ractor.yield` * Change the spec * `Racotr.select` You can wait for multiple sequences of messages with `Ractor::Port`. ```ruby ports = 3.times.map{ Ractor::Port.new } ports.map.with_index do \|port, ri\| Ractor.new port,ri do \|port, ri\| 3.times{\|i\| port << "r#{ri}-#{i}"} end end p ports.each{\|port\| pp 3.times.map{port.receive}} ``` In this example, we use 3 ports, and 3 Ractors send messages to them respectively. We can receive a series of messages from each port. You can use `Ractor#value` to get the last value of a Ractor's block: ```ruby result = Ractor.new do heavy_task() end.value ``` You can wait for the termination of a Ractor with `Ractor#join` like this: ```ruby Ractor.new do some_task() end.join ``` `#value` and `#join` are similar to `Thread#value` and `Thread#join`. To implement `#join`, `Ractor#monitor` (and `Ractor#unmonitor`) is introduced. This commit changes `Ractor.select()` method. It now only accepts ports or Ractors, and returns when a port receives a message or a Ractor terminates. We removes `Ractor.yield` and `Ractor#take` because: * `Ractor::Port` supports most of similar use cases in a simpler manner. * Removing them significantly simplifies the code. We also change the internal thread scheduler code (thread_pthread.c): * During barrier synchronization, we keep the `ractor_sched` lock to avoid deadlocks. This lock is released by `rb_ractor_sched_barrier_end()` which is called at the end of operations that require the barrier. * fix potential deadlock issues by checking interrupts just before setting UBF. https://bugs.ruby-lang.org/issues/21262	2025-05-31 04:01:33 +09:00
Peter Zhu	b5f5672034	Set iclass_is_origin flag for FrozenCore We don't free the method table for FrozenCore since it is converted to an iclass and doesn't have the iclass_is_origin flag set. This causes a memory leak to be reported during RUBY_FREE_AT_EXIT: 14 dyld 0x19f13ab98 start + 6076 13 miniruby 0x100644928 main + 96 main.c:62 12 miniruby 0x10064498c rb_main + 48 main.c:42 11 miniruby 0x10073be0c ruby_init + 16 eval.c:98 10 miniruby 0x10073bc6c ruby_setup + 232 eval.c:87 9 miniruby 0x100786b98 rb_call_inits + 168 inits.c:63 8 miniruby 0x1009b5010 Init_VM + 212 vm.c:4017 7 miniruby 0x10067aae8 rb_class_new + 44 class.c:834 6 miniruby 0x10067a04c rb_class_boot + 48 class.c:748 5 miniruby 0x10067a250 class_initialize_method_table + 32 class.c:721 4 miniruby 0x1009412a8 rb_id_table_create + 24 id_table.c:98 3 miniruby 0x100759fac ruby_xmalloc + 24 gc.c:5201 2 miniruby 0x10075fc14 ruby_xmalloc_body + 52 gc.c:5211 1 miniruby 0x1007726b4 rb_gc_impl_malloc + 92 default.c:8141 0 libsystem_malloc.dylib 0x19f30d12c _malloc_zone_malloc_instrumented_or_legacy + 152	2025-05-28 13:25:37 -04:00
Nobuyoshi Nakada	aad9fa2853	Use RB_VM_LOCKING	2025-05-25 15:22:43 +09:00
Jean Boussier	83d636f2d0	Free shapes last [Bug #21352] `rb_objspace_free_objects` may need to check objects shapes to know how to free them.	2025-05-19 15:06:08 +02:00
Alan Wu	92b218fbc3	YJIT: ZJIT: Allow both JITs in the same build This commit allows building YJIT and ZJIT simultaneously, a "combo build". Previously, `./configure --enable-yjit --enable-zjit` failed. At runtime, though, only one of the two can be enabled at a time. Add a root Cargo workspace that contains both the yjit and zjit crate. The common Rust build integration mechanisms are factored out into defs/jit.mk. Combo YJIT+ZJIT dev builds are supported; if either JIT uses `--enable-*=dev`, both of them are built in dev mode. The combo build requires Cargo, but building one JIT at a time with only rustc in release build remains supported.	2025-05-15 00:39:03 +09:00
Luke Gruber	1d4822a175	Get ractor message passing working with > 1 thread sending/receiving values in same ractor Rework ractors so that any ractor action (Ractor.receive, Ractor#send, Ractor.yield, Ractor#take, Ractor.select) will operate on the thread that called the action. It will put that thread to sleep if it's a blocking function and it needs to put it to sleep, and the awakening action (Ractor.yield, Ractor#send) will wake up the blocked thread. Before this change every blocking ractor action was associated with the ractor struct and its fields. If a ractor called Ractor.receive, its wait status was wait_receiving, and when another ractor calls r.send on it, it will look for that status in the ractor struct fields and wake it up. The problem was that what if 2 threads call blocking ractor actions in the same ractor. Imagine if 1 thread has called Ractor.receive and another r.take. Then, when a different ractor calls r.send on it, it doesn't know which ruby thread is associated to which ractor action, so what ruby thread should it schedule? This change moves some fields onto the ruby thread itself so that ruby threads are the ones that have ractor blocking statuses, and threads are then specifically scheduled when unblocked. Fixes [#17624] Fixes [#21037]	2025-05-13 13:23:57 -07:00
Samuel Williams	425fa0aeb5	Make `waiting_fd` behaviour per-IO. (#13127 ) - `rb_thread_fd_close` is deprecated and now a no-op. - IO operations (including close) no longer take a vm-wide lock.	2025-05-13 19:02:03 +09:00
Satoshi Tagomori	382645d440	namespace on read	2025-05-11 23:32:50 +09:00
Jean Boussier	7116b0a7f1	Extract `rb_shape_free_all`	2025-05-09 10:22:51 +02:00
Jean Boussier	0ea210d1ea	Rename `ivptr` -> `fields`, `next_iv_index` -> `next_field_index` Ivars will longer be the only thing stored inline via shapes, so keeping the `iv_index` and `ivptr` names would be confusing. Instance variables won't be the only thing stored inline via shapes, so keeping the `ivptr` name would be confusing. `field` encompass anything that can be stored in a VALUE array. Similarly, `gen_ivtbl` becomes `gen_fields_tbl`.	2025-05-08 07:58:05 +02:00
Jean Boussier	3ec7bfff2e	Use a `set_table` for `rb_vm_struct.unused_block_warning_table` Now that we have a hash-set implementation we can use that instead of a hash-table with a static value.	2025-04-27 11:59:28 +02:00
刘皓	45e814d116	Fix jump buffer leak in WASI builds	2025-04-27 15:47:30 +09:00
Takashi Kokubun	8b72e07359	Disable ZJIT profiling at call-threshold (https://github.com/Shopify/zjit/pull/99 ) * Disable ZJIT profiling at call-threshold * Stop referencing ZJIT instructions in codegen	2025-04-18 21:53:01 +09:00
Takashi Kokubun	2915806820	Add --zjit-num-profiles option (https://github.com/Shopify/zjit/pull/98 ) * Add --zjit-profile-interval option * Fix min to max * Avoid rewriting instructions for --zjit-call-threshold=1 * Rename the option to --zjit-num-profiles	2025-04-18 21:53:01 +09:00
Takashi Kokubun	bb46bb781c	Stub Init_builtin_zjit for --disable-zjit	2025-04-18 21:53:00 +09:00
Takashi Kokubun	14253e7d12	Implement Insn::Param using the SP register (https://github.com/Shopify/zjit/pull/39 )	2025-04-18 21:52:59 +09:00
Takashi Kokubun	22c73f1ccb	Implement FixnumAdd and stub PatchPoint/GuardType (https://github.com/Shopify/zjit/pull/30 ) * Implement FixnumAdd and stub PatchPoint/GuardType Co-authored-by: Max Bernstein <max.bernstein@shopify.com> Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> * Clone Target for arm64 * Use $create instead of use create Co-authored-by: Alan Wu <XrXr@users.noreply.github.com> * Fix misindentation from suggested changes * Drop an unneeded variable for mut * Load operand into a register only if necessary --------- Co-authored-by: Max Bernstein <max.bernstein@shopify.com> Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>	2025-04-18 21:52:59 +09:00
Takashi Kokubun	0a543daf15	Add zjit_* instructions to profile the interpreter (https://github.com/Shopify/zjit/pull/16 ) * Add zjit_* instructions to profile the interpreter * Rename FixnumPlus to FixnumAdd * Update a comment about Invalidate * Rename Guard to GuardType * Rename Invalidate to PatchPoint * Drop unneeded debug!() * Plan on profiling the types * Use the output of GuardType as type refined outputs	2025-04-18 21:52:59 +09:00
Takashi Kokubun	53bee25068	Implement --zjit-call-threshold As a preparation for introducing a profiling layer, we need to be able to raise the threshold to run a few cycles for profiling.	2025-04-18 21:52:58 +09:00
Takashi Kokubun	06d875b979	Backport the latest jit_compile()	2025-04-18 21:52:58 +09:00
Alan Wu	1d95139bf6	`miniruby --zjit -e nil` runs through iseq_to_ssa	2025-04-18 21:52:56 +09:00
Takashi Kokubun	0bb709718b	Hook ZJIT compilation	2025-04-18 21:52:56 +09:00
John Hawthorn	57b6a7503f	Lock-free hash set for fstrings [Feature #21268 ] This implements a hash set which is wait-free for lookup and lock-free for insert (unless resizing) to use for fstring de-duplication. As highlighted in https://bugs.ruby-lang.org/issues/19288, heavy use of fstrings (frozen interned strings) can significantly reduce the parallelism of Ractors. I tried a few other approaches first: using an RWLock, striping a series of RWlocks (partitioning the hash N-ways to reduce lock contention), and putting a cache in front of it. All of these improved the situation, but were unsatisfying as all still required locks for writes (and granular locks are awkward, since we run the risk of needing to reach a vm barrier) and this table is somewhat write-heavy. My main reference for this was Cliff Click's talk on a lock free hash-table for java https://www.youtube.com/watch?v=HJ-719EGIts. It turns out this lock-free hash set is made easier to implement by a few properties: * We only need a hash set rather than a hash table (we only need keys, not values), and so the full entry can be written as a single VALUE * As a set we only need lookup/insert/delete, no update * Delete is only run inside GC so does not need to be atomic (It could be made concurrent) * I use rb_vm_barrier for the (rare) table rebuilds (It could be made concurrent) We VM lock (but don't require other threads to stop) for table rebuilds, as those are rare * The conservative garbage collector makes deferred replication easy, using a T_DATA object Another benefits of having a table specific to fstrings is that we compare by value on lookup/insert, but by identity on delete, as we only want to remove the exact string which is being freed. This is faster and provides a second way to avoid the race condition in https://bugs.ruby-lang.org/issues/21172. This is a pretty standard open-addressing hash table with quadratic probing. Similar to our existing st_table or id_table. Deletes (which happen on GC) replace existing keys with a tombstone, which is the only type of update which can occur. Tombstones are only cleared out on resize. Unlike st_table, the VALUEs are stored in the hash table itself (st_table's bins) rather than as a compact index. This avoids an extra pointer dereference and is possible because we don't need to preserve insertion order. The table targets a load factor of 2 (it is enlarged once it is half full).	2025-04-18 13:03:54 +09:00
Aaron Patterson	3628e9e30d	Remove unused field on Thread struct It looks like stat_insn_usage was introduced with YARV, but as far as I can tell the field has never been used. I think we should remove the field since we don't use it.	2025-04-11 10:28:26 -07:00
lukeg	d80f3a287c	Ractor.make_shareable(proc_obj) makes inner structure shareable Proc objects are now traversed like other objects when making them shareable. Fixes [Bug #19372] Fixes [Bug #19374]	2025-03-26 16:05:02 -07:00
Alan Wu	08b3a45bc9	Push a real iseq in rb_vm_push_frame_fname() Previously, vm_make_env_each() (used during proc creation and for the debug inspector C API) picked up the non-GC-allocated iseq that rb_vm_push_frame_fname() creates, which led to a SEGV when the GC tried to mark the non GC object. Put a real iseq imemo instead. Speed should be about the same since the old code also did a imemo allocation and a malloc allocation. Real iseq allows ironing out the special-casing of dummy frames in rb_execution_context_mark() and rb_execution_context_update(). A check is added to RubyVM::ISeq#eval, though, to stop attempts to run dummy iseqs. [Bug #21180] Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>	2025-03-12 15:00:26 -04:00
Yusuke Endoh	993fd96ce6	reject numbered parameters from Binding#local_variables Also, Binding#local_variable_get and #local_variable_set rejects an access to numbered parameters. [Bug #20965] [Bug #21049]	2025-02-18 16:23:24 +09:00
Takashi Kokubun	c1ce3d719d	Streamline YJIT checks on jit_compile()	2025-02-14 10:40:10 -08:00
Nobuyoshi Nakada	4a67ef09cc	[Feature #21116 ] Extract RJIT as a third-party gem	2025-02-13 18:01:03 +09:00
Aaron Patterson	d680a13ad0	Always return jit_entry even if NULL We can just always return the jit_entry since it will be initialized to NULL. There is no reason to specifically return NULL if yjit / rjit are disabled	2025-02-10 15:50:23 -05:00
Peter Zhu	5032791330	Fix conversion of RubyVM::FrozenCore to T_ICLASS We shouldn't directly set the flags of an object because there could be other flags set that would be erased. Instead, we can unset T_MASK and set T_ICLASS isntead.	2025-01-30 10:10:48 -05:00
Peter Zhu	98b36f6f36	Use rb_gc_vm_weak_table_foreach for reference updating We can use rb_gc_vm_weak_table_foreach for reference updating of weak tables in the default GC.	2025-01-27 10:28:36 -05:00
Nobuyoshi Nakada	f7059af50a	Use no-inline version `rb_current_ec` on Arm64 The TLS across .so issue seems related to Arm64, but not Darwin.	2025-01-17 22:48:10 +09:00
Peter Zhu	707c6420b1	Don't reference update frames with VM_FRAME_MAGIC_DUMMY Frames with VM_FRAME_MAGIC_DUMMY pushed by rb_vm_push_frame_fname have allocated iseq, so we should not reference update it.	2024-12-17 11:03:38 -05:00
Peter Zhu	92dd9734a9	Fix use-after-free in ep in Proc#dup for ifunc procs [Bug #20950] ifunc proc has the ep allocated in the cfunc_proc_t which is the data of the TypedData object. If an ifunc proc is duplicated, the ep points to the ep of the source object. If the source object is freed, then the ep of the duplicated object now points to a freed memory region. If we try to use the ep we could crash. For example, the following script crashes: p = { a: 1 }.to_proc 100.times do p = p.dup GC.start p.call rescue ArgumentError end This commit changes ifunc proc to also duplicate the ep when it is duplicated.	2024-12-13 10:10:03 -05:00
Randy Stauner	b021f6f8a7	Use symbol.h in vm.c to get macro for faster ID to sym (#12272 ) The macro provided by symbol.h uses STATIC_ID2SYM when it can which speeds up methods that declare keyword args. Co-authored-by: Alan Wu <XrXr@users.noreply.github.com> Co-authored-by: Takashi Kokubun (k0kubun) <takashikkbn@gmail.com> Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>	2024-12-05 17:51:32 -05:00

1 2 3 4 5 ...

1253 commits