Fix use-after-free in WeakKeyMap#clear
[Bug #20691]
If the WeakKeyMap has been marked but sweeping hasn't started yet and we
cann WeakKeyMap#clear, then there could be a use-after-free because we do
not call rb_gc_remove_weak to remove the key from the GC.
For example, the following code triggers use-after-free errors in Valgrind:
map = ObjectSpace::WeakKeyMap.new
1_000.times do
1_000.times do
map[Object.new] = nil
end
map.clear
end
Output from Valgrind:
==61230== Invalid read of size 8
==61230== at 0x25CAF8: gc_update_weak_references (default.c:5593)
==61230== by 0x25CAF8: gc_marks_finish (default.c:5641)
==61230== by 0x26031C: gc_marks_continue (default.c:5987)
==61230== by 0x26031C: gc_continue (default.c:2255)
==61230== by 0x2605FC: newobj_cache_miss (default.c:2589)
==61230== by 0x26111F: newobj_alloc (default.c:2622)
==61230== by 0x26111F: rb_gc_impl_new_obj (default.c:2701)
==61230== by 0x26111F: newobj_of (gc.c:890)
==61230== by 0x26111F: rb_wb_protected_newobj_of (gc.c:917)
==61230== by 0x2DE218: rb_class_allocate_instance (object.c:131)
==61230== by 0x2E32A8: class_call_alloc_func (object.c:2141)
==61230== by 0x2E32A8: rb_class_alloc (object.c:2113)
==61230== by 0x2E32A8: rb_class_new_instance_pass_kw (object.c:2172)
==61230== by 0x4296BC: vm_call_cfunc_with_frame_ (vm_insnhelper.c:3788)
==61230== by 0x44A9CD: vm_sendish (vm_insnhelper.c:5955)
==61230== by 0x44A9CD: vm_exec_core (insns.def:898)
==61230== by 0x43A0E4: rb_vm_exec (vm.c:2564)
==61230== by 0x2341B4: rb_ec_exec_node (eval.c:281)
==61230== by 0x236258: ruby_run_node (eval.c:319)
==61230== by 0x15D665: rb_main (main.c:43)
==61230== by 0x15D665: main (main.c:62)
==61230== Address 0x2159cb00 is 0 bytes inside a block of size 8 free'd
==61230== at 0x4849B2C: free (vg_replace_malloc.c:989)
==61230== by 0x248EF1: rb_gc_impl_free (default.c:8512)
==61230== by 0x248EF1: rb_gc_impl_free (default.c:8493)
==61230== by 0x248EF1: ruby_sized_xfree.constprop.0 (gc.c:4178)
==61230== by 0x4627EC: wkmap_free_table_i (weakmap.c:652)
==61230== by 0x3A54AF: apply_functor (st.c:1633)
==61230== by 0x3A54AF: st_general_foreach (st.c:1543)
==61230== by 0x3A54AF: rb_st_foreach (st.c:1640)
==61230== by 0x46203C: wkmap_clear (weakmap.c:973)
==61230== by 0x4296BC: vm_call_cfunc_with_frame_ (vm_insnhelper.c:3788)
==61230== by 0x44A9CD: vm_sendish (vm_insnhelper.c:5955)
==61230== by 0x44A9CD: vm_exec_core (insns.def:898)
==61230== by 0x43A0E4: rb_vm_exec (vm.c:2564)
==61230== by 0x2341B4: rb_ec_exec_node (eval.c:281)
==61230== by 0x236258: ruby_run_node (eval.c:319)
==61230== by 0x15D665: rb_main (main.c:43)
==61230== by 0x15D665: main (main.c:62)
==61230== Block was alloc'd at
==61230== at 0x484680F: malloc (vg_replace_malloc.c:446)
==61230== by 0x25C68E: rb_gc_impl_malloc (default.c:8527)
==61230== by 0x4622E9: wkmap_aset_replace (weakmap.c:817)
==61230== by 0x3A4D02: rb_st_update (st.c:1487)
==61230== by 0x4623E4: wkmap_aset (weakmap.c:854)
==61230== by 0x4296BC: vm_call_cfunc_with_frame_ (vm_insnhelper.c:3788)
==61230== by 0x44A9CD: vm_sendish (vm_insnhelper.c:5955)
==61230== by 0x44A9CD: vm_exec_core (insns.def:898)
==61230== by 0x43A0E4: rb_vm_exec (vm.c:2564)
==61230== by 0x2341B4: rb_ec_exec_node (eval.c:281)
==61230== by 0x236258: ruby_run_node (eval.c:319)
==61230== by 0x15D665: rb_main (main.c:43)
==61230== by 0x15D665: main (main.c:62)
==61230==
==61230== Invalid write of size 8
==61230== at 0x25CB3B: gc_update_weak_references (default.c:5598)
==61230== by 0x25CB3B: gc_marks_finish (default.c:5641)
==61230== by 0x26031C: gc_marks_continue (default.c:5987)
==61230== by 0x26031C: gc_continue (default.c:2255)
==61230== by 0x2605FC: newobj_cache_miss (default.c:2589)
==61230== by 0x26111F: newobj_alloc (default.c:2622)
==61230== by 0x26111F: rb_gc_impl_new_obj (default.c:2701)
==61230== by 0x26111F: newobj_of (gc.c:890)
==61230== by 0x26111F: rb_wb_protected_newobj_of (gc.c:917)
==61230== by 0x2DE218: rb_class_allocate_instance (object.c:131)
==61230== by 0x2E32A8: class_call_alloc_func (object.c:2141)
==61230== by 0x2E32A8: rb_class_alloc (object.c:2113)
==61230== by 0x2E32A8: rb_class_new_instance_pass_kw (object.c:2172)
==61230== by 0x4296BC: vm_call_cfunc_with_frame_ (vm_insnhelper.c:3788)
==61230== by 0x44A9CD: vm_sendish (vm_insnhelper.c:5955)
==61230== by 0x44A9CD: vm_exec_core (insns.def:898)
==61230== by 0x43A0E4: rb_vm_exec (vm.c:2564)
==61230== by 0x2341B4: rb_ec_exec_node (eval.c:281)
==61230== by 0x236258: ruby_run_node (eval.c:319)
==61230== by 0x15D665: rb_main (main.c:43)
==61230== by 0x15D665: main (main.c:62)
==61230== Address 0x2159cb00 is 0 bytes inside a block of size 8 free'd
==61230== at 0x4849B2C: free (vg_replace_malloc.c:989)
==61230== by 0x248EF1: rb_gc_impl_free (default.c:8512)
==61230== by 0x248EF1: rb_gc_impl_free (default.c:8493)
==61230== by 0x248EF1: ruby_sized_xfree.constprop.0 (gc.c:4178)
==61230== by 0x4627EC: wkmap_free_table_i (weakmap.c:652)
==61230== by 0x3A54AF: apply_functor (st.c:1633)
==61230== by 0x3A54AF: st_general_foreach (st.c:1543)
==61230== by 0x3A54AF: rb_st_foreach (st.c:1640)
==61230== by 0x46203C: wkmap_clear (weakmap.c:973)
==61230== by 0x4296BC: vm_call_cfunc_with_frame_ (vm_insnhelper.c:3788)
==61230== by 0x44A9CD: vm_sendish (vm_insnhelper.c:5955)
==61230== by 0x44A9CD: vm_exec_core (insns.def:898)
==61230== by 0x43A0E4: rb_vm_exec (vm.c:2564)
==61230== by 0x2341B4: rb_ec_exec_node (eval.c:281)
==61230== by 0x236258: ruby_run_node (eval.c:319)
==61230== by 0x15D665: rb_main (main.c:43)
==61230== by 0x15D665: main (main.c:62)
==61230== Block was alloc'd at
==61230== at 0x484680F: malloc (vg_replace_malloc.c:446)
==61230== by 0x25C68E: rb_gc_impl_malloc (default.c:8527)
==61230== by 0x4622E9: wkmap_aset_replace (weakmap.c:817)
==61230== by 0x3A4D02: rb_st_update (st.c:1487)
==61230== by 0x4623E4: wkmap_aset (weakmap.c:854)
==61230== by 0x4296BC: vm_call_cfunc_with_frame_ (vm_insnhelper.c:3788)
==61230== by 0x44A9CD: vm_sendish (vm_insnhelper.c:5955)
==61230== by 0x44A9CD: vm_exec_core (insns.def:898)
==61230== by 0x43A0E4: rb_vm_exec (vm.c:2564)
==61230== by 0x2341B4: rb_ec_exec_node (eval.c:281)
==61230== by 0x236258: ruby_run_node (eval.c:319)
==61230== by 0x15D665: rb_main (main.c:43)
==61230== by 0x15D665: main (main.c:62)
Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
* Add struct weakmap_entry for WeakMap entries
* Refactor wmap_foreach to pass weakmap_entry
* Use wmap_foreach for wmap_mark
* Refactor wmap_compact to use wmap_foreach
* Remove wmap_free_entry
* Fix WeakMap use-after-free
[Bug #20688]
We cannot free the weakmap_entry before the ST_DELETE because it could
hash the key which would read the weakmap_entry and would cause a
use-after-free. Instead, we store the entry and free it on the next
iteration.
For example, the following script triggers a use-after-free in Valgrind:
weakmap = ObjectSpace::WeakMap.new
10_000.times { weakmap[Object.new] = Object.new }
==25795== Invalid read of size 8
==25795== at 0x462297: wmap_cmp (weakmap.c:165)
==25795== by 0x3A2B1C: find_table_bin_ind (st.c:930)
==25795== by 0x3A5EAA: st_general_foreach (st.c:1599)
==25795== by 0x3A5EAA: rb_st_foreach (st.c:1640)
==25795== by 0x25C991: gc_mark_children (default.c:4870)
==25795== by 0x25C991: gc_marks_wb_unprotected_objects_plane (default.c:5565)
==25795== by 0x25C991: rgengc_rememberset_mark_plane (default.c:5557)
==25795== by 0x25C991: rgengc_rememberset_mark (default.c:6233)
==25795== by 0x25C991: gc_marks_start (default.c:6057)
==25795== by 0x25C991: gc_marks (default.c:6077)
==25795== by 0x25C991: gc_start (default.c:6723)
==25795== by 0x260F96: heap_prepare (default.c:2282)
==25795== by 0x260F96: heap_next_free_page (default.c:2489)
==25795== by 0x260F96: newobj_cache_miss (default.c:2598)
==25795== by 0x26197F: newobj_alloc (default.c:2622)
==25795== by 0x26197F: rb_gc_impl_new_obj (default.c:2701)
==25795== by 0x26197F: newobj_of (gc.c:890)
==25795== by 0x26197F: rb_wb_protected_newobj_of (gc.c:917)
==25795== by 0x2DEA88: rb_class_allocate_instance (object.c:131)
==25795== by 0x2E3B18: class_call_alloc_func (object.c:2141)
==25795== by 0x2E3B18: rb_class_alloc (object.c:2113)
==25795== by 0x2E3B18: rb_class_new_instance_pass_kw (object.c:2172)
==25795== by 0x429DDC: vm_call_cfunc_with_frame_ (vm_insnhelper.c:3786)
==25795== by 0x44B08D: vm_sendish (vm_insnhelper.c:5953)
==25795== by 0x44B08D: vm_exec_core (insns.def:898)
==25795== by 0x43A7A4: rb_vm_exec (vm.c:2564)
==25795== by 0x234914: rb_ec_exec_node (eval.c:281)
==25795== Address 0x21603710 is 0 bytes inside a block of size 16 free'd
==25795== at 0x4849B2C: free (vg_replace_malloc.c:989)
==25795== by 0x249651: rb_gc_impl_free (default.c:8527)
==25795== by 0x249651: rb_gc_impl_free (default.c:8508)
==25795== by 0x249651: ruby_sized_xfree.constprop.0 (gc.c:4178)
==25795== by 0x4626EC: ruby_sized_xfree_inlined (gc.h:277)
==25795== by 0x4626EC: wmap_free_entry (weakmap.c:45)
==25795== by 0x4626EC: wmap_mark_weak_table_i (weakmap.c:61)
==25795== by 0x3A5CEF: apply_functor (st.c:1633)
==25795== by 0x3A5CEF: st_general_foreach (st.c:1543)
==25795== by 0x3A5CEF: rb_st_foreach (st.c:1640)
==25795== by 0x25C991: gc_mark_children (default.c:4870)
==25795== by 0x25C991: gc_marks_wb_unprotected_objects_plane (default.c:5565)
==25795== by 0x25C991: rgengc_rememberset_mark_plane (default.c:5557)
==25795== by 0x25C991: rgengc_rememberset_mark (default.c:6233)
==25795== by 0x25C991: gc_marks_start (default.c:6057)
==25795== by 0x25C991: gc_marks (default.c:6077)
==25795== by 0x25C991: gc_start (default.c:6723)
==25795== by 0x260F96: heap_prepare (default.c:2282)
==25795== by 0x260F96: heap_next_free_page (default.c:2489)
==25795== by 0x260F96: newobj_cache_miss (default.c:2598)
==25795== by 0x26197F: newobj_alloc (default.c:2622)
==25795== by 0x26197F: rb_gc_impl_new_obj (default.c:2701)
==25795== by 0x26197F: newobj_of (gc.c:890)
==25795== by 0x26197F: rb_wb_protected_newobj_of (gc.c:917)
==25795== by 0x2DEA88: rb_class_allocate_instance (object.c:131)
==25795== by 0x2E3B18: class_call_alloc_func (object.c:2141)
==25795== by 0x2E3B18: rb_class_alloc (object.c:2113)
==25795== by 0x2E3B18: rb_class_new_instance_pass_kw (object.c:2172)
==25795== by 0x429DDC: vm_call_cfunc_with_frame_ (vm_insnhelper.c:3786)
==25795== by 0x44B08D: vm_sendish (vm_insnhelper.c:5953)
==25795== by 0x44B08D: vm_exec_core (insns.def:898)
==25795== by 0x43A7A4: rb_vm_exec (vm.c:2564)
==25795== Block was alloc'd at
==25795== at 0x484680F: malloc (vg_replace_malloc.c:446)
==25795== by 0x25CE9E: rb_gc_impl_malloc (default.c:8542)
==25795== by 0x462A39: wmap_aset_replace (weakmap.c:423)
==25795== by 0x3A5542: rb_st_update (st.c:1487)
==25795== by 0x462B8E: wmap_aset (weakmap.c:452)
==25795== by 0x429DDC: vm_call_cfunc_with_frame_ (vm_insnhelper.c:3786)
==25795== by 0x44B08D: vm_sendish (vm_insnhelper.c:5953)
==25795== by 0x44B08D: vm_exec_core (insns.def:898)
==25795== by 0x43A7A4: rb_vm_exec (vm.c:2564)
==25795== by 0x234914: rb_ec_exec_node (eval.c:281)
==25795== by 0x2369B8: ruby_run_node (eval.c:319)
==25795== by 0x15D675: rb_main (main.c:43)
==25795== by 0x15D675: main (main.c:62)
* Fix use-after-free for WeakKeyMap
[Bug #20688]
We cannot free the key before the ST_DELETE because it could hash the
key which would read the key and would cause a use-after-free. Instead,
we store the key and free it on the next iteration.
Enhance docs for WeakMap and WeakKeyMap
* WeakKeyMap: more class-level explanations, more details
on #getkey, fix a slight bug in code of #delete example;
* WeekMap: a bit more detailed class- and method-level docs.
If we're during incremental marking, then Ruby code can execute that
deallocates certain memory buffers that have been called with
rb_gc_mark_weak, which can cause use-after-free bugs.
[Bug #19531]
```ruby
wmap[1] = "A"
wmap[1] = "B"
```
In the example above, we need to remove the `"A" => 1` inverse reference
so that when `"A"` is GCed the `1` key isn't deleted.
In wmap_final_func, j is the number of elements + 1 (since j also
includes the length at the 0th index), so we should resize the buffer
to size j and the new length is j - 1.
[Bug #19529]
The fix for [Bug #19529] in commit 548086b contained a bug that crashes
on the following script:
```
wm = ObjectSpace::WeakMap.new
obj = Object.new
100.times do
wm[Object.new] = obj
GC.start
end
GC.compact
```
[Bug #19529]
`rb_gc_update_tbl_refs` can't be used on `w->obj2wmap` because it's
not a `VALUE -> VALUE` table, but a `VALUE -> VALUE *` table, so
we need some dedicated iterator.
For both we mark the lambda finalizer.
ObjectSpace::WeakMap doesn't mark any other reference, so we can just add the flag.
ObjectSpace::WeakKeyMap only ever add new refs in `wkmap_aset`, so we can just trigger the write barrier there.
These classes don't belong in gc.c as they're not actually part of the
GC. This commit refactors the code by moving all the code into a
weakmap.c file.