[Bug #21170]
st_table reserves -1 as a special hash value to indicate that an entry
has been deleted. So that that's a valid value to be returned from the
hash function, do_hash replaces -1 with 0 so that it is not mistaken for
the sentinel.
Previously, when upgrading an AR table to an ST table,
rb_st_add_direct_with_hash was used which did not perform the same
conversion, this could lead to a hash in a broken state where one if its
entries which was supposed to exist being marked as a tombstone.
The hash could then become further corrupted when the ST table required
resizing as the falsely tombstoned entry would be skipped but it would
be counted in num entries, leading to an uninitialized entry at index
15.
In most cases this will be really rare, unless using a very poorly
implemented custom hash function.
This also adds two debug assertions, one that st_add_direct_with_hash
does not receive the reserved hash value, and a second in
rebuild_table_with, which ensures that after we rebuild/compact a table
it contains the expected number of elements.
Co-authored-by: Alan Wu <alanwu@ruby-lang.org>
* delete `ar_try_convert` but use `ar_force_convert_table`
to make program simple.
* `ar_force_convert_table` checks hash modification while
calling `#hash` method with the following strategy:
1. copy keys (and vals) of ar_table
2. calc hashes from keys
3. check copied keys and hash's keys. if not matched, repeat from 1
fix [Bug #20050]
According to the C99 specification section 7.20.3.2 paragraph 2:
> If ptr is a null pointer, no action occurs.
So we do not need to check that the pointer is a null pointer.
st_copy allocates a st_table, which is not needed for hashes since it is
allocated by VWA and embedded, so this causes a memory leak.
The following script demonstrates the issue:
```ruby
20.times do
100_000.times do
{a: 1, b: 2, c: 3, d: 4, e: 5, f: 6, g: 7, h: 8, i: 9}
end
puts `ps -o rss= -p #{$$}`
end
```
st tables will maintain insertion order so we can marshal dump / load
objects with instance variables in the same order they were set on that
particular instance
[ruby-core:112926] [Bug #19535]
Co-Authored-By: Jemma Issroff <jemmaissroff@gmail.com>
When the generic_iv_tbl is resized up, rebuild_table performs
allocations that can trigger GC. If autocompaction is enabled, then
moved objects are removed from and inserted into the generic_iv_tbl.
This may cause another call to rebuild_table to resize the
generic_iv_tbl. When returning back to the original rebuild_table, some
of the data may be stale, causing the generic_iv_tbl to be corrupted.
This commit changes rebuild_table to only read data from the st_table
after the allocations have completed.
Co-Authored-By: Matt Valentine-House <matt@eightbitraptor.com>
tab->entries_bound is used to check if the bins are full in
rebuild_table_if_necessary.
Hash#shift against an empty hash assigned 0 to tab->entries_bound, but
didn't clear the bins. Thus, the table is not rebuilt even when the bins
are full. Attempting to add a new element into full-bin hash gets stuck.
This change stops clearing tab->entries_bound in Hash#shift.
[Bug #18578]
iff means if and only if, but readers without that knowledge might
assume this to be a spelling mistake. To me, this seems like
exclusionary language that is unnecessary. Simply using "if and only if"
instead should suffice.
iv_index_tbl manages instance variable indexes (ID -> index).
This data structure should be synchronized with other ractors
so introduce some VM locks.
This patch also introduced atomic ivar cache used by
set/getinlinecache instructions. To make updating ivar cache (IVC),
we changed iv_index_tbl data structure to manage (ID -> entry)
and an entry points serial and index. IVC points to this entry so
that cache update becomes atomically.
This compile-time option has been broken for years (at least since
commit 4663c224fa, according to git
bisect). Let's delete codes that no longer work.
Saves comitters' daily life by avoid #include-ing everything from
internal.h to make each file do so instead. This would significantly
speed up incremental builds.
We take the following inclusion order in this changeset:
1. "ruby/config.h", where _GNU_SOURCE is defined (must be the very
first thing among everything).
2. RUBY_EXTCONF_H if any.
3. Standard C headers, sorted alphabetically.
4. Other system headers, maybe guarded by #ifdef
5. Everything else, sorted alphabetically.
Exceptions are those win32-related headers, which tend not be self-
containing (headers have inclusion order dependencies).
The original st.c was public domain hash table implementation, but
Ruby's st.c is highly modified, and its data structure is not
compatiblie with the original one.
Therefore, when creating an extension library to wrap C code that uses
the original st.c, the symbols conflict, which leads to segfault.
This changes the prefix `st_*` of st.c functions to `rb_st_*` for
reflecting that they are specific to Ruby's, and avoid symbol conflicts.
After 5e86b005c0, I now think ANYARGS is
dangerous and should be extinct. This commit adds function prototypes
for struct st_hash_type. Honestly I don't understand why they were
commented out at the first place.
After 5e86b005c0, I now think ANYARGS is
dangerous and should be extinct. This commit deletes ANYARGS from
st_foreach. I strongly believe that this commit should have had come
with b0af0592fd, which added extra
parameter to st_foreach callbacks.
For some reason symbols (or classes) are being overridden in trunk
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67598 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
This commit adds the new method `GC.compact` and compacting GC support.
Please see this issue for caveats:
https://bugs.ruby-lang.org/issues/15626
[Feature #15626]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67576 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Because hard to specify commits related to r67479 only.
So please commit again.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67499 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
This commit adds the new method `GC.compact` and compacting GC support.
Please see this issue for caveats:
https://bugs.ruby-lang.org/issues/15626
[Feature #15626]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67479 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
"hash_bulk_insert" first expands the table, but the target size was
wrong: it was calculated by "num_entries + (size to buld insert)", but
it was wrong when "num_entries < entries_bound", i.e., it has a deleted
entry. "hash_bulk_insert" adds the given entries from entries_bound,
which led to out-of-bounds write access. [Bug #15536]
As a simple fix, this commit changes the calculation to "entries_bound +
size". I'm afraid if this might be inefficient, but I think it is safe
anyway.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66832 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
The reserved hash values in hash.c must be consistend with st.c.
[ruby-core:90356] [Bug #15389]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66274 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
When EMPTY_OR_DELETED_BIN_P(bin) is true, it is a wrong idea to
subtract ENTRY_BASE from it. Delay doing so until we are sure to be
safe.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65635 b2dd03c8-39d4-4d8f-98ff-823fe69b080e