Commit graph

118 commits

Author SHA1 Message Date
Jean Boussier
1986d775cd symbol.c: use rb_gc_mark_and_move over rb_gc_location
The `p->field = rb_gc_location(p->field)` isn't ideal because it means all
references are rewritten on compaction, regardless of whether the referenced
object has moved. This isn't good for caches nor for Copy-on-Write.

`rb_gc_mark_and_move` avoid needless writes, and most of the time allow to
have a single function for both marking and updating references.
2025-08-07 21:00:00 +02:00
Peter Zhu
71b46938a7 Fix off-by-one in symbol next_id
Symbol last_id was changed to next_id, but it remained to be set to
tNEXT_ID - 1 initially, causing the initial static symbol to overlap with
the last built-in symbol in id.def.
2025-08-06 13:40:27 -04:00
Peter Zhu
95320f1ddf Fix RUBY_FREE_AT_EXIT for static symbols
Since static symbols allocate memory, we should deallocate them at shutdown
to prevent memory leaks from being reported with RUBY_FREE_AT_EXIT.
2025-08-05 12:04:27 -04:00
Peter Zhu
6c24904a69 Make static symbol ID atomic
We don't need the VM lock if we make static symbol IDs atomic.
2025-07-31 11:09:03 -04:00
Jean Boussier
d488935910 Get rid of ID_JUNK
It has been aliased as ID_INTERNAL for a long time and that alias
is much more descriptive.
2025-07-28 12:22:42 +02:00
Peter Zhu
bd2d6845f1 Remove VM lock in register_static_symid 2025-07-25 09:51:24 -04:00
Peter Zhu
42f95456cc Remove VM lock for sym_find 2025-07-25 09:51:24 -04:00
Peter Zhu
2235fdb6f1 Remove VM lock for rb_id_attrset 2025-07-25 09:51:24 -04:00
Peter Zhu
93be578691 Remove global symbol locks for rb_intern 2025-07-23 10:07:11 -04:00
Peter Zhu
33a849e385 Remove global symbol lock for rb_gc_free_dsymbol 2025-07-23 10:07:11 -04:00
Peter Zhu
66349692f0 Introduce free function to rb_concurrent_set_funcs
If we create a key but don't insert it (due to other Ractor winning the
race), then it would leak memory if we don't free it. This introduces a
new function to free that memory for this case.
2025-07-21 10:58:30 -04:00
Peter Zhu
061224f3cb Remove lock for dynamic symbol
Benchmark:

    ARGV[0].to_i.times.map do
      Ractor.new do
        1_000_000.times do |i|
          "hello#{i}".to_sym
        end
      end
    end.map(&:value)

Results:

| Ractor count | Branch (s) | Master (s) |
|--------------|------------|------------|
| 1            | 0.364      | 0.401      |
| 2            | 0.555      | 1.149      |
| 3            | 0.583      | 3.890      |
| 4            | 0.680      | 3.288      |
| 5            | 0.789      | 5.107      |
2025-07-21 10:58:30 -04:00
Peter Zhu
a2e165e8a0 Remove dsymbol_fstr_hash
We don't need to delay the freeing of the fstr for the symbol if we store
the hash of the fstr in the dynamic symbol and we use compare-by-identity
for removing the dynamic symbol from the sym_set.
2025-07-21 10:58:30 -04:00
Peter Zhu
2bcb155b49 Convert global symbol table to concurrent set 2025-07-21 10:58:30 -04:00
Peter Zhu
116d11062f Assume that symbol in rb_check_symbol is not garbage
rb_check_symbol is a public API, so it is always a bug if the user holds
on to a dead object and passes it in.
2025-07-04 17:41:57 -04:00
Peter Zhu
8b2d76136b Assume that the symbol is not garbage in rb_sym2id
rb_sym2id is a public API, so it is always a bug if the user holds on to
a dead object and passes it in.
2025-07-03 09:05:23 -04:00
Jean Boussier
1f976509a5 symbol.c: enforce intern_str is always called with a lock
Add missing locks in `rb_intern_str`, `rb_id_attrset` and `rb_intern3`.
2025-07-03 12:19:04 +02:00
Nobuyoshi Nakada
edaa27ce45
Suppress warnings by gcc-13 with -Og 2025-06-05 22:33:02 +09:00
Nobuyoshi Nakada
fc518fe1ff
Delimit the scopes using encoding/symbol tables 2025-05-25 15:22:43 +09:00
Nobuyoshi Nakada
bbf1130f91 Add RBIMPL_ATTR_NONSTRING_ARRAY() macro for GCC 15 2025-05-05 18:25:04 +09:00
Takashi Kokubun
67b91e7807 Drop an ignored attribute
GCC 13.3.0 (Ubuntu 24.04) emits the following warning:

../symbol.c: In function ‘rb_id_attrset’:
../symbol.c:175:9: warning: ‘nonstring’ attribute ignored on objects of type ‘const char[][8]’ [-Wattributes]
  175 |         RBIMPL_ATTR_NONSTRING() static const char id_types[][8] = {
      |         ^~~~~~~~~~~~~~~~~~~~~
2025-05-01 10:26:20 -07:00
Nobuyoshi Nakada
b42afa1dbc
Suppress gcc 15 unterminated-string-initialization warnings 2025-04-30 20:04:10 +09:00
Peter Zhu
3fb455adab Move global symbol reference updating to rb_sym_global_symbols_update_references 2025-02-10 08:47:44 -05:00
Peter Zhu
8d0416ae0b Make ruby_global_symbols movable
The `ids` array and `dsymbol_fstr_hash` were pinned because they were
kept alive by rb_vm_register_global_object. This prevented the GC from
moving them even though there were reference updating code.

This commit changes it to be marked movable by marking it as a root object.
2025-02-10 08:47:44 -05:00
Nobuyoshi Nakada
4dd9e5cf74 Add builtin type assertion 2024-04-08 11:13:29 +09:00
Peter Zhu
43dcf4d1a6 Assert correct types in get_id_serial_entry 2024-04-05 16:15:48 -04:00
Peter Zhu
a80e8ba1c4 Assert correct types in set_id_entry 2024-04-05 16:15:40 -04:00
Peter Zhu
37490474c4 Assert that rb_sym2str returns 0 or a T_STRING 2024-04-05 16:15:33 -04:00
Jean Boussier
d4f3dcf4df Refactor VM root modules
This `st_table` is used to both mark and pin classes
defined from the C API. But `vm->mark_object_ary` already
does both much more efficiently.

Currently a Ruby process starts with 252 rooted classes,
which uses `7224B` in an `st_table` or `2016B` in an `RArray`.

So a baseline of 5kB saved, but since `mark_object_ary` is
preallocated with `1024` slots but only use `405` of them,
it's a net `7kB` save.

`vm->mark_object_ary` is also being refactored.

Prior to this changes, `mark_object_ary` was a regular `RArray`, but
since this allows for references to be moved, it was marked a second
time from `rb_vm_mark()` to pin these objects.

This has the detrimental effect of marking these references on every
minors even though it's a mostly append only list.

But using a custom TypedData we can save from having to mark
all the references on minor GC runs.

Addtionally, immediate values are now ignored and not appended
to `vm->mark_object_ary` as it's just wasted space.
2024-03-06 15:33:43 -05:00
Alan Wu
ee3b4bec0e
YJIT: Simplify Kernel#send guards and admit more cases (#9956)
Previously, our compile time check rejected dynamic symbols (e.g. what
String#to_sym could return) even though we could handle them just fine.
The runtime guards for the type of method name was also overly
restrictive and didn't accept dynamic symbols.

Fold the type check into the rb_get_symbol_id() and take advantage of
the guard already checking for 0. This also avoids generating the same
call twice in case the same method name is presented as different
types.
2024-02-14 11:19:04 -05:00
Burdette Lamar
65f5435540
[DOC] Doc compliance (#9955) 2024-02-14 10:47:42 -05:00
Peter Zhu
4d3fc96b8b Change dsymbol_alloc to use NEWOBJ_OF 2024-02-13 15:30:06 -05:00
Peter Zhu
01fd262e62 Fix crash when checking symbol encoding
[Bug #20245]

We sometimes pass in a fake string to sym_check_asciionly. This can crash
if sym_check_asciionly raises because it creates a CFP with the fake
string as the receiver which will crash if GC tries to mark the CFP.

For example, the following script crashes:

    GC.stress = true
    Object.const_defined?("\xC3")
2024-02-08 10:12:56 -05:00
Adam Hess
6816e8efcf Free everything at shutdown
when the RUBY_FREE_ON_SHUTDOWN environment variable is set, manually free memory at shutdown.

Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
Co-authored-by: Peter Zhu <peter@peterzhu.ca>
2023-12-07 15:52:35 -05:00
Nobuyoshi Nakada
79eb75a8dd
[Bug #20025] Check if upper/lower before fallback to case-folding 2023-11-29 14:40:21 +09:00
Nobuyoshi Nakada
e7dc8f0b27
Compile debugging code for symbol and ID always 2023-06-30 23:59:05 +09:00
Nobuyoshi Nakada
ac0163949a
Compile code without Symbol GC always 2023-06-30 23:59:05 +09:00
Matt Valentine-House
72aba64fff Merge gc.h and internal/gc.h
[Feature #19425]
2023-02-09 10:32:29 -05:00
Takashi Kokubun
e7443dbbca
Rewrite Symbol#to_sym and #intern in Ruby (#6683) 2022-11-15 21:34:30 -08:00
Jimmy Miller
467992ee35
Implement optimize send in yjit (#6488)
* Implement optimize send in yjit

This successfully makes all our benchmarks exit way less for optimize send reasons.
It makes some benchmarks faster, but not by as much as I'd like. I think this implementation
works, but there are definitely more optimial arrangements. For example, what if we compiled
send to a jump table? That seems like perhaps the most optimal we could do, but not obvious (to me)
how to implement give our current setup.

Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>

* Attempt at fixing the issues raised by @XrXr

* fix allowlist

* returns 0 instead of nil when not found

* remove comment about encoding exception

* Fix up c changes

* Update assert

Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>

* get rid of unneeded code and fix the flags

* Apply suggestions from code review

Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>

* rename and fix typo

Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
2022-10-11 16:37:05 -04:00
Peter Zhu
efb91ff19b Rename rb_ary_tmp_new to rb_ary_hidden_new
rb_ary_tmp_new suggests that the array is temporary in some way, but
that's not true, it just creates an array that's hidden and not on the
transient heap. This commit renames it to rb_ary_hidden_new.
2022-07-26 09:12:09 -04:00
Takashi Kokubun
5b21e94beb Expand tabs [ci skip]
[Misc #18891]
2022-07-21 09:42:04 -07:00
Daniel Colson
32e406d6d3 Ensure _id2ref finds symbols with the correct type
Prior to this commit it was possible to call `ObjectSpace._id2ref` with
an offset static symbol object_id and get back a new, incorrectly tagged
symbol:

```
> sensible_sym = ObjectSpace._id2ref(:a.object_id)
=> :a
> nonsense_sym = ObjectSpace._id2ref(:a.object_id + 40)
=> :a
> sensible_sym == nonsense_sym
=> false
```

`nonsense_sym` ends up tagged with `RUBY_ID_INSTANCE` instead of
`RB_ID_LOCAL`. That means we can do silly things like:

```
> foo = Object.new
> foo.instance_variable_set(:a, 123)
(irb):2:in `instance_variable_set': `a' is not allowed as an instance variable name (NameError)
> foo.instance_variable_set(ObjectSpace._id2ref(:a.object_id + 40), 123)
=> 123
> foo.instance_variables
=> [:a]
```

This was happening because `get_id_entry` ignores the tag bits when
looking up the symbol. So `rb_id2str(symid)` would return a value and
then we'd continue on with the nonsense `symid`.

This commit prevents the situation by checking that the `symid` actually
matches what we get back from `get_id_entry`. Now we get a `RangeError`
for the nonsense id:

```
> ObjectSpace._id2ref(:a.object_id)
=> :a
> ObjectSpace._id2ref(:a.object_id + 40)
(irb):1:in `_id2ref': 0x000000000013f408 is not symbol id value (RangeError)
```

Co-authored-by: John Hawthorn <jhawthorn@github.com>
2022-07-20 10:38:44 -07:00
Nobuyoshi Nakada
8f17591435 [Bug #18905] Check symbol name types more strictly 2022-07-20 00:23:38 +09:00
Nobuyoshi Nakada
5d45afdbbf
[DOC] Move the documentations of moved Symbol methods 2022-04-14 11:17:37 +09:00
Nobuyoshi Nakada
c14f230b26 Assign temporary ID to anonymous ID [Bug #18250]
Dumped iseq binary can not have unnamed symbols/IDs, and ID 0 is
stored instead.  As `struct rb_id_table` disallows ID 0, also for
the distinction, re-assign a new temporary ID based on the local
variable table index when loading from the binary, as well as the
parser.
2021-11-23 21:03:19 +09:00
Nobuyoshi Nakada
334b69e504 rb_id_serial_to_id: return unregistered ID as an internal ID
```ruby
def foo(*); ->{ super }; end
```

This code makes anonymous parameters which is not registered as an
ID.  The problem is that when Ractors try to scan `getlocal`
instructions, it puts the Symbol corresponding to the parameter
in to a hash.  Since it is not registered, we end up with a
strange exception.  This commit wraps the unregistered ID in an
internal ID so that we get the same exception for `...` as `*`.

Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
Co-Authored-By: John Hawthorn <john@hawthorn.email>
2021-11-07 12:40:27 +09:00
Nobuyoshi Nakada
1aa9fcca76
Fix STATIC_SYM2ID for large ID on IL32LLP64 platforms 2021-10-14 01:11:31 +09:00
Nobuyoshi Nakada
aa5759a22b
rb_id_serial_to_id is used in key2id since 4c2d014e92 2021-10-13 11:27:09 +09:00
卜部昌平
73d2bf97c1 include/ruby/internal/symbol.h: add doxygen
Must not be a bad idea to improve documents. [ci skip]
2021-09-10 20:00:06 +09:00