Commit graph

92 commits

Author SHA1 Message Date
John Hawthorn
f483befd90 Add shape_id to RBasic under 32 bit
This makes `RBobject` `4B` larger on 32 bit systems
but simplifies the implementation a lot.

[Feature #21353]

Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
2025-05-26 10:31:54 +02:00
Jean Boussier
83f57ca3d2 String.new(capacity:) don't substract termlen
[Bug #20585]

This was changed in 36a06efdd9 because
`String.new(1024)` would end up allocating `1025` bytes, but the problem
with this change is that the caller may be trying to right size a String.

So instead, we should just better document the behavior of `capacity:`.
2024-06-19 15:11:07 +02:00
tompng
04467218ce Add rb_str_resize coderange test 2024-06-13 18:27:02 +02:00
Jean Boussier
730e3b2ce0 Stop exposing rb_str_chilled_p
[Feature #20205]

Now that chilled strings no longer appear as frozen, there is no
need to offer an API to check for chilled strings.

We however need to change `rb_check_frozen_internal` to no
longer be a macro, as it needs to check for chilled strings.
2024-06-02 13:53:35 +02:00
Étienne Barrié
2b08406cd0 Expose rb_str_chilled_p
Some extensions (like stringio) may need to differentiate between
chilled strings and frozen strings.

They can now use rb_str_chilled_p but must check for its presence since
the function will be removed when chilled strings are removed.

[Bug #20389]

[Feature #20205]

Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
2024-03-26 12:54:54 +01:00
Thomas Marshall
7e4b1f8e19
[Bug #20322] Fix rb_enc_interned_str_cstr null encoding
The documentation for `rb_enc_interned_str_cstr` notes that `enc` can be
a null pointer, but this currently causes a segmentation fault when
trying to autoload the encoding. This commit fixes the issue by checking
for NULL before calling `rb_enc_autoload`.
2024-03-03 10:43:35 +00:00
Hiroshi SHIBATA
1d972498eb Use Encoding::CESU_8 for test case
Encoding::Windows_31J is already loaded in mswin platform
2024-01-25 16:06:06 +09:00
Nobuyoshi Nakada
1910bd4247
String for string literal is not resizable 2023-11-08 00:59:45 +09:00
Jean Boussier
ac8ec004e5 Make String.new size pools aware.
If the required capacity would fit in an embded string,
returns one.

This can reduce malloc churn for code that use string buffers.
2023-11-02 23:34:58 +01:00
Nobuyoshi Nakada
6b66b5fded [Bug #19902] Update the coderange regarding the changed region 2023-09-26 15:35:40 +09:00
Peter Zhu
4b6c584023 Remove --disable-gems for assert_separately
assert_separately adds --disable=gems so we don't need to add
--disable-gems when calling assert_separately.
2023-08-03 09:11:08 +09:00
Peter Zhu
7577c101ed
Unify length field for embedded and heap strings (#7908)
* Unify length field for embedded and heap strings

The length field is of the same type and position in RString for both
embedded and heap allocated strings, so we can unify it.

* Remove RSTRING_EMBED_LEN
2023-06-06 10:19:20 -04:00
Peter Zhu
1da2e7fca3
[Feature #19579] Remove !USE_RVARGC code (#7655)
Remove !USE_RVARGC code

[Feature #19579]

The Variable Width Allocation feature was turned on by default in Ruby
3.2. Since then, we haven't received bug reports or backports to the
non-Variable Width Allocation code paths, so we assume that nobody is
using it. We also don't plan on maintaining the non-Variable Width
Allocation code, so we are going to remove it.
2023-04-04 17:30:06 -04:00
Benoit Daloze
6abe20e87b Remove Encoding#replicate 2023-01-11 13:41:41 +01:00
Jemma Issroff
5246f4027e Transition shape when object's capacity changes
This commit adds a `capacity` field to shapes, and adds shape
transitions whenever an object's capacity changes. Objects which are
allocated out of a bigger size pool will also make a transition from the
root shape to the shape with the correct capacity for their size pool
when they are allocated.

This commit will allow us to remove numiv from objects completely, and
will also mean we can guarantee that if two objects share shapes, their
IVs are in the same positions (an embedded and extended object cannot
share shapes). This will enable us to implement ivar sets in YJIT using
object shapes.

Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
2022-11-10 10:11:34 -05:00
Peter Zhu
7b77d46671 Decouple GC slot sizes from RVALUE
Add a new macro BASE_SLOT_SIZE that determines the slot size.

For Variable Width Allocation (compiled with USE_RVARGC=1), all slot
sizes are powers-of-2 multiples of BASE_SLOT_SIZE.

For USE_RVARGC=0, BASE_SLOT_SIZE is set to sizeof(RVALUE).
2022-02-02 09:52:04 -05:00
Peter Zhu
018036c282 Remove assert_equal that will never be run
`@s1.set_len(3)` will raise so the `assert_equal` will never be ran.
2022-01-28 13:33:03 -05:00
Peter Zhu
2d81a718ec Make embedded string length a long for VWA
A short (2 bytes) will cause unaligned struct accesses when strings are
used as a buffer to directly store binary data.
2022-01-12 12:00:55 -05:00
Nobuyoshi Nakada
39bc5de833
Remove tainted and trusted features
Already these had been announced to be removed in 3.2.
2021-12-26 23:28:54 +09:00
Peter Zhu
a5b6598192 [Feature #18239] Implement VWA for strings
This commit adds support for embedded strings with variable capacity and
uses Variable Width Allocation to allocate strings.
2021-10-25 13:26:23 -04:00
Nobuyoshi Nakada
391abc543c
Scan the coderange in the given encoding 2021-06-26 16:05:15 +09:00
Jean Boussier
7e8a9af9db rb_enc_interned_str: handle autoloaded encodings
If called with an autoloaded encoding that was not yet
initialized, `rb_enc_interned_str` would crash with
a NULL pointer exception.

See: https://github.com/ruby/ruby/pull/4119#issuecomment-800189841
2021-03-22 21:37:48 +09:00
Jean Boussier
6bef49427a Fix rb_interned_str_* functions to not assume static strings
Fixes [Feature #13381]

When passed a `fake_str`, `register_fstring` would create new strings
with `str_new_static`. That's not what was expected, and answer
almost no use cases.
2020-11-30 17:33:28 +09:00
Jeremy Evans
58325daae3 Make String methods return String instances when called on a subclass instance
This modifies the following String methods to return String instances
instead of subclass instances:

* String#*
* String#capitalize
* String#center
* String#chomp
* String#chop
* String#delete
* String#delete_prefix
* String#delete_suffix
* String#downcase
* String#dump
* String#each/#each_line
* String#gsub
* String#ljust
* String#lstrip
* String#partition
* String#reverse
* String#rjust
* String#rpartition
* String#rstrip
* String#scrub
* String#slice!
* String#slice/#[]
* String#split
* String#squeeze
* String#strip
* String#sub
* String#succ/#next
* String#swapcase
* String#tr
* String#tr_s
* String#upcase

This also fixes a bug in String#swapcase where it would return the
receiver instead of a copy of the receiver if the receiver was the
empty string.

Some string methods were left to return subclass instances:

* String#+@
* String#-@

Both of these methods will return the receiver (subclass instance)
in some cases, so it is best to keep the returned class consistent.

Fixes [#10845]
2020-11-20 16:30:23 -08:00
Nobuyoshi Nakada
daa04c5562
Word array instead of splitting 2020-03-08 17:39:22 +09:00
Yusuke Endoh
0c0278b90a test/-ext-/string/test_fstring.rb: suppress a warning for taint 2019-11-18 09:25:49 -06:00
Jeremy Evans
ffd0820ab3 Deprecate taint/trust and related methods, and make the methods no-ops
This removes the related tests, and puts the related specs behind
version guards.  This affects all code in lib, including some
libraries that may want to support older versions of Ruby.
2019-11-18 01:00:25 +02:00
NARUSE, Yui
bea322a352 Revert "[EXPERIMENTAL] Make Symbol#to_s return a frozen String [Feature #16150]"
This reverts commit 6ffc045a81.
2019-11-05 17:30:54 +09:00
卜部昌平
11b6ff12af more use of RbConfig::LIMITS
`8 * RbConfig::SIZEOF` ... is not straight.
2019-10-08 11:21:20 +09:00
Yusuke Endoh
fc66947c61 test/-ext-/string/test_fstring.rb: suppress "possibly useless use of -@"
"in void context" by assigning the result to a dummy variable.
2019-09-30 20:22:29 +09:00
Benoit Daloze
6ffc045a81 [EXPERIMENTAL] Make Symbol#to_s return a frozen String
* Always the same frozen String for a given Symbol.
* Avoids extra allocations whenever calling Symbol#to_s.
* See [Feature #16150]
2019-09-26 10:23:02 +02:00
Alan Wu
93faa011d3 Tag string shared roots to fix use-after-free
The buffer deduplication codepath in rb_fstring can be used to free the buffer
of shared string roots, which leads to use-after-free.

Introudce a new flag to tag strings that at one point have been a shared root.
Check for it in rb_fstring to avoid freeing buffers that are shared by
multiple strings. This change is based on nobu's idea in [ruby-core:94838].

The included test case test for the sequence of calls to internal functions
that lead to this bug. See attached ticket for Ruby level repros.

[Bug #16151]
2019-09-26 15:30:18 +09:00
John Hawthorn
04bc4c0662 Resize capacity for fstring
When a string is #frozen, it's capacity is resized to fit (if it is much
larger), since we know it will no longer be mutated.

    > puts ObjectSpace.dump(String.new("a"*30, capacity: 1000))
    {"type":"STRING", "class":"0x7feaf00b7bf0", "bytesize":30, "capacity":1000, "value":"...
    > puts ObjectSpace.dump(String.new("a"*30, capacity: 1000).freeze)
    {"type":"STRING", "class":"0x7feaf00b7bf0", "frozen":true, "bytesize":30, "value":"...

(ObjectSpace.dump doesn't show capacity if capacity is equal to bytesize)

Previously, if we dedup into an fstring, using String#-@, capacity would
not be reduced.

    > puts ObjectSpace.dump(-String.new("a"*30, capacity: 1000))
    {"type":"STRING", "class":"0x7feaf00b7bf0", "frozen":true, "fstring":true, "bytesize":30, "capacity":1000, "value":"...

This commit makes rb_fstring call rb_str_resize, the same as
rb_str_freeze does.

Closes: https://github.com/ruby/ruby/pull/2256
2019-06-26 15:01:48 +09:00
Nobuyoshi Nakada
3840791b7e
Get rid of error with frozen string literal
[Bug #14194]
2019-06-23 07:56:43 +09:00
Alan Wu
c06ddfee87
str_duplicate: Don't share with a frozen shared string
This is a follow up for 3f9562015e.
Before this commit, it was possible to create a shared string which
shares with another shared string by passing a frozen shared string
to `str_duplicate`.

Such string looks like:

```
 --------                    -----------------
 | root | ------ owns -----> | root's buffer |
 --------                    -----------------
     ^                             ^   ^
 -----------                       |   |
 | shared1 | ------ references -----   |
 -----------                           |
     ^                                 |
 -----------                           |
 | shared2 | ------ references ---------
 -----------
```

This is bad news because `rb_fstring(shared2)` can make `shared1`
independent, which severs the reference from `shared1` to `root`:

```c
/* from fstr_update_callback() */
str = str_new_frozen(rb_cString, shared2);  /* can return shared1 */
if (STR_SHARED_P(str)) { /* shared1 is also a shared string */
    str_make_independent(str);  /* no frozen check */
}
```

If `shared1` was the only reference to `root`, then `root` can be
reclaimed by the GC, leaving `shared2` in a corrupted state:

```
 -----------                         --------------------
 | shared1 | -------- owns --------> | shared1's buffer |
 -----------                         --------------------
      ^
      |
 -----------                         -------------------------
 | shared2 | ------ references ----> | root's buffer (freed) |
 -----------                         -------------------------
```

Here is a reproduction script for the situation this commit fixes.

```ruby
a = ('a' * 24).strip.freeze.strip
-a
p a
4.times { GC.start }
p a
```

 - string.c (str_duplicate): always share with the root string when
   the original is a shared string.
 - test_rb_str_dup.rb: specifically test `rb_str_dup` to make
   sure it does not try to share with a shared string.

[Bug #15792]

Closes: https://github.com/ruby/ruby/pull/2159
2019-05-09 10:04:19 +09:00
shyouhei
b57915eddc Add FrozenError as a subclass of RuntimeError
FrozenError will be used instead of RuntimeError for exceptions
raised when there is an attempt to modify a frozen object. The
reason for this change is to differentiate exceptions related
to frozen objects from generic exceptions such as those generated
by Kernel#raise without an exception class.

From: Jeremy Evans <code@jeremyevans.net>
Signed-off-by: Urabe Shyouhei <shyouhei@ruby-lang.org>


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61131 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-12-12 00:46:34 +00:00
naruse
7a96d788dc Add test for Bug::String.buf_new
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60993 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-12-03 12:06:16 +00:00
nobu
a1692f7fdf string.c: fix rb_external_str_new_with_enc
* string.c (rb_external_str_new_with_enc): do not search non-ascii
  by NULL pointer.  [ruby-core:84055] [Bug #14150]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60979 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-12-02 07:09:16 +00:00
hsbt
6693e3e723 Fixed misspelling words.
These are detected by https://github.com/client9/misspell

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60359 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-10-22 11:27:06 +00:00
nobu
9f1994ed5c io.c: shrink read buffer
* io.c (io_setstrbuf): return true if the buffer is newly created.

* io.c (io_set_read_length): shrink the read buffer if it is a new
  object and is too large.  [ruby-core:81370] [Bug #13597]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59701 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-08-31 08:21:46 +00:00
nobu
1c16a35017 test_modify_expand.rb: skip if no overflow
* test/-ext-/string/test_modify_expand.rb (test_integer_overflow):
  no longer happens on platforms where size_t is larger than long,
  e.g. 64bit windows, since r57122.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57156 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-12-22 23:20:00 +00:00
nobu
c62ab010c8 test_fstring.rb: fix exception
* test/-ext-/string/test_fstring.rb (test_singleton_class): fix
  expected exception class.  [ruby-dev:49867] [Bug #12923]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56754 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-11-12 13:24:23 +00:00
nobu
ee160e68f9 class.c: no fstring singleton class
* class.c (singleton_class_of): prohibit fstrings from creating
  singleton classes.
  temporary measure for [ruby-dev:49867] [Bug #12923]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56747 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-11-12 09:43:05 +00:00
rhe
be3baa4380 string.c: fix buffer overflow check condition in rb_str_set_len()
* string.c (rb_str_set_len): The buffer overflow check is wrong. The
  space for termlen is allocated outside the capacity returned by
  rb_str_capacity(). This fixes r41920 ("string.c: multi-byte
  terminator", 2013-07-11).  [ruby-core:77257] [Bug #12757]

* test/-ext-/string/test_set_len.rb (test_capacity_equals_to_new_size):
  Test for this change. Applying only the test will trigger [BUG].

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56148 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-09-13 07:08:15 +00:00
naruse
8667e8b186 require "rbconfig/sizeof"
They may fail parallel test-all

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55602 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-07-06 18:17:36 +00:00
nobu
79a85b18cc string.c: return reallocated pointer
* string.c (str_fill_term): return new pointer reallocated by
  filling terminator.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55212 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-05-30 07:20:28 +00:00
nobu
b493d156de string.c: integer overflow
* string.c (rb_str_modify_expand): check integer overflow.
  [ruby-core:75592] [Bug #12390]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55054 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-05-18 05:52:40 +00:00
naruse
d46e2aea71 * string.c (rb_str_init): introduce String.new(capacity: size)
[Feature #12024]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53850 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-17 03:21:35 +00:00
naruse
040ce05610 * string.c (str_new_frozen): if the given string is embeddedable
but not embedded, embed a new copied string. [Bug #11946]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53724 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-03 04:52:13 +00:00
hsbt
b7b5692aea * test/-ext-/string/test_capacity.rb: Added missing library.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53674 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-01-27 12:04:47 +00:00