[DOC] How to get the longest last match [Bug #18415]
---
string.c | 32 +++++++++++++++++++++++++++++++-
1 file changed, 31 insertions(+), 1 deletion(-)
Fix documentation for String#{<<,concat,prepend}
These methods mutate and return the receiver, they don't create
and return a new string.
Fixes [Bug #18241]
---
string.c | 19 ++++++++++---------
1 file changed, 10 insertions(+), 9 deletions(-)
Fix documentation of #<=> and #casecmp [ci skip]
Descriptions for return values of -1 and 1 were reversed.
---
string.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
[Bug #18154] Fix memory leak in String#initialize
String#initialize can leak memory when called on a string that is marked
with STR_NOFREE because it does not unset the STR_NOFREE flag.
---
string.c | 2 +-
test/ruby/test_string.rb | 10 ++++++++++
2 files changed, 11 insertions(+), 1 deletion(-)
Scan the coderange in the given encoding
---
ext/-test-/string/enc_str_buf_cat.c | 14 ++++++++++++++
string.c | 32 ++++++++++++++++++++++---------
test/-ext-/string/test_enc_str_buf_cat.rb | 9 +++++++++
3 files changed, 46 insertions(+), 9 deletions(-)
Work around issue transcoding issue with non-ASCII compatible
encodings and xml escaping
When using a non-ASCII compatible source and destination encoding
and xml escaping (the :xml option to String#encode), the resulting
string was broken, as it used the correct non-ASCII compatible
encoding, but contained data that was ASCII-compatible instead of
compatible with the string's encoding.
Work around this issue by detecting the case where both the
source and destination encoding are non-ASCII compatible, and
transcoding the source string from the non-ASCII compatible
encoding to UTF-8. The xml escaping code will correctly handle
the UTF-8 source string and the return the correctly encoded
and escaped value.
Fixes [Bug #12052]
Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
---
test/ruby/test_transcode.rb | 19 +++++++++++++++++++
transcode.c | 6 ++++++
2 files changed, 25 insertions(+)
=?UTF-8?q?-=20add=20regression=20tests=20for=20U+6E7F=20(?=
=?UTF-8?q?=E6=B9=BF)=20in=20ISO-2022-JP?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
In ISO-2022-JP, the bytes use to code are the same as those for "<>".
This adds regression tests to make sure that these bytes, when representing
湿, are NOT escaped with encode("ISO-2022-JP, xml: :text) or similar.
These are additional regression tests for #12052.
---
test/ruby/test_transcode.rb | 3 +++
1 file changed, 3 insertions(+)
Make String#{strip,lstrip}{,!} strip leading NUL bytes
The documentation already specifies that they strip whitespace
and defines whitespace to include null.
This wraps the new behavior in the appropriate guards in the specs,
but does not specify behavior for previous versions, because this
is a bug that could be backported.
Fixes [Bug #17467]
---
spec/ruby/core/string/lstrip_spec.rb | 18 ++++++++++++------
spec/ruby/core/string/strip_spec.rb | 22 ++++++++++------------
string.c | 4 ++--
test/ruby/test_string.rb | 16 ++++++++++++++++
4 files changed, 40 insertions(+), 20 deletions(-)
Also document that both :deprecated and :experimental are supported
:category option values.
The locations where warnings were marked as deprecation warnings
was previously reviewed by shyouhei.
Comment a couple locations where deprecation warnings should probably
be used but are not currently used because deprecation warning
enablement has not occurred at the time they are called
(RUBY_FREE_MIN, RUBY_HEAP_MIN_SLOTS, -K).
Add assert_deprecated_warn to test assertions. Use this to simplify
some tests, and fix failing tests after marking some warnings with
deprecated category.
Fixes [Feature #13381]
When passed a `fake_str`, `register_fstring` would create new strings
with `str_new_static`. That's not what was expected, and answer
almost no use cases.
since 58325daae3.
../string.c:1339:1: warning: ‘str_new_empty’ defined but not used [-Wunused-function]
1339 | str_new_empty(VALUE str)
| ^~~~~~~~~~~~~
This modifies the following String methods to return String instances
instead of subclass instances:
* String#*
* String#capitalize
* String#center
* String#chomp
* String#chop
* String#delete
* String#delete_prefix
* String#delete_suffix
* String#downcase
* String#dump
* String#each/#each_line
* String#gsub
* String#ljust
* String#lstrip
* String#partition
* String#reverse
* String#rjust
* String#rpartition
* String#rstrip
* String#scrub
* String#slice!
* String#slice/#[]
* String#split
* String#squeeze
* String#strip
* String#sub
* String#succ/#next
* String#swapcase
* String#tr
* String#tr_s
* String#upcase
This also fixes a bug in String#swapcase where it would return the
receiver instead of a copy of the receiver if the receiver was the
empty string.
Some string methods were left to return subclass instances:
* String#+@
* String#-@
Both of these methods will return the receiver (subclass instance)
in some cases, so it is best to keep the returned class consistent.
Fixes [#10845]
`encoding` can be not only an encoding name, but also an Encoding object.
```
s = String.new('foo', encoding: Encoding::US_ASCII)
s.encoding # => #<Encoding:US-ASCII>
```
When the pattern Regexp given to String#index and String#rindex
contain a /\K/ (lookbehind) operator, these methods return the
position where the beginning of the lookbehind pattern matches, while
they are expected to return the position where the \K matches.
```
# without patch
"abcdbce".index(/b\Kc/) # => 1
"abcdbce".rindex(/b\Kc/) # => 4
```
This patch fixes this problem by using BEG(0) instead of the return
value of rb_reg_search.
```
# with patch
"abcdbce".index(/b\Kc/) # => 2
"abcdbce".rindex(/b\Kc/) # => 5
```
Fixes [Bug #17118]
When the pattern given to String#partition and String#rpartition
contain a /\K/ (lookbehind) operator, the methods return strings
sliced at incorrect positions.
```
# without patch
"abcdbce".partition(/b\Kc/) # => ["a", "c", "cdbce"]
"abcdbce".rpartition(/b\Kc/) # => ["abcd", "c", "ce"]
```
This patch fixes the problem by using BEG(0) instead of the return
value of rb_reg_search.
```
# with patch
"abcdbce".partition(/b\Kc/) # => ["ab", "c", "dbce"]
"abcdbce".rpartition(/b\Kc/) # => ["abcdb", "c", "e"]
```
As a side-effect this patch makes String#partition 2x faster when the
pattern is a costly Regexp by performing Regexp search only once,
which was unexpectedly done twice in the original implementation.
Fixes [Bug #17119]
Use BEG(0) instead of the result of rb_reg_search to handle the cases
when the separator Regexp contains /\K/ (lookbehind) operator.
Fixes [Bug #17113]
Not every compilers understand that rb_raise does not return. When a
function does not end with a return statement, such compilers can issue
warnings. We would better tell them about reachabilities.