Work around issue transcoding issue with non-ASCII compatible encodings and xml escaping

When using a non-ASCII compatible source and destination encoding
and xml escaping (the :xml option to String#encode), the resulting
string was broken, as it used the correct non-ASCII compatible
encoding, but contained data that was ASCII-compatible instead of
compatible with the string's encoding.

Work around this issue by detecting the case where both the
source and destination encoding are non-ASCII compatible, and
transcoding the source string from the non-ASCII compatible
encoding to UTF-8. The xml escaping code will correctly handle
the UTF-8 source string and the return the correctly encoded
and escaped value.

Fixes [Bug #12052]

Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
This commit is contained in:
Jeremy Evans 2021-06-26 12:32:39 -07:00 committed by GitHub
parent 391abc543c
commit e86c1f6fc5
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
Notes: git 2021-06-27 04:33:08 +09:00
Merged: https://github.com/ruby/ruby/pull/4605

Merged-By: jeremyevans <code@jeremyevans.net>
2 changed files with 25 additions and 0 deletions

View file

@ -2719,6 +2719,12 @@ str_transcode0(int argc, VALUE *argv, VALUE *self, int ecflags, VALUE ecopts)
}
}
else {
if (senc && denc && !rb_enc_asciicompat(senc) && !rb_enc_asciicompat(denc)) {
rb_encoding *utf8 = rb_utf8_encoding();
str = rb_str_conv_enc(str, senc, utf8);
senc = utf8;
sname = "UTF-8";
}
if (encoding_equal(sname, dname)) {
sname = "";
dname = "";