encoding.c: handle needmore error from rb_enc_precise_mbclen()
rb_enc_ascget() erroneously reports success even if the given byte
sequence is incomplete, for non-ASCII compatible encoding strings.
rb_enc_precise_mbclen() may return a negative value on error, and thus
rb_enc_ascget() must not store the return value in 'unsigned int';
otherwise the subsequent MBCLEN_CHARFOUND_P() check won't catch the
error. [ruby-core:78646] [Bug #13034]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_2_2@57217 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* string.c: Fix memory corruptions when using UTF-16/32 strings.
[Bug #12536] [ruby-dev:49699]
* string.c (TERM_LEN_MAX): Macro for the longest TERM_FILL length,
the same as largest value of rb_enc_mbminlen(enc) among encodings.
* string.c (str_new, rb_str_buf_new, str_shared_replace): Allocate
+TERM_LEN_MAX bytes instead of +1. This change may increase memory
usage.
* string.c (rb_str_new_with_class): Use TERM_LEN of the "obj".
* string.c (rb_str_plus, rb_str_justify): Use str_new0 which is aware
of termlen.
* string.c (str_shared_replace): Copy +termlen bytes instead of +1.
* string.c (rb_str_times): termlen should not be included in capa.
* string.c (RESIZE_CAPA_TERM): When using RSTRING_EMBED_LEN_MAX,
termlen should be counted with it because embedded strings are
also processed by TERM_FILL.
* string.c (rb_str_capacity, str_shared_replace, str_buf_cat): ditto.
* string.c (rb_str_drop_bytes, rb_str_setbyte, str_byte_substr): ditto.
* string.c (rb_str_subseq, str_substr): When RSTRING_EMBED_LEN_MAX
is used, TERM_LEN(str) should be considered with it because
embedded strings are also processed by TERM_FILL.
Additional fix for [Bug #12536] [ruby-dev:49699].
Additional fix for [Bug #12536] [ruby-dev:49699].
* string.c (rb_usascii_str_new, rb_utf8_str_new): Specify termlen
which is apparently 1 for the encodings.
* string.c (str_new0_cstr): New static function to create a String
object from a C string with specifying termlen.
* string.c (rb_usascii_str_new_cstr, rb_utf8_str_new_cstr): Specify
termlen by using new str_new0_cstr().
* string.c (str_new_static): Specify termlen from the given encoding
when creating a new String object is needed.
* string.c (rb_tainted_str_new_with_enc): New function to create a
tainted String object with the given encoding. This means that
the termlen is correctly specified. Curretly static function.
The function name might be renamed to rb_tainted_enc_str_new
or rb_enc_tainted_str_new.
* string.c (rb_external_str_new_with_enc): Use encoding by using the
above rb_tainted_str_new_with_enc().
* string.c (str_fill_term): When termlen increases, re-allocation
of memory for termlen should always be needed.
In this fix, if possible, decrease capa instead of realloc.
[Bug #12536] [ruby-dev:49699]
* string.c: Partially reverts r55547 and r55555.
ChangeLog about the reverted changes are also deleted in this file.
[Bug #12536] [ruby-dev:49699] [ruby-dev:49702]
* string.c: Specify termlen as far as possible.
the termlen is correctly specified. Currently static function.
* string.c (rb_str_change_terminator_length): New function to change
termlen and resize heap for the terminator. This is split from
rb_str_fill_terminator (str_fill_term) because filling terminator
and changing terminator length are different things. [Bug #12536]
* internal.h: declaration for rb_str_change_terminator_length.
* string.c (str_fill_term): Simplify only to zero-fill the terminator.
For non-shared strings, it assumes that (capa + termlen) bytes of
heap is allocated. This partially reverts r55557.
* encoding.c (rb_enc_associate_index): rb_str_change_terminator_length
is used, and it should be called whenever the termlen is changed.
* string.c (str_capacity): New static function to return capacity
of a string with the given termlen, because the termlen may
sometimes be different from TERM_LEN(str) especially during
changing termlen or filling terminator with specific termlen.
* string.c (rb_str_capacity): Use str_capacity.
* string.c (str_buf_cat): Fix capa size for embed string.
Fix bug in r55547. [Bug #12536]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_2_2@56301 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* encoding.c (enc_m_loader): defer finding encoding object not to
be infected by marshal source. [ruby-core:71793] [Bug #11760]
* marshal.c (r_object0): enable compatible loader on USERDEF
class. the loader function is called with the class itself,
instead of an allocated object, and the loaded data.
* marshal.c (compat_allocator_table): intialize
compat_allocator_tbl on demand.
* object.c (rb_undefined_alloc): extract from rb_obj_alloc.
* marshal.c (compat_allocator_table): initialize
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_2_2@52974 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* encoding.c (load_encoding): use rb_require_internal instead of
calling rb_require_safe with protection.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@48700 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
use 0 for rb_data_type_t::reserved instead of NULL, since its type
may be changed in the future and possibly not a pointer type.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@48662 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* encoding.c: move `ruby_encoding_index` stuff from
include/ruby/encoding.h to hide the extra field.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@46323 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* encoding.c (rb_enc_default_internal): fix rdoc. `__FILE__` is
in filesystem encoding but not `default_internal`.
[ruby-core:61894] [Bug #9713]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@45539 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* parse.y (rb_str_dynamic_intern): associate proper encoding with
the result symbol.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@45439 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* encoding.c (rb_enc_get_index): the encoding of dynamic symbol
comes from fstr.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@45438 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* encoding.c (enc_capable): Symbol is now encoding capable.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@45436 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* string.c (get_actual_encoding): get actual encoding according to
the BOM if exists.
* string.c (rb_str_inspect): use according encoding, instead of
pseudo encodings, UTF-{16,32}. [ruby-core:59757] [Bug #8940]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@44605 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* encoding.c (load_encoding): should preserve outer errinfo, so that
expected exception may not be lost. [ruby-core:57949] [Bug #9038]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@43376 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* encoding.c (is_obj_encoding): new macro to check if obj is an
Encoding. obj can be any type while is_data_encoding expects T_DATA
only.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@42166 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* encoding.c (rb_enc_code_to_mbclen): add new function which returns
mbclen from codepoint like as rb_enc_codelen() but 0 for invalid
char.
* include/ruby/encoding.h (rb_enc_code_to_mbclen): declaration and
shortcut macro.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@42077 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* encoding.c (enc_set_index): since r41967, old terminator is dealt
with in str_fill_term(). should not consider it here because this
function is called before any encoding is set.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@41994 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* encoding.c (enc_inspect): use PRIsVALUE to preserve the result
encoding.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@41965 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* encoding.c (enc_set_index): deal with terminator so that
rb_enc_set_index also works.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@41964 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* encoding.c (rb_enc_associate_index): fill new terminator length, not
old one.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@41937 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* encoding.c (rb_enc_associate_index): refill the terminator if it
becomes longer than before. [ruby-dev:47500] [Bug #8624]
* string.c (str_null_char, str_fill_term): get rid of out of bound
access.
* string.c (rb_str_fill_terminator): add a parameter for the length of
new terminator.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@41930 b2dd03c8-39d4-4d8f-98ff-823fe69b080e