Commit graph

1033 commits

Author SHA1 Message Date
nobu
bebc52a4a7 string.c: enable capacity when setting capa
* string.c (rb_str_modify_expand): enable capacity and disable
  assocation with packed objects when setting capa, so that
  pack("p") string fails to unpack properly after modified.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@44803 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2014-02-04 03:55:32 +00:00
nobu
6951fbca43 string.c: respect BOM
* string.c (get_encoding): respect BOM on pseudo encodings.
  [ruby-dev:47895] [Bug #9415]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@44606 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2014-01-15 05:04:36 +00:00
nobu
77ae7b2e83 string.c: use actual encodings
* string.c (get_actual_encoding): get actual encoding according to
  the BOM if exists.
* string.c (rb_str_inspect): use according encoding, instead of
  pseudo encodings, UTF-{16,32}.  [ruby-core:59757] [Bug #8940]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@44605 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2014-01-15 05:03:49 +00:00
ko1
c702005a7b * include/ruby/ruby.h: rename OBJ_WRITE and OBJ_WRITTEN into
RB_OBJ_WRITE and RB_OBJ_WRITTEN.
* array.c, class.c, compile.c, hash.c, internal.h, iseq.c,
  proc.c, process.c, re.c, string.c, variable.c, vm.c,
  vm_eval.c, vm_insnhelper.c, vm_insnhelper.h,
  vm_method.c: catch up this change.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@44299 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-12-20 08:07:47 +00:00
tmm1
98a74d4dd5 parse.y: use rb_fstring() for strings stored in the symbol table
* parse.y (register_symid_str): use fstrings in symbol table
  [Bug #9171] [ruby-core:58656]
* parse.y (rb_id2str): ditto
* string.c (rb_fstring): create frozen_strings on first usage. this
  allows rb_fstring() calls from the parser (before cString is created)
* string.c (fstring_set_class_i): set klass on fstrings generated
  before cString was defined
* string.c (Init_String): convert frozen_strings table to String
  objects after boot
* ext/-test-/symbol/type.c (bug_sym_id2str): expose rb_id2str()
* test/-ext-/symbol/test_type.rb (module Test_Symbol): verify symbol
  table entries are fstrings

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@44057 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-12-08 01:39:27 +00:00
nobu
efbcd1cb25 * string.c (rb_str_scrub): [DOC] add param str.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@44018 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-12-05 13:05:04 +00:00
nobu
5a7ee1e117 string.c: fix declaration-after-statement
* string.c (fstr_update_callback): move a variable declaration since
  ISO C90 forbids mixed declarations and code.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@43988 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-12-04 04:20:15 +00:00
tmm1
753fe47175 * string.c (fstr_update_callback): Improve implementation in r43968
based on feedback from @nagachika. In the existing case, we can
  return ST_STOP to prevent any hash modification. In the !existing
  case, set both key and value to the fstr.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@43986 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-12-04 04:05:13 +00:00
tmm1
6edaf997e3 * string.c (rb_fstring): Use st_update instead of st_lookup +
st_insert.
* string.c (fstr_update_callback): New callback for st_update.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@43968 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-12-03 03:40:56 +00:00
ko1
d7df3e2830 * string.c (rb_fstring): fstrings should be ELTS_SHARED.
If we resurrect dying objects (non-marked, but not swept yet),
  pointing shared string can be collected.
  To avoid such issue, fstrings (recorded to fstring_table)
  should not be ELTS_SHARED (should not have a shared string).



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@43887 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-11-28 00:22:45 +00:00
tmm1
cbb56e30a4 * compile.c: Use rb_fstring() to de-duplicate string literals in code. [ruby-core:58599] [Bug #9159] [ruby-core:54405]
* iseq.c (prepare_iseq_build): De-duplicate iseq labels and source locations.
* re.c (rb_reg_initialize): Use rb_fstring() for regex string.
* string.c (rb_fstring): Handle non-string and already-fstr arguments.
* vm_eval.c (eval_string_with_cref): De-duplicate eval source filename.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@43866 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-11-26 23:30:25 +00:00
nobu
4760b9824f string.c: fix memsize of frozen shared string
* string.c (str_new4): copy the original capacity so that memsize of
  frozen shared string returns correct size.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@43862 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-11-26 17:27:01 +00:00
ko1
2bfd722d80 * internal.h: do not use ruby_sized_xrealloc() and ruby_sized_xfree()
if HAVE_MALLOC_USABLE_SIZE (or _WIN32) is defined.
  We don't need these function if malloc_usable_size() is available.
* gc.c: catch up this change.
* gc.c: define HAVE_MALLOC_USABLE_SIZE on _WIN32.
* array.c (ary_resize_capa): do not use ruby_sized_xfree() with
  local variable to avoid "unused local variable" warning.
  This change only has few impact.
* string.c (rb_str_resize): ditto.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@43839 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-11-25 01:13:31 +00:00
ko1
301223df95 * gc.c (rb_gc_resurrect): added.
rb_fstring() used rb_gc_mark() to avoid freeing used string.
  However, rb_gc_mark() set mark bit *and* pushes mark_stack.
  rb_gc_resurrect() does only set mark bit if it is before sweeping.
* string.c (rb_fstring): use rb_gc_resurrect.
* internal.h: add decl.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@43718 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-11-19 09:48:47 +00:00
nobu
b97f754876 string.c: constify
* string.c (tr_find): constify argument table.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@43699 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-11-17 14:34:24 +00:00
tmm1
8f3934261a * internal.h: move common string/hash flags to include file.
* ext/objspace/objspace_dump.c: remove flags shared above.
* hash.c: ditto.
* string.c: ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@43647 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-11-11 09:39:13 +00:00
charliesome
07ac58747f * compile.c (iseq_compile_each): emit opt_str_freeze if the #freeze
method is called on a static string literal with no arguments.

* defs/id.def (firstline): add freeze so idFreeze is available

* insns.def (opt_str_freeze): add opt_str_freeze instruction which
  pushes a frozen string literal without allocating a new object if
  String#freeze is not overriden

* string.c (Init_String): define String#freeze

* vm.c (vm_init_redefined_flag): define BOP_FREEZE on String class as
  a basic operation

* vm_insnhelper.h: ditto

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@43627 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-11-09 21:17:06 +00:00
nobu
e7fac351c2 string.c: fix typo
* string.c (rb_str_scrub): fix typo, should yield invalid byte
  sequence to be scrubbed.  reported by znz at IRC.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@43503 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-11-01 11:53:59 +00:00
nobu
adbdd97d28 string.c: export rb_str_scrub
* string.c (rb_str_scrub): export with fixed length arguments, and
  allow nil as replacement string instead of omitting.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@43500 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-11-01 07:55:56 +00:00
ko1
cf0106827d * string.c (STR_HEAP_SIZE): includes TERM_LEN(str).
* string.c (rb_str_memsize): use STR_HEAP_SIZE().



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@43335 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-10-17 09:49:58 +00:00
ko1
76b06555d0 * gc.c, internal.h: rename ruby_xsizefree/realloc to
rb_sized_free/realloc.
* array.c: catch up these changes.
* string.c: ditto.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@43333 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-10-17 08:41:23 +00:00
ko1
3de7ec0a3f * array.c, string.c: use ruby_xsizedfree() and ruby_xsizedrealloc().
* internal.h (SIZED_REALLOC_N): define a macro as REALLOC_N().



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@43332 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-10-17 08:35:06 +00:00
nobu
3d3a0d88c9 string.c: use str_duplicate
* string.c (rb_str_resurrect): use str_duplicate(), which does
  completely same.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@43233 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-10-10 03:13:34 +00:00
nobu
a28de81aec string.c: mark frozen string
* string.c (rb_fstring): because of lazy sweep, str may be unmaked
  already and swept at next time, so mark it for the time being.
  [ruby-core:57756]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@43210 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-10-09 08:40:18 +00:00
ko1
dc626dbab3 * include/ruby/ruby.h: rename RARRAY_RAWPTR() to RARRAY_CONST_PTR().
RARRAY_RAWPTR(ary) returns (const VALUE *) type pointer and
  usecase of this macro is not acquire raw pointer, but acquire
  read-only pointer. So we rename to better name.
  RSTRUCT_RAWPTR() is also renamed to RSTRUCT_CONST_PTR()
  (I expect that nobody use it).
* array.c, compile.c, cont.c, enumerator.c, gc.c, proc.c, random.c,
  string.c, struct.c, thread.c, vm_eval.c, vm_insnhelper.c:
  catch up this change.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@43043 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-09-25 08:24:34 +00:00
nobu
3788742bc9 string.c: fix for UTF-16/32
* string.c (rb_str_inspect): get rid of out-of-bound access.
* string.c (rb_str_inspect): when a UTF-16/32 string doesn't have a
  BOM, inspect as a dummy encoding string.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@43035 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-09-24 08:39:01 +00:00
nobu
1b3adaefd9 string.c: scan coderange
* string.c (rb_str_conv_enc_opts): make sure to scan coderange to get
  rid of unnecessary conversion.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@42998 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-09-20 15:47:46 +00:00
glass
81629f0531 * string.c (rb_str_enumerate_lines): make String#each_line and
#lines not raise invalid byte sequence error when it is called
  with an argument. The patch also causes performance improvement.
  [ruby-dev:47549] [Bug #8698]

* test/ruby/test_m17n_comb.rb (test_str_each_line): remove
  assertions which check that String#each_line and #lines will
  raise an error if the receiver includes invalid byte sequence.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@42966 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-09-18 14:34:04 +00:00
charliesome
7eafeaa313 * string.c (fstring_cmp): take string encoding into account when
comparing fstrings [ruby-core:57037] [Bug #8866]

* test/ruby/test_string.rb: add test

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@42847 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-09-05 09:07:48 +00:00
nobu
dffae9a1f9 string.c: reduce objects in rb_fstring
* string.c (rb_fstring, rb_str_free): use st_data_t instead of VALUE.
* string.c (rb_fstring): get rid of duplicating already frozen object.
* parse.y (str_suffix_gen): freeze in advance to reduce objects.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@42846 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-09-05 08:25:56 +00:00
charliesome
6fd9000076 * include/ruby/ruby.h: add RSTRING_FSTR flag
* internal.h: add rb_fstring() prototype
* parse.y (str_suffix_gen): deduplicate frozen string literals
* string.c (rb_fstring): deduplicate frozen string literals
* string.c (rb_str_free): delete fstrings from frozen_strings table when
  they are GC'd
* string.c (Init_String): initialize frozen_strings table
* test/ruby/test_string.rb: test frozen strings are deduplicated

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@42843 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-09-05 04:49:16 +00:00
nobu
b31965cb43 string.c: fix for \K
* string.c (str_gsub): use BEG(0) for whole matched position not
  return value from rb_reg_search(), for \K matching.
  [ruby-dev:47694] [Bug #8856]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@42820 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-09-04 02:13:42 +00:00
nobu
5669902126 string.c: rb_enc_str_new_cstr
* string.c (rb_enc_str_new_cstr): new function to create a string from
  the C-string pointer with the specified encoding.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@42811 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-09-03 13:03:54 +00:00
nobu
378161fe68 dir.c: reduce string object
* dir.c (dir_each): get rid of allocate new string from UTF-8 string.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@42737 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-08-31 04:30:25 +00:00
ko1
e0932e3ad3 * string.c (rb_str_format_m): use RARRAY_RAWPTR() instead of
RARRAY_PTR() because there is no new reference.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@42443 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-08-08 10:58:03 +00:00
zzak
aa66f59c97 * string.c: [DOC] Description of rb_str_equal [Fixes GH-375]
Based on a patch by @markijbema
  https://github.com/ruby/ruby/pull/375


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@42417 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-08-07 00:02:17 +00:00
nobu
c837fe4056 string.c: fix typo
* string.c (rb_str_ellipsize): [DOC] fix typo, "encoding" instead of
  "encoded" which is probably a slip of the auto-completion.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@42395 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-08-06 07:04:00 +00:00
glass
79be10475f * string.c (str_rindex): remove comment.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@42308 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-08-01 13:32:06 +00:00
glass
0e2d0bb970 * string.c (rb_str_rindex): fix bug introduced in r42269.
"".rindex("") should return 0.
  (str_rindex): ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@42275 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-07-31 11:18:18 +00:00
glass
867876ab9b * string.c (rb_str_rindex): performance improvement by using
memrchr(3).

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@42269 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-07-31 08:47:13 +00:00
glass
8b126d59b3 * string.c (rb_str_rindex): refactoring and avoid to call str_nth() if
pos == 0.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@42268 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-07-31 07:53:08 +00:00
glass
dd8f7a6cfd * string.c: add internal API rb_str_locktmp_ensure().
* io.c (io_fread): use rb_str_locktmp_ensure().
  [ruby-core:56121] [Bug #8669]

* test/ruby/test_io.rb: add a test for above.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@42212 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-07-28 08:49:25 +00:00
glass
0780974482 * string.c (rb_str_enumerate_chars): specify array capa
with str_strlen().

* string.c (rb_str_enumerate_codepoints): ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@42113 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-07-22 09:42:15 +00:00
glass
fa20fb3728 * string.c (rb_str_enumerate_chars): specify array capa.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@42112 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-07-22 09:03:44 +00:00
glass
f775a27bd3 * string.c (rb_str_each_char_size): performance implement by
using rb_str_length().

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@42110 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-07-22 08:37:46 +00:00
naruse
42bf899458 * string.c (rb_str_succ): add missing case NEIGHBOR_WRAPPED.
r42078 caused buggy behavior like "\xFF".b -> "\x01\xFF".b

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@42082 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-07-20 09:10:12 +00:00
nobu
e6a6dd8e7e string.c: wchar succ
* string.c (enc_succ_char, enc_pred_char): consider wchar case.
  [ruby-core:56071] [Bug #8653]
* string.c (rb_str_succ): do not replace with invalid char.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@42078 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-07-20 03:14:09 +00:00
ko1
7497452930 * string.c (str_alloc): no need to clear RString (already cleared).
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@42033 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-07-18 05:39:30 +00:00
nobu
460d8c11cd string.c: char length
* string.c (str_null_char): calculate char length.  fix commit miss at
  r41967.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@41971 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-07-14 17:40:32 +00:00
nobu
a7481aae3f string.c: consider old terminator
* string.c (str_fill_term): consider old terminator length, and should
  not use rb_enc_ascget since it depends on the current encoding which
  may not be compatible with the new terminator.  [Bug #8634]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@41967 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-07-14 17:21:41 +00:00