Commit graph

1411 commits

Author SHA1 Message Date
normal
2079b71000 string.c: fix String#crypt leak introduced in r58866
* string.c (rb_str_crypt): define LARGE_CRYPT_DATA when allocating

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58876 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-24 21:26:14 +00:00
nobu
92261511b6 string.c: for small crypt_data
* string.c (rb_str_crypt): struct crypt_data defined in
  missing/crypt.h is small enough.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58866 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-24 06:55:09 +00:00
ko1
9e1624cfe8 Add debug counters.
* debug_counter.h: add the following counters to measure object types.
  obj_free: freed count
  obj_str_ptr: freed count of Strings they have extra buff.
  obj_str_embed: freed count of Strings they don't have extra buff.
  obj_str_shared: freed count of Strings they have shared extra buff.
  obj_str_nofree: freed count of Strings they are marked as nofree.
  obj_str_fstr: freed count of Strings they are marked as fstr.
  obj_ary_ptr: freed count of Arrays they have extra buff.
  obj_ary_embed: freed count of Arrays they don't have extra buff.
  obj_obj_ptr: freed count of Objects (T_OBJECT) they have extra buff.
  obj_obj_embed: freed count of Objects they don't have extra buff.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58865 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-24 06:46:44 +00:00
normal
144e067007 string.c (rb_str_crypt): fix excessive stack use with crypt_r
"struct crypt_data" is 131232 bytes on x86-64 GNU/Linux,
making it unsafe to use tiny Fiber stack sizes.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58864 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-24 03:01:44 +00:00
stomar
40bc846bf8 string.c: fix String#{casecmp,casecmp?} for non-string arguments
* string.c: make String#{casecmp,casecmp?} return nil for
  non-string arguments instead of raising a TypeError.

* test/ruby/test_string.rb: add tests.

Reported by Marcus Stollsteimer.  Based on a patch by Shingo Morita.
[ruby-core:80145] [Bug #13312]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58837 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-21 19:28:48 +00:00
nobu
f3a49ebc92 string.c: cut down intermediate string
* string.c (rb_external_str_new_with_enc): cut down intermediate
  string for conversion source, by appending with conversion.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58709 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-14 00:21:00 +00:00
nobu
7323de517d revert r58703 & r58705
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58708 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-13 16:04:05 +00:00
nobu
a3960d0a60 string.c: fix up r58703
* string.c (rb_external_str_new_with_enc): fix the case of
  conversion failure.  when conversion failed for some reason,
  just ignores the default internal encoding and returns in the
  given encoding.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58705 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-13 14:20:19 +00:00
nobu
7678c0d702 string.c: cut down intermediate string
* string.c (rb_external_str_new_with_enc): cut down intermediate
  string for conversion source, by appending with conversion.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58703 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-13 13:34:39 +00:00
nobu
7d52ad9e17 string.c: fix one-off bug
* string.c (rb_str_cat_conv_enc_opts): fix one-off bug.  `ofs`
  equals `olen` when appending at the end.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58702 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-13 12:31:01 +00:00
nobu
2c0baa97a9 string.c: remove bare Unicode.
* string.c (rb_str_unicode_normalize): remove bare Unicode.  do
  not assume that all compilers can handle UTF-8.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58688 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-12 15:29:55 +00:00
stomar
8303bc6cf6 string.c: docs for String#match
* string.c: [DOC] add example for String#match with pos argument.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58669 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-11 18:59:45 +00:00
stomar
4a0eaeeb4d string.c: docs for Symbol
* string.c: [DOC] adopt call-seq's for Symbol#{match,match?} from
  String methods; other small improvements for Symbol docs.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58668 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-11 18:58:27 +00:00
stomar
477cb2b8d8 string.c: docs for Symbol#{match,match?}
* string.c: [DOC] mention pos argument for Symbol#{match,match?}.
  Patch by Yuki Kurihara (ksss).  [Fix GH-1606]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58666 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-11 18:56:32 +00:00
nobu
208240870a string.c: fix r58618
* string.c (unicode_normalize_common): aggregation type cannot be
  initialized with dynamic values, in C89.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58621 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-09 14:11:46 +00:00
duerst
a4301ec214 replace hand-written argument check by call to rb_scan_args in unicode_normalize_common
In string.c, replace hand-written argument count check by call to rb_scan_args.
This allows to use rb_funcallv once, rather than using rb_funcall twice.
Thanks to Hanmac (Hans Mackowiak) for the idea, see
https://bugs.ruby-lang.org/issues/11078#note-7.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58618 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-09 11:13:45 +00:00
nobu
d7f2c72322 string.c: fix types
* string.c (id_normalize, id_normalized_p): fix types, IDs should
  be ID.

* string.c (unicode_normalize_common): ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58575 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-06 01:01:52 +00:00
stomar
6ae3cf02f7 string.c: [DOC] improve docs for String.new
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58569 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-04 13:19:43 +00:00
ktsj
4e6daedc70 string.c: [DOC] Properly refer to keyword argument by its name
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58567 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-04 08:59:01 +00:00
duerst
f47033e237 refactor common parts of unicode normalization functions into unicode_normalize_common
In string.c, refactor the common parts (requiring of unicode_normalize/normalize.rb,
check of number of arguments) of the unicode normalization functions
(rb_str_unicode_normalize, rb_str_unicode_normalize_bang, rb_str_unicode_normalized_p)
into the new function unicode_normalize_common.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58558 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-04 02:16:27 +00:00
duerst
140560e4ee move definition of String#unicode_normalized? to C to make sure it is documented
* lib/unicode_normalize.rb: Remove definition of String#unicode_normalized?
  (including documentation). Leave a comment explaining that the file is now empty.
* string.c: Define String#unicode_normalized? in rb_str_unicode_normalized_p in C,
  (including documentation)
* lib/unicode_normalize/normalize.rb: Remove (re)definition of
  String#unicode_normalized? to avoid warnings (when $VERBOSE==true) and
  problems when String is frozen

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58555 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-04 02:00:19 +00:00
duerst
90ab1ee023 move definition of String#unicode_normalize! to C to make sure it is documented
* lib/unicode_normalize.rb: Remove definition of String#unicode_normalize!
  (including documentation)
* string.c: Define String#unicode_normalize! in rb_str_unicode_normalize_bang in C,
  (including documentation)
* lib/unicode_normalize/normalize.rb: Remove (re)definition of
  String#unicode_normalize! to avoid warnings (when $VERBOSE==true) and
  problems when String is frozen

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58553 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-04 01:36:52 +00:00
svn
7abf8bae23 * remove trailing spaces.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58551 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-03 12:18:37 +00:00
duerst
5fee67c9ba move definition of String#unicode_normalize to C to make sure it is documented
* lib/unicode_normalize.rb: Remove definition of String#unicode_normalize
  (including documentation)
* string.c: Define String#unicode_normalize in rb_str_unicode_normalize in C,
  (including documentation)
* lib/unicode_normalize/normalize.rb: Remove (re)definition of
  String#unicode_normalize to avoid warnings (when $VERBOSE==true) and
  problems when String is frozen

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58550 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-03 12:18:37 +00:00
nobu
cc68af3d02 string.c: improve insertion performace
* string.c (rb_str_splice_0): improve performace of single byte
  optimizable cases, insertion 7bit string to 7bit string.
  [ruby-dev:49984] [Bug #13228]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58383 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-04-17 13:38:34 +00:00
sorah
31a755e4f2 string.c: Supress logical-op-parentheses warning
* string.c(rb_str_upcase_bang): Supress logical-op-parentheses warning
  Patch by Fukuo Kadota <fukuo-kadota@cookpad.com>,
  Closes [GH-1570] [Bug #13387].

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58211 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-29 11:33:59 +00:00
nobu
f0e6e47999 string.c: use the usable size
* string.c (rb_str_change_terminator_length): when called after
  the content has been copied, old terminator length no longer
  makes sense.  use the whole usable size instead of capacity
  without terminator.  [ruby-core:80257] [Bug #13339]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58042 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-21 05:28:38 +00:00
duerst
a5330fa9ea fix accidental reversal of r57997 in r58000
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58009 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-18 01:35:03 +00:00
duerst
67c1197835 clarifiy 'codepoint' in documentation of String#each_codepoint
Make sure it's clear that the returned values are not Unicode codepoints
for encodings other than UTF-8/UTF-16(BE|LE)/UTF-32(BE|LE).

[ci skip] [Bug #13321]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58000 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-17 02:24:53 +00:00
normal
9eb94b4dc1 deduplicate static rb_str_format format strings
Anybody who hits these code paths can hit them again in the
future, so try deduplicating across multiple runs of these
methods to reduce garbage.

* string.c (str_upto_each): fstring on "%.*d"
* strftime.c (rb_strftime_with_timespec): fstring on "%0*d"

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57997 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-17 00:55:55 +00:00
nobu
8c661ba264 string.c: shortcut argument check
* string.c (str_casecmp, str_casecmp_p): split to skip argument
  check when it is a String certainly.

* string.c (sym_casecmp, sym_casecmp_p): shortcut argument checks.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57978 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-15 07:57:11 +00:00
nobu
9fa56026e5 string.c: use rb_check_string_type
* string.c (rb_str_cmp_m): use rb_check_string_type for check and
  conversion, instead of calling the conversion method directly.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57965 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-14 03:42:43 +00:00
stomar
e6dec8f92e docs for Symbol#casecmp and Symbol#casecmp?
* string.c: [DOC] improve docs of Symbol#casecmp and Symbol#casecmp?
  according to the similar String methods; fix RDoc markup and typos;
  fix call-seq's for Symbol#{upcase,downcase,capitalize,swapcase}.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57963 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-13 20:20:40 +00:00
nobu
bd17e25588 string.c (rb_str_set_len): pathological check
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57961 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-13 11:47:45 +00:00
nobu
16e804117c string.c: $; is a GC-root
* string.c (Init_String): $; must be a GC-root, not to be
  collected.  [ruby-core:79582]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57958 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-13 09:12:05 +00:00
stomar
605b472d2d docs for String#casecmp and String#casecmp?
* string.c: [DOC] specify when String#casecmp and String#casecmp?
  return nil; modify examples to better show difference to <=>;
  fix RDoc markup and typos.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57886 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-11 20:01:55 +00:00
normal
e064879e5a string.c (str_uminus): update doc for deduplication
As of r57698, String#-@ can return pre-existing strings.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57813 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-08 21:24:24 +00:00
nobu
e7f4d90930 fix paren
* string.c (str_byte_substr): fix misplaced parenthesis at r56155.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57809 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-08 08:19:56 +00:00
kazu
d0708e9e2a string.c: [DOC] Fix a typo in String#dump
[Fix GH-1531][ci skip]
Author:    Alex Semyonov <alex@semyonov.us>

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57802 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-07 13:04:39 +00:00
nobu
d69d98f61a string.c: negation of LONG_MIN
* string.c (rb_str_update): do not use negation of LONG_MIN, which
  is negative too.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57800 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-07 09:13:41 +00:00
nobu
f4d13801b6 string.c: fix integer overflow
* string.c (str_byte_substr): fix another integer overflow which
  can happen only when SHARABLE_MIDDLE_SUBSTRING is enabled.
  [ruby-core:79951] [Bug #13289]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57799 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-07 09:07:57 +00:00
nobu
72f8df158f string.c: fix integer overflow
* string.c (rb_str_subpos): fix integer overflow which can happen
  only when SHARABLE_MIDDLE_SUBSTRING is enabled.  incorpolate
  7db0786abd

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57797 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-07 05:48:15 +00:00
stomar
3ca1cbecc6 string.c: [DOC] fix doc formatting for String#==, #===
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57778 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-04 20:08:04 +00:00
stomar
a698d99703 string.c: restore documentation for String#<<
* string.c: [DOC] restore documentation for String#<<
  which became undocumented with r56021; fix a typo.
  [ruby-core:79865] [Bug #13268]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57758 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-02 10:31:56 +00:00
normal
4e90dcc9d7 string.c (str_uminus): deduplicate strings
This exposes the rb_fstring internal function to return a
deduped and frozen string when a non-frozen string is given.
This is useful for writing all sorts of record processing key
values maybe stored, but certain keys and values are often
duplicated at a high frequency, so memory savings can
noticeable.

Use cases are many:

* email/NNTP header processing

  There are some standard header keys everybody uses
  (From/To/Cc/Date/Subject/Received/Message-ID/References/In-Reply-To),
  as well as common ones specific to a certain lists:
  (ruby-core has X-Redmine-* headers)
  It is also useful to dedupe values, as most inboxes have
  multiple messages from the same sender, or MUA.

* package management systems -
  things like RubyGems stores identical strings for licenses,
  dependency names, author names/emails, etc

* HTTP headers/trailers -
  standard headers (Host/Accept/Accept-Encoding/User-Agent/...)
  are common, but there are also uncommon ones.
  Values may be deduped, as well, as it is likely a user
  agent will make multiple/parallel requests to the same
  server.

* version control systems -
  this can be useful for deduplicating names of frequent
  committers (like "nobu" :)

  In linux.git and git.git, there are also common
  trailers such as Signed-Off-By/Acked-by/Reviewed-by/Fixes/...
  as well as less common ones.

* audio metadata -

  There are commonly used tags (Artist/Album/Title/Tracknumber),
  but Vorbis comments allows arbitrary key values to be stored.
  Music collections contain songs by the same artist or mutiple
  songs from the same album, so deduplicating values will be
  helpful there, too.

* JSON, YAML, XML, HTML processing

  Certain fields, tags and attributes are commonly used
  across the same and multiple documents

There is no security concern in this being a DoS vector by
causing immortal strings.  The fstring table is not a GC-root
and not walked during the mark phase.  GC-able dynamic symbols
since Ruby 2.2 are handled in the same manner, and that
implementation also relies on the non-immortality of fstrings.

[Feature #13077] [ruby-core:79663]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57698 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-02-24 01:01:23 +00:00
nobu
7de42daa21 string.c: assertion
* string.c (str_shared_replace): use RUBY_ASSERT for
  pre-condition.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57628 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-02-14 12:29:56 +00:00
nobu
957e6e4b14 initialize variables
* string.c (rb_str_enumerate_lines): initialize conditionally
  used variable.

* thread.c (rb_fd_no_init): ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57625 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-02-14 07:52:30 +00:00
nobu
959aac29e7 suppress warnings
* string.c (rb_str_enumerate_lines): hint to suppress a
  maybe-uninitialized warning by gcc.

* thread.c (rb_fd_no_init): ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57618 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-02-13 05:44:15 +00:00
normal
a6b9b360ce doc: Add example for Symbol#to_s
* string.c: add example for Symbol#to_s.

The docs for Symbol#to_s only include an example for
Symbol#id2name, but not for #to_s which is an alias;
the docs should include examples for both methods.

From: Marcus Stollsteimer <sto.mar@web.de>

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57536 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-02-05 00:22:03 +00:00
normal
b0cfa46bce symbol.c (rb_id2str): eliminate branch to set class
Since the fstring table encompasses all strings in the
symbol table, we may reuse the fstring table walk to set
the class and eliminate the branch in rb_id2str.

* string.c (Init_String): use rb_cString immediately after definition
* symbol.c (rb_id2str): eliminate branch to set class

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57521 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-02-03 23:55:06 +00:00