These APIs are much like <valgrind/memcheck.h>. Use them to
fine-grain annotate the usage of our memory.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65573 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
In these functions we are intentionally reading memory address
not owned by us. These reads should not be diagnosed.
See also [Bug #8680]
See also 451202718
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65564 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* gc.c (rb_raw_obj_info): fix type mismatch specified by the following
build log: 448634481
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65463 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* hash.c, internal.h: support theap for small Hash.
Introduce RHASH_ARRAY (li_table) besides st_table and small Hash
(<=8 entries) are managed by an array data structure.
This array data can be managed by theap.
If st_table is needed, then converting array data to st_table data.
For st_table using code, we prepare "stlike" APIs which accepts hash value
and are very similar to st_ APIs.
This work is based on the GSoC achievement
by tacinight <tacingiht@gmail.com> and refined by ko1.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65454 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* variable.c: now instance variable space has theap supports.
obj_ivar_heap_alloc() tries to acquire memory from theap.
* debug_counter.h: add some counters for theap.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65451 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* transient_heap.c, transient_heap.h: implement TransientHeap (theap).
theap is designed for Ruby's object system. theap is like Eden heap
on generational GC terminology. theap allocation is very fast because
it only needs to bump up pointer and deallocation is also fast because
we don't do anything. However we need to evacuate (Copy GC terminology)
if theap memory is long-lived. Evacuation logic is needed for each type.
See [Bug #14858] for details.
* array.c: Now, theap for T_ARRAY is supported.
ary_heap_alloc() tries to allocate memory area from theap. If this trial
sccesses, this array has theap ptr and RARRAY_TRANSIENT_FLAG is turned on.
We don't need to free theap ptr.
* ruby.h: RARRAY_CONST_PTR() returns malloc'ed memory area. It menas that
if ary is allocated at theap, force evacuation to malloc'ed memory.
It makes programs slow, but very compatible with current code because
theap memory can be evacuated (theap memory will be recycled).
If you want to get transient heap ptr, use RARRAY_CONST_PTR_TRANSIENT()
instead of RARRAY_CONST_PTR(). If you can't understand when evacuation
will occur, use RARRAY_CONST_PTR().
(re-commit of r65444)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65449 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* variable.c: now instance variable space has theap supports.
obj_ivar_heap_alloc() tries to acquire memory from theap.
* debug_counter.h: add some counters for theap.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65446 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* transient_heap.c, transient_heap.h: implement TransientHeap (theap).
theap is designed for Ruby's object system. theap is like Eden heap
on generational GC terminology. theap allocation is very fast because
it only needs to bump up pointer and deallocation is also fast because
we don't do anything. However we need to evacuate (Copy GC terminology)
if theap memory is long-lived. Evacuation logic is needed for each type.
See [Bug #14858] for details.
* array.c: Now, theap for T_ARRAY is supported.
ary_heap_alloc() tries to allocate memory area from theap. If this trial
sccesses, this array has theap ptr and RARRAY_TRANSIENT_FLAG is turned on.
We don't need to free theap ptr.
* ruby.h: RARRAY_CONST_PTR() returns malloc'ed memory area. It menas that
if ary is allocated at theap, force evacuation to malloc'ed memory.
It makes programs slow, but very compatible with current code because
theap memory can be evacuated (theap memory will be recycled).
If you want to get transient heap ptr, use RARRAY_CONST_PTR_TRANSIENT()
instead of RARRAY_CONST_PTR(). If you can't understand when evacuation
will occur, use RARRAY_CONST_PTR().
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65444 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* class.c (rb_keyword_error_new): use RARRAY_AREF() because
RARRAY_CONST_PTR() can introduce additional overhead in a futre.
Same fixes for other files.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65430 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* debug_counter.h: add the following counters.
* frame_push: control frame counts (total counts).
* frame_push_*: control frame counts per every frame type.
* obj_*: add free'ed counts for each type.
* gc.c: ditto.
* vm_insnhelper.c (vm_push_frame): ditto.
* debug_counter.c (rb_debug_counter_show_results): widen counts field
to show >10G numbers.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64867 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* gc.c (obj_free): a table can be accessed for debug counters.
[Bug #15165] [Fix GH-1964]
A patch from Joe Truba <jtruba@meraki.com>
Also check USE_DEBUG_COUNTER macro.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64857 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Now that we can say for sure if an instruction calls a method or
not internally, it is now possible to reroute the bugs that
forced us to revert the "move PC around" optimization.
First try: r62051
Reverted: r63763
See also: r63999
----
trunk: ruby 2.6.0dev (2018-09-13 trunk 64736) [x86_64-darwin15]
ours: ruby 2.6.0dev (2018-09-13 trunk 64736) [x86_64-darwin15]
last_commit=move ADD_PC around (take 2)
Calculating -------------------------------------
trunk ours
so_ackermann 1.884 2.278 i/s - 1.000 times in 0.530926s 0.438935s
so_array 1.178 1.157 i/s - 1.000 times in 0.848786s 0.864467s
so_binary_trees 0.176 0.177 i/s - 1.000 times in 5.683895s 5.657707s
so_concatenate 0.220 0.221 i/s - 1.000 times in 4.546896s 4.518949s
so_count_words 6.729 6.470 i/s - 1.000 times in 0.148602s 0.154561s
so_exception 3.324 3.688 i/s - 1.000 times in 0.300872s 0.271147s
so_fannkuch 0.546 0.968 i/s - 1.000 times in 1.831328s 1.033376s
so_fasta 0.541 0.547 i/s - 1.000 times in 1.849923s 1.827091s
so_k_nucleotide 0.800 0.777 i/s - 1.000 times in 1.250635s 1.286295s
so_lists 2.101 1.848 i/s - 1.000 times in 0.475954s 0.541095s
so_mandelbrot 0.435 0.408 i/s - 1.000 times in 2.299328s 2.450535s
so_matrix 1.946 1.912 i/s - 1.000 times in 0.513872s 0.523076s
so_meteor_contest 0.311 0.317 i/s - 1.000 times in 3.219297s 3.152052s
so_nbody 0.746 0.703 i/s - 1.000 times in 1.339815s 1.423441s
so_nested_loop 0.899 0.901 i/s - 1.000 times in 1.111767s 1.109555s
so_nsieve 0.559 0.579 i/s - 1.000 times in 1.787763s 1.726552s
so_nsieve_bits 0.435 0.428 i/s - 1.000 times in 2.296282s 2.333852s
so_object 1.368 1.442 i/s - 1.000 times in 0.731237s 0.693684s
so_partial_sums 0.616 0.546 i/s - 1.000 times in 1.623592s 1.833097s
so_pidigits 0.831 0.832 i/s - 1.000 times in 1.203117s 1.202334s
so_random 2.934 2.724 i/s - 1.000 times in 0.340791s 0.367150s
so_reverse_complement 0.583 0.866 i/s - 1.000 times in 1.714144s 1.154615s
so_sieve 1.829 2.081 i/s - 1.000 times in 0.546607s 0.480562s
so_spectralnorm 0.524 0.558 i/s - 1.000 times in 1.908716s 1.792382s
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64737 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
This commit fixes a build error on systems that have
`malloc_usable_size` but also enable jemalloc via `--with-jemalloc`.
For example, Ubuntu Precise defines `malloc_usable_size` in malloc.h, so
gc.c will include malloc.h. This definition conflicts with jemalloc's
definition, so the following error occurs:
```
compiling gc.c
compiling hash.c
In file included from gc.c:50:0:
/usr/include/malloc.h:152:15: error: conflicting types for 'malloc_usable_size'
/usr/include/jemalloc/jemalloc.h:45:8: note: previous declaration of 'malloc_usable_size' was here
cc1: warning: unrecognized command line option "-Wno-self-assign" [enabled by default]
cc1: warning: unrecognized command line option "-Wno-constant-logical-operand" [enabled by default]
cc1: warning: unrecognized command line option "-Wno-parentheses-equality" [enabled by default]
cc1: warning: unrecognized command line option "-Wno-tautological-compare" [enabled by default]
```
Since jemalloc always defines `malloc_usable_size`, this patch just
includes the jemalloc header instead of malloc.h if it's available.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63992 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
I introduced this mechanism in r62051 to speed things up. Later it
was reported that the change causes problems. I searched for
workarounds but nothing seemed appropriate. I hereby officially
give it up. The idea to move ADD_PC around was a mistake.
Fixes [Bug #14809] and [Bug #14834].
Signed-off-by: Urabe, Shyouhei <shyouhei@ruby-lang.org>
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63763 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
We often had SEGV in ruby_xfree when USE_GC_MALLOC_OBJ_INFO_DETAILS is 1
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63733 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* include/ruby/defines.h: introduce `USE_GC_MALLOC_OBJ_INFO_DETAILS`
to show malloc statistics by replace ruby_xmalloc() and so on with
macros.
* gc.c (struct malloc_obj_info): introduced to save per-malloc information.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63701 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Shouldn't affect production use, but good to fix regardless :>
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63695 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
rb_gc_mark_encodings has been empty for a decade
(since r17875 / 28b216ac45).
Just remove it and its only caller in gc.c
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63582 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Every time I look at gc.c, I get confused by argument ordering:
gc_start(..., TRUE, TRUE, FALSE, ...)
gc_start(..., FALSE, FALSE, FALSE, ... )
While we do not have kwargs in C, we can use flags to improve readability:
gc_start(...,
GPR_FLAG_FULL_MARK | GPR_FLAG_IMMEDIATE_MARK |
GPR_FLAG_IMMEDIATE_SWEEP | ...)
[ruby-core:87311] [Misc #14798]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63575 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
We may add gc_*_continue calls in a few more places, and adding
more #ifdefs around those is ugly. For now, this makes the
heap_prepare function look better.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63573 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Repeatedly checking for malloc_limit_max in gc_reset_malloc_info
is unnecessary, check and set it once during initialization
in ruby_gc_set_params to simplify the hotter path
* gc.c (gc_reset_malloc_info): remove zero check
(ruby_gc_set_params): treat malloc_limit_max==0 as SIZE_MAX
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63569 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Most (if not all) architectures have instructions for comparing
against zero, allowing compilers to generate more compact code.
Other MEMOP_TYPE_* enum values are not compared in hot paths,
but MEMOP_TYPE_MALLOC is checked in objspace_malloc_increase
text data bss dec hex filename
84088 264 3664 88016 157d0 gc-before.o
83784 264 3664 87712 156a0 gc.o
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63546 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
There were major size regressions I failed to notice before in:
bm_array_sample_100k__6k
bm_array_sample_100k___10k
bm_array_sample_100k___50k
This reverts commit r63463 / 14fb10a9ec
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63464 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
atomic_sub_nounderflow is expensive and objspace_malloc_increase
was showing up near the top of some `perf` profiles. The new
implementation allows the compiler to inline and eliminate
some branches from objspace_malloc_increase.
Furthermore, we do not need atomics for oldmalloc_increase
This consistently improves bm_so_count_words benchmark by
around 10% on my hardware.
name built
so_count_words 1.107
[ruby-core:87096] [Feature #14767]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63463 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
This seems to improve the readability of gc.c a small amount
and it doesn't have any measurable performance impact.
[ruby-core:87067] [Misc #14762]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63447 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* gc.c (rb_alloc_tmp_buffer_with_count): keep the order; allocate
an empty imemo first then xmalloc, to get rid of potential
memory leak when allocation imemo failed.
* parse.y (rb_parser_malloc, rb_parser_calloc, rb_parser_realloc):
ditto.
* process.c (rb_execarg_allocate_dup2_tmpbuf): ditto.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63385 b2dd03c8-39d4-4d8f-98ff-823fe69b080e