Commit graph

1771 commits

Author SHA1 Message Date
Aaron Patterson
02b216e5a7
Combine sweeping and moving
This commit combines the sweep step with moving objects.  With this
commit, we can do:

```ruby
GC.start(compact: true)
```

This code will do the following 3 steps:

1. Fully mark the heap
2. Sweep + Move objects
3. Update references

By default, this will compact in order that heap pages are allocated.
In other words, objects will be packed towards older heap pages (as
opposed to heap pages with more pinned objects like `GC.compact` does).
2020-05-29 15:24:32 -07:00
Aaron Patterson
c7ceaa6d3c
Extract "free moved list" function
Extract a function to free the moved list.  We'll use this function
later on to compact at the same time as sweep.
2020-05-28 15:01:10 -07:00
Jeremy Evans
ad729a1d11 Fix origin iclass pointer for modules
If a module has an origin, and that module is included in another
module or class, previously the iclass created for the module had
an origin pointer to the module's origin instead of the iclass's
origin.

Setting the origin pointer correctly requires using a stack, since
the origin iclass is not created until after the iclass itself.
Use a hidden ruby array to implement that stack.

Correctly assigning the origin pointers in the iclass caused a
use-after-free in GC.  If a module with an origin is included
in a class, the iclass shares a method table with the module
and the iclass origin shares a method table with module origin.

Mark iclass origin with a flag that notes that even though the
iclass is an origin, it shares a method table, so the method table
should not be garbage collected.  The shared method table will be
garbage collected when the module origin is garbage collected.
I've tested that this does not introduce a memory leak.

This change caused a VM assertion failure, which was traced to callable
method entries using the incorrect defined_class.  Update
rb_vm_check_redefinition_opt_method and find_defined_class_by_owner
to treat iclass origins different than class origins to avoid this
issue.

This also includes a fix for Module#included_modules to skip
iclasses with origins.

Fixes [Bug #16736]
2020-05-22 20:31:23 -07:00
Jeremy Evans
8d798e7c53 Revert "Fix origin iclass pointer for modules"
This reverts commit c745a60634.

This triggers a VM assertion.  Reverting until the issue can be
debugged.
2020-05-22 07:54:34 -07:00
Jeremy Evans
c745a60634 Fix origin iclass pointer for modules
If a module has an origin, and that module is included in another
module or class, previously the iclass created for the module had
an origin pointer to the module's origin instead of the iclass's
origin.

Setting the origin pointer correctly requires using a stack, since
the origin iclass is not created until after the iclass itself.
Use a hidden ruby array to implement that stack.

Correctly assigning the origin pointers in the iclass caused a
use-after-free in GC.  If a module with an origin is included
in a class, the iclass shares a method table with the module
and the iclass origin shares a method table with module origin.

Mark iclass origin with a flag that notes that even though the
iclass is an origin, it shares a method table, so the method table
should not be garbage collected.  The shared method table will be
garbage collected when the module origin is garbage collected.
I've tested that this does not introduce a memory leak.

This also includes a fix for Module#included_modules to skip
iclasses with origins.

Fixes [Bug #16736]
2020-05-22 07:36:52 -07:00
Aaron Patterson
6e7e7c1e57
Only marked objects should be considered movable
Ruby's GC is incremental, meaning that during the mark phase (and also
the sweep phase) programs are allowed to run.  This means that programs
can allocate objects before the mark or sweep phase have actually
completed.  Those objects may not have had a chance to be marked, so we
can't know if they are movable or not. Something that references the
newly created object might have called the pinning function during the
mark phase, but since the mark phase hasn't run we can't know if there
is a "pinning" relationship.

To be conservative, we must only allow objects that are not pinned but
also marked to move.
2020-05-20 15:00:32 -07:00
Aaron Patterson
6efb9fe042
Allow references stored in the VM stack to move
We can update these references too, so lets allow them to move.
2020-05-18 16:57:10 -07:00
卜部昌平
15e977349e more on NULL versus functions
Function pointers are not void*.  See also
115fec062c
ce4ea956d2
8427fca49b
2020-05-11 16:47:25 +09:00
卜部昌平
9e41a75255 sed -i 's|ruby/impl|ruby/internal|'
To fix build failures.
2020-05-11 09:24:08 +09:00
卜部昌平
122f96c362 sed -i s/ruby3/rbimpl/g 2020-05-11 09:24:08 +09:00
卜部昌平
97672f669a sed -i s/RUBY3/RBIMPL/g
Devs do not love "3".  The only exception is RUBY3_KEYWORDS in parse.y,
which seems unrelated to our interests.
2020-05-11 09:24:08 +09:00
卜部昌平
d7f4d732c1 sed -i s|ruby/3|ruby/impl|g
This shall fix compile errors.
2020-05-11 09:24:08 +09:00
Nobuyoshi Nakada
5d430c1b34
Added more NORETURN declarations 2020-05-11 00:40:14 +09:00
Aaron Patterson
ff4f9cf95d
Allow global variables to move
This patch allows global variables that have been assigned in Ruby to
move.  I added a new function for the GC to call that will update
global references and introduced a new callback in the global variable
struct for updating references.

Only pure Ruby global variables are supported right now, other
references will be pinned.
2020-05-07 11:42:39 -07:00
Aaron Patterson
00698f26a9
T_MOVED should never be pushed on the mark stack
No objects should ever reference a `T_MOVED` slot.  If they do, it's
absolutely a bug.  If we kill the process when `T_MOVED` is pushed on
the mark stack it will make it easier to identify which object holds a
reference that hasn't been updated.
2020-05-07 08:44:11 -07:00
Aaron Patterson
5ef019e8af
Output compaction stats in one loop / eliminate 0 counts
We only need to loop `T_MASK` times once.  Also, not every value between
0 and `T_MASK` is an actual Ruby type.  Before this change, some
integers were being added to the result hash even though they aren't
actual types.  This patch omits considered / moved entries that total 0,
cleaning up the result hash and eliminating these "fake types".
2020-05-04 13:50:21 -07:00
Benoit Daloze
c2dc52e18b Rename arguments for ObjectSpace::WeakMap#[]= for clarity 2020-05-02 16:16:56 +02:00
Benoit Daloze
a2be428c5f Fix ObjectSpace::WeakMap#key? to work if the value is nil
* Fixes [Bug #16826]
2020-05-02 16:08:36 +02:00
Nobuyoshi Nakada
ac0c760843
Mark ruby_memerror as NORETURN 2020-04-29 00:34:14 +09:00
Yusuke Endoh
1994ed90e4 Remove debugging code from gc.c
Partially revert adab82b9a7 and
c63b5c6179.
The issue that these commits attempt to address was maybe fixed with
1c7f5a5712.
2020-04-29 00:05:46 +09:00
Kazuhiro NISHIYAMA
fd2df58451
Fix a typo [ci skip] 2020-04-27 09:41:45 +09:00
Nobuyoshi Nakada
42ac3f79ba
Assert that typed data is distinguished from non-typed 2020-04-25 09:29:27 +09:00
卜部昌平
c63b5c6179 rb_memerror: abort immediately
Ditto for adab82b9a7.  TRY_WITH_GC was
found innocent.
2020-04-21 16:30:33 +09:00
Nobuyoshi Nakada
dc9089b51f
Fixed a typo [ci skip] 2020-04-21 13:35:31 +09:00
卜部昌平
adab82b9a7 TRY_WITH_GC: abort immediately
NoMemoryError is observed on icc but I fail to reproduce so far.  Let me
see the backtrace on CI.
2020-04-21 12:59:35 +09:00
Nobuyoshi Nakada
693378f105
Moved noreturn call to end of noreturn function 2020-04-16 18:02:11 +09:00
Nobuyoshi Nakada
e474c189da
Suppress -Wswitch warnings 2020-04-08 15:13:37 +09:00
卜部昌平
9e6e39c351
Merge pull request #2991 from shyouhei/ruby.h
Split ruby.h
2020-04-08 13:28:13 +09:00
Nobuyoshi Nakada
2a4049b23c
Bail out before pushing unexpected object 2020-04-03 01:16:57 +09:00
Koichi Sasada
d05455d083 fix type cast 2020-03-11 02:55:07 +09:00
Koichi Sasada
ec78b8b62a show method entry with iseq details 2020-03-11 02:50:44 +09:00
卜部昌平
97fa6468dc fix compile error w/ -DCALC_EXACT_MALLOC_SIZE 2020-03-04 12:30:42 +09:00
卜部昌平
62c2b8c74e kill USE_RGENGC=0
This compile-time option has been broken for years (at least since
commit 49369ef173, according to git
bisect).  Let's delete codes that no longer works.
2020-02-26 16:00:10 +09:00
卜部昌平
e7bcb416af avoid #if inside of rb_str_new_cstr
ISO/IEC 9899:1999 section 6.10.3 paragraph 11 explicitly states that
"If there are sequences of preprocessing tokens within the list of
arguments that would otherwise act as preprocessing directives, the
behavior is undefined."

rb_str_new_cstr is in fact a macro.  We cannot do this.
2020-02-26 16:00:10 +09:00
Koichi Sasada
b9007b6c54 Introduce disposable call-cache.
This patch contains several ideas:

(1) Disposable inline method cache (IMC) for race-free inline method cache
    * Making call-cache (CC) as a RVALUE (GC target object) and allocate new
      CC on cache miss.
    * This technique allows race-free access from parallel processing
      elements like RCU.
(2) Introduce per-Class method cache (pCMC)
    * Instead of fixed-size global method cache (GMC), pCMC allows flexible
      cache size.
    * Caching CCs reduces CC allocation and allow sharing CC's fast-path
      between same call-info (CI) call-sites.
(3) Invalidate an inline method cache by invalidating corresponding method
    entries (MEs)
    * Instead of using class serials, we set "invalidated" flag for method
      entry itself to represent cache invalidation.
    * Compare with using class serials, the impact of method modification
      (add/overwrite/delete) is small.
    * Updating class serials invalidate all method caches of the class and
      sub-classes.
    * Proposed approach only invalidate the method cache of only one ME.

See [Feature #16614] for more details.
2020-02-22 09:58:59 +09:00
Koichi Sasada
f2286925f0 VALUE size packed callinfo (ci).
Now, rb_call_info contains how to call the method with tuple of
(mid, orig_argc, flags, kwarg). Most of cases, kwarg == NULL and
mid+argc+flags only requires 64bits. So this patch packed
rb_call_info to VALUE (1 word) on such cases. If we can not
represent it in VALUE, then use imemo_callinfo which contains
conventional callinfo (rb_callinfo, renamed from rb_call_info).

iseq->body->ci_kw_size is removed because all of callinfo is VALUE
size (packed ci or a pointer to imemo_callinfo).

To access ci information, we need to use these functions:
vm_ci_mid(ci), _flag(ci), _argc(ci), _kwarg(ci).

struct rb_call_info_kw_arg is renamed to rb_callinfo_kwarg.

rb_funcallv_with_cc() and rb_method_basic_definition_p_with_cc()
is temporary removed because cd->ci should be marked.
2020-02-22 09:58:59 +09:00
卜部昌平
984e0233fe TestTime#test_memsize: skip when on GC_DEBUG
GC_DEBUG=1 makes this test fail because it changes the size of struct
RVALUE.  I don't think the test is useful then.  Let's just skip.
2020-02-20 11:46:54 +09:00
Yusuke Endoh
912ef0b559 Revert "gc.c: make the stack overflow detection earlier under s390x"
This reverts commit a28c166f78.

This change didn't help.
According to odaira, the issue was fixed by increasing `ulimit -s`.
2020-02-10 14:13:48 +09:00
Nobuyoshi Nakada
0f05b234fb
Disable GC until VM objects get initialized [Bug #16616] 2020-02-09 17:15:55 +09:00
Nobuyoshi Nakada
aeaf0dc555
Separate objspace argument for rb_gc_disable and rb_gc_enable 2020-02-09 17:06:31 +09:00
Yusuke Endoh
a28c166f78 gc.c: make the stack overflow detection earlier under s390x
On s390x, TestFiber#test_stack_size fails with SEGV.

20200205T223421Z.fail.html.gz

```
TestFiber#test_stack_size [/home/chkbuild/build/20200205T223421Z/ruby/test/ruby/test_fiber.rb:356]:
pid 23844 killed by SIGABRT (signal 6) (core dumped)
| -e:1:in `times': stack level too deep (SystemStackError)
| 	from -e:1:in `rec'
| 	from -e:1:in `block (3 levels) in rec'
| 	from -e:1:in `times'
| 	from -e:1:in `block (2 levels) in rec'
| 	from -e:1:in `times'
| 	from -e:1:in `block in rec'
| 	from -e:1:in `times'
| 	from -e:1:in `rec'
| 	 ... 172 levels...
| 	from -e:1:in `block in rec'
| 	from -e:1:in `times'
| 	from -e:1:in `rec'
| 	from -e:1:in `block in <main>'
| -e: [BUG] Segmentation fault at 0x0000000000000000
```

This change tries a similar fix with
ef64ab917e and
3ddbba84b5.
2020-02-09 12:55:44 +09:00
Nobuyoshi Nakada
0c4bbb46f1
Removed type-punning pointer casts around st_data_t 2020-01-31 12:13:00 +09:00
Nobuyoshi Nakada
af899503a6
Moved GC.verify_compaction_references to gc.rb
And fixed a segfault by coercion of `Qundef`, when any keyword
argument without `toward:` option is given.
2020-01-27 10:52:37 +09:00
Lourens Naudé
61ff5cd5fd Fix syntax error in obj_free with hash size debug counter when USE_DEBUG_COUNTER is enabled 2020-01-13 08:03:01 +09:00
Kenta Murata
e082f41611
Introduce BIGNUM_EMBED_P to check BIGNUM_EMBED_FLAG (#2802)
* bignum.h: Add BIGNUM_EMBED_P

* bignum.c: Use macros for handling BIGNUM_EMBED_FLAG
2019-12-31 22:48:23 +09:00
Nobuyoshi Nakada
d7bef803ac Separate builtin initialization calls 2019-12-29 12:34:55 +09:00
卜部昌平
5e22f873ed decouple internal.h headers
Saves comitters' daily life by avoid #include-ing everything from
internal.h to make each file do so instead.  This would significantly
speed up incremental builds.

We take the following inclusion order in this changeset:

1.  "ruby/config.h", where _GNU_SOURCE is defined (must be the very
    first thing among everything).
2.  RUBY_EXTCONF_H if any.
3.  Standard C headers, sorted alphabetically.
4.  Other system headers, maybe guarded by #ifdef
5.  Everything else, sorted alphabetically.

Exceptions are those win32-related headers, which tend not be self-
containing (headers have inclusion order dependencies).
2019-12-26 20:45:12 +09:00
卜部昌平
b739a63eb4 split internal.h into files
One day, I could not resist the way it was written.  I finally started
to make the code clean.  This changeset is the beginning of a series of
housekeeping commits.  It is a simple refactoring; split internal.h into
files, so that we can divide and concur in the upcoming commits.  No
lines of codes are either added or removed, except the obvious file
headers/footers.  The generated binary is identical to the one before.
2019-12-26 20:45:12 +09:00
Koichi Sasada
100fc2750b fix wmap_finalize.
wmap_finalize expects id2ref() returns a corresponding object
even if the object is dead. Make id2ref_obj_tbl() for this
purpose.
2019-12-23 17:04:31 +09:00
Koichi Sasada
9eeaae432b add more debug counters to count numeric objects. 2019-12-23 16:31:17 +09:00