Commit graph

94 commits

Author SHA1 Message Date
Jean Boussier
6391ae9ebc objspace_dump.c: dump call cache ids with dump_append_id
Not all `ID` have an associated string.

Fixes a SEGFAULT in ObjectSpace.dump_all spec.
2023-11-22 10:24:35 +01:00
Peter Zhu
68869e9bd9 Revert "Revert "Remove SHAPE_CAPACITY_CHANGE shapes""
This reverts commit 5f3fb4f4e3.
2023-11-13 18:26:36 -05:00
John Hawthorn
b41270842a Record more info from CALLCACHE in heap dumps
This records the called_id and klass from imemo_callcache objects in
heap dumps.
2023-11-13 15:03:11 -08:00
Peter Zhu
5f3fb4f4e3 Revert "Remove SHAPE_CAPACITY_CHANGE shapes"
This reverts commit f6910a6112.

We're seeing crashes in the test suite of Shopify's core monolith after
this change.
2023-11-10 11:27:49 -05:00
Peter Zhu
f6910a6112 Remove SHAPE_CAPACITY_CHANGE shapes
We don't need to create a shape to transition capacity as we can
transition the capacity when the capacity of the SHAPE_IVAR changes.
2023-11-09 09:25:02 -05:00
Peter Zhu
38ba040d8b Make every initial size pool shape a root shape
This commit makes every initial size pool shape a root shape and assigns
it a capacity of 0.
2023-11-02 13:42:11 -04:00
John Hawthorn
1c871c08d9 Switch mid dump to dump_append_string_value
I don't think it's possible to create a CI with a mid which would need
escaping to be in a JSON string, but I think we might as well not rely
on that assumption.
2023-10-12 10:22:32 +02:00
John Hawthorn
635b92099e Fix ObjectSpace.dump with super() callinfo
super() uses 0 as mid for its callinfo, so we need to check for that to
avoid a segfault when using dump_all.
2023-10-12 10:22:32 +02:00
Peter Zhu
63e504d6e6 Dump name of method for imemo callinfo
This commit dumps the `mid` of the imemo callinfo when calling
`ObjectSpace.dump_all`.
2023-10-02 09:49:13 -04:00
Samuel Williams
3fe09eba9d
Add deprecations for public struct rb_io members. (#7916)
* Add deprecations for public struct rb_io members.
2023-06-08 20:22:43 +09:00
NARUSE, Yui
85dcc4866d Revert "Hide most of the implementation of struct rb_io. (#6511)"
This reverts commit 18e55fc1e1.

fix [Bug #19704]
https://bugs.ruby-lang.org/issues/19704
This breaks compatibility for extension libraries. Such changes
need a discussion.
2023-06-01 08:43:22 +09:00
Samuel Williams
18e55fc1e1
Hide most of the implementation of struct rb_io. (#6511)
* Add rb_io_path and rb_io_open_descriptor.

* Use rb_io_open_descriptor to create PTY objects

* Rename FMODE_PREP -> FMODE_EXTERNAL and expose it

FMODE_PREP I believe refers to the concept of a "pre-prepared" file, but
FMODE_EXTERNAL is clearer about what the file descriptor represents and
aligns with language in the IO::Buffer module.

* Ensure that rb_io_open_descriptor closes the FD if it fails

If FMODE_EXTERNAL is not set, then it's guaranteed that Ruby will be
responsible for closing your file, eventually, if you pass it to
rb_io_open_descriptor, even if it raises an exception.

* Rename IS_EXTERNAL_FD -> RUBY_IO_EXTERNAL_P

* Expose `rb_io_closed_p`.

* Add `rb_io_mode` to get IO mode.

---------

Co-authored-by: KJ Tsanaktsidis <ktsanaktsidis@zendesk.com>
2023-05-30 10:02:40 +09:00
Matt Valentine-House
72aba64fff Merge gc.h and internal/gc.h
[Feature #19425]
2023-02-09 10:32:29 -05:00
Peter Zhu
2056c0a7c6 Add embedded status to dumps of T_OBJECT
This commit adds `"embedded":true` in ObjectSpace.dump for T_OBJECTs
that are embedded.
2023-01-05 16:00:36 -05:00
Jemma Issroff
e9ba3042e1 Indicate if a shape is too_complex in ObjectSpace#dump 2022-12-15 13:41:47 -08:00
Jemma Issroff
c1ab6ddc9a Transition complex objects to "too complex" shape
When an object becomes "too complex" (in other words it has too many
variations in the shape tree), we transition it to use a "too complex"
shape and use a hash for storing instance variables.

Without this patch, there were rare cases where shape tree growth could
"explode" and cause performance degradation on what would otherwise have
been cached fast paths.

This patch puts a limit on shape tree growth, and gracefully degrades in
the rare case where there could be a factorial growth in the shape tree.

For example:

```ruby
class NG; end

HUGE_NUMBER.times do
  NG.new.instance_variable_set(:"@unique_ivar_#{_1}", 1)
end
```

We consider objects to be "too complex" when the object's class has more
than SHAPE_MAX_VARIATIONS (currently 8) leaf nodes in the shape tree and
the object introduces a new variation (a new leaf node) associated with
that class.

For example, new variations on instances of the following class would be
considered "too complex" because those instances create more than 8
leaves in the shape tree:

```ruby
class Foo; end
9.times { Foo.new.instance_variable_set(":@uniq_#{_1}", 1) }
```

However, the following class is *not* too complex because it only has
one leaf in the shape tree:

```ruby
class Foo
  def initialize
    @a = @b = @c = @d = @e = @f = @g = @h = @i = nil
  end
end
9.times { Foo.new }
``

This case is rare, so we don't expect this change to impact performance
of most applications, but it needs to be handled.

Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
2022-12-15 10:06:04 -08:00
Jemma Issroff
a3d552aedd Add variation_count on classes
Count how many "variations" each class creates. A "variation" is a a
unique ordering of instance variables on a particular class. This can
also be thought of as a branch in the shape tree.

For example, the following Foo class will have 2 variations:

```ruby
class Foo ; end

Foo.new.instance_variable_set(:@a, 1) # case 1: creates one variation
Foo.new.instance_variable_set(:@b, 1) # case 2: creates another variation

foo = Foo.new
foo.instance_variable_set(:@a, 1) # does not create a new variation
foo.instance_variable_set(:@b, 1) # does not create a new variation (a continuation of the variation in case 1)
```

We will use this number to limit the amount of shapes that a class can
create and fallback to using a hash iv lookup.

Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
2022-12-15 10:06:04 -08:00
Jean Boussier
1df6d0e578 objspace_dump.c: don't dump class of T_IMEMO
They don't actually have a class.
2022-12-14 15:53:41 +01:00
Peter Zhu
0b4fda11ec [DOC] Don't document private methods in objspace 2022-12-12 09:48:06 -05:00
Jean Boussier
d7812d1949 objspace_dump.c: dump the capacity field for INITIAL_CAPACITY shapes
We forgot about that one, it's quite useful to see which capacity
we started from.
2022-12-09 17:06:21 +01:00
Jean Boussier
73771e4b19 ObjectSpace.dump_all: dump shapes as well
I see several arguments in doing so.

First they use a non trivial amount of memory, so for various memory
profiling/mapping tools it is relevant to have visibility of the space
occupied by shapes.

Then, some pathological code can create a tons of shape, so it is
valuable to have a way to have a way to observe shapes without having
to compile Ruby with `SHAPE_DEBUG=1`.

And additionally it's likely much faster to dump then this way than
to use `RubyVM::Shape`.

There are however a few open questions:

- Shapes can't respect the `since:` argument. Not sure what to do when
  it is provided. Would probably make sense to not dump them.
- Maybe it would make more sense to have a separate `ObjectSpace.dump_shapes`?
- Maybe instead `dump_all` should take a `shapes: false` argument?

Additionally, `ObjectSpace.dump_shapes` is added for the use case of
debugging the evolution of the shape tree.
2022-12-08 18:46:16 +01:00
Jemma Issroff
e4aba8f519 Add shape_id to heap dump 2022-12-05 14:33:16 -08:00
Jemma Issroff
c726c48a3d Remove numiv from RObject
Since object shapes store the capacity of an object, we no longer
need the numiv field on RObjects. This gives us one extra slot which
we can use to give embedded objects one more instance variable (for a
total of 3 ivs). This commit removes the concept of numiv from RObject.
2022-11-10 10:11:34 -05:00
Peter Zhu
4a8cd9e8bc Use shared flags of the type
The ELTS_SHARED flag is generic, so we should prefer to use the flags
specific of the type (STR_SHARED for strings and RARRAY_SHARED_FLAG
for arrays).
2022-11-02 11:03:21 -04:00
Nobuyoshi Nakada
92c7417d73
Adjust indents [ci skip] 2022-07-22 21:59:27 +09:00
Nobuyoshi Nakada
c6aa65430f
Get rid of magic numbers 2022-07-22 10:41:44 +09:00
Nobuyoshi Nakada
cf7d07570f
Dump non-ASCII char as unsigned
Non-ASCII code may be negative on platforms plain char is signed.
2022-07-22 09:56:48 +09:00
Jean byroot Boussier
f0ae583a3d Revert "objspace_dump.c: skip dumping method name if not pure ASCII"
This reverts commit 79406e3600.
2022-07-21 19:56:08 +02:00
Jean Boussier
79406e3600 objspace_dump.c: skip dumping method name if not pure ASCII
Sidekiq has a method named `❨╯°□°❩╯︵┻━┻`which corrupts
heap dumps.

Normally we could just dump is as is since it's valid UTF-8 and need
no escaping. But our code to escape control characters isn't UTF-8
aware so it's more complicated than it seems.

Ultimately since the overwhelming majority of method names are
pure ASCII, it's not a big loss to just skip it.
2022-07-21 18:43:45 +02:00
Takashi Kokubun
5b21e94beb Expand tabs [ci skip]
[Misc #18891]
2022-07-21 09:42:04 -07:00
Nobuyoshi Nakada
a070d4cceb
Local functions should be static 2022-07-05 09:30:05 +09:00
Jean Boussier
890df5f812 ObjectSpace.dump: Include string coderange
I suspect that some shared pages are invalidated because
some static string don't have their coderange set eagerly.

So the first time they are scanned, the entire memory page is
invalidated.

Being able to see the coderange in `ObjectSpace` would help debug
this.

And in addition `dump` currently call `is_broken_string()`  and `is_ascii_string()`
which both end up scanning the string and assigning coderange. I think it's
undesirable as `dump` should be read only.
2022-07-04 20:04:59 +02:00
Peter Zhu
fb724a887a Show embed status of array when len is 0 in objspace dump 2022-03-01 10:55:53 -05:00
Matt Valentine-House
9fab2c1a1a Add the size pool slot size to the output of ObjectSpace.dump/dump_all 2022-02-03 15:07:35 -05:00
Peter Zhu
4f88acc833 Fix compiler warnings in objspace_dump.c when assertions are turned on
Example:

```
In file included from ../../../include/ruby/defines.h:72,
                 from ../../../include/ruby/ruby.h:23,
                 from ../../../gc.h:3,
                 from ../../../ext/objspace/objspace_dump.c:15:
../../../ext/objspace/objspace_dump.c: In function ‘dump_append_ld’:
../../../ext/objspace/objspace_dump.c:95:26: warning: comparison of integer expressions of different signedness: ‘long unsigned int’ and ‘int’ [-Wsign-compare]
   95 |     RUBY_ASSERT(required <= width);
      |                          ^~
```
2021-04-26 19:26:50 -04:00
Jean Boussier
3a888398a6 objspace_dump.c: tag singleton classes and reference the superclass 2021-02-04 09:53:31 -08:00
Jean Boussier
6ca3d1af33 objspace_dump.c: Handle allocation path and line missing 2021-01-20 10:48:13 -08:00
Aaron Patterson
18b3f0f54c Make ext/objspace ASAN friendly
ext/objspace iterates over the heap, but some slots in the heap are
poisoned, so we need to take care of that when running with ASAN
2020-09-28 08:20:23 -07:00
Jean Boussier
fbba6bd4e3 Parse ObjectSpace.dump_all / dump arguments in Ruby to avoid allocation noise
[Feature #17045] ObjectSpace.dump_all should allocate as little as possible in the GC heap

Up until this commit ObjectSpace.dump_all allocates two Hash because of `rb_scan_args`.

It also can allocate a `File` because of `rb_io_get_write_io`.

These allocations are problematic because `dump_all` dumps the Ruby
heap, so it should try modify as little as possible what it is
observing.
2020-09-15 09:18:13 -07:00
Kazuhiro NISHIYAMA
406559a268
Add missing break
pointed out by Coverity Scan
2020-09-11 11:02:24 +09:00
Jean Boussier
5001cc4716 Optimize ObjectSpace.dump_all
The two main optimization are:
  - buffer writes for improved performance
  - avoid formatting functions when possible

```

|                   |compare-ruby|built-ruby|
|:------------------|-----------:|---------:|
|dump_all_string    |       1.038|   195.925|
|                   |           -|   188.77x|
|dump_all_file      |      33.453|   139.645|
|                   |           -|     4.17x|
|dump_all_dev_null  |      44.030|   278.552|
|                   |           -|     6.33x|
```
2020-09-09 11:11:36 -07:00
Jean Boussier
b49a870414 Add a :since option to dump_all
This is useful to see what a block of code allocated, e.g.

```
GC.start
GC.disable
ObjectSpace.trace_object_allocations do
  # run some code
end
gc_gen = GC.count
allocations = ObjectSpace.dump_all(output: :file, since: gc_gen)
GC.enable
GC.start
retentions = ObjectSpace.dump_all(output: :file, since: gc_gen)
```
2020-09-09 08:05:14 -07:00
John Hawthorn
971857c332 Fix method name escaping in ObjectSpace.dump
It's possible to define methods with any name, even if the parser
doesn't support it and it can only be used with ex. send.

This fixes an issue where invalid JSON was output from ObjectSpace.dump
when a method name needed escaping.
2020-08-17 09:47:53 -07:00
Nobuyoshi Nakada
27f7b047e0
Also escape DEL code 2020-08-17 22:36:48 +09:00
Nobuyoshi Nakada
7b4b5e0840
Fixed the radix for control chars 2020-08-17 22:30:26 +09:00
Jean Boussier
6a0cb1d649 Avoid allocating a string when dumping an anonymous module or class 2020-07-23 10:52:30 +09:00
Alan Wu
cbf52087a2 Fix missing imemo cases in objspace_dump by refactoring
imemo_callcache and imemo_callinfo were not handled by the `objspace`
module and were showing up as "unknown" in the dump. Extract the code for
naming imemos and use that in both the GC and the `objspace` module.
2020-07-10 22:42:35 -04:00
Nobuyoshi Nakada
e474c189da
Suppress -Wswitch warnings 2020-04-08 15:13:37 +09:00
卜部昌平
5e22f873ed decouple internal.h headers
Saves comitters' daily life by avoid #include-ing everything from
internal.h to make each file do so instead.  This would significantly
speed up incremental builds.

We take the following inclusion order in this changeset:

1.  "ruby/config.h", where _GNU_SOURCE is defined (must be the very
    first thing among everything).
2.  RUBY_EXTCONF_H if any.
3.  Standard C headers, sorted alphabetically.
4.  Other system headers, maybe guarded by #ifdef
5.  Everything else, sorted alphabetically.

Exceptions are those win32-related headers, which tend not be self-
containing (headers have inclusion order dependencies).
2019-12-26 20:45:12 +09:00
git
e315f3a134 * expand tabs. 2019-07-31 10:22:47 +09:00