mirror of https://github.com/ruby/ruby.git synced 2025-08-15 13:39:04 +02:00

History

Jeremy Evans e4f85bfc31 Implement Set as a core class Set has been an autoloaded standard library since Ruby 3.2. The standard library Set is less efficient than it could be, as it uses Hash for storage, which stores unnecessary values for each key. Implementation details: * Core Set uses a modified version of `st_table`, named `set_table`. than `s/st_/set_/`, the main difference is that the stored records do not have values, making them 1/3 smaller. `st_table_entry` stores `hash`, `key`, and `record` (value), while `set_table_entry` only stores `hash` and `key`. This results in large sets using ~33% less memory compared to stdlib Set. For small sets, core Set uses 12% more memory (160 byte object slot and 64 malloc bytes, while stdlib set uses 40 for Set and 160 for Hash). More memory is used because the set_table is embedded and 72 bytes in the object slot are currently wasted. Hopefully we can make this more efficient and have it stored in an 80 byte object slot in the future. * All methods are implemented as cfuncs, except the pretty_print methods, which were moved to `lib/pp.rb` (which is where the pretty_print methods for other core classes are defined). As is typical for core classes, internal calls call C functions and not Ruby methods. For example, to check if something is a Set, `rb_obj_is_kind_of` is used, instead of calling `is_a?(Set)` on the related object. * Almost all methods use the same algorithm that the pure-Ruby implementation used. The exception is when calling `Set#divide` with a block with 2-arity. The pure-Ruby method used tsort to implement this. I developed an algorithm that only allocates a single intermediate hash and does not need tsort. * The `flatten_merge` protected method is no longer necessary, so it is not implemented (it could be). * Similar to Hash/Array, subclasses of Set are no longer reflected in `inspect` output. * RDoc from stdlib Set was moved to core Set, with minor updates. This includes a comprehensive benchmark suite for all public Set methods. As you would expect, the native version is faster in the vast majority of cases, and multiple times faster in many cases. There are a few cases where it is significantly slower: * Set.new with no arguments (~1.6x) * Set#compare_by_identity for small sets (~1.3x) * Set#clone for small sets (~1.5x) * Set#dup for small sets (~1.7x) These are slower as Set does not currently use the AR table optimization that Hash does, so a new set_table is initialized for each call. I'm not sure it's worth the complexity to have an AR table-like optimization for small sets (for hashes it makes sense, as small hashes are used everywhere in Ruby). The rbs and repl_type_completor bundled gems will need updates to support core Set. The pull request marks them as allowed failures. This passes all set tests with no changes. The following specs needed modification: * Modifying frozen set error message (changed for the better) * `Set#divide` when passed a 2-arity block no longer yields the same object as both the first and second argument (this seems like an issue with the previous implementation). * Set-like objects that override `is_a?` such that `is_a?(Set)` return `true` are no longer treated as Set instances. * `Set.allocate.hash` is no longer the same as `nil.hash` * `Set#join` no longer calls `Set#to_a` (it calls the underlying C function). * `Set#flatten_merge` protected method is not implemented. Previously, `set.rb` added a `SortedSet` autoload, which loads `set/sorted_set.rb`. This replaces the `Set` autoload in `prelude.rb` with a `SortedSet` autoload, but I recommend removing it and `set/sorted_set.rb`. This moves `test/set/test_set.rb` to `test/ruby/test_set.rb`, reflecting that switch to a core class. This does not move the spec files, as I'm not sure how they should be handled. Internally, this uses the st_* types and functions as much as possible, and only adds set_* types and functions as needed. The underlying set_table implementation is stored in st.c, but there is no public C-API for it, nor is there one planned, in order to keep the ability to change the internals going forward. For internal uses of st_table with Qtrue values, those can probably be replaced with set_table. To do that, include internal/set_table.h. To handle symbol visibility (rb_ prefix), internal/set_table.h uses the same macro approach that include/ruby/st.h uses. The Set class (rb_cSet) and all methods are defined in set.c. There isn't currently a C-API for the Set class, though C-API functions can be added as needed going forward. Implements [Feature #21216] Co-authored-by: Jean Boussier <jean.boussier@gmail.com> Co-authored-by: Oliver Nutter <mrnoname1000@riseup.net>		2025-04-26 10:31:11 +09:00
..
bin	Sync Bundler and adapt to new spec setup	2025-04-10 19:21:51 +09:00
bundler	all of examples at commands/newgem_spec.rb are working on ruby repo	2025-04-23 18:09:55 +09:00
lib	Use release version of turbo_tests	2025-03-26 19:37:22 +09:00
mspec	Update to ruby/mspec@484310d	2025-03-27 11:09:24 +01:00
ruby	Implement Set as a core class	2025-04-26 10:31:11 +09:00
syntax_suggest	[ruby/syntax_suggest] Resolve to lint failure of standardrb	2025-01-10 05:38:39 +00:00
bundled_gems.mspec	Convert ostruct to openstruct	2025-01-08 17:12:19 +09:00
bundled_gems_spec.rb	Refactor bundled condition	2025-04-10 17:29:39 +09:00
default.mspec	Convert ostruct to openstruct	2025-01-08 17:12:19 +09:00
mmtk.mspec	[ruby/mmtk] Add MMTk test exclusions for Ruby CI	2024-12-05 20:12:45 +00:00
README.md	[DOC] Update to use `SPECOPTS` instead of `MSPECOPT`	2023-08-12 12:33:05 +09:00

README.md

spec/bundler

spec/bundler is rspec examples for bundler library (lib/bundler.rb, lib/bundler/*).

Running spec/bundler

To run rspec for bundler:

make test-bundler

or run rspec with parallel execution:

make test-bundler-parallel

If you specify BUNDLER_SPECS=foo/bar_spec.rb then only spec/bundler/foo/bar_spec.rb will be run.

spec/ruby

ruby/spec (https://github.com/ruby/spec/) is a test suite for the Ruby language.

Once a month, @eregon merges the in-tree copy under spec/ruby with the upstream repository, preserving the commits and history. The same happens for other implementations such as JRuby and TruffleRuby.

Feel welcome to modify the in-tree spec/ruby. This is the purpose of the in-tree copy, to facilitate contributions to ruby/spec for MRI developers.

New features, additional tests for existing features and regressions tests are all welcome in ruby/spec. There is very little behavior that is implementation-specific, as in the end user programs tend to rely on every behavior MRI exhibits. In other words: If adding a spec might reveal a bug in another implementation, then it is worth adding it. Currently, the only module which is MRI-specific is RubyVM.

Changing behavior and versions guards

Version guards (ruby_version_is) must be added for new features or features which change behavior or are removed. This is necessary for other Ruby implementations to still be able to run the specs and contribute new specs.

For example, change:

describe "Some spec" do
  it "some example" do
    # Old behavior for Ruby < 2.7
  end
end

to:

describe "Some spec" do
  ruby_version_is ""..."2.7" do
    it "some example" do
      # Old behavior for Ruby < 2.7
    end
  end

  ruby_version_is "2.7" do
    it "some example" do
      # New behavior for Ruby >= 2.7
    end
  end
end

See spec/ruby/CONTRIBUTING.md for more documentation about guards.

To verify specs are compatible with older Ruby versions:

cd spec/ruby
$RUBY_MANAGER use 2.4.9
../mspec/bin/mspec -j

Running ruby/spec

To run all specs:

make test-spec

Extra arguments can be added via SPECOPTS. For instance, to show the help:

make test-spec SPECOPTS=-h

You can also run the specs in parallel, which is currently experimental. It takes around 10s instead of 60s on a quad-core laptop.

make test-spec SPECOPTS=-j

To run a specific test, add its path to the command:

make test-spec SPECOPTS=spec/ruby/language/for_spec.rb

If ruby trunk is your current ruby in $PATH, you can also run mspec directly:

# change ruby to trunk
ruby -v # => trunk
spec/mspec/bin/mspec spec/ruby/language/for_spec.rb

ruby/spec and test/

The main difference between a "spec" under spec/ruby/ and a test under test/ is that specs are documenting what they test. This is extremely valuable when reading these tests, as it helps to quickly understand what specific behavior is tested, and how a method should behave. Basic English is fine for spec descriptions. Specs also tend to have few expectations (assertions) per spec, as they specify one aspect of the behavior and not everything at once. Beyond that, the syntax is slightly different but it does the same thing: assert_equal 3, 1+2 is just (1+2).should == 3.

Example:

describe "The for expression" do
  it "iterates over an Enumerable passing each element to the block" do
    j = 0
    for i in 1..3
      j += i
    end
    j.should == 6
  end
end

For more details, see spec/ruby/CONTRIBUTING.md.

spec/syntax_suggest

Running spec/syntax_suggest

To run rspec for syntax_suggest:

make test-syntax-suggest

If you specify SYNTAX_SUGGEST_SPECS=foo/bar_spec.rb then only spec/syntax_suggest/foo/bar_spec.rb will be run.