mirror of https://github.com/ruby/ruby.git synced 2025-08-15 13:39:04 +02:00

History

Jeremy Evans e4f85bfc31 Implement Set as a core class Set has been an autoloaded standard library since Ruby 3.2. The standard library Set is less efficient than it could be, as it uses Hash for storage, which stores unnecessary values for each key. Implementation details: * Core Set uses a modified version of `st_table`, named `set_table`. than `s/st_/set_/`, the main difference is that the stored records do not have values, making them 1/3 smaller. `st_table_entry` stores `hash`, `key`, and `record` (value), while `set_table_entry` only stores `hash` and `key`. This results in large sets using ~33% less memory compared to stdlib Set. For small sets, core Set uses 12% more memory (160 byte object slot and 64 malloc bytes, while stdlib set uses 40 for Set and 160 for Hash). More memory is used because the set_table is embedded and 72 bytes in the object slot are currently wasted. Hopefully we can make this more efficient and have it stored in an 80 byte object slot in the future. * All methods are implemented as cfuncs, except the pretty_print methods, which were moved to `lib/pp.rb` (which is where the pretty_print methods for other core classes are defined). As is typical for core classes, internal calls call C functions and not Ruby methods. For example, to check if something is a Set, `rb_obj_is_kind_of` is used, instead of calling `is_a?(Set)` on the related object. * Almost all methods use the same algorithm that the pure-Ruby implementation used. The exception is when calling `Set#divide` with a block with 2-arity. The pure-Ruby method used tsort to implement this. I developed an algorithm that only allocates a single intermediate hash and does not need tsort. * The `flatten_merge` protected method is no longer necessary, so it is not implemented (it could be). * Similar to Hash/Array, subclasses of Set are no longer reflected in `inspect` output. * RDoc from stdlib Set was moved to core Set, with minor updates. This includes a comprehensive benchmark suite for all public Set methods. As you would expect, the native version is faster in the vast majority of cases, and multiple times faster in many cases. There are a few cases where it is significantly slower: * Set.new with no arguments (~1.6x) * Set#compare_by_identity for small sets (~1.3x) * Set#clone for small sets (~1.5x) * Set#dup for small sets (~1.7x) These are slower as Set does not currently use the AR table optimization that Hash does, so a new set_table is initialized for each call. I'm not sure it's worth the complexity to have an AR table-like optimization for small sets (for hashes it makes sense, as small hashes are used everywhere in Ruby). The rbs and repl_type_completor bundled gems will need updates to support core Set. The pull request marks them as allowed failures. This passes all set tests with no changes. The following specs needed modification: * Modifying frozen set error message (changed for the better) * `Set#divide` when passed a 2-arity block no longer yields the same object as both the first and second argument (this seems like an issue with the previous implementation). * Set-like objects that override `is_a?` such that `is_a?(Set)` return `true` are no longer treated as Set instances. * `Set.allocate.hash` is no longer the same as `nil.hash` * `Set#join` no longer calls `Set#to_a` (it calls the underlying C function). * `Set#flatten_merge` protected method is not implemented. Previously, `set.rb` added a `SortedSet` autoload, which loads `set/sorted_set.rb`. This replaces the `Set` autoload in `prelude.rb` with a `SortedSet` autoload, but I recommend removing it and `set/sorted_set.rb`. This moves `test/set/test_set.rb` to `test/ruby/test_set.rb`, reflecting that switch to a core class. This does not move the spec files, as I'm not sure how they should be handled. Internally, this uses the st_* types and functions as much as possible, and only adds set_* types and functions as needed. The underlying set_table implementation is stored in st.c, but there is no public C-API for it, nor is there one planned, in order to keep the ability to change the internals going forward. For internal uses of st_table with Qtrue values, those can probably be replaced with set_table. To do that, include internal/set_table.h. To handle symbol visibility (rb_ prefix), internal/set_table.h uses the same macro approach that include/ruby/st.h uses. The Set class (rb_cSet) and all methods are defined in set.c. There isn't currently a C-API for the Set class, though C-API functions can be added as needed going forward. Implements [Feature #21216] Co-authored-by: Jean Boussier <jean.boussier@gmail.com> Co-authored-by: Oliver Nutter <mrnoname1000@riseup.net>		2025-04-26 10:31:11 +09:00
..
bin	Update to ruby/spec@54c391e	2024-11-06 21:58:28 +01:00
command_line	Freeze $/ and make it ractor safe	2025-03-27 17:54:56 +01:00
core	Implement Set as a core class	2025-04-26 10:31:11 +09:00
fixtures	Update to ruby/spec@18032a7	2025-01-07 12:30:52 +01:00
language	Implement Set as a core class	2025-04-26 10:31:11 +09:00
library	Implement Set as a core class	2025-04-26 10:31:11 +09:00
optional/capi	Expose `ruby_thread_has_gvl_p`.	2025-04-14 18:28:09 +09:00
security	spec/mspec/tool/wrap_with_guard.rb 'ruby_version_is ...3.5' spec/ruby/security/cve_2020_10663_spec.rb	2025-03-28 12:44:53 +09:00
shared	Removed Solaris conditions from optional and shared directories	2025-04-02 16:24:47 +09:00
.gitignore
.mspec.constants	Define RactorLocalSingleton on .mspec.constants	2024-10-01 18:41:38 +09:00
.rubocop.yml	Update to ruby/spec@54c391e	2024-11-06 21:58:28 +01:00
.rubocop_todo.yml	Update to ruby/spec@54c391e	2024-11-06 21:58:28 +01:00
CONTRIBUTING.md	Update to ruby/spec@18032a7	2025-01-07 12:30:52 +01:00
default.mspec	Define RactorLocalSingleton on .mspec.constants	2024-10-01 18:41:38 +09:00
LICENSE
README.md	Update to ruby/spec@5e579e2	2025-03-27 11:09:24 +01:00
spec_helper.rb	Update to ruby/spec@96d1072	2023-09-04 16:07:46 +02:00
TODO

README.md

The Ruby Spec Suite

The Ruby Spec Suite, abbreviated ruby/spec, is a test suite for the behavior of the Ruby programming language.

Description and Motivation

It is not a standardized specification like the ISO one, and does not aim to become one. Instead, it is a practical tool to describe and test the behavior of Ruby with code.

Every example code has a textual description, which presents several advantages:

It is easier to understand the intent of the author
It documents how recent versions of Ruby should behave
It helps Ruby implementations to agree on a common behavior

The specs are written with syntax similar to RSpec 2. They are run with MSpec, the purpose-built framework for running the Ruby Spec Suite. For more information, see the MSpec project.

The specs describe the language syntax, the core library, the standard library, the C API for extensions and the command line flags. The language specs are grouped by keyword while the core and standard library specs are grouped by class and method.

ruby/spec is known to be tested in these implementations for every commit:

MRI on 30 platforms and 4 versions
JRuby for both 1.7 and 9.x
TruffleRuby
Opal
Artichoke

ruby/spec describes the behavior of Ruby 3.1 and more recent Ruby versions. More precisely, every latest stable MRI release should pass all specs of ruby/spec (3.1.x, 3.2.x, etc), and those are tested in CI.

Synchronization with Ruby Implementations

The specs are synchronized both ways around once a month by @andrykonchin between ruby/spec, MRI, JRuby and TruffleRuby, using this script. Each of these repositories has a full copy of the specs under spec/ruby to ease editing specs. Any of these repositories can be used to add or edit specs, use what is most convenient for you.

For testing the development version of a Ruby implementation, one should always test against that implementation's copy of the specs under spec/ruby, as that's what the Ruby implementation tests against in their CI. Also, this repository doesn't always contain the latest spec changes from MRI (it's synchronized monthly), and does not contain tags (specs marked as failing on that Ruby implementation). Running specs on a Ruby implementation can be done with:

$ cd ruby_implementation/spec/ruby
# Add ../ruby_implementation/bin in PATH, or pass -t /path/to/bin/ruby
$ ../mspec/bin/mspec

Specs for old Ruby versions

For older specs try these commits:

Ruby 2.0.0-p647 - Suite using MSpec (may encounter 2 failures)
Ruby 2.1.9 - Suite using MSpec
Ruby 2.2.10 - Suite using MSpec
Ruby 2.3.8 - Suite using MSpec
Ruby 2.4.10 - Suite using MSpec
Ruby 2.5.9 - Suite using MSpec
Ruby 2.6.10 - Suite using MSpec
Ruby 2.7.8 - Suite using MSpec
Ruby 3.0.7 - Suite using MSpec

Running the specs

First, clone this repository:

$ git clone https://github.com/ruby/spec.git

Then move to it:

$ cd spec

Clone MSpec:

$ git clone https://github.com/ruby/mspec.git ../mspec

And run the spec suite:

$ ../mspec/bin/mspec

This will execute all the specs using the executable named ruby on your current PATH.

Running Specs with a Specific Ruby Implementation

Use the -t option to specify the Ruby implementation with which to run the specs. The argument is either a full path to the Ruby binary, or an executable in $PATH.

$ ../mspec/bin/mspec -t /path/to/some/bin/ruby

Running Selected Specs

To run a single spec file, pass the filename to mspec:

$ ../mspec/bin/mspec core/kernel/kind_of_spec.rb

You can also pass a directory, in which case all specs in that directories will be run:

$ ../mspec/bin/mspec core/kernel

Finally, you can also run them per group as defined in default.mspec. The following command will run all language specs:

$ ../mspec/bin/mspec :language

In similar fashion, the following commands run the respective specs:

$ ../mspec/bin/mspec :core
$ ../mspec/bin/mspec :library
$ ../mspec/bin/mspec :capi

Sanity Checks When Running Specs

A number of checks for various kind of "leaks" (file descriptors, temporary files, threads, subprocesses, ENV, ARGV, global encodings, top-level constants) can be enabled with CHECK_LEAKS=true:

$ CHECK_LEAKS=true ../mspec/bin/mspec

New top-level constants should only be introduced when needed or follow the pattern <ClassBeingTested>Specs such as module StringSpecs. Other constants used for testing should be nested under such a module.

Exceptions to these rules are contained in the file .mspec.constants. MSpec can automatically add new top-level constants in this file with:

$ CHECK_LEAKS=save mspec ../mspec/bin/mspec file

Running Specs on S390x CPU Architecture

Run the specs with DFLTCC=0 if you see failing specs related to the zlib library on s390x CPU architecture. The failures can happen with the zlib library applying the patch madler/zlib#410 to enable the deflate algorithm producing a different compressed byte stream.

$ DFLTCC=0 ../mspec/bin/mspec

Contributing and Writing Specs

See CONTRIBUTING.md for documentation about contributing and writing specs (guards, matchers, etc).

Dependencies

These command-line executables are needed to run the specs.

echo
stat for core/file/*time_spec.rb
find for core/file/fixtures/file_types.rb (package findutils, not needed on Windows)

The file /etc/services is required for socket specs (package netbase on Debian, not needed on Windows).

Socket specs from rubysl-socket

Most specs under library/socket were imported from the rubysl-socket project (which is no longer on GitHub). The 3 copyright holders of rubysl-socket, Yorick Peterse, Chuck Remes and Brian Shirai, agreed to relicense those specs under the MIT license in ruby/spec.

History and RubySpec

This project was originally born from Rubinius tests being converted to the spec style. The revision history of these specs is available here. These specs were later extracted to their own project, RubySpec, with a specific vision and principles. At the end of 2014, Brian Shirai, the creator of RubySpec, decided to end RubySpec. A couple months later, the different repositories were merged and the project was revived. On 12 January 2016, the name was changed to "The Ruby Spec Suite" for clarity and to let the RubySpec ideology rest in peace.