![]() Set has been an autoloaded standard library since Ruby 3.2. The standard library Set is less efficient than it could be, as it uses Hash for storage, which stores unnecessary values for each key. Implementation details: * Core Set uses a modified version of `st_table`, named `set_table`. than `s/st_/set_/`, the main difference is that the stored records do not have values, making them 1/3 smaller. `st_table_entry` stores `hash`, `key`, and `record` (value), while `set_table_entry` only stores `hash` and `key`. This results in large sets using ~33% less memory compared to stdlib Set. For small sets, core Set uses 12% more memory (160 byte object slot and 64 malloc bytes, while stdlib set uses 40 for Set and 160 for Hash). More memory is used because the set_table is embedded and 72 bytes in the object slot are currently wasted. Hopefully we can make this more efficient and have it stored in an 80 byte object slot in the future. * All methods are implemented as cfuncs, except the pretty_print methods, which were moved to `lib/pp.rb` (which is where the pretty_print methods for other core classes are defined). As is typical for core classes, internal calls call C functions and not Ruby methods. For example, to check if something is a Set, `rb_obj_is_kind_of` is used, instead of calling `is_a?(Set)` on the related object. * Almost all methods use the same algorithm that the pure-Ruby implementation used. The exception is when calling `Set#divide` with a block with 2-arity. The pure-Ruby method used tsort to implement this. I developed an algorithm that only allocates a single intermediate hash and does not need tsort. * The `flatten_merge` protected method is no longer necessary, so it is not implemented (it could be). * Similar to Hash/Array, subclasses of Set are no longer reflected in `inspect` output. * RDoc from stdlib Set was moved to core Set, with minor updates. This includes a comprehensive benchmark suite for all public Set methods. As you would expect, the native version is faster in the vast majority of cases, and multiple times faster in many cases. There are a few cases where it is significantly slower: * Set.new with no arguments (~1.6x) * Set#compare_by_identity for small sets (~1.3x) * Set#clone for small sets (~1.5x) * Set#dup for small sets (~1.7x) These are slower as Set does not currently use the AR table optimization that Hash does, so a new set_table is initialized for each call. I'm not sure it's worth the complexity to have an AR table-like optimization for small sets (for hashes it makes sense, as small hashes are used everywhere in Ruby). The rbs and repl_type_completor bundled gems will need updates to support core Set. The pull request marks them as allowed failures. This passes all set tests with no changes. The following specs needed modification: * Modifying frozen set error message (changed for the better) * `Set#divide` when passed a 2-arity block no longer yields the same object as both the first and second argument (this seems like an issue with the previous implementation). * Set-like objects that override `is_a?` such that `is_a?(Set)` return `true` are no longer treated as Set instances. * `Set.allocate.hash` is no longer the same as `nil.hash` * `Set#join` no longer calls `Set#to_a` (it calls the underlying C function). * `Set#flatten_merge` protected method is not implemented. Previously, `set.rb` added a `SortedSet` autoload, which loads `set/sorted_set.rb`. This replaces the `Set` autoload in `prelude.rb` with a `SortedSet` autoload, but I recommend removing it and `set/sorted_set.rb`. This moves `test/set/test_set.rb` to `test/ruby/test_set.rb`, reflecting that switch to a core class. This does not move the spec files, as I'm not sure how they should be handled. Internally, this uses the st_* types and functions as much as possible, and only adds set_* types and functions as needed. The underlying set_table implementation is stored in st.c, but there is no public C-API for it, nor is there one planned, in order to keep the ability to change the internals going forward. For internal uses of st_table with Qtrue values, those can probably be replaced with set_table. To do that, include internal/set_table.h. To handle symbol visibility (rb_ prefix), internal/set_table.h uses the same macro approach that include/ruby/st.h uses. The Set class (rb_cSet) and all methods are defined in set.c. There isn't currently a C-API for the Set class, though C-API functions can be added as needed going forward. Implements [Feature #21216] Co-authored-by: Jean Boussier <jean.boussier@gmail.com> Co-authored-by: Oliver Nutter <mrnoname1000@riseup.net> |
||
---|---|---|
.. | ||
bin | ||
command_line | ||
core | ||
fixtures | ||
language | ||
library | ||
optional/capi | ||
security | ||
shared | ||
.gitignore | ||
.mspec.constants | ||
.rubocop.yml | ||
.rubocop_todo.yml | ||
CONTRIBUTING.md | ||
default.mspec | ||
LICENSE | ||
README.md | ||
spec_helper.rb | ||
TODO |
The Ruby Spec Suite
The Ruby Spec Suite, abbreviated ruby/spec
, is a test suite for the behavior of the Ruby programming language.
Description and Motivation
It is not a standardized specification like the ISO one, and does not aim to become one. Instead, it is a practical tool to describe and test the behavior of Ruby with code.
Every example code has a textual description, which presents several advantages:
- It is easier to understand the intent of the author
- It documents how recent versions of Ruby should behave
- It helps Ruby implementations to agree on a common behavior
The specs are written with syntax similar to RSpec 2. They are run with MSpec, the purpose-built framework for running the Ruby Spec Suite. For more information, see the MSpec project.
The specs describe the language syntax, the core library, the standard library, the C API for extensions and the command line flags. The language specs are grouped by keyword while the core and standard library specs are grouped by class and method.
ruby/spec is known to be tested in these implementations for every commit:
- MRI on 30 platforms and 4 versions
- JRuby for both 1.7 and 9.x
- TruffleRuby
- Opal
- Artichoke
ruby/spec describes the behavior of Ruby 3.1 and more recent Ruby versions. More precisely, every latest stable MRI release should pass all specs of ruby/spec (3.1.x, 3.2.x, etc), and those are tested in CI.
Synchronization with Ruby Implementations
The specs are synchronized both ways around once a month by @andrykonchin between ruby/spec, MRI, JRuby and TruffleRuby,
using this script.
Each of these repositories has a full copy of the specs under spec/ruby
to ease editing specs.
Any of these repositories can be used to add or edit specs, use what is most convenient for you.
For testing the development version of a Ruby implementation, one should always test against that implementation's copy of the specs under spec/ruby
, as that's what the Ruby implementation tests against in their CI.
Also, this repository doesn't always contain the latest spec changes from MRI (it's synchronized monthly), and does not contain tags (specs marked as failing on that Ruby implementation).
Running specs on a Ruby implementation can be done with:
$ cd ruby_implementation/spec/ruby
# Add ../ruby_implementation/bin in PATH, or pass -t /path/to/bin/ruby
$ ../mspec/bin/mspec
Specs for old Ruby versions
For older specs try these commits:
- Ruby 2.0.0-p647 - Suite using MSpec (may encounter 2 failures)
- Ruby 2.1.9 - Suite using MSpec
- Ruby 2.2.10 - Suite using MSpec
- Ruby 2.3.8 - Suite using MSpec
- Ruby 2.4.10 - Suite using MSpec
- Ruby 2.5.9 - Suite using MSpec
- Ruby 2.6.10 - Suite using MSpec
- Ruby 2.7.8 - Suite using MSpec
- Ruby 3.0.7 - Suite using MSpec
Running the specs
First, clone this repository:
$ git clone https://github.com/ruby/spec.git
Then move to it:
$ cd spec
Clone MSpec:
$ git clone https://github.com/ruby/mspec.git ../mspec
And run the spec suite:
$ ../mspec/bin/mspec
This will execute all the specs using the executable named ruby
on your current PATH.
Running Specs with a Specific Ruby Implementation
Use the -t
option to specify the Ruby implementation with which to run the specs.
The argument is either a full path to the Ruby binary, or an executable in $PATH
.
$ ../mspec/bin/mspec -t /path/to/some/bin/ruby
Running Selected Specs
To run a single spec file, pass the filename to mspec
:
$ ../mspec/bin/mspec core/kernel/kind_of_spec.rb
You can also pass a directory, in which case all specs in that directories will be run:
$ ../mspec/bin/mspec core/kernel
Finally, you can also run them per group as defined in default.mspec
.
The following command will run all language specs:
$ ../mspec/bin/mspec :language
In similar fashion, the following commands run the respective specs:
$ ../mspec/bin/mspec :core
$ ../mspec/bin/mspec :library
$ ../mspec/bin/mspec :capi
Sanity Checks When Running Specs
A number of checks for various kind of "leaks" (file descriptors, temporary files,
threads, subprocesses, ENV
, ARGV
, global encodings, top-level constants) can be
enabled with CHECK_LEAKS=true
:
$ CHECK_LEAKS=true ../mspec/bin/mspec
New top-level constants should only be introduced when needed or follow the
pattern <ClassBeingTested>Specs
such as module StringSpecs
.
Other constants used for testing should be nested under such a module.
Exceptions to these rules are contained in the file .mspec.constants
.
MSpec can automatically add new top-level constants in this file with:
$ CHECK_LEAKS=save mspec ../mspec/bin/mspec file
Running Specs on S390x CPU Architecture
Run the specs with DFLTCC=0
if you see failing specs related to the zlib library on s390x CPU architecture. The failures can happen with the zlib library applying the patch madler/zlib#410 to enable the deflate algorithm producing a different compressed byte stream.
$ DFLTCC=0 ../mspec/bin/mspec
Contributing and Writing Specs
See CONTRIBUTING.md for documentation about contributing and writing specs (guards, matchers, etc).
Dependencies
These command-line executables are needed to run the specs.
echo
stat
forcore/file/*time_spec.rb
find
forcore/file/fixtures/file_types.rb
(packagefindutils
, not needed on Windows)
The file /etc/services
is required for socket specs (package netbase
on Debian, not needed on Windows).
Socket specs from rubysl-socket
Most specs under library/socket
were imported from the rubysl-socket project (which is no longer on GitHub).
The 3 copyright holders of rubysl-socket, Yorick Peterse, Chuck Remes and
Brian Shirai, agreed to relicense those specs under the MIT license in ruby/spec.
History and RubySpec
This project was originally born from Rubinius tests being converted to the spec style. The revision history of these specs is available here. These specs were later extracted to their own project, RubySpec, with a specific vision and principles. At the end of 2014, Brian Shirai, the creator of RubySpec, decided to end RubySpec. A couple months later, the different repositories were merged and the project was revived. On 12 January 2016, the name was changed to "The Ruby Spec Suite" for clarity and to let the RubySpec ideology rest in peace.