This gets in the middle if we ever start allowing to build as if using a
different RubyGems version than the one being run.
This could be useful to make `gem rebuild` a little more usable, and
it's already done by Bundler specs which already make this method a noop
when they need this.
I'm not sure forcefully setting this, even if user explicitly specified
something else is helpful.
Since this could potentially prevent gems explicitly setting a constant
RubyGems version from building, I changed the error of incorrect
RubyGems version from a hard error to a warning, since it will start
happening in those cases if we stop overwriting the version.
45676af80d
This is to prevent a malicious gem from causing a denial of service by
including a very large metadata or checksums file,
which is then read into memory in its entirety just by opening the gem package.
This is guaranteed to limit the amount of memory needed, since
gzips (which use deflate streams for compression) have a maximum compression
ratio of 1032:1, so the uncompressed size of the metadata or checksums file
will be at most 1032 times the size of the (limited) amount of data read.
This prevents a gem from causing 500GB of memory to be allocated
to read a 500MB metadata file.
a596e3c5ec
Most of the calls to `FormatError.new` pass `@gem` for the second argument, which has a `path` method.
But in one case—on package.rb:691 in `verify_gz`, the `source` argument is a `String`.
So if there's ever a GZip decode error when attempting to read the contents of the `data.tar.gz` file, instead of reporting the underlying GZip error (which might be something like "unexpected end of file"), we would report instead a NoMethodError coming from package.rb
```
Exception while verifying sorbet-0.5.11301.gem
ERROR: While executing gem ... (NoMethodError)
undefined method `path' for "data.tar.gz":String
@path = source.path
^^^^^
```
There are two ways to fix this:
1. Make `FormatError#initialize` aware of the fact that `source` might sometimes be a `String`
2. Make the call to `FormatError.new` in `verify_gz` pass `@gem` instead of `entry.full_name`.
I've chosen 1 because I think it's more useful to see "unexpected end of file in data.tar.gz" instead of "unexpected end of file in sorbet-0.5.11301.gem." The end of file **is actually** in data.tar.gz, not in the gem file itself, which was decoded successfully.
For now, on a small rails app I have hanging around:
```
==> memprof.after.txt <==
Total allocated: 872.51 MB (465330 objects)
Total retained: 40.48 kB (326 objects)
==> memprof.before.txt <==
Total allocated: 890.79 MB (1494026 objects)
Total retained: 40.40 kB (328 objects)
```
Not a huge difference in memory usage, but it's a drastic improvement
in total number of allocations.
Additionally, this will pay huge dividends once
https://github.com/ruby/zlib/pull/61 is merged, as it will allow us to
completely avoid allocations in the repeated calls to readpartial,
which currently accounts for most of the memory usage shown above.
f78d45d927
Gem::Package::TarReader::Entry now raises EOFError or returns nil
appropriately based on Ruby core IO.read and IO.readpartial behavior.
Zlib will respond accordingly by raising Zlib::GzipFile::Error on EOF.
When verifying a gem or extracting contents, raise FormatError similar
to other cases of corrupt gems.
Addresses a bug where Gem::Package would attempt to call size on nil
instead of raising a more descriptive and useful error, leading users
to assume the problem is internal to rubygems.
Remove unused error class TarReader::UnexpectedEOF that was never raised
since the NoMethodError on nil would happen first. Use EOFError instead.
dc6129644b
When extracting files from the tarball, a mode is retrieved from
the header. Occasionally you'll encounter a gem that was packaged
on a system whose permission bits result in a value that is larger
than the value that File.chmod will allow (anything >= 2^16). In
that case the extraction fails with a RangeError, which is pretty
esoteric.
If you extract the tarball with the tar and gunzip utilities, the
file permissions end up being just the bottom 16 bits masked off
from the original value. I've mirrored that behavior here. Per the
tar spec:
> Modes which are not supported by the operating system restoring
> files from the archive will be ignored.
I think that basically means what I've done here.
---
This commit also changes the behavior very slightly with regard to
when the chmod is called. Previously it was called while the file
descriptor was still open, but after the write call.
When write flushes, the file permissions are changed to the mode
value from the File.open call, undoing the changes made by
FileUtils.chmod. CRuby appears to flush the buffer after the
chmod call, whereas TruffleRuby flushes before the chmod call.
So the file permissions can change depending on implementation.
Both implementations end up getting the correct file permissions
for the bottom 9 bits (user, group, world), but differ with
regard to the sticky bit in the next 3.
To get consistent behavior, this commit changes it to close the
file descriptor before attempting to chmod anything, which makes
it consistent because the write flushes in both cases.
22ce076e99
If we explicitly disallow the creation of symlinks that point to files
outside of the destination directory, we can avoid any other safety
checks while creating directories, because we can be sure they will
always fall under the destination directory as well.
555692b8de