Previously rb_ractor_interrupt_exec would use an intermediate function
to create a new thread with the actual target function, replacing the
data being passed in with a piece of malloc memory holding the "next"
function and the original data.
Because of this, passing rb_interrupt_exec_flag_value_data to
rb_ractor_interrupt_exec didn't have the intended effect of allowing
data to be passed in and marked.
This commit adds a rb_interrupt_exec_flag_new_thread flag, which
both simplifies the implementation and allows the original data to be
marked.
If one thread is reading and another closes that socket, the close
blocks waiting for the read to abort cleanly. This ensures that Ruby is
totally done with the file descriptor _BEFORE_ we tell the OS to close
and potentially re-use it.
When the read is correctly terminated, the close should be unblocked.
That currently works if closing is happening on a thread, but if it's
happening on a fiber with a fiber scheduler, it does NOT work.
This patch ensures that if the close happened in a fiber scheduled
thread, that the scheduler is notified that the fiber is unblocked.
[Bug #20723]
[Feature #20590]
For better of for worse, fork(2) remain the primary provider of
parallelism in Ruby programs. Even though it's frowned uppon in
many circles, and a lot of literature will simply state that only
async-signal safe APIs are safe to use after `fork()`, in practice
most APIs work well as long as you are careful about not forking
while another thread is holding a pthread mutex.
One of the APIs that is known cause fork safety issues is `getaddrinfo`.
If you fork while another thread is inside `getaddrinfo`, a mutex
may be left locked in the child, with no way to unlock it.
I think we could reduce the impact of these problem by preventing
in for the most notorious and common cases, by locking around
`fork(2)` and known unsafe APIs with a read-write lock.
Before this patch, the MN scheduler waits for the IO with the
following steps:
1. `poll(fd, timeout=0)` to check fd is ready or not.
2. if fd is not ready, waits with MN thread scheduler
3. call `func` to issue the blocking I/O call
The advantage of advanced `poll()` is we can wait for the
IO ready for any fds. However `poll()` becomes overhead
for already ready fds.
This patch changes the steps like:
1. call `func` to issue the blocking I/O call
2. if the `func` returns `EWOULDBLOCK` the fd is `O_NONBLOCK`
and we need to wait for fd is ready so that waits with MN
thread scheduler.
In this case, we can wait only for `O_NONBLOCK` fds. Otherwise
it waits with blocking operations such as `read()` system call.
However we don't need to call `poll()` to check fd is ready
in advance.
With this patch we can observe performance improvement
on microbenchmark which repeats blocking I/O (not
`O_NONBLOCK` fd) with and without MN thread scheduler.
```ruby
require 'benchmark'
f = open('/dev/null', 'w')
f.sync = true
TN = 1
N = 1_000_000 / TN
Benchmark.bm{|x|
x.report{
TN.times.map{
Thread.new{
N.times{f.print '.'}
}
}.each(&:join)
}
}
__END__
TN = 1
user system total real
ruby32 0.393966 0.101122 0.495088 ( 0.495235)
ruby33 0.493963 0.089521 0.583484 ( 0.584091)
ruby33+MN 0.639333 0.200843 0.840176 ( 0.840291) <- Slow
this+MN 0.512231 0.099091 0.611322 ( 0.611074) <- Good
```
This patch introduce M:N thread scheduler for Ractor system.
In general, M:N thread scheduler employs N native threads (OS threads)
to manage M user-level threads (Ruby threads in this case).
On the Ruby interpreter, 1 native thread is provided for 1 Ractor
and all Ruby threads are managed by the native thread.
From Ruby 1.9, the interpreter uses 1:1 thread scheduler which means
1 Ruby thread has 1 native thread. M:N scheduler change this strategy.
Because of compatibility issue (and stableness issue of the implementation)
main Ractor doesn't use M:N scheduler on default. On the other words,
threads on the main Ractor will be managed with 1:1 thread scheduler.
There are additional settings by environment variables:
`RUBY_MN_THREADS=1` enables M:N thread scheduler on the main ractor.
Note that non-main ractors use the M:N scheduler without this
configuration. With this configuration, single ractor applications
run threads on M:1 thread scheduler (green threads, user-level threads).
`RUBY_MAX_CPU=n` specifies maximum number of native threads for
M:N scheduler (default: 8).
This patch will be reverted soon if non-easy issues are found.
[Bug #19842]
Because a thread calling IO#close now blocks in a native condvar wait,
it's possible for there to be _no_ threads left to actually handle
incoming signals/ubf calls/etc.
This manifested as failing tests on Solaris 10 (SPARC), because:
* One thread called IO#close, which sent a SIGVTALRM to the other
thread to interrupt it, and then waited on the condvar to be notified
that the reading thread was done.
* One thread was calling IO#read, but it hadn't yet reached the actual
call to select(2) when the SIGVTALRM arrived, so it never unblocked
itself.
This results in a deadlock.
The fix is to use a real Ruby mutex for the close lock; that way, the
closing thread goes into sigwait-sleep and can keep trying to interrupt
the select(2) thread.
See the discussion in: https://github.com/ruby/ruby/pull/7865/
When one thread is closing a file descriptor whilst another thread is
concurrently reading it, we need to wait for the reading thread to be
done with it to prevent a potential EBADF (or, worse, file descriptor
reuse).
At the moment, that is done by keeping a list of threads still using the
file descriptor in io_close_fptr. It then continually calls
rb_thread_schedule() in fptr_finalize_flush until said list is empty.
That busy-looping seems to behave rather poorly on some OS's,
particulary FreeBSD. It can cause the TestIO#test_race_gets_and_close
test to fail (even with its very long 200 second timeout) because the
closing thread starves out the using thread.
To fix that, I introduce the concept of struct rb_io_close_wait_list; a
list of threads still using a file descriptor that we want to close. We
call `rb_notify_fd_close` to let the thread scheduler know we're closing
a FD, which fills the list with threads. Then, we call
rb_notify_fd_close_wait which will block the thread until all of the
still-using threads are done.
This is implemented with a condition variable sleep, so no busy-looping
is required.
[Bug #19415]
If multiple threads attemps to load the same file concurrently
it's not a circular dependency issue.
So we check that the existing ThreadShield is owner by the current
fiber before warning about circular dependencies.
[Bug #19415]
If multiple threads attemps to load the same file concurrently
it's not a circular dependency issue.
So we check that the existing ThreadShield is owner by the current
fiber before warning about circular dependencies.
According to MSVC manual (*1), cl.exe can skip including a header file
when that:
- contains #pragma once, or
- starts with #ifndef, or
- starts with #if ! defined.
GCC has a similar trick (*2), but it acts more stricter (e. g. there
must be _no tokens_ outside of #ifndef...#endif).
Sun C lacked #pragma once for a looong time. Oracle Developer Studio
12.5 finally implemented it, but we cannot assume such recent version.
This changeset modifies header files so that each of them include
strictly one #ifndef...#endif. I believe this is the most portable way
to trigger compiler optimizations. [Bug #16770]
*1: https://docs.microsoft.com/en-us/cpp/preprocessor/once
*2: https://gcc.gnu.org/onlinedocs/cppinternals/Guard-Macros.html
One day, I could not resist the way it was written. I finally started
to make the code clean. This changeset is the beginning of a series of
housekeeping commits. It is a simple refactoring; split internal.h into
files, so that we can divide and concur in the upcoming commits. No
lines of codes are either added or removed, except the obvious file
headers/footers. The generated binary is identical to the one before.