With the latest version of the postponed job patchset merged, we don't
actually need to go through the contortions of keeping the data in a
global variable; we can just update `data` with multiple calls to
rb_postponed_job_preregister.
This _partially_ reverts commit
d2ba8ea54a, but for UDP sockets only.
With TCP sockets (and other things which use `rsock_init_inetsock`), the
order of operations is to call `getaddrinfo(3)` with AF_UNSPEC, look at
the returned addresses, pick one, and then call `socket(2)` with the
family for that address (i.e. AF_INET or AF_INET6).
With UDP sockets, however, this is reversed; `UDPSocket.new` takes an
address family as an argument, and then calls `socket(2)` with that
family. A subsequent call to UDPSocket#connect will then call
`getaddrinfo(3)` with that family.
The problem here is that...
* If you are in a networking situation that _only_ has loopback addrs,
* And you want to look up a name like "localhost" (or NULL)
* And you pass AF_INET or AF_INET6 as the ai_family argument to
getaddrinfo(3),
* And you pass AI_ADDRCONFIG to the hints argument as well,
then glibc on Linux will not return an address. This is because
AI_ADDRCONFIG is supposed to return addresses for families we actually
have an address for and could conceivably connect to, but also is
documented to explicitly ignore localhost in that situation.
It honestly doesn't make a ton of sense to pass AI_ADDRCONFIG if you're
explicitly passing the address family anyway, because you're not looking
for "an address for this name we can connect to"; you're looking for "an
IPv(4|6) address for this name". And the original glibc bug that
d2ba8ea5 was supposed to work around was related to parallel issuance of
A and AAAA queries, which of course won't happen if an address family is
explicitly specified.
So, we fix this by not passing AI_ADDRCONFIG for calls to
`rsock_addrinfo` that we also pass an explicit family to (i.e. for
UDPsocket).
[Bug #20048]
Our current implementation of rb_postponed_job_register suffers from
some safety issues that can lead to interpreter crashes (see bug #1991).
Essentially, the issue is that jobs can be called with the wrong
arguments.
We made two attempts to fix this whilst keeping the promised semantics,
but:
* The first one involved masking/unmasking when flushing jobs, which
was believed to be too expensive
* The second one involved a lock-free, multi-producer, single-consumer
ringbuffer, which was too complex
The critical insight behind this third solution is that essentially the
only user of these APIs are a) internal, or b) profiling gems.
For a), none of the usages actually require variable data; they will
work just fine with the preregistration interface.
For b), generally profiling gems only call a single callback with a
single piece of data (which is actually usually just zero) for the life
of the program. The ringbuffer is complex because it needs to support
multi-word inserts of job & data (which can't be atomic); but nobody
actually even needs that functionality, really.
So, this comit:
* Introduces a pre-registration API for jobs, with a GVL-requiring
rb_postponed_job_prereigster, which returns a handle which can be
used with an async-signal-safe rb_postponed_job_trigger.
* Deprecates rb_postponed_job_register (and re-implements it on top of
the preregister function for compatability)
* Moves all the internal usages of postponed job register
pre-registration
When making an outgoing TCP or UDP connection, set AI_ADDRCONFIG in the
hints we send to getaddrinfo(3) (if supported). This will prompt the
resolver to _NOT_ issue A or AAAA queries if the system does not
actually have an IPv4 or IPv6 address (respectively).
This makes outgoing connections marginally more efficient on
non-dual-stack systems, since we don't have to try connecting to an
address which can't possibly work.
More importantly, however, this works around a race condition present
in some older versions of glibc on aarch64 where it could accidently
send the two outgoing DNS queries with the same DNS txnid, and get
confused when receiving the responses. This manifests as outgoing
connections sometimes taking 5 seconds (the DNS timeout before retry) to
be made.
Fixes#19144
> https://github.com/flori/json/pull/525
> Rename escape_slash in script_safe and also escape E+2028 and E+2029
Co-authored-by: Jean Boussier <jean.boussier@gmail.com>
> https://github.com/flori/json/pull/454
> Remove unnecessary initialization of create_id in JSON.parse()
Co-authored-by: Watson <watson1978@gmail.com>
It is rather common to directly interpolate JSON string inside
<script> tags in HTML as to provide configuration or parameters to a
script.
However this may lead to XSS vulnerabilities, to prevent that 3
characters need to be escaped:
- `/` (forward slash)
- `U+2028` (LINE SEPARATOR)
- `U+2029` (PARAGRAPH SEPARATOR)
The forward slash need to be escaped to prevent closing the script
tag early, and the other two are valid JSON but invalid Javascript
and can be used to break JS parsing.
Given that the intent of escaping forward slash is the same than escaping
U+2028 and U+2029, I chos to rename and repurpose the existing `escape_slash`
option.
Previously in the JSON::Ext parser, when we encountered an "Infinity"
token (and weren't allowing NaN/Infinity) we would try to display the
"unexpected token" at the character before.
42ac170712
According to https://bugs.openjdk.org/browse/JDK-8268605, pthread_create
may fail spuriously. This change implements a simple retry as a modest
measure, which is also used by JDK.
This entirely changes how it is tested. Rather than to use counters
we now record the timeline of events with associated threads which
makes it much easier to assert that certains events are only preceded
by a specific event, and makes it much easier to debug unexpected
timelines.
Co-Authored-By: Étienne Barrié <etienne.barrie@gmail.com>
Co-Authored-By: JP Camara <jp@jpcamara.com>
Co-Authored-By: John Hawthorn <john@hawthorn.email>
* Before this it was compiled but not used, because TruffleRuby has
a stringio.rb in stdlib and .rb has precedence over .so.
In fact that extension never worked on TruffleRuby,
because rb_io_extract_modeenc() has never been defined on TruffleRuby.
* So this just skip compiling the extension since compilation of it now fails:
https://github.com/ruby/openssl/issues/699d791b63df6
Node has not been managed by GC from Ruby 2.5.
Therefore these codes are not needed. If ObjectSpace depends on Node,
it needs to update the file when node type is updated. Delete node
related codes to avoid such update.
Context: https://github.com/ivoanjo/gvl-tracing/pull/4
Some hooks may want to collect data on a per thread basis.
Right now the only way to identify the concerned thread is to
use `rb_nativethread_self()` or similar, but even then because
of the thread cache or MaNy, two distinct Ruby threads may report
the same native thread id.
By passing `thread->self`, hooks can use it as a key to store
the metadata.
NB: Most hooks are executed outside the GVL, so such data collection
need to use a thread-safe data-structure, and shouldn't use the
reference in other ways from inside the hook.
They must also either pin that value or handle compaction.