Commit graph

1188 commits

Author SHA1 Message Date
Nobuyoshi Nakada
e9a7801a93
Drop support for old ERB 2024-03-03 00:55:45 +09:00
Nobuyoshi Nakada
d4e24021d3
Revise 9ec342e07d 2024-02-26 13:12:05 +09:00
Nobuyoshi Nakada
a0f7de814a
[Bug #20296] Fix the default assertion message 2024-02-26 12:29:23 +09:00
Misaki Shioi
9ec342e07d
Introduction of Happy Eyeballs Version 2 (RFC8305) in Socket.tcp (#9374)
* Introduction of Happy Eyeballs Version 2 (RFC8305) in Socket.tcp

This is an implementation of Happy Eyeballs version 2 (RFC 8305) in Socket.tcp.

[Background]
Currently, `Socket.tcp` synchronously resolves names and makes connection attempts with `Addrinfo::foreach.`
This implementation has the following two problems.

1. In name resolution, the program stops until the DNS server responds to all DNS queries.
2. In a connection attempt, while an IP address is trying to connect to the destination host and is taking time, the program stops, and other resolved IP addresses cannot try to connect.

[Proposal]
"Happy Eyeballs" ([RFC 8305](https://datatracker.ietf.org/doc/html/rfc8305)) is an algorithm to solve this kind of problem. It avoids delays to the user whenever possible and also uses IPv6 preferentially.

I implemented it into `Socket.tcp` by using `Addrinfo.getaddrinfo` in each thread spawned per address family to resolve the hostname asynchronously, and using `Socket::connect_nonblock` to try to connect with multiple addrinfo in parallel.

[Outcome]

This change eliminates a fatal defect in the following cases.

Case 1. One of the A or AAAA DNS queries does not return

---
require 'socket'

class Addrinfo
  class << self
    # Current Socket.tcp depends on foreach
    def foreach(nodename, service, family=nil, socktype=nil, protocol=nil, flags=nil, timeout: nil, &block)
      getaddrinfo(nodename, service, Socket::AF_INET6, socktype, protocol, flags, timeout: timeout)
        .concat(getaddrinfo(nodename, service, Socket::AF_INET, socktype, protocol, flags, timeout: timeout))
        .each(&block)
    end

    def getaddrinfo(_, _, family, *_)
      case family
      when Socket::AF_INET6 then sleep
      when Socket::AF_INET then [Addrinfo.tcp("127.0.0.1", 4567)]
      end
    end
  end
end

Socket.tcp("localhost", 4567)
---

Because the current `Socket.tcp` cannot resolve IPv6 names, the program stops in this case. It cannot start to connect with IPv4 address.
Though `Socket.tcp` with HEv2 can promptly start a connection attempt with IPv4 address in this case.

 Case 2. Server does not promptly return ack for syn of either IPv4 / IPv6 address family

---
require 'socket'

fork do
  socket = Socket.new(Socket::AF_INET6, :STREAM)
  socket.setsockopt(:SOCKET, :REUSEADDR, true)
  socket.bind(Socket.pack_sockaddr_in(4567, '::1'))
  sleep
  socket.listen(1)
  connection, _ = socket.accept
  connection.close
  socket.close
end

fork do
  socket = Socket.new(Socket::AF_INET, :STREAM)
  socket.setsockopt(:SOCKET, :REUSEADDR, true)
  socket.bind(Socket.pack_sockaddr_in(4567, '127.0.0.1'))
  socket.listen(1)
  connection, _ = socket.accept
  connection.close
  socket.close
end

Socket.tcp("localhost", 4567)
---

The current `Socket.tcp` tries to connect serially, so when its first name resolves an IPv6 address and initiates a connection to an IPv6 server, this server does not return an ACK, and the program stops.
Though `Socket.tcp` with HEv2 starts to connect sequentially and in parallel so a connection can be established promptly at the socket that attempted to connect to the IPv4 server.

In exchange, the performance of `Socket.tcp` with HEv2 will be degraded.

---
100.times { Socket.tcp("www.ruby-lang.org", 80) }
---

This is due to the addition of the creation of IO objects, Thread objects, etc., and calls to `IO::select` in the implementation.

* Avoid NameError of Socket::EAI_ADDRFAMILY in MinGW

* Support Windows with SO_CONNECT_TIME

* Improve performance

I have additionally implemented the following patterns:

- If the host is single-stack, name resolution is performed in the main thread. This reduces the cost of creating threads.
- If an IP address is specified, name resolution is performed in the main thread. This also reduces the cost of creating threads.
- If only one IP address is resolved, connect is executed in blocking mode. This reduces the cost of calling IO::select.

Also, I have added a fast_fallback option for users who wish not to use HE.
Here are the results of each performance test.

```ruby
require 'socket'
require 'benchmark'

HOSTNAME = "www.ruby-lang.org"
PORT = 80

ai = Addrinfo.tcp(HOSTNAME, PORT)

Benchmark.bmbm do |x|
  x.report("Domain name") do
    30.times { Socket.tcp(HOSTNAME, PORT).close }
  end

  x.report("IP Address") do
    30.times { Socket.tcp(ai.ip_address, PORT).close }
  end

  x.report("fast_fallback: false") do
    30.times { Socket.tcp(HOSTNAME, PORT, fast_fallback: false).close }
  end
end
```

```
                           user     system      total        real
Domain name            0.015567   0.032511   0.048078 (  0.325284)
IP Address             0.004458   0.014219   0.018677 (  0.284361)
fast_fallback: false   0.005869   0.021511   0.027380 (  0.321891)
````

And this is the measurement result when executed in a single stack environment.

```
                           user     system      total        real
Domain name            0.007062   0.019276   0.026338 (  1.905775)
IP Address             0.004527   0.012176   0.016703 (  3.051192)
fast_fallback: false   0.005546   0.019426   0.024972 (  1.775798)
```

The following is the result of the run on Ruby 3.3.0.

(on Dual stack environment)

```
                 user     system      total        real
Ruby 3.3.0   0.007271   0.027410   0.034681 (  0.472510)
```

(on Single stack environment)

```
                 user     system      total        real
Ruby 3.3.0  0.005353   0.018898   0.024251 (  1.774535)
```

* Do not cache `Socket.ip_address_list`

As mentioned in the comment at https://github.com/ruby/ruby/pull/9374#discussion_r1482269186, caching Socket.ip_address_list does not follow changes in network configuration.
But if we stop caching, it becomes necessary to check every time `Socket.tcp` is called whether it's a single stack or not, which could further degrade performance in the case of a dual stack.
From this, I've changed the approach so that when a domain name is passed, it doesn't check whether it's a single stack or not and resolves names in parallel each time.

The performance measurement results are as follows.

require 'socket'
require 'benchmark'

HOSTNAME = "www.ruby-lang.org"
PORT = 80

ai = Addrinfo.tcp(HOSTNAME, PORT)

Benchmark.bmbm do |x|
  x.report("Domain name") do
    30.times { Socket.tcp(HOSTNAME, PORT).close }
  end

  x.report("IP Address") do
    30.times { Socket.tcp(ai.ip_address, PORT).close }
  end

  x.report("fast_fallback: false") do
    30.times { Socket.tcp(HOSTNAME, PORT, fast_fallback: false).close }
  end
end

                           user     system      total        real
Domain name            0.004085   0.011873   0.015958 (  0.330097)
IP Address             0.000993   0.004400   0.005393 (  0.257286)
fast_fallback: false   0.001348   0.008266   0.009614 (  0.298626)

* Wait forever if fallback addresses are unresolved, unless resolv_timeout

Changed from waiting only 3 seconds for name resolution when there is no fallback address available, to waiting as long as there is no resolv_timeout.
This is in accordance with the current `Socket.tcp` specification.

* Use exact pattern to match IPv6 address format for specify address family
2024-02-26 12:14:11 +09:00
Peter Zhu
ce8531fed4 Stop using rb_str_locktmp_ensure publicly
rb_str_locktmp_ensure is a private API.
2024-02-23 14:08:29 -05:00
Marek Küthe
8b2c421a17 Add option for mtu discovery flag
Signed-off-by: Marek Küthe <m.k@mk16.de>
2024-02-23 09:47:09 -08:00
Marek Küthe
4bb4327228 Fixes [Bug #20258]
Signed-off-by: Marek Küthe <m.k@mk16.de>
2024-02-23 09:47:09 -08:00
KJ Tsanaktsidis
da33c5ac9f Revert "Set AI_ADDRCONFIG when making getaddrinfo(3) calls for outgoing conns"
This reverts commit 673ed41c81.
2024-02-01 11:09:54 +11:00
Nobuyoshi Nakada
0f417d640d
Initialize errno variables and fix maybe-uninitialized warnings 2024-01-24 19:33:25 +09:00
KJ Tsanaktsidis
6c0e58a54e Make sure the correct error is raised for EAI_SYSTEM resolver fail
In case of EAI_SYSTEM, getaddrinfo is supposed to set more detail in
errno; however, because we call getaddrinfo on a thread now, and errno
is threadlocal, that information is being lost. Instead, we just raise
whatever errno happens to be on the calling thread (which can be
something very confusing, like `ECHILD`).

Fix it by explicitly propagating errno back to the calling thread
through the getaddrinfo_arg structure.

[Bug #20198]
2024-01-22 14:34:31 +11:00
KJ Tsanaktsidis
61da90c1b8 Mark asan fake stacks during machine stack marking
ASAN leaves a pointer to the fake frame on the stack; we can use the
__asan_addr_is_in_fake_stack API to work out the extent of the fake
stack and thus mark any VALUEs contained therein.

[Bug #20001]
2024-01-19 09:55:12 +11:00
KJ Tsanaktsidis
688a6ff510 Revert "Mark asan fake stacks during machine stack marking"
This reverts commit d10bc3a2b8.
2024-01-12 17:58:54 +11:00
KJ Tsanaktsidis
d10bc3a2b8 Mark asan fake stacks during machine stack marking
ASAN leaves a pointer to the fake frame on the stack; we can use the
__asan_addr_is_in_fake_stack API to work out the extent of the fake
stack and thus mark any VALUEs contained therein.

[Bug #20001]
2024-01-12 17:29:48 +11:00
Yusuke Endoh
1bd98c820d Remove setaffinity of pthread for getaddrinfo
It looks like `sched_getcpu(3)` returns a strange number on some
(virtual?) environments.

I decided to remove the setaffinity mechanism because the performance
does not appear to degrade on a quick benchmark even if removed.

[Bug #20172]
2024-01-11 12:38:16 +09:00
Adam Hess
6aacbd690c Free pthread_attr after setting up the thread
[bug #20149]
2024-01-05 08:56:44 +09:00
Jean Boussier
b2fc1b054e Update BasicSocket#recv documentation on return value
Ref: https://github.com/ruby/ruby/pull/6407

[Bug #19012]

`0` is now interpreted as closed connection an not an
empty packet, as these are very rare and pretty much
useless.
2023-12-18 12:58:08 +01:00
Nobuyoshi Nakada
0601bce6fc
[DOC] Add Socket::ResolutionError documentation 2023-12-18 08:49:06 +09:00
Nobuyoshi Nakada
71c4a9c38f
[DOC] Correct the location of Addrinfo document
The document must be placed immediately before the class definition.
No other statements can be placed in between.
2023-12-18 08:47:59 +09:00
Nobuyoshi Nakada
e316128e3d
[DOC] Stop unintentional references to builtin or standard names 2023-12-18 08:38:59 +09:00
Nobuyoshi Nakada
7bfa1c3dc9
Revert "[DOC] Make undocumented socket constans nodoc"
This reverts commit cbda94edd8, because
`:nodoc:` does not work for constants.

In the case of `rb_define_const`, RDoc parses the preceeding comment
as in `"/* definition: comment */"` form.
2023-12-17 22:37:15 +09:00
Nobuyoshi Nakada
cbda94edd8
[DOC] Make undocumented socket constans nodoc 2023-12-17 20:17:45 +09:00
Nobuyoshi Nakada
557d929ba6
[DOC] Utilize COMMENTS.default_proc to add fallback documents 2023-12-17 20:17:05 +09:00
KJ Tsanaktsidis
25711e7063 Partially revert "Set AI_ADDRCONFIG when making getaddrinfo(3) calls"
This _partially_ reverts commit
d2ba8ea54a, but for UDP sockets only.

With TCP sockets (and other things which use `rsock_init_inetsock`), the
order of operations is to call `getaddrinfo(3)` with AF_UNSPEC, look at
the returned addresses, pick one, and then call `socket(2)` with the
family for that address (i.e. AF_INET or AF_INET6).

With UDP sockets, however, this is reversed; `UDPSocket.new` takes an
address family as an argument, and then calls `socket(2)` with that
family. A subsequent call to UDPSocket#connect will then call
`getaddrinfo(3)` with that family.

The problem here is that...

* If you are in a networking situation that _only_ has loopback addrs,
* And you want to look up a name like "localhost" (or NULL)
* And you pass AF_INET or AF_INET6 as the ai_family argument to
  getaddrinfo(3),
* And you pass AI_ADDRCONFIG to the hints argument as well,

then glibc on Linux will not return an address. This is because
AI_ADDRCONFIG is supposed to return addresses for families we actually
have an address for and could conceivably connect to, but also is
documented to explicitly ignore localhost in that situation.

It honestly doesn't make a ton of sense to pass AI_ADDRCONFIG if you're
explicitly passing the address family anyway, because you're not looking
for "an address for this name we can connect to"; you're looking for "an
IPv(4|6) address for this name". And the original glibc bug that
d2ba8ea5 was supposed to work around was related to parallel issuance of
A and AAAA queries, which of course won't happen if an address family is
explicitly specified.

So, we fix this by not passing AI_ADDRCONFIG for calls to
`rsock_addrinfo` that we also pass an explicit family to (i.e. for
UDPsocket).

[Bug #20048]
2023-12-12 20:05:21 +11:00
KJ Tsanaktsidis
f8effa209a Change the semantics of rb_postponed_job_register
Our current implementation of rb_postponed_job_register suffers from
some safety issues that can lead to interpreter crashes (see bug #1991).
Essentially, the issue is that jobs can be called with the wrong
arguments.

We made two attempts to fix this whilst keeping the promised semantics,
but:
  * The first one involved masking/unmasking when flushing jobs, which
    was believed to be too expensive
  * The second one involved a lock-free, multi-producer, single-consumer
    ringbuffer, which was too complex

The critical insight behind this third solution is that essentially the
only user of these APIs are a) internal, or b) profiling gems.

For a), none of the usages actually require variable data; they will
work just fine with the preregistration interface.

For b), generally profiling gems only call a single callback with a
single piece of data (which is actually usually just zero) for the life
of the program. The ringbuffer is complex because it needs to support
multi-word inserts of job & data (which can't be atomic); but nobody
actually even needs that functionality, really.

So, this comit:
  * Introduces a pre-registration API for jobs, with a GVL-requiring
    rb_postponed_job_prereigster, which returns a handle which can be
    used with an async-signal-safe rb_postponed_job_trigger.
  * Deprecates rb_postponed_job_register (and re-implements it on top of
    the preregister function for compatability)
  * Moves all the internal usages of postponed job register
    pre-registration
2023-12-10 15:00:37 +09:00
KJ Tsanaktsidis
d2ba8ea54a
Set AI_ADDRCONFIG when making getaddrinfo(3) calls for outgoing conns (#7295)
When making an outgoing TCP or UDP connection, set AI_ADDRCONFIG in the
hints we send to getaddrinfo(3) (if supported). This will prompt the
resolver to _NOT_ issue A or AAAA queries if the system does not
actually have an IPv4 or IPv6 address (respectively).

This makes outgoing connections marginally more efficient on
non-dual-stack systems, since we don't have to try connecting to an
address which can't possibly work.

More importantly, however, this works around a race condition present
in some older versions of glibc on aarch64 where it could accidently
send the two outgoing DNS queries with the same DNS txnid, and get
confused when receiving the responses. This manifests as outgoing
connections sometimes taking 5 seconds (the DNS timeout before retry) to
be made.

Fixes #19144
2023-12-07 17:55:15 +09:00
Nobuyoshi Nakada
ac9fdb7a50
Adjust indent [ci skip] 2023-11-30 13:32:53 +09:00
Misaki Shioi
5f62b1d00c Rename rsock_raise_socket_error to rsock_raise_resolution_error
Again, rsock_raise_socket_error is called only when getaddrinfo and getaddrname fail
2023-11-30 13:27:19 +09:00
Misaki Shioi
52f6de4196 Replace SocketError with Socket::ResolutionError in rsock_raise_socket_error
rsock_raise_socket_error is called only when getaddrinfo and getaddrname fail
2023-11-30 13:27:19 +09:00
Misaki Shioi
e9050270d7 Add Socket::ResolutionError & Socket::ResolutionError#error_code
Socket::ResolutionError#error_code returns Socket::EAI_XXX
2023-11-30 13:27:19 +09:00
Yusuke Endoh
62c816410f Retry pthread_create a few times
According to https://bugs.openjdk.org/browse/JDK-8268605, pthread_create
may fail spuriously. This change implements a simple retry as a modest
measure, which is also used by JDK.
2023-11-28 20:49:12 +09:00
Yusuke Endoh
49b6dc8f07 Prevent cpu_set_t overflow even if there are more than 63 cores
Do not use `pthread_attr_setaffinity_np` if `sched_getcpu()` exceeds
`CPU_SETSIZE`. (Using `CPU_ALLOC()` would be more appropriate.)
2023-11-07 04:39:09 +09:00
Yusuke Endoh
deb6dd76e1 Fix a memory leak
pointed by @nobu
2023-11-07 04:39:09 +09:00
Yusuke Endoh
dc636fec2a Use pthread_attr_setaffinity_np instead of pthread_setaffinity_np 2023-11-07 04:39:09 +09:00
Yusuke Endoh
d0066211f2 Detach a pthread after pthread_setaffinity_np
After a pthread for getaddrinfo is detached, we cannot predict when the
thread will exit. It would lead to a segfault by setting
pthread_setaffinity to the terminated pthread.  I guess this problem
would be more likely to occur in high-load environments.

This change detaches the pthread after pthread_setaffinity is called.
[Feature #19965]
2023-11-07 04:39:09 +09:00
Yusuke Endoh
15560cce5f Revert "Do not use pthread_setaffinity_np on s390x"
This reverts commit de82439215.
2023-11-07 04:39:09 +09:00
Yusuke Endoh
de82439215 Do not use pthread_setaffinity_np on s390x
Looks like it randomly causes a segfault

20231025T093302Z.fail.html.gz
```
[11186/26148] TestNetHTTP_v1_2#test_set_form/home/chkbuild/build/20231025T093302Z/ruby/tool/lib/webrick/httprequest.rb:197: [BUG] Segmentation fault at 0x000003ff1ffff000
ruby 3.3.0dev (2023-10-25T07:50:00Z master 526292d9fe) [s390x-linux]
```
2023-10-25 20:04:18 +09:00
Yusuke Endoh
25c1204fe7 rb_getaddrinfo should return EAI_AGAIN instead of EAGAIN 2023-10-24 12:22:53 +09:00
Yusuke Endoh
c08020254e Indent critical regions with blocks
Cosmetic change per ko1's preference
2023-10-24 12:22:53 +09:00
Yusuke Endoh
acd774263c Do not use pthread on mingw 2023-10-24 12:22:53 +09:00
Yusuke Endoh
16d6a22757 Make rb_getnameinfo interruptible
Same as previous commit for rb_getnameinfo.
2023-10-24 12:22:53 +09:00
Yusuke Endoh
3dc311bdc8 Make rb_getaddrinfo interruptible
When pthread_create is available, rb_getaddrinfo creates a pthread and
executes getaddrinfo(3) in it. The caller thread waits for the pthread
to complete, but detaches it if interrupted. This allows name resolution
to be interuppted by Timeout.timeout, etc. even if it takes a long time
(for example, when the DNS server does not respond).  [Feature #19965]
2023-10-24 12:22:53 +09:00
Yusuke Endoh
efd58f19ea Expand macro branches to make them plain 2023-10-24 12:22:53 +09:00
Yusuke Endoh
25ef8d262a Refactor GETADDRINFO_IMPL instead of GETADDRINFO_EMU
This is a preparation for introducing cancellable
getaddrinfo/getnameinfo.
2023-10-24 12:22:53 +09:00
Yusuke Endoh
9ce607a8d4 refactor a call to getaddrinfo 2023-10-24 12:22:53 +09:00
Yusuke Endoh
7362c484c8 Use rb_getnameinfo instead of directly using getnameinfo 2023-10-17 18:21:40 +09:00
Jean Boussier
bcc905100f BasicSocket#recv* return nil rather than an empty packet
[Bug #19012]

man recvmsg(2) states:

> Return Value
> These calls return the number of bytes received, or -1 if an error occurred.
> The return value will be 0 when the peer has performed an orderly shutdown.

Not too sure how one is supposed to make the difference between a packet of
size 0 and a closed connection.
2023-08-30 10:07:18 +02:00
Peter Zhu
58386814a7 Don't check for null pointer in calls to free
According to the C99 specification section 7.20.3.2 paragraph 2:

> If ptr is a null pointer, no action occurs.

So we do not need to check that the pointer is a null pointer.
2023-06-30 09:13:31 -04:00
yui-knk
b481b673d7 [Feature #19719] Universal Parser
Introduce Universal Parser mode for the parser.
This commit includes these changes:

* Introduce `UNIVERSAL_PARSER` macro. All of CRuby related functions
  are passed via `struct rb_parser_config_struct` when this macro is enabled.
* Add CI task with 'cppflags=-DUNIVERSAL_PARSER' for ubuntu.
2023-06-12 18:23:48 +09:00
Matt Valentine-House
2a34bcaa10 Update VPATH for socket, & dependencies
The socket extensions rubysocket.h pulls in the "private" include/gc.h,
which now depends on vm_core.h. vm_core.h pulls in id.h

when tool/update-deps generates the dependencies for the makefiles, it
generates the line for id.h to be based on VPATH, which is configured in
the extconf.rb for each of the extensions. By default VPATH does not
include the actual source directory of the current Ruby so the
dependency fails to resolve and linking fails.

We need to append the topdir and top_srcdir to VPATH to have the
dependancy picked up correctly (and I believe we need both of these to
cope with in-tree and out-of-tree builds).

I copied this from the approach taken in
https://github.com/ruby/ruby/blob/master/ext/objspace/extconf.rb#L3
2023-04-06 11:07:16 +01:00
Matt Valentine-House
5e4b80177e Update the depend files 2023-02-28 09:09:00 -08:00