archive/ruby - Eplg Git: Free And Private Git Hosting

mirror of https://github.com/ruby/ruby.git synced 2025-08-23 04:55:21 +02:00

Author	SHA1	Message	Date
Misaki Shioi	b3baa11ee9	Improve Socket.tcp (#11187 ) [Feature #20646]Improve Socket.tcp This is a proposed improvement to `Socket.tcp`, which has implemented Happy Eyeballs version 2 (RFC8305) in PR9374. 1. Background I implemented Happy Eyeballs version 2 (HEv2) for Socket.tcp in PR9374, but several issues have been identified: - `IO.select` waits for name resolution or connection establishment in v46w, but it does not consider the case where both events occur simultaneously when it returns a value. - In this case, Socket.tcp can only capture one event and needs to execute an unnecessary loop to capture the other one, calling `IO.select` one extra time. - `IO.select` waits for both IPv6/IPv4 name resolution (in start), but when it returns a value, it doesn't consider the case where name resolution for both address families is complete. - In this case, `Socket.tcp` can only obtain the addresses of one address family and needs to execute an unnecessary loop obtain the other addresses, calling `IO.select` one extra time. - The consideration for `connect_timeout` was insufficient. After initiating one or more connections, it raises a 'user specified timeout' after the `connect_timeout` period even if there were addresses that have been resolved and have not yet tried to connect. - It does not retry with another address in case of a connection failure. - It executes unnecessary state transitions even when an IP address is passed as the `host` argument. - The regex for IP addresses did not correctly specify the start and end. 2. Proposal & Outcome To overcome the aforementioned issues, this PR introduces the following changes: - Previously, each loop iteration represented a single state transition. This has been changed to execute all processes that meet the execution conditions within a single loop iteration. - This prevents unnecessary repeated loops and calling `IO.select` - Introduced logic to determine the timeout value set for `IO.select`. During the Resolution Delay and Connection Attempt Delay, the user-specified timeout is ignored. Otherwise, the timeout value is set to the larger of `resolv_timeout` and `connect_timeout`. - This ensures that the `connect_timeout` is only detected after attempting to connect to all resolved addresses. - Retry with another address in case of a connection failure. - This prevents unnecessary repeated loops upon connection failure. - Call `tcp_without_fast_fallback` when an IP address is passed as the host argument. - This prevents unnecessary state transitions when an IP address is passed. - Fixed regex for IP addresses. Additionally, the code has been reduced by over 100 lines, and redundancy has been minimized, which is expected to improve readability. 3. Performance No significant performance changes were observed in the happy case before and after the improvement. However, improvements in state transition deficiencies are expected to enhance performance in edge cases. ```ruby require 'socket' require 'benchmark' Benchmark.bmbm do \|x\| x.report('fast_fallback: true') do 30.times { Socket.tcp("www.ruby-lang.org", 80) } end x.report('fast_fallback: false') do # Ruby3.3時点と同じ 30.times { Socket.tcp("www.ruby-lang.org", 80, fast_fallback: false) } end end ``` Before: ``` ~/s/build ❯❯❯ ../install/bin/ruby ../ruby/test.rb user system total real fast_fallback: true 0.021315 0.040723 0.062038 ( 0.504866) fast_fallback: false 0.007553 0.026248 0.033801 ( 0.533211) ``` After: ``` ~/s/build ❯❯❯ ../install/bin/ruby ../ruby/test.rb user system total real fast_fallback: true 0.023081 0.040525 0.063606 ( 0.406219) fast_fallback: false 0.007302 0.025515 0.032817 ( 0.418680) ```	2024-07-30 12:58:31 +09:00
Nobuyoshi Nakada	d8c6e91748	Fix dangling `else`	2024-06-23 09:42:25 +09:00
Dmitry Davydov	fba8aff7af	[Bug #20592 ] Fix segfault when sending NULL to freeaddrinfo On alpine freeaddrinfo does not accept NULL pointer	2024-06-22 22:05:31 +09:00
Koichi Sasada	bd583ca645	retry on cancelling of `getaddrinfo` When the registerred unblock function is called, it should retry the cancelled blocking function if possible after checkints. For example, `SIGCHLD` can cancel this method, but it should not raise any exception if there is no trap handlers. The following is repro-code: ```ruby require 'socket' PN = 10_000 1000000.times{ p _1 PN.times{ fork{ sleep rand(0.3) } } i = 0 while i<PN cpid = Process.wait -1, Process::WNOHANG if cpid # p [i, cpid] i += 1 end begin TCPServer.new(nil, 0).close rescue p $! exit! end end } ```	2024-06-21 22:36:42 +09:00
Yusuke Endoh	b346eb8f14	Raise EAI_SYSTEM when pthread_create fails in getaddrinfo Previously, EAI_AGAIN was raised. In our CI, "Temporary failure in name resolution" (EAI_AGAIN) is often raised. We are not sure if this was caused by pthread_create failure or getaddrinfo failure. To make it possible to distinguish between them, this changeset raises EAI_SYSTEM instead of EAI_AGAIN on pthread_create failure.	2024-06-03 10:44:30 +09:00
Nobuyoshi Nakada	a720a1c447	Suppress -Wmaybe-uninitialized warnings with LTO	2024-06-01 16:22:31 +09:00
卜部昌平	c844968b72	ruby tool/update-deps --fix	2024-04-27 21:55:28 +09:00
Nobuyoshi Nakada	e9a7801a93	Drop support for old ERB	2024-03-03 00:55:45 +09:00
Nobuyoshi Nakada	d4e24021d3	Revise `9ec342e07d`	2024-02-26 13:12:05 +09:00
Nobuyoshi Nakada	a0f7de814a	[Bug #20296 ] Fix the default assertion message	2024-02-26 12:29:23 +09:00
Misaki Shioi	9ec342e07d	Introduction of Happy Eyeballs Version 2 (RFC8305) in Socket.tcp (#9374 ) * Introduction of Happy Eyeballs Version 2 (RFC8305) in Socket.tcp This is an implementation of Happy Eyeballs version 2 (RFC 8305) in Socket.tcp. [Background] Currently, `Socket.tcp` synchronously resolves names and makes connection attempts with `Addrinfo::foreach.` This implementation has the following two problems. 1. In name resolution, the program stops until the DNS server responds to all DNS queries. 2. In a connection attempt, while an IP address is trying to connect to the destination host and is taking time, the program stops, and other resolved IP addresses cannot try to connect. [Proposal] "Happy Eyeballs" ([RFC 8305](https://datatracker.ietf.org/doc/html/rfc8305)) is an algorithm to solve this kind of problem. It avoids delays to the user whenever possible and also uses IPv6 preferentially. I implemented it into `Socket.tcp` by using `Addrinfo.getaddrinfo` in each thread spawned per address family to resolve the hostname asynchronously, and using `Socket::connect_nonblock` to try to connect with multiple addrinfo in parallel. [Outcome] This change eliminates a fatal defect in the following cases. Case 1. One of the A or AAAA DNS queries does not return --- require 'socket' class Addrinfo class << self # Current Socket.tcp depends on foreach def foreach(nodename, service, family=nil, socktype=nil, protocol=nil, flags=nil, timeout: nil, &block) getaddrinfo(nodename, service, Socket::AF_INET6, socktype, protocol, flags, timeout: timeout) .concat(getaddrinfo(nodename, service, Socket::AF_INET, socktype, protocol, flags, timeout: timeout)) .each(&block) end def getaddrinfo(_, _, family, _) case family when Socket::AF_INET6 then sleep when Socket::AF_INET then [Addrinfo.tcp("127.0.0.1", 4567)] end end end end Socket.tcp("localhost", 4567) --- Because the current `Socket.tcp` cannot resolve IPv6 names, the program stops in this case. It cannot start to connect with IPv4 address. Though `Socket.tcp` with HEv2 can promptly start a connection attempt with IPv4 address in this case. Case 2. Server does not promptly return ack for syn of either IPv4 / IPv6 address family --- require 'socket' fork do socket = Socket.new(Socket::AF_INET6, :STREAM) socket.setsockopt(:SOCKET, :REUSEADDR, true) socket.bind(Socket.pack_sockaddr_in(4567, '::1')) sleep socket.listen(1) connection, _ = socket.accept connection.close socket.close end fork do socket = Socket.new(Socket::AF_INET, :STREAM) socket.setsockopt(:SOCKET, :REUSEADDR, true) socket.bind(Socket.pack_sockaddr_in(4567, '127.0.0.1')) socket.listen(1) connection, _ = socket.accept connection.close socket.close end Socket.tcp("localhost", 4567) --- The current `Socket.tcp` tries to connect serially, so when its first name resolves an IPv6 address and initiates a connection to an IPv6 server, this server does not return an ACK, and the program stops. Though `Socket.tcp` with HEv2 starts to connect sequentially and in parallel so a connection can be established promptly at the socket that attempted to connect to the IPv4 server. In exchange, the performance of `Socket.tcp` with HEv2 will be degraded. --- 100.times { Socket.tcp("www.ruby-lang.org", 80) } --- This is due to the addition of the creation of IO objects, Thread objects, etc., and calls to `IO::select` in the implementation. Avoid NameError of Socket::EAI_ADDRFAMILY in MinGW * Support Windows with SO_CONNECT_TIME * Improve performance I have additionally implemented the following patterns: - If the host is single-stack, name resolution is performed in the main thread. This reduces the cost of creating threads. - If an IP address is specified, name resolution is performed in the main thread. This also reduces the cost of creating threads. - If only one IP address is resolved, connect is executed in blocking mode. This reduces the cost of calling IO::select. Also, I have added a fast_fallback option for users who wish not to use HE. Here are the results of each performance test. ```ruby require 'socket' require 'benchmark' HOSTNAME = "www.ruby-lang.org" PORT = 80 ai = Addrinfo.tcp(HOSTNAME, PORT) Benchmark.bmbm do \|x\| x.report("Domain name") do 30.times { Socket.tcp(HOSTNAME, PORT).close } end x.report("IP Address") do 30.times { Socket.tcp(ai.ip_address, PORT).close } end x.report("fast_fallback: false") do 30.times { Socket.tcp(HOSTNAME, PORT, fast_fallback: false).close } end end ``` ``` user system total real Domain name 0.015567 0.032511 0.048078 ( 0.325284) IP Address 0.004458 0.014219 0.018677 ( 0.284361) fast_fallback: false 0.005869 0.021511 0.027380 ( 0.321891) ```` And this is the measurement result when executed in a single stack environment. ``` user system total real Domain name 0.007062 0.019276 0.026338 ( 1.905775) IP Address 0.004527 0.012176 0.016703 ( 3.051192) fast_fallback: false 0.005546 0.019426 0.024972 ( 1.775798) ``` The following is the result of the run on Ruby 3.3.0. (on Dual stack environment) ``` user system total real Ruby 3.3.0 0.007271 0.027410 0.034681 ( 0.472510) ``` (on Single stack environment) ``` user system total real Ruby 3.3.0 0.005353 0.018898 0.024251 ( 1.774535) ``` * Do not cache `Socket.ip_address_list` As mentioned in the comment at https://github.com/ruby/ruby/pull/9374#discussion_r1482269186, caching Socket.ip_address_list does not follow changes in network configuration. But if we stop caching, it becomes necessary to check every time `Socket.tcp` is called whether it's a single stack or not, which could further degrade performance in the case of a dual stack. From this, I've changed the approach so that when a domain name is passed, it doesn't check whether it's a single stack or not and resolves names in parallel each time. The performance measurement results are as follows. require 'socket' require 'benchmark' HOSTNAME = "www.ruby-lang.org" PORT = 80 ai = Addrinfo.tcp(HOSTNAME, PORT) Benchmark.bmbm do \|x\| x.report("Domain name") do 30.times { Socket.tcp(HOSTNAME, PORT).close } end x.report("IP Address") do 30.times { Socket.tcp(ai.ip_address, PORT).close } end x.report("fast_fallback: false") do 30.times { Socket.tcp(HOSTNAME, PORT, fast_fallback: false).close } end end user system total real Domain name 0.004085 0.011873 0.015958 ( 0.330097) IP Address 0.000993 0.004400 0.005393 ( 0.257286) fast_fallback: false 0.001348 0.008266 0.009614 ( 0.298626) * Wait forever if fallback addresses are unresolved, unless resolv_timeout Changed from waiting only 3 seconds for name resolution when there is no fallback address available, to waiting as long as there is no resolv_timeout. This is in accordance with the current `Socket.tcp` specification. * Use exact pattern to match IPv6 address format for specify address family	2024-02-26 12:14:11 +09:00
Peter Zhu	ce8531fed4	Stop using rb_str_locktmp_ensure publicly rb_str_locktmp_ensure is a private API.	2024-02-23 14:08:29 -05:00
Marek Küthe	8b2c421a17	Add option for mtu discovery flag Signed-off-by: Marek Küthe <m.k@mk16.de>	2024-02-23 09:47:09 -08:00
Marek Küthe	4bb4327228	Fixes [Bug #20258 ] Signed-off-by: Marek Küthe <m.k@mk16.de>	2024-02-23 09:47:09 -08:00
KJ Tsanaktsidis	da33c5ac9f	Revert "Set AI_ADDRCONFIG when making getaddrinfo(3) calls for outgoing conns" This reverts commit `673ed41c81`.	2024-02-01 11:09:54 +11:00
Nobuyoshi Nakada	0f417d640d	Initialize errno variables and fix maybe-uninitialized warnings	2024-01-24 19:33:25 +09:00
KJ Tsanaktsidis	6c0e58a54e	Make sure the correct error is raised for EAI_SYSTEM resolver fail In case of EAI_SYSTEM, getaddrinfo is supposed to set more detail in errno; however, because we call getaddrinfo on a thread now, and errno is threadlocal, that information is being lost. Instead, we just raise whatever errno happens to be on the calling thread (which can be something very confusing, like `ECHILD`). Fix it by explicitly propagating errno back to the calling thread through the getaddrinfo_arg structure. [Bug #20198]	2024-01-22 14:34:31 +11:00
KJ Tsanaktsidis	61da90c1b8	Mark asan fake stacks during machine stack marking ASAN leaves a pointer to the fake frame on the stack; we can use the __asan_addr_is_in_fake_stack API to work out the extent of the fake stack and thus mark any VALUEs contained therein. [Bug #20001]	2024-01-19 09:55:12 +11:00
KJ Tsanaktsidis	688a6ff510	Revert "Mark asan fake stacks during machine stack marking" This reverts commit `d10bc3a2b8`.	2024-01-12 17:58:54 +11:00
KJ Tsanaktsidis	d10bc3a2b8	Mark asan fake stacks during machine stack marking ASAN leaves a pointer to the fake frame on the stack; we can use the __asan_addr_is_in_fake_stack API to work out the extent of the fake stack and thus mark any VALUEs contained therein. [Bug #20001]	2024-01-12 17:29:48 +11:00
Yusuke Endoh	1bd98c820d	Remove setaffinity of pthread for getaddrinfo It looks like `sched_getcpu(3)` returns a strange number on some (virtual?) environments. I decided to remove the setaffinity mechanism because the performance does not appear to degrade on a quick benchmark even if removed. [Bug #20172]	2024-01-11 12:38:16 +09:00
Adam Hess	6aacbd690c	Free pthread_attr after setting up the thread [bug #20149]	2024-01-05 08:56:44 +09:00
Jean Boussier	b2fc1b054e	Update `BasicSocket#recv` documentation on return value Ref: https://github.com/ruby/ruby/pull/6407 [Bug #19012] `0` is now interpreted as closed connection an not an empty packet, as these are very rare and pretty much useless.	2023-12-18 12:58:08 +01:00
Nobuyoshi Nakada	0601bce6fc	[DOC] Add Socket::ResolutionError documentation	2023-12-18 08:49:06 +09:00
Nobuyoshi Nakada	71c4a9c38f	[DOC] Correct the location of Addrinfo document The document must be placed immediately before the class definition. No other statements can be placed in between.	2023-12-18 08:47:59 +09:00
Nobuyoshi Nakada	e316128e3d	[DOC] Stop unintentional references to builtin or standard names	2023-12-18 08:38:59 +09:00
Nobuyoshi Nakada	7bfa1c3dc9	Revert "[DOC] Make undocumented socket constans nodoc" This reverts commit `cbda94edd8`, because `:nodoc:` does not work for constants. In the case of `rb_define_const`, RDoc parses the preceeding comment as in `"/* definition: comment */"` form.	2023-12-17 22:37:15 +09:00
Nobuyoshi Nakada	cbda94edd8	[DOC] Make undocumented socket constans nodoc	2023-12-17 20:17:45 +09:00
Nobuyoshi Nakada	557d929ba6	[DOC] Utilize COMMENTS.default_proc to add fallback documents	2023-12-17 20:17:05 +09:00
KJ Tsanaktsidis	25711e7063	Partially revert "Set AI_ADDRCONFIG when making getaddrinfo(3) calls" This _partially_ reverts commit `d2ba8ea54a`, but for UDP sockets only. With TCP sockets (and other things which use `rsock_init_inetsock`), the order of operations is to call `getaddrinfo(3)` with AF_UNSPEC, look at the returned addresses, pick one, and then call `socket(2)` with the family for that address (i.e. AF_INET or AF_INET6). With UDP sockets, however, this is reversed; `UDPSocket.new` takes an address family as an argument, and then calls `socket(2)` with that family. A subsequent call to UDPSocket#connect will then call `getaddrinfo(3)` with that family. The problem here is that... * If you are in a networking situation that _only_ has loopback addrs, * And you want to look up a name like "localhost" (or NULL) * And you pass AF_INET or AF_INET6 as the ai_family argument to getaddrinfo(3), * And you pass AI_ADDRCONFIG to the hints argument as well, then glibc on Linux will not return an address. This is because AI_ADDRCONFIG is supposed to return addresses for families we actually have an address for and could conceivably connect to, but also is documented to explicitly ignore localhost in that situation. It honestly doesn't make a ton of sense to pass AI_ADDRCONFIG if you're explicitly passing the address family anyway, because you're not looking for "an address for this name we can connect to"; you're looking for "an IPv(4\|6) address for this name". And the original glibc bug that `d2ba8ea5` was supposed to work around was related to parallel issuance of A and AAAA queries, which of course won't happen if an address family is explicitly specified. So, we fix this by not passing AI_ADDRCONFIG for calls to `rsock_addrinfo` that we also pass an explicit family to (i.e. for UDPsocket). [Bug #20048]	2023-12-12 20:05:21 +11:00
KJ Tsanaktsidis	f8effa209a	Change the semantics of rb_postponed_job_register Our current implementation of rb_postponed_job_register suffers from some safety issues that can lead to interpreter crashes (see bug #1991). Essentially, the issue is that jobs can be called with the wrong arguments. We made two attempts to fix this whilst keeping the promised semantics, but: * The first one involved masking/unmasking when flushing jobs, which was believed to be too expensive * The second one involved a lock-free, multi-producer, single-consumer ringbuffer, which was too complex The critical insight behind this third solution is that essentially the only user of these APIs are a) internal, or b) profiling gems. For a), none of the usages actually require variable data; they will work just fine with the preregistration interface. For b), generally profiling gems only call a single callback with a single piece of data (which is actually usually just zero) for the life of the program. The ringbuffer is complex because it needs to support multi-word inserts of job & data (which can't be atomic); but nobody actually even needs that functionality, really. So, this comit: * Introduces a pre-registration API for jobs, with a GVL-requiring rb_postponed_job_prereigster, which returns a handle which can be used with an async-signal-safe rb_postponed_job_trigger. * Deprecates rb_postponed_job_register (and re-implements it on top of the preregister function for compatability) * Moves all the internal usages of postponed job register pre-registration	2023-12-10 15:00:37 +09:00
KJ Tsanaktsidis	d2ba8ea54a	Set AI_ADDRCONFIG when making getaddrinfo(3) calls for outgoing conns (#7295 ) When making an outgoing TCP or UDP connection, set AI_ADDRCONFIG in the hints we send to getaddrinfo(3) (if supported). This will prompt the resolver to _NOT_ issue A or AAAA queries if the system does not actually have an IPv4 or IPv6 address (respectively). This makes outgoing connections marginally more efficient on non-dual-stack systems, since we don't have to try connecting to an address which can't possibly work. More importantly, however, this works around a race condition present in some older versions of glibc on aarch64 where it could accidently send the two outgoing DNS queries with the same DNS txnid, and get confused when receiving the responses. This manifests as outgoing connections sometimes taking 5 seconds (the DNS timeout before retry) to be made. Fixes #19144	2023-12-07 17:55:15 +09:00
Nobuyoshi Nakada	ac9fdb7a50	Adjust indent [ci skip]	2023-11-30 13:32:53 +09:00
Misaki Shioi	5f62b1d00c	Rename rsock_raise_socket_error to rsock_raise_resolution_error Again, rsock_raise_socket_error is called only when getaddrinfo and getaddrname fail	2023-11-30 13:27:19 +09:00
Misaki Shioi	52f6de4196	Replace SocketError with Socket::ResolutionError in rsock_raise_socket_error rsock_raise_socket_error is called only when getaddrinfo and getaddrname fail	2023-11-30 13:27:19 +09:00
Misaki Shioi	e9050270d7	Add Socket::ResolutionError & Socket::ResolutionError#error_code Socket::ResolutionError#error_code returns Socket::EAI_XXX	2023-11-30 13:27:19 +09:00
Yusuke Endoh	62c816410f	Retry pthread_create a few times According to https://bugs.openjdk.org/browse/JDK-8268605, pthread_create may fail spuriously. This change implements a simple retry as a modest measure, which is also used by JDK.	2023-11-28 20:49:12 +09:00
Yusuke Endoh	49b6dc8f07	Prevent cpu_set_t overflow even if there are more than 63 cores Do not use `pthread_attr_setaffinity_np` if `sched_getcpu()` exceeds `CPU_SETSIZE`. (Using `CPU_ALLOC()` would be more appropriate.)	2023-11-07 04:39:09 +09:00
Yusuke Endoh	deb6dd76e1	Fix a memory leak pointed by @nobu	2023-11-07 04:39:09 +09:00
Yusuke Endoh	dc636fec2a	Use pthread_attr_setaffinity_np instead of pthread_setaffinity_np	2023-11-07 04:39:09 +09:00
Yusuke Endoh	d0066211f2	Detach a pthread after pthread_setaffinity_np After a pthread for getaddrinfo is detached, we cannot predict when the thread will exit. It would lead to a segfault by setting pthread_setaffinity to the terminated pthread. I guess this problem would be more likely to occur in high-load environments. This change detaches the pthread after pthread_setaffinity is called. [Feature #19965]	2023-11-07 04:39:09 +09:00
Yusuke Endoh	15560cce5f	Revert "Do not use pthread_setaffinity_np on s390x" This reverts commit `de82439215`.	2023-11-07 04:39:09 +09:00
Yusuke Endoh	de82439215	Do not use pthread_setaffinity_np on s390x Looks like it randomly causes a segfault `20231025`T093302Z.fail.html.gz ``` [11186/26148] TestNetHTTP_v1_2#test_set_form/home/chkbuild/build/20231025T093302Z/ruby/tool/lib/webrick/httprequest.rb:197: [BUG] Segmentation fault at 0x000003ff1ffff000 ruby 3.3.0dev (2023-10-25T07:50:00Z master `526292d9fe`) [s390x-linux] ```	2023-10-25 20:04:18 +09:00
Yusuke Endoh	25c1204fe7	rb_getaddrinfo should return EAI_AGAIN instead of EAGAIN	2023-10-24 12:22:53 +09:00
Yusuke Endoh	c08020254e	Indent critical regions with blocks Cosmetic change per ko1's preference	2023-10-24 12:22:53 +09:00
Yusuke Endoh	acd774263c	Do not use pthread on mingw	2023-10-24 12:22:53 +09:00
Yusuke Endoh	16d6a22757	Make rb_getnameinfo interruptible Same as previous commit for rb_getnameinfo.	2023-10-24 12:22:53 +09:00
Yusuke Endoh	3dc311bdc8	Make rb_getaddrinfo interruptible When pthread_create is available, rb_getaddrinfo creates a pthread and executes getaddrinfo(3) in it. The caller thread waits for the pthread to complete, but detaches it if interrupted. This allows name resolution to be interuppted by Timeout.timeout, etc. even if it takes a long time (for example, when the DNS server does not respond). [Feature #19965]	2023-10-24 12:22:53 +09:00
Yusuke Endoh	efd58f19ea	Expand macro branches to make them plain	2023-10-24 12:22:53 +09:00
Yusuke Endoh	25ef8d262a	Refactor GETADDRINFO_IMPL instead of GETADDRINFO_EMU This is a preparation for introducing cancellable getaddrinfo/getnameinfo.	2023-10-24 12:22:53 +09:00

1 2 3 4 5 ...

1195 commits