Zero register, i.e. xzr, can be used directly to zero FP register.
TODO: FMOV from ZR may be slow on some cores and the preferred
instructio is MOVI with immediate zero [1]. However, MOVI is not
recoginized by DynASM/arm64.
[1] https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=523d72071960
Change-Id: I0eaee4445e05adb45c6bb80ddb62ea02cdc9f4db
Newer MSVC versions support the `inline` keyword, so we do not need to explicitly define it. We also shouldn't define `inline` to `__forceinline`, as we already have `zend_always_inline` to indicate this requirement.
* JIT/AArch64: [macos][ZTS] Support fast path for tlv_get_addr
Access to TLV(thread local variable) in macOS is in "dynamic" form and
function tlv_get_addr() is invoked to resolve the address. See the
example in [1].
Note there is one struct TLVDescriptor [2] for each TLV. The first
member holds the address of function tlv_get_addr(), and the other two
members, "key" and "offset", would be used inside tlv_get_addr().
The disassembly code for function tlv_get_addr() is shown in [3]. With
the value from system register, i.e. tpidrro_el0, together with "key"
and "offset", the TLV address can be obtained.
Note that the value from tpidrro_el0 varies for different threads, and
unique address for TLV is resolved.
It's worth noting that slow path would be executed, i.e. function
tlv_allocate_and_initialize_for_key(), for the first time of TLV access.
In this patch:
1. "_tsrm_ls_cache" is guaranteed to be accessed before any VM/JIT code
during the request startup, e.g. in init_executor(), therefore, slow
path can be avoided.
2. As TLVDecriptor is immutable and zend_jit_setup() executes once, we
get this structure in tsrm_get_ls_cache_tcb_offset(). Note the 'ldr'
instruction would be patched to 'add' by the linker.
3. Only fast path for tlv_get_addr() is implemented in macro
LOAD_TSRM_CACHE.
With this patch, all ~4k test cases can pass for ZTS+CALL in macOS on
Apple silicon.
[1] https://gist.github.com/shqking/4aab67e0105f7c1f2c549d57d5799f94
[2]
https://opensource.apple.com/source/dyld/dyld-195.6/src/threadLocalVariables.c.auto.html
[3] https://gist.github.com/shqking/329d7712c26bad49786ab0a544a4af43
Change-Id: I613e9c37e3ff2ecc3fab0f53f1e48a0246e12ee3
Excerpt from the release news:
Version 10.37 26-May-2021
-------------------------
A few more bug fixes and tidies. The only change of real note is the removal of
the actual POSIX names regcomp etc. from the POSIX wrapper library because
these have caused issues for some applications (see 10.33 #2 below).
Version 10.36 04-December-2020
------------------------------
Again, mainly bug fixes and tidies. The only enhancements are the addition of
GNU grep's -m (aka --max-count) option to pcre2grep, and also unifying the
handling of substitution strings for both -O and callouts in pcre2grep, with
the addition of $x{...} and $o{...} to allow for characters whose code points
are greater than 255 in Unicode mode.
NOTE: there is an outstanding issue with JIT support for MacOS on arm64
hardware. For details, please see Bugzilla issue #2618.
Signed-off-by: Anatol Belski <ab@php.net>
This format matches against null bytes, and prevents the test
expectation from being interpreted as binary data.
bless_tests.php will automatically replace \0 with %0 as well.
While reading the docs in the repo, I found & made a few improvements
to the documentation so it's clearer to its readers.
These improvements are around: typos, general punctuations, and grammar.
This tests takes about 2 minutes on AppVeyor CI, what is super slow.
The problem is that we're doing 50,000 inserts of small keys and values
instead of only few inserts with large values, what basically has the
same effect regarding the mmap size.
Closes GH-7073.
Now that inheritance can throw deprecations again, these may be
converted to exception by a custom error handler. In this case
we need to convert the exception to a fatal error, as inheritance
cannot safely throw in the general case.
- make_fcontext() never returns NULL.
- Show syscall error info in exception just like zend_alloc does (we also used php_win32_error_*() in Zend).
- MAP_ANON was marked as deprecated or compatibility macro on Linux, It would be better to use MAP_ANONYMOUS first.
- Makes the code consistent with the code of other modules in the kernel.
- adds some comments.
When row data split across multiple packets, allocate a temporary
buffer that can be reallocated, and only copy into the row buffer
pool arena once we know the final size. This avoids quadratic
memory usage for very large results.
(cherry picked from commit 1fc4c89214)