While __php_mempcpy is only used by ext/standard/crypt_sha*, the
mempcpy "pattern" is used everywhere.
This commit removes __php_mempcpy, adds zend_mempcpy and transforms
open-coded parts into function calls.
Thanks to Maurício Fauth for finding and reporting this bug.
The bug was introduced in October 2022. It originally only affected
text encodings which do not have a fixed byte width per characters
and for which mbstring does not have an mblen_table. However, I recently
made another change to mbstring, such that mb_substr no longer relies
on the mblen_table even if one is available. Because of this change,
the bug earlier introduced in October 2022 now affected a greater
number of text encodings, including UTF-8.
Because these functions are copied and not properly registered (which we
can't), the observer code doesn't add the temporaries on startup.
Add them via a callback during startup.
Closes GH-12906.
For GB18030, it is not generally possible to identify character
boundaries without scanning through the entire string. Therefore,
implement mb_strcut using a similar strategy as the mblen_table based
implementation in mbstring.c. The difference is that for GB18030, we
need to look at two leading bytes to determine the byte length of a
multi-byte character.
The new implementation is 4-5x faster for short strings, and more than
10x faster for long strings. (Part of the reason why this new code has
such a great performance advantage is because it is replacing code
based on the older text conversion filters provided by libmbfl, which
were quite slow.)
The behavior is the same as before for valid GB18030 strings; for
some invalid strings, mb_strcut will choose different 'cut' points
as compared to before. (Clang's libFuzzer was used to compare the
old and new implementations, searching for test cases where they had
different behavior; no such cases were found.)
* PHP-8.3:
Fix GH-12929: SimpleXMLElement with stream_wrapper_register can segfault
Fix getting the address of an uninitialized property of a SimpleXMLElement resulting in a crash
Fix GH-12962: Double free of init_file in phpdbg_prompt.c
Move SimpleXML invalidation code after node checks
This is safe, i.e. the tree hasn't been modified yet, because either we
didn't call a libxml modification function yet, or xmlNewChild is called
with a NULL pointer, which makes it bail out and return NULL.
Closes GH-12947.
* PHP-8.2:
Fix getting the address of an uninitialized property of a SimpleXMLElement resulting in a crash
Fix GH-12962: Double free of init_file in phpdbg_prompt.c
* Use strcspn() to optimize dom_html5_escape_string()
This routine implemented by libc uses a faster algorithm than the old
naive byte-per-byte approach here. It also is often optimized using
SIMD.
* Calculate mask outside of loop