PCRE2_EXTRA_CASELESS_RESTRICT is only available as of pcre2 10.43.
Note: no check is necessary for pcre2_set_compile_extra_options because
it is available since pcre2 10.30, which is the minimum version PHP
requires.
The memmove() function is C99 standard function [1] and check was left for
the PCRE2 bundled library. It can be simplified by passing the compile
option instead of checking always available function on current systems.
External PCRE2 library on the system doesn't need this.
[1]: https://port70.net/~nsz/c/c99/n1256.html#7.21.2.2
Adds support for "Caseless restricted" matching added in PCRE2lib
10.43 with the "r" modifier.
This is `PCRE2_EXTRA_CASELESS_RESTRICT` in PCRE2. This is an "extra"
option, which means it is not possible to pass this option as
pcre2_compile() function parameter.
This option is passed in a pcre2_set_compile_extra_options() call.
Previously, these extra options are set at php_pcre_init_pcre2(),
but after this change, it is possible to customize the options
by adding bits to `eoptions` in pcre_get_compiled_regex_cache_ex().
The tests for this change are ported from upstream test suite[^1].
[^1]: c13d54f658 (diff-8c8312e4eb2d35bb16485404b7b5cc0eaef0bca1aa95ff5febf6a1890048305c)
While __php_mempcpy is only used by ext/standard/crypt_sha*, the
mempcpy "pattern" is used everywhere.
This commit removes __php_mempcpy, adds zend_mempcpy and transforms
open-coded parts into function calls.
* Update signature of pcre API
This changes the variables that are bools to actually be bools instead
of ints, which allows some additional optimization by the compiler (e.g.
removing some ternaries and move extensions).
It also gets rid of the use_flags argument because that's just the same
as flags == 0. This reduces the call frame.
* Use zend_string_release_ex where possible
* Remove duplicate symbols from strchr
* Avoid useless value conversions
* Use a raw HashTable* instead of a zval
* Move condition
* Make for loop cheaper by reusing a recently used value as start iteration index
* Remove useless condition
This can't be true if the second condition is true because it would
require the string to occupy the entire address space.
* Upgrading + remark
* Always inline populate_match_value and fix argument type
The call overhead of this function is quite large.
* Use _new variant of zend_hash in some places to avoid additional check
* Move allocation of match_sets down to simplify and reduce code size
* Move pcre2_get_ovector_pointer out of the loop
This is allocated together with the match data and stays loop invariant:
the pointer is always the same (the values not however).
* Mark error condition as cold block
* Simplify condition: subpats is already checked
* Move array size preallocation to use allocate the up-to-date size
* Simplify condition
* Rework internal functions to avoid repeated unwrapping
* Remember Z_ARRVAL_P(return_value)
The lookup is loop invariant.
* Mark some pointers as const
The code in the attached test used to work correctly in PHP 8.0, but not
in 8.1+. This is because PHP 8.1+ uses a more modern version of pcre2
than PHP 8.0, and that pcre2 versions has a regression.
While upgrading pcre2lib seems to be only done for the master branch, it
is possible to backport upstream fixes to stable branches. This has been
already done in the past in for JIT regressions [1], so it is not
unprecedented.
We backport the upstream pcre2 fix [2].
[1] 788a701e22
[2] https://github.com/PCRE2Project/pcre2/pull/135
Closes GH-12108.
The ZVAL_ARR macro always set the zval type_info to IS_ARRAY_EX, even if the
hash table is immutable. Since in preg_replace_callback_array() we can return
the passed array directly, and that passed array can be immutable, we need to
reset the type_flags to keep the VM from performing ref-counting on the array.
Fixes GH-10968
Closes GH-10970