Provides the last PCRE error as a human-readable message, similar
to functionality existing in other extensions, such as
json_last_error_msg().
Closes GH-5185.
PCRE only validates the string starting from the start offset
(minus maximum look-behind, but let's ignore that), so we can
only remember that the string is fully valid UTF-8 is the original
start offset is zero.
We need not just the whole string to be UTF-8, but the start
position to be on a character boundary as well. Check this by
looking for a continuation byte.
A new function `pcre_get_compiled_regex_cache_ex()` is introduced,
which allows to compile regexp pattern using the "C" locale instead
of a current locale.
This will be needed to replace setlocale() usage in fileinfo,
which is not thread-safe.
Print a more informative message that indicates that this is
likely a permission issue, and also indicate that pcre.jit=0
can be used to work around it.
Also automatically disable the JIT, so that this message is
only shown once.
See bug #78630.
RFC: https://wiki.php.net/rfc/tostring_exceptions
And convert some object to string conversion related recoverable
fatal errors into Error exceptions.
Improve exception safety of internal code performing string
conversions.
The `<loccale.h>` header file, setlocale, and localeconv are part of the
standard C89 [1] and on current systems can be used unconditionally.
Since PHP 7.4 requires at least C89 or greater, the `HAVE_LOCALE_H`,
`HAVE_SETLOCALE`, and `HAVE_LOCALECONV` symbols defined by Autoconf in
configure.ac [2] can be ommitted and simplifed.
The bundled libmagic (file) has also been patched already in version
5.35 and up in upstream location so when it will be patched also in
php-src the check for locale.h header is still left in the configure.ac
and in windows headers definition file.
[1] https://port70.net/~nsz/c/c89/c89-draft.html#4.4
[2] https://git.savannah.gnu.org/cgit/autoconf.git/tree/lib/autoconf/headers.m4
Omit the bundled libmagic files
There will only be one request on the CLI SAPI, so there is no
advantage to having a persistent PCRE cache. Using a non-persistent
cache allows us to use arbitrary strings as cache keys.
Accept the two offsets directly, rather than doing length calculations
at all callsites. Also extract the logic to create a possibly interned
string.
Switch the split implementation to work on a char* subject internally,
because ZSTR_VAL(subject_str) is a mouthful...
This issue was mentioned in bug #73948. The PREG_PATTERN_ORDER
padding was performed without respecting the PREF_OFFSET_CAPTURE
flag, which resulted in unmatched subpatterns being either null or
[null, -1] depending on where they occur. Now they will always be
[null, -1], consistent with other usages.