Fixes GHSA-3qrf-m4j2-pcrr.
To parse a document with libxml2, you first need to create a parsing context.
The parsing context contains parsing options (e.g. XML_NOENT to substitute
entities) that the application (in this case PHP) can set.
Unfortunately, libxml2 also supports providing default set options.
For example, if you call xmlSubstituteEntitiesDefault(1) then the XML_NOENT
option will be added to the parsing options every time you create a parsing
context **even if the application never requested XML_NOENT**.
Third party extensions can override these globals, in particular the
substitute entity global. This causes entity substitution to be
unexpectedly active.
Fix it by setting the parsing options to a sane known value.
For API calls that depend on global state we introduce
PHP_LIBXML_SANITIZE_GLOBALS() and PHP_LIBXML_RESTORE_GLOBALS().
For other APIs that work directly with a context we introduce
php_libxml_sanitize_parse_ctxt_options().
* Add behavioural tests for incdec operators
* Add support to ++/-- for objects castable to _IS_NUMBER
* Add str_increment() function
* Add str_decrement() function
RFC: https://wiki.php.net/rfc/saner-inc-dec-operators
Co-authored-by: Ilija Tovilo <ilija.tovilo@me.com>
Co-authored-by: Arnaud Le Blanc <arnaud.lb@gmail.com>
After a hash filling routine the number of elements are set to the fill
index. However, if the fill index is larger than the number of elements,
the number of elements are no longer correct. This is observable at
least via count() and var_dump(). E.g. the attached test case would
incorrectly show int(17) instead of int(11).
Solve this by only increasing the number of elements by the actual
number that got added. Instead of adding a variable that increments per
iteration, I wanted to save some cycles in the iteration and simply
compute the number of added elements at the end.
I discovered this behaviour while fixing GH-11016, where this filling
routine is easily exposed to userland via a specialised VM path [1].
Since this seems to be more a general problem with the macros, and may
be triggered outside of the VM handlers, I fixed it in the macros
instead of modifying the VM to fixup the number of elements.
[1] b2c5acbb01/Zend/zend_vm_def.h (L6132-L6141)
* PHP-8.2:
Re-add some CTE functions that were removed from being CTE by a mistake
Fix GH-8065: opcache.consistency_checks > 0 causes segfaults in PHP >= 8.1.5 in fpm context
Fix GH-8646: Memory leak PHP FPM 8.1
Fixes GH-8646
See https://github.com/php/php-src/issues/8646 for thorough discussion.
Interned strings that hold class entries can get a corresponding slot in map_ptr for the CE cache.
map_ptr works like a bump allocator: there is a counter which increases to allocate the next slot in the map.
For class name strings in non-opcache we have:
- on startup: permanent + interned
- on request: interned
For class name strings in opcache we have:
- on startup: permanent + interned
- on request: either not interned at all, which we can ignore because they won't get a CE cache entry
or they were already permanent + interned
or we get a new permanent + interned string in the opcache persistence code
Notice that the map_ptr layout always has the permanent strings first, and the request strings after.
In non-opcache, a request string may get a slot in map_ptr, and that interned request string
gets destroyed at the end of the request. The corresponding map_ptr slot can thereafter never be used again.
This causes map_ptr to keep reallocating to larger and larger sizes.
We solve it as follows:
We can check whether we had any interned request strings, which only happens in non-opcache.
If we have any, we reset map_ptr to the last permanent string.
We can't lose any permanent strings because of map_ptr's layout.
Closes GH-10783.
Sets the UTF-8 valid flag if all parts are valid, or numeric (which are valid UTF-8 by definition).
* remove unuseful comments
* Imply UTF8 validity in implode function
* revert zend_string_dup change
(I agree that null should come last, but then it should rather throw with a proper message than emit an undefined key warning.)
Signed-off-by: Bob Weinand <bobwei9@hotmail.com>
`zend_uchar` suggests that the value is an ASCII character, but here,
it's about very small integers. This is misleading, so let's use a
C99 integer instead.
On all architectures currently supported by PHP, `zend_uchar` and
`uint8_t` are identical. This change is only about code readability.
copy_file_range can return early without copying all the data. This is
legal behaviour and worked properly, unless the mmap fallback was used.
The mmap fallback would read too much data into the destination,
corrupting the destination file. Furthermore, if the mmap fallback would
fail and have to fallback to the regular file copying mechanism, a
similar issue would occur because both maxlen and haveread are modified.
Furthermore, there was a mmap-resource in one of the failure paths of
the mmap fallback code.
This patch fixes these issues. This also adds regression tests using the
new copy_file_range early-return simulation added in the previous
commit.
The UTF-8 valid flag needs to be copied upon interning,
otherwise strings that are concatenated at compile time lose this information.
However, if previously this string was interned without the flag it is not added
E.g. in the case the string is an existing class name.
Co-authored-by: Niels Dossche <7771979+nielsdos@users.noreply.github.com>