A bunch of different issues:
1) The referenced value is copied without incrementing the refcount.
The reason the refcount isn't incremented is because otherwise
the array modifications would violate the RC1 constraints.
Solve this by copying the reference itself instead and always
read the referenced value.
2) No type checks on the array data, so malicious scripts could
cause type confusion bugs.
3) Potential overflow when the arrays resize and we access ctag.
Closes GH-17205.
Upon unwinding from an exception, the parser state is not stable, we
should not continue updating the values if an exception was thrown.
Closes GH-15879.
Upon unwinding from an exception, the parser state is not stable, we
should not continue updating the values if an exception was thrown.
Closes GH-15879.
libxml2 2.13 makes changes to how the parsing state is set, update our
code accordingly. In particular, it started reporting entities within
attributes, while it should only report entities inside text nodes.
Closes GH-14837.
* PHP-8.2:
NEWS for compatibility in XML
Stop setting parse options directly
Stop relying on lastError directly
Stop relying on the sax2 flag directly
Port XML_GetCurrentByteIndex to public APIs
The reflection failure is because the XML extension is used to check the
module dependency information, but that extension can be configured to
not depend on ext/libxml, resulting in a different output. The solution
is to check another extension instead.
The test failures in ext/xml/tests are because of different behaviour
between libxml2 and Expat error handling. These are expected differences
and the solution is to split the tests.
Closes GH-13522.
Fixes GHSA-3qrf-m4j2-pcrr.
To parse a document with libxml2, you first need to create a parsing context.
The parsing context contains parsing options (e.g. XML_NOENT to substitute
entities) that the application (in this case PHP) can set.
Unfortunately, libxml2 also supports providing default set options.
For example, if you call xmlSubstituteEntitiesDefault(1) then the XML_NOENT
option will be added to the parsing options every time you create a parsing
context **even if the application never requested XML_NOENT**.
Third party extensions can override these globals, in particular the
substitute entity global. This causes entity substitution to be
unexpectedly active.
Fix it by setting the parsing options to a sane known value.
For API calls that depend on global state we introduce
PHP_LIBXML_SANITIZE_GLOBALS() and PHP_LIBXML_RESTORE_GLOBALS().
For other APIs that work directly with a context we introduce
php_libxml_sanitize_parse_ctxt_options().
It's possible to categorise the failures into 2 categories:
- Changed error message. In this case we either duplicate the test and
modify the error message. Or if the change in error message is
small, we use the EXPECTF matchers to make the test compatible with both
old and new versions of libxml2.
- Missing warnings. This is caused by a change in libxml2 where the
parser started using SAX APIs internally [1]. In this case the
error_type passed to php_libxml_internal_error_handler() changed from
PHP_LIBXML_ERROR to PHP_LIBXML_CTX_WARNING because it internally
started to use the SAX handlers instead of the generic handlers.
However, for the SAX handlers the current input stack is empty, so
nothing is actually printed. I fixed this by falling back to a
regular warning without a filename & line number reference, which
mimicks the old behaviour. Furthermore, this change now also shows
an additional warning in a test which was previously hidden.
[1] 9a82b94a94
Closes GH-11162.
This reverts commit 94ee4f9834.
The commit was a bit too late to be included in PHP 8.2 RC1. Given it's a massive ABI break, we decide to postpone the change to PHP 8.3.