Setting the recovery option by using a hardcoded value (1) worked
already for SimpleXML. For DOM, a small change is necessary because
otherwise the recover field overwrites the recovery option.
From a quick search on GitHub [1] it looks like this won't clash with
existing PHP code as no one seems to define (or use) a constant with
such a name.
[1] https://github.com/search?q=LIBXML_RECOVER+language%3APHP&type=code&l=PHP
I forgot to also update the document reference of attributes, so when
there is no document reference anymore from a variable, but still an
attribute, this can crash. Fix it by also updating the document
references for attributes.
Closes GH-13002.
There were multiple things here since forever, see the GH thread [1]
for discussion.
There were already many fixes to this function previously, and as a
consequence of one of those fixes this started throwing exceptions for a
correct use-case. It turns out that even when reverting to the previous
behaviour there are still bugs. Just fix all of them while we have the
chance.
[1] https://github.com/php/php-src/issues/12870
Closes GH-12888.
This is a continuation of commit c2a58ab07d, in which several OOM error
handling was converted to throwing an INVALID_STATE_ERR DOMException.
Some places were missed and they still returned false without an
exception, or threw a PHP_ERR DOMException.
Convert all of these to INVALID_STATE_ERR DOMExceptions. This also
reduces confusion of users going through documentation [1].
Unfortunately, not all node creations are checked for a NULL pointer.
Some places therefore will not do anything if an OOM occurs (well,
except crash).
On the one hand it's nice to handle these OOM cases.
On the other hand, this adds some complexity and it's very unlikely to
happen in the real world. But then again, "unlikely" situations have
caused trouble before. Ideally all cases should be checked.
[1] https://github.com/php/doc-en/issues/1741
* Avoid passing NULL to xmlSwitchToEncoding
This otherwise switches to UTF-8 on libxml2 2.12.0
* Split tests for different error reporting behaviour in libxml2 2.12.0
* Avoid deprecation warnings for libxml2 2.12.0
We can't fully get rid of the parser globals as there are still APIs
that implicitly use them.
* Temporarily disable part of test for libxml 2.12.0 regression
See https://gitlab.gnome.org/GNOME/libxml2/-/issues/634
* Review fixes
* [ci skip] Update test description
This always results in a segfault when trying to instantiate, so this never
worked. At least throw an error instead of segfaulting to prevent developers
from being confused.
Closes GH-12420.
The original caching implementation had an oversight in combination with
the new lifetime management in DOM for 8.3.
The modification counter is stored on the document object itself, but as
that can get deallocated when all references disappear, stale cache data
can be used. Normally this isn't a problem, unless getElementsByTagName is
called not on the document but on a child node. Fix it by moving caching
data into the ref object, which will outlive all nodes from a document
even if the document object disappears.
Closes GH-12338.
The entry points are duplicated: they add bloat and make it easier to forget
to change something. Make maintenance easier by using @implementation-alias.
Also, this has the nice side-effect of slightly reducing the amount of
code and binary size.
Closes GH-12158.
Because the failure path did not release the string, there was a memory
leak.
As the only valid types for this function are IS_NULL and IS_STRING, we
and IS_NULL is always rejected in practice, solve the issue by not using
a function that increments the refcount in the first place.
Closes GH-12002.
There are two linked issues:
- Conflicts couldn't be resolved by changing the prefix name.
- Lacking a prefix would shift the namespace as the default namespace,
causing elements to suddenly become part of the namespace instead of
the attributes.
The output could still be improved by removing redundant namespace
declarations, but that's another issue. At least the output is
correct now.
Closes GH-11777.
Fixes GHSA-3qrf-m4j2-pcrr.
To parse a document with libxml2, you first need to create a parsing context.
The parsing context contains parsing options (e.g. XML_NOENT to substitute
entities) that the application (in this case PHP) can set.
Unfortunately, libxml2 also supports providing default set options.
For example, if you call xmlSubstituteEntitiesDefault(1) then the XML_NOENT
option will be added to the parsing options every time you create a parsing
context **even if the application never requested XML_NOENT**.
Third party extensions can override these globals, in particular the
substitute entity global. This causes entity substitution to be
unexpectedly active.
Fix it by setting the parsing options to a sane known value.
For API calls that depend on global state we introduce
PHP_LIBXML_SANITIZE_GLOBALS() and PHP_LIBXML_RESTORE_GLOBALS().
For other APIs that work directly with a context we introduce
php_libxml_sanitize_parse_ctxt_options().
At one point this was changed from a bool to an int in libxml2, with
negative values meaning it is unspecified. Because it is cast to a bool
this therefore returned true instead of the expected false.
Closes GH-11793.
* PHP-8.2:
Fix empty argument cases for DOMParentNode methods
Fix DOMCharacterData::replaceWith() with itself
Fix incorrect attribute existence check in DOMElement::setAttributeNodeNS()
Fix DOMEntity field getter bugs
For the past 20 years this threw a "not yet implemented" exception. But
the function was actually there (albeit not documented) and could be called...
Closes GH-11333.
* PHP-8.2:
Fix GH-11455: Segmentation fault with custom object date properties
Revert "Fix GH-11404: DOMDocument::savexml and friends ommit xmlns="" declaration for null namespace, creating incorrect xml representation of the DOM"