Commit graph

1027 commits

Author SHA1 Message Date
Niels Dossche
bffc74474b Add missing EXTENSIONS section to DOM tests 2023-08-26 18:37:42 +02:00
Niels Dossche
20ac42e1b0 Fix memory leak when setting an invalid DOMDocument encoding
Because the failure path did not release the string, there was a memory
leak.
As the only valid types for this function are IS_NULL and IS_STRING, we
and IS_NULL is always rejected in practice, solve the issue by not using
a function that increments the refcount in the first place.

Closes GH-12002.
2023-08-20 14:05:26 +02:00
Niels Dossche
d19e4da125 Fix segfault when DOMParentNode::prepend() is called when the child disappears
Closes GH-11906.
2023-08-08 20:06:39 +02:00
Niels Dossche
df6e8bd4fd Fix viable next sibling search for replaceWith
Closes GH-11888.
2023-08-07 20:23:06 +02:00
Niels Dossche
dddd309da4 Fix GH-11830: ParentNode methods should perform their checks upfront
Closes GH-11887.
2023-08-07 19:39:05 +02:00
Niels Dossche
08c4db7f36 Fix manually calling __construct() on DOM classes
Closes GH-11894.
2023-08-07 19:37:47 +02:00
Niels Dossche
6e468bbd3b Fix json_encode result on DOMDocument
According to https://www.php.net/manual/en/class.domdocument:
  When using json_encode() on a DOMDocument object the result will be
  that of encoding an empty object.

But this was broken in 8.1. The output was `{"config": null}`.
That's because the config property is defined with a default value of
NULL, hence it was included. The other properties are not included
because they don't have a default property, and nothing is ever written
to their backing field. Hence, the JSON encoder excludes them.
Similarly, `(array) $doc` would yield the same `config` key in the
array.

Closes GH-11840.
2023-08-01 17:28:51 +02:00
Ben Ramsey
ebbccb3dc6
Merge branch 'PHP-8.0' into PHP-8.1 2023-07-31 20:01:03 -05:00
Niels Dossche
62228a2568
Disable global state test on Windows
It looks like the config.w32 uses CHECK_HEADER_ADD_INCLUDE to add the include
path to libxml into the search path.
That doesn't happen in zend-test.
To add to the Windows trouble, libxml is statically linked in, ext/libxml can
only be built statically but ext/zend-test can be built both statically and
dynamically.
So the regression tests won't work in all possible configurations anyway on Windows.
All of this is no problem on Linux because it just uses dynamic linking
and pkg-config, without any magic.

Signed-off-by: Ben Ramsey <ramsey@php.net>
2023-07-31 19:55:10 -05:00
Derick Rethans
0870ebb862 Merge branch 'PHP-8.0' into PHP-8.1 2023-07-31 19:53:43 +01:00
Niels Dossche
c283c3ab0b Sanitize libxml2 globals before parsing
Fixes GHSA-3qrf-m4j2-pcrr.

To parse a document with libxml2, you first need to create a parsing context.
The parsing context contains parsing options (e.g. XML_NOENT to substitute
entities) that the application (in this case PHP) can set.
Unfortunately, libxml2 also supports providing default set options.
For example, if you call xmlSubstituteEntitiesDefault(1) then the XML_NOENT
option will be added to the parsing options every time you create a parsing
context **even if the application never requested XML_NOENT**.

Third party extensions can override these globals, in particular the
substitute entity global. This causes entity substitution to be
unexpectedly active.

Fix it by setting the parsing options to a sane known value.
For API calls that depend on global state we introduce
PHP_LIBXML_SANITIZE_GLOBALS() and PHP_LIBXML_RESTORE_GLOBALS().
For other APIs that work directly with a context we introduce
php_libxml_sanitize_parse_ctxt_options().
2023-07-31 19:47:19 +01:00
Niels Dossche
bed0e54104 Fix DOM test 2023-07-26 18:05:24 +02:00
Niels Dossche
bf4e7bd3ed Fix GH-11791: Wrong default value of DOMDocument::xmlStandalone
At one point this was changed from a bool to an int in libxml2, with
negative values meaning it is unspecified. Because it is cast to a bool
this therefore returned true instead of the expected false.

Closes GH-11793.
2023-07-26 17:20:10 +02:00
Niels Dossche
abb1d2e824 Fix empty argument cases for DOMParentNode methods
Closes GH-11768.
2023-07-24 18:58:39 +02:00
Niels Dossche
1cf2d216a2 Fix DOMCharacterData::replaceWith() with itself
Previously, when replacing the node with itself (or contained within
itself), the node disappeared.

Closes GH-11770.
2023-07-24 18:58:17 +02:00
Niels Dossche
168bc8146f Fix incorrect attribute existence check in DOMElement::setAttributeNodeNS()
Closes GH-11776.
2023-07-24 18:57:16 +02:00
Niels Dossche
d439ee18ed Fix DOMEntity field getter bugs
- publicId could crash PHP if none was provided
- notationName never worked

The fields of this classs were untested. This new test file changes that.

Closes GH-11779.
2023-07-24 18:55:51 +02:00
Niels Dossche
5c26258eeb Handle fragments consisting out of multiple children without a single root correctly
Closes GH-11698.
2023-07-13 16:09:04 +02:00
Niels Dossche
48b246e038 Add regression test for GH-11682
This bug was already fixed via 15ff830, but we really need more
test coverage.

Co-authored-by: Arne Blankerts <arne@blankerts.de>
2023-07-11 23:02:01 +02:00
Niels Dossche
15ff830373 Fix GH-11625: DOMElement::replaceWith() doesn't replace node with DOMDocumentFragment but just deletes node or causes wrapping <></> depending on libxml2 version
Depending on the libxml2 version, the behaviour is either to not
render the fragment correctly, or to wrap it inside <></>. Fix it by
unpacking fragments manually. This has the side effect that we need to
move the unlinking check in the replacement function to earlier because
the empty child list is now possible in non-error cases.
Also fixes a mistake in the linked list management.

Closes GH-11627.
2023-07-10 13:29:31 +02:00
nielsdos
c174ebfce0 Revert "Fix GH-11404: DOMDocument::savexml and friends ommit xmlns="" declaration for null namespace, creating incorrect xml representation of the DOM"
This reverts commit 7eb3e9cd17.

Although the fix follows the spec, it causes issues because a lot of old
code assumes the incorrect behaviour PHP had since a long time.
We cannot do this yet, especially not in a stable release.
We revert this for the time being.
See GH-11428.
2023-06-19 19:37:46 +02:00
Niels Dossche
9f7d88802e Fix #80332: Completely broken array access functionality with DOMNamedNodeMap
The problem is the usage of zval_get_long(). In particular, if the
string is non-numeric the result of zval_get_long() will be 0 without
giving an error or warning. This is misleading for users: users get the
impression that they can use strings to access the map because it
coincidentally works for the first item (which is at index 0). Of
course, this fails with any other index which causes confusion and bugs.

This patch adds proper support for using string offsets while accessing
the map. It does so by detecting if it's a non-numeric string, and then
using the getNamedItem() method instead of item(). I had to split up the
array access implementation code for DOMNodeList and DOMNamedNodeMap
first to be able to do this.

Closes GH-11468.
2023-06-18 14:59:19 +02:00
nielsdos
7eb3e9cd17 Fix GH-11404: DOMDocument::savexml and friends ommit xmlns="" declaration for null namespace, creating incorrect xml representation of the DOM
The NULL namespace is only correct when there is no default namespace
override. When there is, we need to manually set it to the empty string
namespace.

Closes GH-11428.
2023-06-17 13:36:00 +02:00
nielsdos
b30be40b86 Fix bug #55294 and #47530 and #47847: namespace reconciliation issues
We'll use the DOM wrapper version of libxml2 instead of the regular one.
It's conforming to the behaviour we expect of DOM.
Most of this patch is tests.

I based and extended the tests on the code attached with the aforementioned
bug reports. Therefore the credits for the tests:
Co-authored-by: hilse at web dot de
Co-authored-by: robin2008 at altruists dot org
Co-authored-by: sgunderson at bigfoot dot com

We'll also change the searching point of the internal reconciliation to
start at the top of the added tree to avoid redundant work now that the
function is changed.

Closes GH-11454.
2023-06-15 21:50:00 +02:00
Niels Dossche
10d94aca4c Fix "invalid state error" with cloned namespace declarations
Closes GH-11429.
2023-06-13 17:30:18 +02:00
Niels Dossche
e309fd8461 Fix lifetime issue with getAttributeNodeNS()
It's the same issue that I fixed previously in GH-11402, but in a
different place.

Closes GH-11422.
2023-06-13 17:29:37 +02:00
nielsdos
f2d673fb18 Fix #70359 and #78577: segfaults with DOMNameSpaceNode
* Fix type confusion and parent reference
* Manually manage the lifetime of the parent
* Add regression tests
* Break out to a helper, and apply the use-after-free fix to xpath

Closes GH-11402.
2023-06-09 21:35:55 +02:00
Niels Dossche
0e34ac864a Fix bug #77686: Removed elements are still returned by getElementById
From the moment an ID is created, libxml2's behaviour is to cache that element,
even if that element is not yet attached to the document. Similarly, only upon
destruction of the element the ID is actually removed by libxml2.
Since libxml2 has such behaviour deeply ingrained in the library, and uses the
cache for various purposes, it seems like a bad idea and lost cause to fight it.
Instead, we'll simply walk the tree upwards to check if the node is attached to
the document.

Closes GH-11369.
2023-06-04 16:20:34 +02:00
Niels Dossche
23f7002527 Fix bug #81642: DOMChildNode::replaceWith() bug when replacing a node with itself
Closes GH-11363.
2023-06-04 16:19:48 +02:00
Niels Dossche
b1d8e240e6 Fix bug #67440: append_node of a DOMDocumentFragment does not reconcile namespaces
The test was amended from the original issue report. For the test:
Co-authored-by: php@deep-freeze.ca

The problem is that the regular dom_reconcile_ns() only works on a
single node. We actually have to reconciliate the whole tree in case a
fragment was added. This also required to move some code around such
that this special case could be handled separately.

Closes GH-11362.
2023-06-04 16:19:04 +02:00
nielsdos
7812772105 Fix GH-11347: Memory leak when calling a static method inside an xpath query
It's a type confusion bug. `zend_make_callable` may change the function name
of the fci to become an array, causing a crash in debug mode on
`zval_ptr_dtor_str(&fci.function_name);` in `dom_xpath_ext_function_php`.
On a production build it doesn't crash but only causes a leak, because
the array elements are not destroyed, only the array container itself
is. We can use the nogc variant because it cannot contain cycles, the
potential array can only contain 2 strings.

Closes GH-11350.
2023-05-31 17:14:57 +02:00
nielsdos
b374ec399d Fix DOMElement::append() and DOMElement::prepend() hierarchy checks
We could end up in an invalid hierarchy, resulting in infinite loops and
eventual crashes if we don't check for the DOM hierarchy validity.

Closes GH-11344.
2023-05-30 17:36:26 +02:00
Niels Dossche
154c251013 Fix spec compliance error for DOMDocument::getElementsByTagNameNS
Spec link: https://dom.spec.whatwg.org/#concept-getelementsbytagnamens
Spec says we should match any namespace when '*' is provided. This was
however not the case: elements that didn't have a namespace were not
returned. This patch fixes the error by modifying the namespace check.

Closes GH-11343.
2023-05-30 17:35:38 +02:00
divinity76
761b9a44f8 Fix return value in stub file for DOMNodeList::item
Not explicitly documenting the possibility of returning DOMElement causes
the Intelephense linter (a popular PHP linter with ~9 million downloads:
https://marketplace.visualstudio.com/items?itemName=bmewburn.vscode-intelephense-client)
to think this code is bad:

  $xp->query("whatever")->item(0)->getAttribute("foo");

DOMNode does not have getAttribute (while DOMElement does).
Documenting the DOMElement return type should fix Intelephense's linter.

Closes GH-11342.
2023-05-29 18:49:26 +02:00
Niels Dossche
c473787abb Fix GH-10234: Setting DOMAttr::textContent results in an empty attribute value
We can't directly call xmlNodeSetContent, because it might encode the string
through xmlStringLenGetNodeList for types
XML_DOCUMENT_FRAG_NODE, XML_ELEMENT_NODE, XML_ATTRIBUTE_NODE.
In these cases we need to use a text node to avoid the encoding.
For the other cases, we *can* rely on xmlNodeSetContent because it is either
a no-op, or handles the content without encoding and clears the properties
field if needed.

The test was taken from the issue report, for the test:
Co-authored-by: ThomasWeinert <thomas@weinert.info>

Closes GH-10245.
2023-05-29 14:10:59 +02:00
nielsdos
cba335d61e Fix GH-11288 and GH-11289 and GH-11290 and GH-9142: DOMExceptions and segfaults with replaceWith
This replaces the implementation of before and after with one following
the spec very strictly, instead of trying to figure out the state we're
in by looking at the pointers. Also relaxes the condition on text node
copying to prevent working on a stale node pointer.

Closes GH-11299.
2023-05-25 23:04:19 +02:00
Niels Dossche
7c0dfc5cf5 Fix GH-11160: Few tests failed building with new libxml 2.11.0
It's possible to categorise the failures into 2 categories:
  - Changed error message. In this case we either duplicate the test and
    modify the error message. Or if the change in error message is
    small, we use the EXPECTF matchers to make the test compatible with both
    old and new versions of libxml2.
  - Missing warnings. This is caused by a change in libxml2 where the
    parser started using SAX APIs internally [1]. In this case the
    error_type passed to php_libxml_internal_error_handler() changed from
    PHP_LIBXML_ERROR to PHP_LIBXML_CTX_WARNING because it internally
    started to use the SAX handlers instead of the generic handlers.
    However, for the SAX handlers the current input stack is empty, so
    nothing is actually printed. I fixed this by falling back to a
    regular warning without a filename & line number reference, which
    mimicks the old behaviour. Furthermore, this change now also shows
    an additional warning in a test which was previously hidden.

[1] 9a82b94a94

Closes GH-11162.
2023-05-06 23:10:07 +02:00
Niels Dossche
0579beb842 Fix incorrect error handling in dom_zvals_to_fragment()
Discovered this pre-existing problem while testing GH-10682.
Note: this problem existed *before* that PR.

* Not all paths throw a hierarchy request error
* xmlFreeNode must be used instead of xmlFree for the fragment to also
  free its children.
* Free up nodes that couldn't be added when xmlAddChild fails.

I unified the error handling code that's exactly the same with a goto to
prevent at least some of such problems in the future.

Closes GH-10981.
2023-04-03 21:21:35 +02:00
NathanFreeman
2d6decc14c Fix bug #80602: Segfault when using DOMChildNode::before()
This furthermore fixes the logic error explained in
https://github.com/php/php-src/pull/8729#issuecomment-1161737132

Closes GH-10682.
2023-03-30 20:49:05 +02:00
Stanislav Malyshev
85d9278db2 Merge branch 'PHP-8.0' into PHP-8.1 2023-02-12 21:33:39 -07:00
Niels Dossche
ec10b28d64 Fix array overrun when appending slash to paths
Fix it by extending the array sizes by one character. As the input is
limited to the maximum path length, there will always be place to append
the slash. As the php_check_specific_open_basedir() simply uses the
strings to compare against each other, no new failures related to too
long paths are introduced.
We'll let the DOM and XML case handle a potentially too long path in the
library code.
2023-02-12 20:56:19 -07:00
George Peter Banyard
a4acba9e52
Add missing EXTENSION section to tests 2022-10-27 14:39:43 +01:00
Christoph M. Becker
9bd9e9a867
Merge branch 'PHP-8.0' into PHP-8.1
* PHP-8.0:
  Fix #79451: DOMDocument->replaceChild on doctype causes double free
2022-08-19 18:13:48 +02:00
NathanFreeman
6027d441c1
Fix #79451: DOMDocument->replaceChild on doctype causes double free
We have to reset intSubset if replacing doctype with another doctype node.

Closes GH-9201.
Closes GH-9376.
2022-08-19 18:10:06 +02:00
George Peter Banyard
eb8ea14c66 Merge branch 'PHP-8.0' into PHP-8.1 2022-08-19 13:57:19 +01:00
George Peter Banyard
d6831e9a5c Revert Fixed bug #79451
The fix for 8.1 and above is not identical and I don't know how to fix without breaking the whole build apparently
2022-08-19 13:54:54 +01:00
George Peter Banyard
5739dd0030 Fix bad merge 2022-08-19 13:17:57 +01:00
George Peter Banyard
c36a1ea1ae Merge branch 'PHP-8.0' into PHP-8.1 2022-08-19 12:52:58 +01:00
NathanFreeman
1d4300d870 Fix bug #79451: Using DOMDocument->replaceChild on doctype causes double free
Closes GH-9201
2022-08-19 12:46:23 +01:00
Stanislav Malyshev
9de4eb9e37
Merge branch 'PHP-8.0' into PHP-8.1
* PHP-8.0:
  Fix #79971: special character is breaking the path in xml function
2021-11-14 23:29:59 -08:00