Commit graph

144 commits

Author SHA1 Message Date
Niels Dossche
ceca599649
Merge branch 'PHP-8.3'
* PHP-8.3:
  Fix UAF when removing doctype and using foreach iteration
2024-07-30 20:07:48 +02:00
Niels Dossche
4049594adf
Merge branch 'PHP-8.2' into PHP-8.3
* PHP-8.2:
  Fix UAF when removing doctype and using foreach iteration
2024-07-30 20:03:30 +02:00
Niels Dossche
b282dd749f
Fix UAF when removing doctype and using foreach iteration
This is an old bug, but this is pretty easy to fix.
It's basically applying the same fix as I did for e878b9f.
Reported by YuanchengJiang.

Closes GH-15143.
2024-07-30 20:01:22 +02:00
Niels Dossche
80a4783d25
Deduplicate NULL checks in ext/dom (#15015)
This introduces a new helper php_dom_create_nullable_object() that does
the NULL check and puts NULL in return_value. Otherwise it runs
php_dom_create_object(). This deduplicates a bit of code.
2024-07-18 21:20:03 +02:00
Niels Dossche
b3a4a6b1e1
Resolve TODOs in ext/dom around nullable content (#14999)
It's indeed possible this is NULL. When you create a new text-like node
in libxml and pass NULL as content, you do get NULL in the content field
instead of the empty string. You can hit this by creating DOMText or
DOMComment directly and not passing any argument. This could also be
created internally.
We refactor the code such that this detail is hidden and we add a test
to check that it correctly throws an exception.
2024-07-18 00:05:40 +02:00
Niels Dossche
4ef7539144
Split off private data from the ns mapper 2024-07-15 11:02:52 +02:00
Niels Dossche
8825235348
Reapply "Stop using reserved names in dom"
This reverts commit dda96768ec.
2024-07-08 17:27:39 +02:00
Niels Dossche
dda96768ec
Revert "Stop using reserved names in dom"
This reverts commit 013bc53f0c.

This somehow breaks the Windows build. Will investigate later.
2024-07-08 16:07:32 +02:00
Niels Dossche
013bc53f0c Stop using reserved names in dom 2024-07-08 06:09:04 -07:00
Niels Dossche
3303c15754 Make some pointers const in php_dom.h 2024-07-08 06:09:04 -07:00
Niels Dossche
fc09f4b2bc
Implement Dom\TokenList (#13664)
Part of RFC: https://wiki.php.net/rfc/dom_additions_84

Closes GH-11688.
2024-07-02 21:34:23 +02:00
Niels Dossche
768900b180 Implement Dom $innerHTML property 2024-07-02 11:15:38 -07:00
Niels Dossche
88da914910 Implement CSS selectors 2024-06-29 13:00:26 -07:00
Niels Dossche
8dc2391bae
Fix bug #79701: getElementById does not correctly work with duplicate definitions
This is a long standing bug: IDs aren't properly tracked causing either
outdated or plain incorrect results from getElementById.

This PR implements a pragmatic solution in which we still try to use the
ID lookup table to a degree, but only as a performance boost not as a
"single source of truth". Full details are explained in the
getElementById code.

Closes GH-14349.
2024-06-01 12:55:05 +02:00
Niels Dossche
ab80392710
Cleanup DOM exception throwing parameters (#14330) 2024-05-26 14:01:37 +02:00
Levi Morrison
c461b60060
refactor: change zend_is_true to return bool (#14301)
Previously this returned `int`. Many functions actually take advantage
of the fact this returns exactly 0 or 1. For instance,
`main/streams/xp_socket.c` does:

    sockopts |= STREAM_SOCKOP_IPV6_V6ONLY_ENABLED * zend_is_true(tmpzval);

And `Zend/zend_compile.c` does:

    child = &ast->child[2 - zend_is_true(zend_ast_get_zval(ast->child[0]))];

I changed a few places trivially from `int` to `bool`, but there are
still many places such as the object handlers which return `int` that
should eventually be `bool`.
2024-05-24 15:16:36 -06:00
Niels Dossche
e95b06c5ad Make some more arguments const 2024-05-13 19:46:51 +02:00
Niels Dossche
1fdbb0aba6 Get rid of unused declarations 2024-05-13 19:46:51 +02:00
Niels Dossche
44485892df Factor out all common code for XML serialization and merge common paths 2024-05-11 18:09:39 +02:00
Niels Dossche
e9355fa162 Remove unused prototype from php_dom.h 2024-05-11 01:21:15 +02:00
Niels Dossche
fae25ca2df Move dom_attr_value() into ext/libxml 2024-05-05 10:14:40 +02:00
Niels Dossche
6f989cdb75
Merge branch 'PHP-8.3'
* PHP-8.3:
  Fix crash when calling childNodes next() when iterator is exhausted
  Fix references not handled correctly in C14N
  Fix crashes when entity declaration is removed while still having entity references
2024-04-30 22:53:48 +02:00
Niels Dossche
461d890f0a
Merge branch 'PHP-8.2' into PHP-8.3
* PHP-8.2:
  Fix crash when calling childNodes next() when iterator is exhausted
  Fix references not handled correctly in C14N
  Fix crashes when entity declaration is removed while still having entity references
2024-04-30 22:38:32 +02:00
Niels Dossche
e878b9f390
Fix crashes when entity declaration is removed while still having entity references
libxml doesn't do reference counting inside its node types. It's
possible to remove an entity declaration out of the document, but then
entity references will keep pointing to that stale declaration. This
will cause crashes.

One idea would be to check when a declaration is removed, to trigger a
hook that updates all references. However this means we have to keep
track of all references somehow, which would be a high-overhead
solution. The solution in this patch makes sure that the fields are
always updated before they are read.

Closes GH-14089.
2024-04-30 22:29:44 +02:00
Niels Dossche
b3f820b408
Split off nodelist header components to nodelist.h 2024-04-14 14:45:46 +02:00
Niels Dossche
53f6e5ecd8
Move node list dimension handling to a separate file 2024-04-14 14:45:46 +02:00
Niels Dossche
ac039cf716
Implement HTMLCollection::namedItem() 2024-04-14 14:45:45 +02:00
Niels Dossche
5c69b2e86f
Factor out reading an attribute value 2024-04-14 14:45:45 +02:00
Niels Dossche
78ccea4e40
Fix GH-13863: Removal during NodeList iteration breaks loop
The list is live, so upon cache invalidation we should rewalk the tree
to sync the index again with the node list. We keep the legacy behaviour
for the old DOM classes.

Closes GH-13934.
2024-04-10 19:07:59 +02:00
Niels Dossche
15259a0a6c Factor out common "first container's child" code 2024-04-03 18:15:43 +02:00
Niels Dossche
fc9b58f602 Remove duplicated code for entity vs notation handling 2024-04-03 18:15:43 +02:00
Niels Dossche
e1630381b7
Fix GH-13764: xsl cannot build on PHP 8.4 (#13770)
Move some of the DOM APIs from the non-public php_dom.h header to the
public header xml_common.h.
2024-03-20 19:03:09 +01:00
Niels Dossche
751163d18e Change stricterror type to bool 2024-03-10 11:08:46 +01:00
Niels Dossche
14b6c981c3
[RFC] Add a way to opt-in ext/dom spec compliance (#13031)
RFC: https://wiki.php.net/rfc/opt_in_dom_spec_compliance
2024-03-09 16:56:00 +01:00
Niels Dossche
90785dd865
[RFC] Improve callbacks in ext/dom and ext/xsl (#12627) 2024-01-13 00:00:26 +01:00
Niels Dossche
ec79fc9d9c Merge branch 'PHP-8.3'
* PHP-8.3:
  Fix GH-12870: Creating an xmlns attribute results in a DOMException
2023-12-07 22:51:02 +01:00
Niels Dossche
e658f80501 Fix GH-12870: Creating an xmlns attribute results in a DOMException
There were multiple things here since forever, see the GH thread [1]
for discussion.

There were already many fixes to this function previously, and as a
consequence of one of those fixes this started throwing exceptions for a
correct use-case. It turns out that even when reverting to the previous
behaviour there are still bugs. Just fix all of them while we have the
chance.

[1] https://github.com/php/php-src/issues/12870

Closes GH-12888.
2023-12-07 22:42:32 +01:00
Niels Dossche
1492be5286
[RFC] DOM HTML5 parsing and serialization support (#12111) 2023-11-13 20:18:19 +01:00
Niels Dossche
3e33eda39a Fix cloning attribute with namespace disappearing namespace
Closes GH-12547.
2023-10-29 17:22:41 +01:00
Niels Dossche
b56141c5dd Merge branch 'PHP-8.3'
* PHP-8.3:
  Fix broken cache invalidation with deallocated and reallocated document node
2023-10-01 17:07:11 +02:00
Niels Dossche
eebc528cbf Fix broken cache invalidation with deallocated and reallocated document node
The original caching implementation had an oversight in combination with
the new lifetime management in DOM for 8.3.
The modification counter is stored on the document object itself, but as
that can get deallocated when all references disappear, stale cache data
can be used. Normally this isn't a problem, unless getElementsByTagName is
called not on the document but on a child node. Fix it by moving caching
data into the ref object, which will outlive all nodes from a document
even if the document object disappears.

Closes GH-12338.
2023-10-01 17:06:02 +02:00
Niels Dossche
ac62eee842
Remove DOM_NO_ARGS() and DOM_NOT_IMPLEMENTED() (#12147)
DOM_NO_ARGS() has no users.
DOM_NOT_IMPLEMENTED() has a single user.
2023-09-08 17:02:52 +02:00
Niels Dossche
d46dc5694c Fix various namespace prefix conflict resolution bugs and namespace shift bugs
There are two linked issues:

- Conflicts couldn't be resolved by changing the prefix name.
- Lacking a prefix would shift the namespace as the default namespace,
  causing elements to suddenly become part of the namespace instead of
  the attributes.

The output could still be improved by removing redundant namespace
declarations, but that's another issue. At least the output is
correct now.

Closes GH-11777.
2023-08-15 20:42:42 +02:00
Niels Dossche
bb092ab4c6 Fix #80927: Removing documentElement after creating attribute node: possible use-after-free
Closes GH-11892.
2023-08-12 18:49:12 +02:00
Niels Dossche
a73f38f407
Implement DOMElement::insertAdjacent{Element,Text} (#11700)
* Implement DOMElement::insertAdjacent{Element,Text}

ref: https://dom.spec.whatwg.org/#dom-element-insertadjacentelement
ref: https://dom.spec.whatwg.org/#dom-element-insertadjacenttext

Closes GH-11700.
2023-07-17 17:42:47 +02:00
Niels Dossche
d38cc9b9b6 Implement DOMNode::isConnected and DOMNameSpaceNode::isConnected
ref: https://dom.spec.whatwg.org/#dom-node-isconnected

Closes GH-11677.
2023-07-17 13:14:13 +02:00
Niels Dossche
6560c9bf8e Implement DOMParentNode::replaceChildren()
ref: https://dom.spec.whatwg.org/#dom-parentnode-replacechildren
2023-07-14 14:34:29 +02:00
Niels Dossche
b3899eb569 Refactor dom_node_node_name_read() to avoid double allocation
We will use this helper later outside of the node name read handler.
2023-07-13 16:18:10 +02:00
Niels Dossche
87e7b61d8f Cleanup macro usage in ext/dom and ext/libxml
With the new lifetime system in place, we can simplify some macros.
2023-07-03 21:32:48 +02:00
nielsdos
941a7e59d9 Avoid allocation when getting the node content, if possible
Closes GH-11543.
2023-06-27 18:00:44 +02:00