Commit graph

127 commits

Author SHA1 Message Date
Niels Dossche
1fdbb0aba6 Get rid of unused declarations 2024-05-13 19:46:51 +02:00
Niels Dossche
44485892df Factor out all common code for XML serialization and merge common paths 2024-05-11 18:09:39 +02:00
Niels Dossche
e9355fa162 Remove unused prototype from php_dom.h 2024-05-11 01:21:15 +02:00
Niels Dossche
fae25ca2df Move dom_attr_value() into ext/libxml 2024-05-05 10:14:40 +02:00
Niels Dossche
6f989cdb75
Merge branch 'PHP-8.3'
* PHP-8.3:
  Fix crash when calling childNodes next() when iterator is exhausted
  Fix references not handled correctly in C14N
  Fix crashes when entity declaration is removed while still having entity references
2024-04-30 22:53:48 +02:00
Niels Dossche
461d890f0a
Merge branch 'PHP-8.2' into PHP-8.3
* PHP-8.2:
  Fix crash when calling childNodes next() when iterator is exhausted
  Fix references not handled correctly in C14N
  Fix crashes when entity declaration is removed while still having entity references
2024-04-30 22:38:32 +02:00
Niels Dossche
e878b9f390
Fix crashes when entity declaration is removed while still having entity references
libxml doesn't do reference counting inside its node types. It's
possible to remove an entity declaration out of the document, but then
entity references will keep pointing to that stale declaration. This
will cause crashes.

One idea would be to check when a declaration is removed, to trigger a
hook that updates all references. However this means we have to keep
track of all references somehow, which would be a high-overhead
solution. The solution in this patch makes sure that the fields are
always updated before they are read.

Closes GH-14089.
2024-04-30 22:29:44 +02:00
Niels Dossche
b3f820b408
Split off nodelist header components to nodelist.h 2024-04-14 14:45:46 +02:00
Niels Dossche
53f6e5ecd8
Move node list dimension handling to a separate file 2024-04-14 14:45:46 +02:00
Niels Dossche
ac039cf716
Implement HTMLCollection::namedItem() 2024-04-14 14:45:45 +02:00
Niels Dossche
5c69b2e86f
Factor out reading an attribute value 2024-04-14 14:45:45 +02:00
Niels Dossche
78ccea4e40
Fix GH-13863: Removal during NodeList iteration breaks loop
The list is live, so upon cache invalidation we should rewalk the tree
to sync the index again with the node list. We keep the legacy behaviour
for the old DOM classes.

Closes GH-13934.
2024-04-10 19:07:59 +02:00
Niels Dossche
15259a0a6c Factor out common "first container's child" code 2024-04-03 18:15:43 +02:00
Niels Dossche
fc9b58f602 Remove duplicated code for entity vs notation handling 2024-04-03 18:15:43 +02:00
Niels Dossche
e1630381b7
Fix GH-13764: xsl cannot build on PHP 8.4 (#13770)
Move some of the DOM APIs from the non-public php_dom.h header to the
public header xml_common.h.
2024-03-20 19:03:09 +01:00
Niels Dossche
751163d18e Change stricterror type to bool 2024-03-10 11:08:46 +01:00
Niels Dossche
14b6c981c3
[RFC] Add a way to opt-in ext/dom spec compliance (#13031)
RFC: https://wiki.php.net/rfc/opt_in_dom_spec_compliance
2024-03-09 16:56:00 +01:00
Niels Dossche
90785dd865
[RFC] Improve callbacks in ext/dom and ext/xsl (#12627) 2024-01-13 00:00:26 +01:00
Niels Dossche
ec79fc9d9c Merge branch 'PHP-8.3'
* PHP-8.3:
  Fix GH-12870: Creating an xmlns attribute results in a DOMException
2023-12-07 22:51:02 +01:00
Niels Dossche
e658f80501 Fix GH-12870: Creating an xmlns attribute results in a DOMException
There were multiple things here since forever, see the GH thread [1]
for discussion.

There were already many fixes to this function previously, and as a
consequence of one of those fixes this started throwing exceptions for a
correct use-case. It turns out that even when reverting to the previous
behaviour there are still bugs. Just fix all of them while we have the
chance.

[1] https://github.com/php/php-src/issues/12870

Closes GH-12888.
2023-12-07 22:42:32 +01:00
Niels Dossche
1492be5286
[RFC] DOM HTML5 parsing and serialization support (#12111) 2023-11-13 20:18:19 +01:00
Niels Dossche
3e33eda39a Fix cloning attribute with namespace disappearing namespace
Closes GH-12547.
2023-10-29 17:22:41 +01:00
Niels Dossche
b56141c5dd Merge branch 'PHP-8.3'
* PHP-8.3:
  Fix broken cache invalidation with deallocated and reallocated document node
2023-10-01 17:07:11 +02:00
Niels Dossche
eebc528cbf Fix broken cache invalidation with deallocated and reallocated document node
The original caching implementation had an oversight in combination with
the new lifetime management in DOM for 8.3.
The modification counter is stored on the document object itself, but as
that can get deallocated when all references disappear, stale cache data
can be used. Normally this isn't a problem, unless getElementsByTagName is
called not on the document but on a child node. Fix it by moving caching
data into the ref object, which will outlive all nodes from a document
even if the document object disappears.

Closes GH-12338.
2023-10-01 17:06:02 +02:00
Niels Dossche
ac62eee842
Remove DOM_NO_ARGS() and DOM_NOT_IMPLEMENTED() (#12147)
DOM_NO_ARGS() has no users.
DOM_NOT_IMPLEMENTED() has a single user.
2023-09-08 17:02:52 +02:00
Niels Dossche
d46dc5694c Fix various namespace prefix conflict resolution bugs and namespace shift bugs
There are two linked issues:

- Conflicts couldn't be resolved by changing the prefix name.
- Lacking a prefix would shift the namespace as the default namespace,
  causing elements to suddenly become part of the namespace instead of
  the attributes.

The output could still be improved by removing redundant namespace
declarations, but that's another issue. At least the output is
correct now.

Closes GH-11777.
2023-08-15 20:42:42 +02:00
Niels Dossche
bb092ab4c6 Fix #80927: Removing documentElement after creating attribute node: possible use-after-free
Closes GH-11892.
2023-08-12 18:49:12 +02:00
Niels Dossche
a73f38f407
Implement DOMElement::insertAdjacent{Element,Text} (#11700)
* Implement DOMElement::insertAdjacent{Element,Text}

ref: https://dom.spec.whatwg.org/#dom-element-insertadjacentelement
ref: https://dom.spec.whatwg.org/#dom-element-insertadjacenttext

Closes GH-11700.
2023-07-17 17:42:47 +02:00
Niels Dossche
d38cc9b9b6 Implement DOMNode::isConnected and DOMNameSpaceNode::isConnected
ref: https://dom.spec.whatwg.org/#dom-node-isconnected

Closes GH-11677.
2023-07-17 13:14:13 +02:00
Niels Dossche
6560c9bf8e Implement DOMParentNode::replaceChildren()
ref: https://dom.spec.whatwg.org/#dom-parentnode-replacechildren
2023-07-14 14:34:29 +02:00
Niels Dossche
b3899eb569 Refactor dom_node_node_name_read() to avoid double allocation
We will use this helper later outside of the node name read handler.
2023-07-13 16:18:10 +02:00
Niels Dossche
87e7b61d8f Cleanup macro usage in ext/dom and ext/libxml
With the new lifetime system in place, we can simplify some macros.
2023-07-03 21:32:48 +02:00
nielsdos
941a7e59d9 Avoid allocation when getting the node content, if possible
Closes GH-11543.
2023-06-27 18:00:44 +02:00
Niels Dossche
a37ad648b8 Merge branch 'PHP-8.2'
* PHP-8.2:
  Fix #80332: Completely broken array access functionality with DOMNamedNodeMap
2023-06-18 15:21:12 +02:00
Niels Dossche
9f7d88802e Fix #80332: Completely broken array access functionality with DOMNamedNodeMap
The problem is the usage of zval_get_long(). In particular, if the
string is non-numeric the result of zval_get_long() will be 0 without
giving an error or warning. This is misleading for users: users get the
impression that they can use strings to access the map because it
coincidentally works for the first item (which is at index 0). Of
course, this fails with any other index which causes confusion and bugs.

This patch adds proper support for using string offsets while accessing
the map. It does so by detecting if it's a non-numeric string, and then
using the getNamedItem() method instead of item(). I had to split up the
array access implementation code for DOMNodeList and DOMNamedNodeMap
first to be able to do this.

Closes GH-11468.
2023-06-18 14:59:19 +02:00
Niels Dossche
30c5ae4219 Merge branch 'PHP-8.2'
* PHP-8.2:
  Fix #70359 and #78577: segfaults with DOMNameSpaceNode
2023-06-09 21:46:05 +02:00
nielsdos
f2d673fb18 Fix #70359 and #78577: segfaults with DOMNameSpaceNode
* Fix type confusion and parent reference
* Manually manage the lifetime of the parent
* Add regression tests
* Break out to a helper, and apply the use-after-free fix to xpath

Closes GH-11402.
2023-06-09 21:35:55 +02:00
Tim Starling
74910b1403 Factor out dom_remove_all_children()
A few callers remove all children of a node. The way it was done in
node.c was unsafe, because it left nodep->last dangling. It just happens
to not crash if xmlNodeSetContent() is called immediately afterwards.
2023-06-05 20:04:40 +02:00
Niels Dossche
a720268214
Use uint32_t for the number of nodes (#11371) 2023-06-05 13:57:41 +01:00
Niels Dossche
e8fb0edc69 Merge branch 'PHP-8.2'
* PHP-8.2:
  Fix bug #77686: Removed elements are still returned by getElementById
  Fix bug #81642: DOMChildNode::replaceWith() bug when replacing a node with itself
  Fix bug #67440: append_node of a DOMDocumentFragment does not reconcile namespaces
2023-06-04 16:34:42 +02:00
Niels Dossche
23f7002527 Fix bug #81642: DOMChildNode::replaceWith() bug when replacing a node with itself
Closes GH-11363.
2023-06-04 16:19:48 +02:00
Niels Dossche
b1d8e240e6 Fix bug #67440: append_node of a DOMDocumentFragment does not reconcile namespaces
The test was amended from the original issue report. For the test:
Co-authored-by: php@deep-freeze.ca

The problem is that the regular dom_reconcile_ns() only works on a
single node. We actually have to reconciliate the whole tree in case a
fragment was added. This also required to move some code around such
that this special case could be handled separately.

Closes GH-11362.
2023-06-04 16:19:04 +02:00
Niels Dossche
c3f0797385
Implement iteration cache, item cache and length cache for node list iteration (#11330)
* Implement iteration cache, item cache and length cache for node list iteration

The current implementation follows the spec requirement that the list
must be "live". This means that changes in the document must be
reflected in the existing node lists without requiring the user to
refetch the node list.
The consequence is that getting any item, or the length of the list,
always starts searching from the root element of the node list. This
results in O(n) time to get any item or the length. If there's a for
loop over the node list, this means the iterations will take O(n²) time
in total. This causes real-world performance issues with potential for
downtime (see GH-11308 and its references for details).

We fix this by introducing a caching strategy. We cache the last
iterated object in the iterator, the last requested item in the node
list, and the last length computation. To invalidate the cache, we
simply count the number of modifications made to the containing
document. If the modification number does not match what the number was
during caching, we know the document has been modified and the cache is
invalid. If this ever overflows, we saturate the modification number and
don't do any caching anymore. Note that we don't check for overflow on
64-bit systems because it would take hundreds of years to overflow.

Fixes GH-11308.
2023-06-03 00:13:14 +02:00
nielsdos
c6655fb719 Implement dom_get_doc_props_read_only()
I was surprised to see that getting the stricterror property showed in
in the Callgrind profile of some tests. Turns out we sometimes allocate
them. Fix this by returning the default in case no changes were made yet.

Closes GH-11345.
2023-05-30 17:48:29 +02:00
KsaR
01b3fc03c3
Update http->https in license (#6945)
1. Update: http://www.php.net/license/3_01.txt to https, as there is anyway server header "Location:" to https.
2. Update few license 3.0 to 3.01 as 3.0 states "php 5.1.1, 4.1.1, and earlier".
3. In some license comments is "at through the world-wide-web" while most is without "at", so deleted.
4. fixed indentation in some files before |
2021-05-06 12:16:35 +02:00
George Peter Banyard
d842bc7e22
Refactor dom_has_feature() to use zend_string* 2021-04-09 18:45:08 +01:00
Máté Kocsis
4c6533c257
Generate class entries from stubs for com, standard, xmlreader, xmlwriter, xsl, zip, Zend
Closes GH-6706
2021-02-22 15:24:03 +01:00
George Peter Banyard
8fef83dd3c Promote warnings to error in DOM extension
Closes GH-5418
2020-09-22 19:12:32 +01:00
Nikita Popov
634dd38289 Throw Error exception in DOM_GET_OBJ
Per general convention for handling of uninitialized objects.
2020-08-13 14:43:40 +02:00
George Peter Banyard
62b1d2cb69 Fix [-Wundef] warning in DOM extension 2020-05-16 15:31:15 +02:00