Commit graph

162 commits

Author SHA1 Message Date
Niels Dossche
91a310e603
Get rid of separate DOM HashPosition member (#18354)
Besides the fact that this is only used for DOM_NODESET and thus makes
no sense of being on the iterator itself, it's also redundant now that
we have the index member.
2025-04-19 17:59:48 +02:00
Niels Dossche
33c4ca36e4
Merge branch 'PHP-8.4'
* PHP-8.4:
  Fix weird unpack behaviour in DOM
  Fix GH-17989: mb_output_handler crash with unset http_output_conv_mimetypes
2025-03-09 11:21:34 +01:00
Niels Dossche
aa6e58f82a
Merge branch 'PHP-8.3' into PHP-8.4
* PHP-8.3:
  Fix weird unpack behaviour in DOM
  Fix GH-17989: mb_output_handler crash with unset http_output_conv_mimetypes
2025-03-09 11:21:27 +01:00
Niels Dossche
9be9f70caa
Fix weird unpack behaviour in DOM
Engine pitfall: the iter index is only updated by foreach opcodes, so
the existing code that used it as an index for the nodes w.r.t. the
start did not work properly. Fix it by using our own counter.

Closes GH-18004.
2025-03-09 11:17:03 +01:00
Niels Dossche
7be3649016
Cleanup iterator instantiation code (#17358)
Just using object_init_ex() directly makes the code a bit simpler and
avoids unnecessary indirections.
2025-01-04 16:48:41 +01:00
Niels Dossche
49fa2e4651 Make some arguments of dom_get_elements_by_tag_name_ns_raw() const 2025-01-03 17:50:01 +01:00
Niels Dossche
59a0d00a5d Avoid string duplications in dom iterators 2025-01-03 17:50:01 +01:00
Niels Dossche
4c3aeec74f
Minor cleanups in namednodemap.c (#17340) 2025-01-03 17:33:29 +01:00
Niels Dossche
f576b81340
Merge branch 'PHP-8.4'
* PHP-8.4:
  Fix GH-16906: Reloading document can cause UAF in iterator
2024-11-24 18:20:29 +01:00
Niels Dossche
52c7c74ebb
Merge branch 'PHP-8.3' into PHP-8.4
* PHP-8.3:
  Fix GH-16906: Reloading document can cause UAF in iterator
2024-11-24 18:20:21 +01:00
Niels Dossche
9d39ff764e
Fix GH-16906: Reloading document can cause UAF in iterator
Closes GH-16909.
2024-11-24 18:19:45 +01:00
Niels Dossche
7f5a888bdb
Change dom_node_is_read_only() to return bool (#16757)
Returning int or zend_result doesn't make sense, it's a yes/no question.
2024-11-11 20:57:52 +01:00
Niels Dossche
a3b27c083f
Add Dom\Element::insertAdjacentHTML() (#16614) 2024-11-09 10:52:06 +01:00
Niels Dossche
81a2cd4dac
Merge branch 'PHP-8.3' into PHP-8.4
* PHP-8.3:
  Fix various document ref pointer mismanagements
2024-10-17 21:21:49 +02:00
Niels Dossche
5cb38e9d24
Fix various document ref pointer mismanagements
- Properly handle attributes
- Fix potential NULL dereference if the intern document pointer is NULL

Fixes GH-16336.
Fixes GH-16338.
Closes GH-16345.
2024-10-17 21:18:50 +02:00
Niels Dossche
73b7993b0d
Merge branch 'PHP-8.3'
* PHP-8.3:
  Fix GH-15654: Signed integer overflow in ext/dom/nodelist.c
2024-08-31 11:56:34 +02:00
Niels Dossche
9cb23a3dec
Fix GH-15654: Signed integer overflow in ext/dom/nodelist.c
There's implicit truncation casts from zend_long to int which cause
issues because checks are done against the zend_longs. Since the
iterator infrastructure uses zend_longs, just convert everything to
zend_long.

Closes GH-15669.
2024-08-31 11:47:08 +02:00
Máté Kocsis
7e45e57d8f
Suppress deprecation notices when ext/dom properties are accessed by the get_debug_info handler (#15530) 2024-08-23 10:39:11 +02:00
Niels Dossche
ceca599649
Merge branch 'PHP-8.3'
* PHP-8.3:
  Fix UAF when removing doctype and using foreach iteration
2024-07-30 20:07:48 +02:00
Niels Dossche
4049594adf
Merge branch 'PHP-8.2' into PHP-8.3
* PHP-8.2:
  Fix UAF when removing doctype and using foreach iteration
2024-07-30 20:03:30 +02:00
Niels Dossche
b282dd749f
Fix UAF when removing doctype and using foreach iteration
This is an old bug, but this is pretty easy to fix.
It's basically applying the same fix as I did for e878b9f.
Reported by YuanchengJiang.

Closes GH-15143.
2024-07-30 20:01:22 +02:00
Niels Dossche
80a4783d25
Deduplicate NULL checks in ext/dom (#15015)
This introduces a new helper php_dom_create_nullable_object() that does
the NULL check and puts NULL in return_value. Otherwise it runs
php_dom_create_object(). This deduplicates a bit of code.
2024-07-18 21:20:03 +02:00
Niels Dossche
b3a4a6b1e1
Resolve TODOs in ext/dom around nullable content (#14999)
It's indeed possible this is NULL. When you create a new text-like node
in libxml and pass NULL as content, you do get NULL in the content field
instead of the empty string. You can hit this by creating DOMText or
DOMComment directly and not passing any argument. This could also be
created internally.
We refactor the code such that this detail is hidden and we add a test
to check that it correctly throws an exception.
2024-07-18 00:05:40 +02:00
Niels Dossche
4ef7539144
Split off private data from the ns mapper 2024-07-15 11:02:52 +02:00
Niels Dossche
8825235348
Reapply "Stop using reserved names in dom"
This reverts commit dda96768ec.
2024-07-08 17:27:39 +02:00
Niels Dossche
dda96768ec
Revert "Stop using reserved names in dom"
This reverts commit 013bc53f0c.

This somehow breaks the Windows build. Will investigate later.
2024-07-08 16:07:32 +02:00
Niels Dossche
013bc53f0c Stop using reserved names in dom 2024-07-08 06:09:04 -07:00
Niels Dossche
3303c15754 Make some pointers const in php_dom.h 2024-07-08 06:09:04 -07:00
Niels Dossche
fc09f4b2bc
Implement Dom\TokenList (#13664)
Part of RFC: https://wiki.php.net/rfc/dom_additions_84

Closes GH-11688.
2024-07-02 21:34:23 +02:00
Niels Dossche
768900b180 Implement Dom $innerHTML property 2024-07-02 11:15:38 -07:00
Niels Dossche
88da914910 Implement CSS selectors 2024-06-29 13:00:26 -07:00
Niels Dossche
8dc2391bae
Fix bug #79701: getElementById does not correctly work with duplicate definitions
This is a long standing bug: IDs aren't properly tracked causing either
outdated or plain incorrect results from getElementById.

This PR implements a pragmatic solution in which we still try to use the
ID lookup table to a degree, but only as a performance boost not as a
"single source of truth". Full details are explained in the
getElementById code.

Closes GH-14349.
2024-06-01 12:55:05 +02:00
Niels Dossche
ab80392710
Cleanup DOM exception throwing parameters (#14330) 2024-05-26 14:01:37 +02:00
Levi Morrison
c461b60060
refactor: change zend_is_true to return bool (#14301)
Previously this returned `int`. Many functions actually take advantage
of the fact this returns exactly 0 or 1. For instance,
`main/streams/xp_socket.c` does:

    sockopts |= STREAM_SOCKOP_IPV6_V6ONLY_ENABLED * zend_is_true(tmpzval);

And `Zend/zend_compile.c` does:

    child = &ast->child[2 - zend_is_true(zend_ast_get_zval(ast->child[0]))];

I changed a few places trivially from `int` to `bool`, but there are
still many places such as the object handlers which return `int` that
should eventually be `bool`.
2024-05-24 15:16:36 -06:00
Niels Dossche
e95b06c5ad Make some more arguments const 2024-05-13 19:46:51 +02:00
Niels Dossche
1fdbb0aba6 Get rid of unused declarations 2024-05-13 19:46:51 +02:00
Niels Dossche
44485892df Factor out all common code for XML serialization and merge common paths 2024-05-11 18:09:39 +02:00
Niels Dossche
e9355fa162 Remove unused prototype from php_dom.h 2024-05-11 01:21:15 +02:00
Niels Dossche
fae25ca2df Move dom_attr_value() into ext/libxml 2024-05-05 10:14:40 +02:00
Niels Dossche
6f989cdb75
Merge branch 'PHP-8.3'
* PHP-8.3:
  Fix crash when calling childNodes next() when iterator is exhausted
  Fix references not handled correctly in C14N
  Fix crashes when entity declaration is removed while still having entity references
2024-04-30 22:53:48 +02:00
Niels Dossche
461d890f0a
Merge branch 'PHP-8.2' into PHP-8.3
* PHP-8.2:
  Fix crash when calling childNodes next() when iterator is exhausted
  Fix references not handled correctly in C14N
  Fix crashes when entity declaration is removed while still having entity references
2024-04-30 22:38:32 +02:00
Niels Dossche
e878b9f390
Fix crashes when entity declaration is removed while still having entity references
libxml doesn't do reference counting inside its node types. It's
possible to remove an entity declaration out of the document, but then
entity references will keep pointing to that stale declaration. This
will cause crashes.

One idea would be to check when a declaration is removed, to trigger a
hook that updates all references. However this means we have to keep
track of all references somehow, which would be a high-overhead
solution. The solution in this patch makes sure that the fields are
always updated before they are read.

Closes GH-14089.
2024-04-30 22:29:44 +02:00
Niels Dossche
b3f820b408
Split off nodelist header components to nodelist.h 2024-04-14 14:45:46 +02:00
Niels Dossche
53f6e5ecd8
Move node list dimension handling to a separate file 2024-04-14 14:45:46 +02:00
Niels Dossche
ac039cf716
Implement HTMLCollection::namedItem() 2024-04-14 14:45:45 +02:00
Niels Dossche
5c69b2e86f
Factor out reading an attribute value 2024-04-14 14:45:45 +02:00
Niels Dossche
78ccea4e40
Fix GH-13863: Removal during NodeList iteration breaks loop
The list is live, so upon cache invalidation we should rewalk the tree
to sync the index again with the node list. We keep the legacy behaviour
for the old DOM classes.

Closes GH-13934.
2024-04-10 19:07:59 +02:00
Niels Dossche
15259a0a6c Factor out common "first container's child" code 2024-04-03 18:15:43 +02:00
Niels Dossche
fc9b58f602 Remove duplicated code for entity vs notation handling 2024-04-03 18:15:43 +02:00
Niels Dossche
e1630381b7
Fix GH-13764: xsl cannot build on PHP 8.4 (#13770)
Move some of the DOM APIs from the non-public php_dom.h header to the
public header xml_common.h.
2024-03-20 19:03:09 +01:00