Commit graph

417 commits

Author SHA1 Message Date
Niels Dossche
ff0a2cff05 Refactor implementation of DOM nodelists, named maps, and iterators
The code was really messy with lots of checks and inconsistencies.
This splits everything up into different functions and now everything is
relayed to a handler vtable.
2025-06-21 22:17:33 +02:00
Niels Dossche
4fadf647d2 Refactor dom_nnodemap_objects_new()
- Use ecalloc() to not miss initializing any field.
- Merge declarations and assignments.
2025-06-21 22:17:33 +02:00
Niels Dossche
307ff3bdea
Merge branch 'PHP-8.4'
* PHP-8.4:
  Fix GH-18744: PHP 8.4 classList works not correctly if copy HTMLElement by clone keyword.
2025-06-04 18:59:21 +02:00
Niels Dossche
111072a9f0
Fix GH-18744: PHP 8.4 classList works not correctly if copy HTMLElement by clone keyword.
The $classList property is special in the sense that it's a cached
object instance per (HTML)Element instance. The reason for this design
is because it has the [[SameObject]] IDL attribute.
Cloning in PHP also clones the properties, so it also clones the cached
instance. To solve this, we undo this by resetting the backing storage.

Closes GH-18749.
2025-06-04 18:59:05 +02:00
Máté Kocsis
7f59fccd52
Create separate lexbor extension (#18538)
An always enabled lexbor extension is added, containing the lexbor library that was separated from ext/dom extension in preparation of https://wiki.php.net/rfc/url_parsing_api. While at it, the lexbor library is upgraded to 2.5.0.

Co-authored-by: Niels Dossche <7771979+nielsdos@users.noreply.github.com>
Co-authored-by: Gina Peter Banyard <girgias@php.net>
2025-05-25 14:12:44 +02:00
Niels Dossche
3ba725a556
Merge branch 'PHP-8.4'
* PHP-8.4:
  Fix GH-18309: ipv6 filter integer overflow
  Fix GH-18304: Changing the properties of a DateInterval through dynamic properties triggers a SegFault
2025-04-11 23:36:42 +02:00
Niels Dossche
a019fbd970
Merge branch 'PHP-8.3' into PHP-8.4
* PHP-8.3:
  Fix GH-18309: ipv6 filter integer overflow
  Fix GH-18304: Changing the properties of a DateInterval through dynamic properties triggers a SegFault
2025-04-11 23:36:12 +02:00
Niels Dossche
ba0853888d
Fix GH-18304: Changing the properties of a DateInterval through dynamic properties triggers a SegFault
For dynamic fetches the cache_slot will be NULL, so we have to check for
that when resetting the cache. For zip and xmlreader this couldn't
easily be tested because of a lack of writable properties.

Closes GH-18307.
2025-04-11 23:33:58 +02:00
Niels Dossche
632357b275
Merge branch 'PHP-8.4'
* PHP-8.4:
  Fix GH-17991: Assertion failure dom_attr_value_write
2025-03-07 22:43:48 +01:00
Niels Dossche
6083dc09a3
Fix GH-17991: Assertion failure dom_attr_value_write
Closes GH-17995.
2025-03-07 22:43:38 +01:00
Niels Dossche
d95b9d6d32
Merge branch 'PHP-8.4'
* PHP-8.4:
  Fix GH-17736: Assertion failure zend_reference_destroy()
2025-03-02 22:41:21 +01:00
Niels Dossche
ee4a9a4a7c
Merge branch 'PHP-8.3' into PHP-8.4
* PHP-8.3:
  Fix GH-17736: Assertion failure zend_reference_destroy()
2025-03-02 22:37:07 +01:00
Niels Dossche
ce8ab5f16a
Fix GH-17736: Assertion failure zend_reference_destroy()
The cache slot for FETCH_OBJ_W in function `test` is primed with the
class for C. The next call uses a simplexml instance and reuses the same
cache slot. simplexml's get_property_ptr handler does not use the cache
slot, so the old values remain in the cache slot. When
`zend_handle_fetch_obj_flags` is called this is not guarded by a check
for the class entry. So we end up using the prop_info from the property
C::$a instead of the simplexml property.

This patch adds a reset to the cache slots in the property address fetch
code and also in the extensions with a non-standard reference handler.
This keeps the run time cache consistent and avoids the issue without
complicating the fast paths.

Closes GH-17739.
2025-03-02 22:33:32 +01:00
Niels Dossche
e6f42c1ed0
Fix incorrect casts 2025-01-11 10:50:07 +01:00
Niels Dossche
6cbe2edaad
Merge branch 'PHP-8.4'
* PHP-8.4:
  Fix GH-17397: Assertion failure ext/dom/php_dom.c
2025-01-08 19:46:23 +01:00
Niels Dossche
6d215981b6
Fix GH-17397: Assertion failure ext/dom/php_dom.c
The problem was that the property hash tables were not merging the
correct ones, a stupid typo (or caused by merging).

Closes GH-17406.
2025-01-08 19:45:40 +01:00
Niels Dossche
7be3649016
Cleanup iterator instantiation code (#17358)
Just using object_init_ex() directly makes the code a bit simpler and
avoids unnecessary indirections.
2025-01-04 16:48:41 +01:00
Niels Dossche
49fa2e4651 Make some arguments of dom_get_elements_by_tag_name_ns_raw() const 2025-01-03 17:50:01 +01:00
Niels Dossche
59a0d00a5d Avoid string duplications in dom iterators 2025-01-03 17:50:01 +01:00
Niels Dossche
c015242947
Merge branch 'PHP-8.4'
* PHP-8.4:
  Fix GH-17145: DOM memory leak
2024-12-14 12:12:52 +01:00
Niels Dossche
4656c22526
Fix GH-17145: DOM memory leak
Because the use of RETURN instead of RETVAL, the freeing code could not
be executed. This only is triggerable if the content of the attribute is
mixed text and entities, so it wasn't noticed earlier.

Closes GH-17147.
2024-12-14 12:12:40 +01:00
Niels Dossche
f576b81340
Merge branch 'PHP-8.4'
* PHP-8.4:
  Fix GH-16906: Reloading document can cause UAF in iterator
2024-11-24 18:20:29 +01:00
Niels Dossche
52c7c74ebb
Merge branch 'PHP-8.3' into PHP-8.4
* PHP-8.3:
  Fix GH-16906: Reloading document can cause UAF in iterator
2024-11-24 18:20:21 +01:00
Niels Dossche
9d39ff764e
Fix GH-16906: Reloading document can cause UAF in iterator
Closes GH-16909.
2024-11-24 18:19:45 +01:00
Niels Dossche
7f5a888bdb
Change dom_node_is_read_only() to return bool (#16757)
Returning int or zend_result doesn't make sense, it's a yes/no question.
2024-11-11 20:57:52 +01:00
Niels Dossche
1083872a08
Merge branch 'PHP-8.4'
* PHP-8.4:
  Fix GH-16465: Heap buffer overflow in DOMNode->getElementByTagName
2024-10-16 22:55:29 +02:00
Niels Dossche
d70f3ba9a5
Fix GH-16465: Heap buffer overflow in DOMNode->getElementByTagName
If the input contains NUL bytes then the length doesn't match the actual
duplicated string's length. Note that libxml can't handle this properly
anyway so we just reject NUL bytes and too long strings.

Closes GH-16467.
2024-10-16 22:55:18 +02:00
DanielEScherzer
41996e8d4f
ext/[cd]*: fix a bunch of typos (#16298)
Only functional change is the renaming of the functions
`dom_document_substitue_entities_(read|write)` to replace `substitue` with
`substitute`.
2024-10-09 17:40:42 +02:00
Niels Dossche
e4e65aa255
Add Dom\Element::$outerHTML setter
Reference: https://html.spec.whatwg.org/multipage/dynamic-markup-insertion.html#the-outerhtml-property
2024-10-05 23:24:27 +02:00
Niels Dossche
402b1c29b6
Add Dom\Element::$outerHTML getter
Reference: https://html.spec.whatwg.org/multipage/dynamic-markup-insertion.html#the-outerhtml-property
2024-10-05 23:24:27 +02:00
Niels Dossche
7bf5b7fa78
Use cache slot for dom_property_exists() (#15941) 2024-09-18 17:23:24 +02:00
Niels Dossche
c9a4abadcc
Fix unsetting DOM properties
This never did anything in lower versions, but on master this crashes
because the virtual properties don't have backing storage. Just forbid
it since it was useless to begin with.

Closes GH-15891.
2024-09-17 19:24:49 +02:00
Niels Dossche
82c504fa9c
Fix GH-15670: Polymorphic cache slot issue in DOM (#15676)
A cache slot can be hit with different DOM object types, so we should
check if we're still handling the same type.
2024-08-31 12:13:21 +02:00
Niels Dossche
73b7993b0d
Merge branch 'PHP-8.3'
* PHP-8.3:
  Fix GH-15654: Signed integer overflow in ext/dom/nodelist.c
2024-08-31 11:56:34 +02:00
Niels Dossche
9cb23a3dec
Fix GH-15654: Signed integer overflow in ext/dom/nodelist.c
There's implicit truncation casts from zend_long to int which cause
issues because checks are done against the zend_longs. Since the
iterator infrastructure uses zend_longs, just convert everything to
zend_long.

Closes GH-15669.
2024-08-31 11:47:08 +02:00
Niels Dossche
367f303efa
Optimize DOM property access (#15626)
For the read and write implementation, store the handler pointer in the
first cache slot.
For the write implementation, use the second cache slot to store the
property info.

For a micro-benchmark that performs a write:
```php
$dom = new DOMDocument;
for ($i=0;$i<9999999;$i++)
        $dom->strictErrorChecking = false;
```

I obtain the following results on an i7-4790:

```
  ./sapi/cli/php ./write.php ran
    1.42 ± 0.08 times faster than ./sapi/cli/php_old ./write.php
```

For a micro-benchmark that performs a read:
```php
$dom = new DOMDocument;
for ($i=0;$i<9999999;$i++)
        $dom->strictErrorChecking;
```

I obtain the following results on the same machine:

```
  ./sapi/cli/php ./read.php ran
    1.29 ± 0.13 times faster than ./sapi/cli/php_old ./read.php
```
2024-08-29 20:13:29 +02:00
Máté Kocsis
7e45e57d8f
Suppress deprecation notices when ext/dom properties are accessed by the get_debug_info handler (#15530) 2024-08-23 10:39:11 +02:00
Máté Kocsis
587110c5bf
Deprecate Soft-deprecated DOMDocument and DOMEntity properties (#15369)
RFC: https://wiki.php.net/rfc/deprecations_php_8_4#formally_deprecate_soft-deprecated_domdocument_and_domentity_properties
2024-08-13 12:39:20 +01:00
Niels Dossche
76ad89ccff
Fix GH-15192: Segmentation fault in dom extension (html5_serializer)
When cloning a document, doc will not be equal to the actual new
document clone->doc. clone->doc will always point to the correct
document so use that instead when comparing document nodes.

Closes GH-15198.
2024-08-02 18:22:17 +02:00
Niels Dossche
551c4a3bf8
Use OBJ_RELEASE instead of ZVAL_OBJ + zval_ptr_dtor in php_dom.c (#15052) 2024-07-21 16:57:05 +02:00
Niels Dossche
80a4783d25
Deduplicate NULL checks in ext/dom (#15015)
This introduces a new helper php_dom_create_nullable_object() that does
the NULL check and puts NULL in return_value. Otherwise it runs
php_dom_create_object(). This deduplicates a bit of code.
2024-07-18 21:20:03 +02:00
Ilija Tovilo
a26ec58fa1
De-duplicate readonly property modification error message (#14972) 2024-07-16 16:29:40 +02:00
Niels Dossche
6980eba863
Support templated content
The template element in HTML 5 is special in the sense that it does not
add its contents into the DOM tree, but instead keeps them in a separate
shadow DOM document fragment. Interacting with the DOM tree cannot touch
the elements in the document fragment.

Closes GH-14906.
2024-07-15 11:10:51 +02:00
Niels Dossche
4ef7539144
Split off private data from the ns mapper 2024-07-15 11:02:52 +02:00
Niels Dossche
92c0db398e
Avoid reconciling when cloning into the same document (#14921)
We don't need to reconcile when we clone into the same document because
the namespace mapper is the same. Only when cloning into another
document is the namespace mapper different and do we need a
reconciliation.
2024-07-12 19:23:37 +02:00
Niels Dossche
8825235348
Reapply "Stop using reserved names in dom"
This reverts commit dda96768ec.
2024-07-08 17:27:39 +02:00
Niels Dossche
dda96768ec
Revert "Stop using reserved names in dom"
This reverts commit 013bc53f0c.

This somehow breaks the Windows build. Will investigate later.
2024-07-08 16:07:32 +02:00
Niels Dossche
013bc53f0c Stop using reserved names in dom 2024-07-08 06:09:04 -07:00
Niels Dossche
cf914f4184
Implement PHP-specific extensions to Dom (#14754)
See RFC: https://wiki.php.net/rfc/dom_additions_84
2024-07-04 13:50:19 +02:00
Niels Dossche
fc09f4b2bc
Implement Dom\TokenList (#13664)
Part of RFC: https://wiki.php.net/rfc/dom_additions_84

Closes GH-11688.
2024-07-02 21:34:23 +02:00