Commit graph

389 commits

Author SHA1 Message Date
Niels Dossche
e4e65aa255
Add Dom\Element::$outerHTML setter
Reference: https://html.spec.whatwg.org/multipage/dynamic-markup-insertion.html#the-outerhtml-property
2024-10-05 23:24:27 +02:00
Niels Dossche
402b1c29b6
Add Dom\Element::$outerHTML getter
Reference: https://html.spec.whatwg.org/multipage/dynamic-markup-insertion.html#the-outerhtml-property
2024-10-05 23:24:27 +02:00
Niels Dossche
7bf5b7fa78
Use cache slot for dom_property_exists() (#15941) 2024-09-18 17:23:24 +02:00
Niels Dossche
c9a4abadcc
Fix unsetting DOM properties
This never did anything in lower versions, but on master this crashes
because the virtual properties don't have backing storage. Just forbid
it since it was useless to begin with.

Closes GH-15891.
2024-09-17 19:24:49 +02:00
Niels Dossche
82c504fa9c
Fix GH-15670: Polymorphic cache slot issue in DOM (#15676)
A cache slot can be hit with different DOM object types, so we should
check if we're still handling the same type.
2024-08-31 12:13:21 +02:00
Niels Dossche
73b7993b0d
Merge branch 'PHP-8.3'
* PHP-8.3:
  Fix GH-15654: Signed integer overflow in ext/dom/nodelist.c
2024-08-31 11:56:34 +02:00
Niels Dossche
9cb23a3dec
Fix GH-15654: Signed integer overflow in ext/dom/nodelist.c
There's implicit truncation casts from zend_long to int which cause
issues because checks are done against the zend_longs. Since the
iterator infrastructure uses zend_longs, just convert everything to
zend_long.

Closes GH-15669.
2024-08-31 11:47:08 +02:00
Niels Dossche
367f303efa
Optimize DOM property access (#15626)
For the read and write implementation, store the handler pointer in the
first cache slot.
For the write implementation, use the second cache slot to store the
property info.

For a micro-benchmark that performs a write:
```php
$dom = new DOMDocument;
for ($i=0;$i<9999999;$i++)
        $dom->strictErrorChecking = false;
```

I obtain the following results on an i7-4790:

```
  ./sapi/cli/php ./write.php ran
    1.42 ± 0.08 times faster than ./sapi/cli/php_old ./write.php
```

For a micro-benchmark that performs a read:
```php
$dom = new DOMDocument;
for ($i=0;$i<9999999;$i++)
        $dom->strictErrorChecking;
```

I obtain the following results on the same machine:

```
  ./sapi/cli/php ./read.php ran
    1.29 ± 0.13 times faster than ./sapi/cli/php_old ./read.php
```
2024-08-29 20:13:29 +02:00
Máté Kocsis
7e45e57d8f
Suppress deprecation notices when ext/dom properties are accessed by the get_debug_info handler (#15530) 2024-08-23 10:39:11 +02:00
Máté Kocsis
587110c5bf
Deprecate Soft-deprecated DOMDocument and DOMEntity properties (#15369)
RFC: https://wiki.php.net/rfc/deprecations_php_8_4#formally_deprecate_soft-deprecated_domdocument_and_domentity_properties
2024-08-13 12:39:20 +01:00
Niels Dossche
76ad89ccff
Fix GH-15192: Segmentation fault in dom extension (html5_serializer)
When cloning a document, doc will not be equal to the actual new
document clone->doc. clone->doc will always point to the correct
document so use that instead when comparing document nodes.

Closes GH-15198.
2024-08-02 18:22:17 +02:00
Niels Dossche
551c4a3bf8
Use OBJ_RELEASE instead of ZVAL_OBJ + zval_ptr_dtor in php_dom.c (#15052) 2024-07-21 16:57:05 +02:00
Niels Dossche
80a4783d25
Deduplicate NULL checks in ext/dom (#15015)
This introduces a new helper php_dom_create_nullable_object() that does
the NULL check and puts NULL in return_value. Otherwise it runs
php_dom_create_object(). This deduplicates a bit of code.
2024-07-18 21:20:03 +02:00
Ilija Tovilo
a26ec58fa1
De-duplicate readonly property modification error message (#14972) 2024-07-16 16:29:40 +02:00
Niels Dossche
6980eba863
Support templated content
The template element in HTML 5 is special in the sense that it does not
add its contents into the DOM tree, but instead keeps them in a separate
shadow DOM document fragment. Interacting with the DOM tree cannot touch
the elements in the document fragment.

Closes GH-14906.
2024-07-15 11:10:51 +02:00
Niels Dossche
4ef7539144
Split off private data from the ns mapper 2024-07-15 11:02:52 +02:00
Niels Dossche
92c0db398e
Avoid reconciling when cloning into the same document (#14921)
We don't need to reconcile when we clone into the same document because
the namespace mapper is the same. Only when cloning into another
document is the namespace mapper different and do we need a
reconciliation.
2024-07-12 19:23:37 +02:00
Niels Dossche
8825235348
Reapply "Stop using reserved names in dom"
This reverts commit dda96768ec.
2024-07-08 17:27:39 +02:00
Niels Dossche
dda96768ec
Revert "Stop using reserved names in dom"
This reverts commit 013bc53f0c.

This somehow breaks the Windows build. Will investigate later.
2024-07-08 16:07:32 +02:00
Niels Dossche
013bc53f0c Stop using reserved names in dom 2024-07-08 06:09:04 -07:00
Niels Dossche
cf914f4184
Implement PHP-specific extensions to Dom (#14754)
See RFC: https://wiki.php.net/rfc/dom_additions_84
2024-07-04 13:50:19 +02:00
Niels Dossche
fc09f4b2bc
Implement Dom\TokenList (#13664)
Part of RFC: https://wiki.php.net/rfc/dom_additions_84

Closes GH-11688.
2024-07-02 21:34:23 +02:00
Niels Dossche
768900b180 Implement Dom $innerHTML property 2024-07-02 11:15:38 -07:00
Peter Kokot
c44834d8ad
Trim trailing whitespace (#14721) 2024-06-29 18:41:45 +02:00
Niels Dossche
48c9f1e2c3 Implement Dom\HTMLElement class 2024-06-26 12:17:12 -07:00
Niels Dossche
78401ba867 Implement Dom\Document::$title setter 2024-06-26 12:17:12 -07:00
Niels Dossche
04af960397 Implement Dom\Document::$title getter 2024-06-26 12:17:12 -07:00
Niels Dossche
a12db3b656 Implement Dom\Document::$body setter 2024-06-26 12:17:12 -07:00
Niels Dossche
287cf91724 Implement Dom\Document::$head 2024-06-26 12:17:12 -07:00
Niels Dossche
a1485df55a Implement Dom\Document::$body getter 2024-06-26 12:17:12 -07:00
Arnaud Le Blanc
11accb5cdf
Preferably include from build dir (#13516)
* Include from build dir first

This fixes out of tree builds by ensuring that configure artifacts are included
from the build dir.

Before, out of tree builds would preferably include files from the src dir, as
the include path was defined as follows (ignoring includes from ext/ and sapi/) :

    -I$(top_builddir)/main
    -I$(top_srcdir)
    -I$(top_builddir)/TSRM
    -I$(top_builddir)/Zend
    -I$(top_srcdir)/main
    -I$(top_srcdir)/Zend
    -I$(top_srcdir)/TSRM
    -I$(top_builddir)/

As a result, an out of tree build would include configure artifacts such as
`main/php_config.h` from the src dir.

After this change, the include path is defined as follows:

    -I$(top_builddir)/main
    -I$(top_builddir)
    -I$(top_srcdir)/main
    -I$(top_srcdir)
    -I$(top_builddir)/TSRM
    -I$(top_builddir)/Zend
    -I$(top_srcdir)/Zend
    -I$(top_srcdir)/TSRM

* Fix extension include path for out of tree builds

* Include config.h with the brackets form

`#include "config.h"` searches in the directory containing the including-file
before any other include path. This can include the wrong config.h when building
out of tree and a config.h exists in the source tree.

Using `#include <config.h>` uses exclusively the include path, and gives
priority to the build dir.
2024-06-26 00:26:43 +02:00
David CARLIER
5c55306a50
Fix GH-14652: segfault on node without document. (#14653)
do not bother trying to clone the inner document if there is none to
begin with.
2024-06-24 22:31:53 +01:00
Niels Dossche
e4250cec79 Introduce Dom\AdjacentPosition and use it in the insert adjacent methods
See https://wiki.php.net/rfc/dom_additions_84#allowing_php-specific_developer_experience_improvements
2024-06-24 12:36:35 -07:00
Niels Dossche
5db05955c8
Move more common code into php_dom_next_in_tree_order() (#14363) 2024-05-29 19:50:41 +02:00
Levi Morrison
c461b60060
refactor: change zend_is_true to return bool (#14301)
Previously this returned `int`. Many functions actually take advantage
of the fact this returns exactly 0 or 1. For instance,
`main/streams/xp_socket.c` does:

    sockopts |= STREAM_SOCKOP_IPV6_V6ONLY_ENABLED * zend_is_true(tmpzval);

And `Zend/zend_compile.c` does:

    child = &ast->child[2 - zend_is_true(zend_ast_get_zval(ast->child[0]))];

I changed a few places trivially from `int` to `bool`, but there are
still many places such as the object handlers which return `int` that
should eventually be `bool`.
2024-05-24 15:16:36 -06:00
Niels Dossche
e95b06c5ad Make some more arguments const 2024-05-13 19:46:51 +02:00
Niels Dossche
1fdbb0aba6 Get rid of unused declarations 2024-05-13 19:46:51 +02:00
Niels Dossche
44485892df Factor out all common code for XML serialization and merge common paths 2024-05-11 18:09:39 +02:00
Niels Dossche
6e7adb3c48
Update ext/dom names after policy change (#14171) 2024-05-09 10:40:53 +02:00
Niels Dossche
fae25ca2df Move dom_attr_value() into ext/libxml 2024-05-05 10:14:40 +02:00
Niels Dossche
6f989cdb75
Merge branch 'PHP-8.3'
* PHP-8.3:
  Fix crash when calling childNodes next() when iterator is exhausted
  Fix references not handled correctly in C14N
  Fix crashes when entity declaration is removed while still having entity references
2024-04-30 22:53:48 +02:00
Niels Dossche
461d890f0a
Merge branch 'PHP-8.2' into PHP-8.3
* PHP-8.2:
  Fix crash when calling childNodes next() when iterator is exhausted
  Fix references not handled correctly in C14N
  Fix crashes when entity declaration is removed while still having entity references
2024-04-30 22:38:32 +02:00
Niels Dossche
e878b9f390
Fix crashes when entity declaration is removed while still having entity references
libxml doesn't do reference counting inside its node types. It's
possible to remove an entity declaration out of the document, but then
entity references will keep pointing to that stale declaration. This
will cause crashes.

One idea would be to check when a declaration is removed, to trigger a
hook that updates all references. However this means we have to keep
track of all references somehow, which would be a high-overhead
solution. The solution in this patch makes sure that the fields are
always updated before they are read.

Closes GH-14089.
2024-04-30 22:29:44 +02:00
Niels Dossche
47feb5795c
Support named items in dimension handling for HTMLCollection
Closes GH-13937.
2024-04-14 14:46:04 +02:00
Niels Dossche
b3f820b408
Split off nodelist header components to nodelist.h 2024-04-14 14:45:46 +02:00
Niels Dossche
53f6e5ecd8
Move node list dimension handling to a separate file 2024-04-14 14:45:46 +02:00
Niels Dossche
5c69b2e86f
Factor out reading an attribute value 2024-04-14 14:45:45 +02:00
Niels Dossche
a0da32a42d
Cleanup and optimize attribute value reading (#13897)
When the attribute has a single text child, we can avoid an allocating
call to libxml2 and read the contents directly.

On my i7-4790, I tested the optimization with both the $value and
$nodeValue property.

```
Summary
  ./sapi/cli/php bench_value.php ran
    1.82 ± 0.09 times faster than ./sapi/cli/php_old bench_value.php

Summary
  ./sapi/cli/php bench_nodeValue.php ran
    1.78 ± 0.10 times faster than ./sapi/cli/php_old bench_nodeValue.php
```

Test code:
```
$dom = new DOMDocument;
$dom->loadXML('<root attrib="this is a relatively short text"/>');
$attrib = $dom->documentElement->attributes[0];

for ($i=0; $i<1000*1000; $i++) {
	$attrib->value; // or nodeValue
}
```
2024-04-07 13:08:31 +02:00
Niels Dossche
649394d357 Remove redundant namespace define 2024-03-10 11:08:46 +01:00
Niels Dossche
d57e7a920b Use BAD_CAST consistently 2024-03-10 11:08:46 +01:00