Commit graph

106 commits

Author SHA1 Message Date
Niels Dossche
8526de84a5 Move common obj_map API functions to obj_map.c 2025-06-22 12:30:50 +02:00
Niels Dossche
ff0a2cff05 Refactor implementation of DOM nodelists, named maps, and iterators
The code was really messy with lots of checks and inconsistencies.
This splits everything up into different functions and now everything is
relayed to a handler vtable.
2025-06-21 22:17:33 +02:00
Niels Dossche
91a310e603
Get rid of separate DOM HashPosition member (#18354)
Besides the fact that this is only used for DOM_NODESET and thus makes
no sense of being on the iterator itself, it's also redundant now that
we have the index member.
2025-04-19 17:59:48 +02:00
Niels Dossche
33c4ca36e4
Merge branch 'PHP-8.4'
* PHP-8.4:
  Fix weird unpack behaviour in DOM
  Fix GH-17989: mb_output_handler crash with unset http_output_conv_mimetypes
2025-03-09 11:21:34 +01:00
Niels Dossche
aa6e58f82a
Merge branch 'PHP-8.3' into PHP-8.4
* PHP-8.3:
  Fix weird unpack behaviour in DOM
  Fix GH-17989: mb_output_handler crash with unset http_output_conv_mimetypes
2025-03-09 11:21:27 +01:00
Niels Dossche
9be9f70caa
Fix weird unpack behaviour in DOM
Engine pitfall: the iter index is only updated by foreach opcodes, so
the existing code that used it as an index for the nodes w.r.t. the
start did not work properly. Fix it by using our own counter.

Closes GH-18004.
2025-03-09 11:17:03 +01:00
Niels Dossche
2a6122c54a
Merge branch 'PHP-8.4'
* PHP-8.4:
  Fix uninitialized memory accesses in DOM iterator
2025-03-08 11:12:40 +01:00
Niels Dossche
8950c241b3
Merge branch 'PHP-8.3' into PHP-8.4
* PHP-8.3:
  Fix uninitialized memory accesses in DOM iterator
2025-03-08 11:12:34 +01:00
Niels Dossche
2634622d3d
Fix uninitialized memory accesses in DOM iterator 2025-03-08 11:12:24 +01:00
Niels Dossche
613c5e626e
Merge branch 'PHP-8.4'
* PHP-8.4:
  Fix GH-17572: getElementsByTagName returns collections with tagName-based indexing, causing loss of elements when converted to arrays
2025-01-26 16:22:04 +01:00
Niels Dossche
fc7c353519
Fix GH-17572: getElementsByTagName returns collections with tagName-based indexing, causing loss of elements when converted to arrays
Only (dtd) named node maps should have string-based indexing.
The ce check is fragile, just check for the presence of an xml hash
table.

Closes GH-17580.
2025-01-26 16:21:54 +01:00
Niels Dossche
28344e0445
Get rid of HASH_OF() in ext/dom (#16767)
The cases where this is used are only where an array is possible, so we
can replace them with Z_ARRVAL_P. This reduces some overhead.
2024-11-13 17:32:27 +01:00
Niels Dossche
73b7993b0d
Merge branch 'PHP-8.3'
* PHP-8.3:
  Fix GH-15654: Signed integer overflow in ext/dom/nodelist.c
2024-08-31 11:56:34 +02:00
Niels Dossche
9cb23a3dec
Fix GH-15654: Signed integer overflow in ext/dom/nodelist.c
There's implicit truncation casts from zend_long to int which cause
issues because checks are done against the zend_longs. Since the
iterator infrastructure uses zend_longs, just convert everything to
zend_long.

Closes GH-15669.
2024-08-31 11:47:08 +02:00
Niels Dossche
6a07400699
Merge branch 'PHP-8.3'
* PHP-8.3:
  Fix GH-15551: Segmentation fault (access null pointer) in ext/dom/xml_common.h
2024-08-23 19:43:32 +02:00
Niels Dossche
8a00faa2bb
Merge branch 'PHP-8.2' into PHP-8.3
* PHP-8.2:
  Fix GH-15551: Segmentation fault (access null pointer) in ext/dom/xml_common.h
2024-08-23 19:42:36 +02:00
Niels Dossche
9af574c26e
Fix GH-15551: Segmentation fault (access null pointer) in ext/dom/xml_common.h
Closes GH-15556.
2024-08-23 19:40:42 +02:00
Niels Dossche
ceca599649
Merge branch 'PHP-8.3'
* PHP-8.3:
  Fix UAF when removing doctype and using foreach iteration
2024-07-30 20:07:48 +02:00
Niels Dossche
4049594adf
Merge branch 'PHP-8.2' into PHP-8.3
* PHP-8.2:
  Fix UAF when removing doctype and using foreach iteration
2024-07-30 20:03:30 +02:00
Niels Dossche
b282dd749f
Fix UAF when removing doctype and using foreach iteration
This is an old bug, but this is pretty easy to fix.
It's basically applying the same fix as I did for e878b9f.
Reported by YuanchengJiang.

Closes GH-15143.
2024-07-30 20:01:22 +02:00
Arnaud Le Blanc
11accb5cdf
Preferably include from build dir (#13516)
* Include from build dir first

This fixes out of tree builds by ensuring that configure artifacts are included
from the build dir.

Before, out of tree builds would preferably include files from the src dir, as
the include path was defined as follows (ignoring includes from ext/ and sapi/) :

    -I$(top_builddir)/main
    -I$(top_srcdir)
    -I$(top_builddir)/TSRM
    -I$(top_builddir)/Zend
    -I$(top_srcdir)/main
    -I$(top_srcdir)/Zend
    -I$(top_srcdir)/TSRM
    -I$(top_builddir)/

As a result, an out of tree build would include configure artifacts such as
`main/php_config.h` from the src dir.

After this change, the include path is defined as follows:

    -I$(top_builddir)/main
    -I$(top_builddir)
    -I$(top_srcdir)/main
    -I$(top_srcdir)
    -I$(top_builddir)/TSRM
    -I$(top_builddir)/Zend
    -I$(top_srcdir)/Zend
    -I$(top_srcdir)/TSRM

* Fix extension include path for out of tree builds

* Include config.h with the brackets form

`#include "config.h"` searches in the directory containing the including-file
before any other include path. This can include the wrong config.h when building
out of tree and a config.h exists in the source tree.

Using `#include <config.h>` uses exclusively the include path, and gives
priority to the build dir.
2024-06-26 00:26:43 +02:00
Niels Dossche
e7af2bfd5b Get rid of reserved name usage 2024-05-13 19:46:51 +02:00
Niels Dossche
6f989cdb75
Merge branch 'PHP-8.3'
* PHP-8.3:
  Fix crash when calling childNodes next() when iterator is exhausted
  Fix references not handled correctly in C14N
  Fix crashes when entity declaration is removed while still having entity references
2024-04-30 22:53:48 +02:00
Niels Dossche
461d890f0a
Merge branch 'PHP-8.2' into PHP-8.3
* PHP-8.2:
  Fix crash when calling childNodes next() when iterator is exhausted
  Fix references not handled correctly in C14N
  Fix crashes when entity declaration is removed while still having entity references
2024-04-30 22:38:32 +02:00
Niels Dossche
2dbe2d62b3
Fix crash when calling childNodes next() when iterator is exhausted
Closes GH-14091.
2024-04-30 22:30:58 +02:00
Niels Dossche
78ccea4e40
Fix GH-13863: Removal during NodeList iteration breaks loop
The list is live, so upon cache invalidation we should rewalk the tree
to sync the index again with the node list. We keep the legacy behaviour
for the old DOM classes.

Closes GH-13934.
2024-04-10 19:07:59 +02:00
Niels Dossche
2bf9da6e3d
Fix key data for NodeList iterations 2024-04-10 19:07:52 +02:00
Niels Dossche
51ff37b618
Factor out dom_fetch_first_iteration_item() 2024-04-10 19:07:52 +02:00
Niels Dossche
15259a0a6c Factor out common "first container's child" code 2024-04-03 18:15:43 +02:00
Niels Dossche
b7bc41341c Remove random newline 2024-04-03 18:15:43 +02:00
Niels Dossche
fc9b58f602 Remove duplicated code for entity vs notation handling 2024-04-03 18:15:43 +02:00
Niels Dossche
fa4d8cd4fa Cleanup create_notation()
We're already clearing memory, and we can merge the declaration and
assignment.
2024-04-03 18:15:43 +02:00
Niels Dossche
aa19461151 Don't use a heap allocation to track the current item 2024-04-03 18:15:43 +02:00
Niels Dossche
6de0486e19 Express php_dom_libxml_notation_iter() using php_dom_libxml_hash_iter() 2024-04-03 18:15:43 +02:00
Niels Dossche
f02fd58809 Get rid of impossible error branch 2024-04-03 18:15:43 +02:00
Niels Dossche
88449f6798 Simplify control flow for special case of php_dom_iterator_move_forward() 2024-04-03 18:15:43 +02:00
Niels Dossche
899b916f71 Merge declarations and assignments in php_dom_iterator_move_forward() 2024-04-03 18:15:43 +02:00
Niels Dossche
14b6c981c3
[RFC] Add a way to opt-in ext/dom spec compliance (#13031)
RFC: https://wiki.php.net/rfc/opt_in_dom_spec_compliance
2024-03-09 16:56:00 +01:00
David CARLIER
9726721560
general signatures discrepencies fixes (#13122) 2024-01-10 22:19:23 +00:00
Niels Dossche
df89409aba Fix compile error with -Werror=incompatible-function-pointer-types and old libxml2
libxml2 prior to 2.9.8 had a different signature for xmlHashScanner.
This signature changed in e03f0a199a
Use an #if to work around the incompatible signature.

Closes GH-12326.
2023-09-30 00:12:20 +02:00
Niels Dossche
a2fde39169 Remove always-true condition from php_dom_iterator_move_forward()
basenode is already checked above
2023-07-11 11:47:54 +02:00
Niels Dossche
c3f0797385
Implement iteration cache, item cache and length cache for node list iteration (#11330)
* Implement iteration cache, item cache and length cache for node list iteration

The current implementation follows the spec requirement that the list
must be "live". This means that changes in the document must be
reflected in the existing node lists without requiring the user to
refetch the node list.
The consequence is that getting any item, or the length of the list,
always starts searching from the root element of the node list. This
results in O(n) time to get any item or the length. If there's a for
loop over the node list, this means the iterations will take O(n²) time
in total. This causes real-world performance issues with potential for
downtime (see GH-11308 and its references for details).

We fix this by introducing a caching strategy. We cache the last
iterated object in the iterator, the last requested item in the node
list, and the last length computation. To invalidate the cache, we
simply count the number of modifications made to the containing
document. If the modification number does not match what the number was
during caching, we know the document has been modified and the cache is
invalid. If this ever overflows, we saturate the modification number and
don't do any caching anymore. Note that we don't check for overflow on
64-bit systems because it would take hundreds of years to overflow.

Fixes GH-11308.
2023-06-03 00:13:14 +02:00
Patrick Allaert
aff365871a Fixed some spaces used instead of tabs 2021-06-29 11:30:26 +02:00
KsaR
01b3fc03c3
Update http->https in license (#6945)
1. Update: http://www.php.net/license/3_01.txt to https, as there is anyway server header "Location:" to https.
2. Update few license 3.0 to 3.01 as 3.0 states "php 5.1.1, 4.1.1, and earlier".
3. In some license comments is "at through the world-wide-web" while most is without "at", so deleted.
4. fixed indentation in some files before |
2021-05-06 12:16:35 +02:00
Nikita Popov
3e01f5afb1 Replace zend_bool uses with bool
We're starting to see a mix between uses of zend_bool and bool.
Replace all usages with the standard bool type everywhere.

Of course, zend_bool is retained as an alias.
2021-01-15 12:33:06 +01:00
Nikita Popov
312201dce4 Add get_gc handle for object iterators
Optional handler with the same semantics as the object handler.
2020-07-01 15:17:22 +02:00
Nikita Popov
15846ff115 Add ZVAL_OBJ_COPY macro
For the common ZVAL_OBJ + GC_ADDREF pattern.
This mirrors the existing ZVAL_STR_COPY API.
2020-06-17 16:36:56 +02:00
George Peter Banyard
62b1d2cb69 Fix [-Wundef] warning in DOM extension 2020-05-16 15:31:15 +02:00
Gabriel Caruso
5d6e923d46
Remove mention of PHP major version in Copyright headers
Closes GH-4732.
2019-09-25 14:51:43 +02:00
Dmitry Stogov
83804519df Replace ZVAL_COPY() and ZVAL_COPY_VALUE() for IS_OBJECT by cheaper macros 2019-05-28 20:10:02 +03:00