Commit graph

277 commits

Author SHA1 Message Date
Niels Dossche
fae25ca2df Move dom_attr_value() into ext/libxml 2024-05-05 10:14:40 +02:00
Niels Dossche
cf7c592143 Simplify property check in php_libxml_node_free_resource() 2024-04-30 17:29:19 +02:00
Niels Dossche
974edc7939 Cleanup php_libxml_internal_error_handler_ex() 2024-04-30 17:29:19 +02:00
Niels Dossche
e5e8b193e0 Remove bogus entity reference cleanup code 2024-04-30 17:29:19 +02:00
Niels Dossche
2fab1437f2 Cleanup php_libxml_streams_IO_open_wrapper() 2024-04-30 17:29:19 +02:00
Niels Dossche
a54d63aefe Cleanup php_libxml_unregister_node() 2024-04-30 17:29:19 +02:00
Niels Dossche
30885f3b5f
Implement request #71571: XSLT processor should provide option to change maxDepth (#13731)
There are two depth limiting parameters for XSLT templates.
1) maxTemplateDepth
   This corresponds to the recursion depth of a template. For very
   complicated templates this can be hit.
2) maxTemplateVars
   This is the total number of live variables. When using recursive
   templates with lots of parameters you can hit this limit.

This patch introduces two new properties to XSLTProcessor that
corresponds to the above variables.
2024-03-31 21:21:23 +02:00
Niels Dossche
b955973818 Only register error handling when observable
Closes GH-13702.
2024-03-17 18:24:40 +01:00
Niels Dossche
14b6c981c3
[RFC] Add a way to opt-in ext/dom spec compliance (#13031)
RFC: https://wiki.php.net/rfc/opt_in_dom_spec_compliance
2024-03-09 16:56:00 +01:00
Niels Dossche
a74da53fc4 Remove useless write to LIBXML(stream_context)
The value will always be overwritten.
2024-02-25 00:28:30 +01:00
Niels Dossche
043daeeec1 Merge branch 'PHP-8.3'
* PHP-8.3:
  Fix crashes with entity references and predefined entities
2024-01-17 19:41:32 +01:00
Niels Dossche
120bd364aa Fix crashes with entity references and predefined entities
Closes GH-13004.
2024-01-17 19:41:22 +01:00
Niels Dossche
03547f6832
Remove properties field from php_libxml_node_object (#13062)
This shrinks the struct from 80 bytes to 72 bytes.
This was unused internally, I did not find users externally via GitHub
search.
The intention for this was that it could be used for attaching extra
data as a 3rd party to a node. However, there are better mechanisms for
that like using actual objects.
2024-01-03 20:03:56 +01:00
Niels Dossche
76f24d3531 Merge branch 'PHP-8.3'
* PHP-8.3:
  Revert "Fix crashes with entity references and predefined entities"
2023-12-23 17:31:28 +01:00
Niels Dossche
5f69232b53 Revert "Fix crashes with entity references and predefined entities"
This reverts commit 3fa5af8496.
2023-12-23 17:31:18 +01:00
Niels Dossche
c293de792f Merge branch 'PHP-8.3'
* PHP-8.3:
  Fix libxml2 build for 2.12.0-2.12.2
2023-12-23 17:21:34 +01:00
Niels Dossche
bb007438e2 Fix libxml2 build for 2.12.0-2.12.2 2023-12-23 17:20:52 +01:00
Niels Dossche
f420ea84aa Merge branch 'PHP-8.3'
* PHP-8.3:
  Fix crashes with entity references and predefined entities
  Fix crash in adoptNode with attribute references
2023-12-23 17:01:28 +01:00
Niels Dossche
3fa5af8496 Fix crashes with entity references and predefined entities
There's two issues here:
- freeing of predefined entity declaration crashes (unique to 8.3 & master)
- using multiple entity references for a single entity declaration crashes
  (since forever)

The fix for the last issue is fairly easy to do on 8.3, but may require a
slightly different approach on 8.2. Therefore, for now this is 8.3-only.

Closes GH-13004.
2023-12-23 17:00:57 +01:00
Niels Dossche
90eb5679d2
Cleanup libxml_get_external_entity_loader() (#12893)
We can directly put the value into return_value instead of copying
things around.
2023-12-08 18:44:46 +01:00
Niels Dossche
58fc521713 Merge branch 'PHP-8.3'
* PHP-8.3:
  Fix libxml2 2.12 build due to API breaks
2023-12-01 18:07:58 +01:00
Niels Dossche
f61f8d439c Merge branch 'PHP-8.2' into PHP-8.3
* PHP-8.2:
  Fix libxml2 2.12 build due to API breaks
2023-12-01 18:07:13 +01:00
Niels Dossche
0a39890c96 Fix libxml2 2.12 build due to API breaks
See 1922547860
2023-12-01 18:03:37 +01:00
Niels Dossche
ae83d6ab07
Fix issues related to libxml2 2.12.0 (#12802)
* Avoid passing NULL to xmlSwitchToEncoding

This otherwise switches to UTF-8 on libxml2 2.12.0

* Split tests for different error reporting behaviour in libxml2 2.12.0

* Avoid deprecation warnings for libxml2 2.12.0

We can't fully get rid of the parser globals as there are still APIs
that implicitly use them.

* Temporarily disable part of test for libxml 2.12.0 regression

See https://gitlab.gnome.org/GNOME/libxml2/-/issues/634

* Review fixes

* [ci skip] Update test description
2023-11-29 20:46:35 +01:00
Niels Dossche
1492be5286
[RFC] DOM HTML5 parsing and serialization support (#12111) 2023-11-13 20:18:19 +01:00
George Peter Banyard
52de0950f4 ext/libxml: Use new F ZPP modifier 2023-10-10 13:44:21 +01:00
Niels Dossche
eebc528cbf Fix broken cache invalidation with deallocated and reallocated document node
The original caching implementation had an oversight in combination with
the new lifetime management in DOM for 8.3.
The modification counter is stored on the document object itself, but as
that can get deallocated when all references disappear, stale cache data
can be used. Normally this isn't a problem, unless getElementsByTagName is
called not on the document but on a child node. Fix it by moving caching
data into the ref object, which will outlive all nodes from a document
even if the document object disappears.

Closes GH-12338.
2023-10-01 17:06:02 +02:00
Niels Dossche
df89409aba Fix compile error with -Werror=incompatible-function-pointer-types and old libxml2
libxml2 prior to 2.9.8 had a different signature for xmlHashScanner.
This signature changed in e03f0a199a
Use an #if to work around the incompatible signature.

Closes GH-12326.
2023-09-30 00:12:20 +02:00
Niels Dossche
6a2b885155 Merge branch 'PHP-8.2' into PHP-8.3
* PHP-8.2:
  Restore old namespace reconciliation behaviour
2023-09-27 22:40:37 +02:00
David CARLIER
e648d39e3b
libxml set error structure simplification proposal (#12054) 2023-08-26 12:11:50 +01:00
Niels Dossche
e1cb721679 Improve warning when returning null from the resolver set by libxml_set_external_entity_loader
Fixes GH-11952.
Closes GH-12022.
2023-08-24 21:23:29 +02:00
Niels Dossche
bb092ab4c6 Fix #80927: Removing documentElement after creating attribute node: possible use-after-free
Closes GH-11892.
2023-08-12 18:49:12 +02:00
Niels Dossche
5018dfecdf Remove useless hashmap check
php_libxml_unlink_entity is called from a hashmap iterator, so using
xmlHashLookup to check if it comes from that hashmap will always be
true.
2023-08-07 19:53:20 +02:00
Niels Dossche
75229cb127 Cleanup php_libxml_node_decrement_resource()
obj_node is already checked, so checking it again in the second if is
not necessary.
Merge declarations and assignments while we're at it.
2023-07-11 11:47:54 +02:00
Niels Dossche
003ebdd039 Fix GH-9628: Implicitly removing nodes from \DOMDocument breaks existing references
Change the way lifetime works in ext/libxml and ext/dom

Previously, a node could be freed even when holding a userland reference
to it. This resulted in exceptions when trying to access that node after
it has been implicitly or explicitly removed. After this patch, a node
will only be freed when the last userland reference disappears.

Fixes GH-9628.
Closes GH-11576.
2023-07-03 21:31:57 +02:00
Niels Dossche
50b4df18e0
Get rid of return value for php_libxml_unregister_node() (#11398) 2023-06-08 17:44:55 +02:00
Niels Dossche
c3f0797385
Implement iteration cache, item cache and length cache for node list iteration (#11330)
* Implement iteration cache, item cache and length cache for node list iteration

The current implementation follows the spec requirement that the list
must be "live". This means that changes in the document must be
reflected in the existing node lists without requiring the user to
refetch the node list.
The consequence is that getting any item, or the length of the list,
always starts searching from the root element of the node list. This
results in O(n) time to get any item or the length. If there's a for
loop over the node list, this means the iterations will take O(n²) time
in total. This causes real-world performance issues with potential for
downtime (see GH-11308 and its references for details).

We fix this by introducing a caching strategy. We cache the last
iterated object in the iterator, the last requested item in the node
list, and the last length computation. To invalidate the cache, we
simply count the number of modifications made to the containing
document. If the modification number does not match what the number was
during caching, we know the document has been modified and the cache is
invalid. If this ever overflows, we saturate the modification number and
don't do any caching anymore. Note that we don't check for overflow on
64-bit systems because it would take hundreds of years to overflow.

Fixes GH-11308.
2023-06-03 00:13:14 +02:00
Niels Dossche
82b05373b1 Merge branch 'PHP-8.2'
* PHP-8.2:
  Fix GH-11160: Few tests failed building with new libxml 2.11.0
2023-05-06 23:15:57 +02:00
Niels Dossche
dc1a70c244 Merge branch 'PHP-8.1' into PHP-8.2
* PHP-8.1:
  Fix GH-11160: Few tests failed building with new libxml 2.11.0
2023-05-06 23:10:58 +02:00
Niels Dossche
7c0dfc5cf5 Fix GH-11160: Few tests failed building with new libxml 2.11.0
It's possible to categorise the failures into 2 categories:
  - Changed error message. In this case we either duplicate the test and
    modify the error message. Or if the change in error message is
    small, we use the EXPECTF matchers to make the test compatible with both
    old and new versions of libxml2.
  - Missing warnings. This is caused by a change in libxml2 where the
    parser started using SAX APIs internally [1]. In this case the
    error_type passed to php_libxml_internal_error_handler() changed from
    PHP_LIBXML_ERROR to PHP_LIBXML_CTX_WARNING because it internally
    started to use the SAX handlers instead of the generic handlers.
    However, for the SAX handlers the current input stack is empty, so
    nothing is actually printed. I fixed this by falling back to a
    regular warning without a filename & line number reference, which
    mimicks the old behaviour. Furthermore, this change now also shows
    an additional warning in a test which was previously hidden.

[1] 9a82b94a94

Closes GH-11162.
2023-05-06 23:10:07 +02:00
Max Kellermann
1287747a9a
ext: make various internal functions static (#10650)
Namely in:
* ext/date
* ext/libxml
* ext/dba
* ext/curl
2023-02-21 15:51:41 +00:00
George Peter Banyard
32d3cae19f
Handle trampolines correctly in new FCC API + usages (#9877) 2022-11-22 17:12:53 +00:00
George Peter Banyard
fb114bf45b Only use FCC for libxml entity loader callback 2022-11-02 14:52:54 +00:00
Tim Starling
11796229f2
Add libxml_get_external_entity_loader()
Add libxml_get_external_entity_loader(), which returns the currently
installed external entity loader, i.e. the value which was passed to
libxml_set_external_entity_loader() or null if no loader was installed
and the default entity loader will be used.

This allows libraries to save and restore the loader, controlling entity
expansion without interfering with the rest of the application.

Add macro Z_PARAM_FUNC_OR_NULL_WITH_ZVAL(). This allows us to get the
zval for a callable parameter without duplicating callable argument
parsing.

The saved zval keeps the object needed for fcc/fci alive, simplifying
memory management.

Fixes #76763.
2022-08-28 12:47:20 +01:00
Christoph M. Becker
145525bc4c
Merge branch 'PHP-8.1'
* PHP-8.1:
  xmlRelaxNGCleanupTypes() is deprecated as of libxml2 2.10.0
2022-08-25 15:12:40 +02:00
Christoph M. Becker
afc5ab4531
Merge branch 'PHP-8.0' into PHP-8.1
* PHP-8.0:
  xmlRelaxNGCleanupTypes() is deprecated as of libxml2 2.10.0
2022-08-25 15:11:41 +02:00
Christoph M. Becker
f59754694e
xmlRelaxNGCleanupTypes() is deprecated as of libxml2 2.10.0
The documentation[1] suggest to call `xmlCleanupParser()` instead, but
we are not doing that for reasons[2].  Thus, we do no longer call
`xmlRelaxNGCleanupTypes()` for libxml2 ≥ 2.10.0.

[1] <https://gnome.pages.gitlab.gnome.org/libxml2/devhelp/libxml2-relaxng.html#xmlRelaxNGCleanupTypes>
[2] <8742276eb3>

Closes GH-9417.
2022-08-25 15:10:30 +02:00
Máté Kocsis
7601068f3d
Declare ext/libxml constants in stubs (#8721) 2022-06-09 08:18:44 +02:00
Ilija Tovilo
2f5295692f
Optimize stripos/stristr
Closes GH-7847
Closes GH-7852

Previously stripos/stristr would lowercase both the haystack and the
needle to reuse strpos. The approach in this PR is similar to strpos.
memchr is highly optimized so we're using it to search for the first
character of the needle in the haystack. If we find it we compare the
remaining characters of the needle manually.

The new implementation seems to perform about half as well as strpos (as
two memchr calls are necessary to find the next candidate).
2022-01-31 21:44:31 +01:00
Stanislav Malyshev
9de4eb9e37
Merge branch 'PHP-8.0' into PHP-8.1
* PHP-8.0:
  Fix #79971: special character is breaking the path in xml function
2021-11-14 23:29:59 -08:00