Commit graph

255 commits

Author SHA1 Message Date
Niels Dossche
bb007438e2 Fix libxml2 build for 2.12.0-2.12.2 2023-12-23 17:20:52 +01:00
Niels Dossche
3fa5af8496 Fix crashes with entity references and predefined entities
There's two issues here:
- freeing of predefined entity declaration crashes (unique to 8.3 & master)
- using multiple entity references for a single entity declaration crashes
  (since forever)

The fix for the last issue is fairly easy to do on 8.3, but may require a
slightly different approach on 8.2. Therefore, for now this is 8.3-only.

Closes GH-13004.
2023-12-23 17:00:57 +01:00
Niels Dossche
f61f8d439c Merge branch 'PHP-8.2' into PHP-8.3
* PHP-8.2:
  Fix libxml2 2.12 build due to API breaks
2023-12-01 18:07:13 +01:00
Niels Dossche
0a39890c96 Fix libxml2 2.12 build due to API breaks
See 1922547860
2023-12-01 18:03:37 +01:00
Niels Dossche
eebc528cbf Fix broken cache invalidation with deallocated and reallocated document node
The original caching implementation had an oversight in combination with
the new lifetime management in DOM for 8.3.
The modification counter is stored on the document object itself, but as
that can get deallocated when all references disappear, stale cache data
can be used. Normally this isn't a problem, unless getElementsByTagName is
called not on the document but on a child node. Fix it by moving caching
data into the ref object, which will outlive all nodes from a document
even if the document object disappears.

Closes GH-12338.
2023-10-01 17:06:02 +02:00
Niels Dossche
df89409aba Fix compile error with -Werror=incompatible-function-pointer-types and old libxml2
libxml2 prior to 2.9.8 had a different signature for xmlHashScanner.
This signature changed in e03f0a199a
Use an #if to work around the incompatible signature.

Closes GH-12326.
2023-09-30 00:12:20 +02:00
Niels Dossche
6a2b885155 Merge branch 'PHP-8.2' into PHP-8.3
* PHP-8.2:
  Restore old namespace reconciliation behaviour
2023-09-27 22:40:37 +02:00
David CARLIER
e648d39e3b
libxml set error structure simplification proposal (#12054) 2023-08-26 12:11:50 +01:00
Niels Dossche
e1cb721679 Improve warning when returning null from the resolver set by libxml_set_external_entity_loader
Fixes GH-11952.
Closes GH-12022.
2023-08-24 21:23:29 +02:00
Niels Dossche
bb092ab4c6 Fix #80927: Removing documentElement after creating attribute node: possible use-after-free
Closes GH-11892.
2023-08-12 18:49:12 +02:00
Niels Dossche
5018dfecdf Remove useless hashmap check
php_libxml_unlink_entity is called from a hashmap iterator, so using
xmlHashLookup to check if it comes from that hashmap will always be
true.
2023-08-07 19:53:20 +02:00
Niels Dossche
75229cb127 Cleanup php_libxml_node_decrement_resource()
obj_node is already checked, so checking it again in the second if is
not necessary.
Merge declarations and assignments while we're at it.
2023-07-11 11:47:54 +02:00
Niels Dossche
003ebdd039 Fix GH-9628: Implicitly removing nodes from \DOMDocument breaks existing references
Change the way lifetime works in ext/libxml and ext/dom

Previously, a node could be freed even when holding a userland reference
to it. This resulted in exceptions when trying to access that node after
it has been implicitly or explicitly removed. After this patch, a node
will only be freed when the last userland reference disappears.

Fixes GH-9628.
Closes GH-11576.
2023-07-03 21:31:57 +02:00
Niels Dossche
50b4df18e0
Get rid of return value for php_libxml_unregister_node() (#11398) 2023-06-08 17:44:55 +02:00
Niels Dossche
c3f0797385
Implement iteration cache, item cache and length cache for node list iteration (#11330)
* Implement iteration cache, item cache and length cache for node list iteration

The current implementation follows the spec requirement that the list
must be "live". This means that changes in the document must be
reflected in the existing node lists without requiring the user to
refetch the node list.
The consequence is that getting any item, or the length of the list,
always starts searching from the root element of the node list. This
results in O(n) time to get any item or the length. If there's a for
loop over the node list, this means the iterations will take O(n²) time
in total. This causes real-world performance issues with potential for
downtime (see GH-11308 and its references for details).

We fix this by introducing a caching strategy. We cache the last
iterated object in the iterator, the last requested item in the node
list, and the last length computation. To invalidate the cache, we
simply count the number of modifications made to the containing
document. If the modification number does not match what the number was
during caching, we know the document has been modified and the cache is
invalid. If this ever overflows, we saturate the modification number and
don't do any caching anymore. Note that we don't check for overflow on
64-bit systems because it would take hundreds of years to overflow.

Fixes GH-11308.
2023-06-03 00:13:14 +02:00
Niels Dossche
82b05373b1 Merge branch 'PHP-8.2'
* PHP-8.2:
  Fix GH-11160: Few tests failed building with new libxml 2.11.0
2023-05-06 23:15:57 +02:00
Niels Dossche
dc1a70c244 Merge branch 'PHP-8.1' into PHP-8.2
* PHP-8.1:
  Fix GH-11160: Few tests failed building with new libxml 2.11.0
2023-05-06 23:10:58 +02:00
Niels Dossche
7c0dfc5cf5 Fix GH-11160: Few tests failed building with new libxml 2.11.0
It's possible to categorise the failures into 2 categories:
  - Changed error message. In this case we either duplicate the test and
    modify the error message. Or if the change in error message is
    small, we use the EXPECTF matchers to make the test compatible with both
    old and new versions of libxml2.
  - Missing warnings. This is caused by a change in libxml2 where the
    parser started using SAX APIs internally [1]. In this case the
    error_type passed to php_libxml_internal_error_handler() changed from
    PHP_LIBXML_ERROR to PHP_LIBXML_CTX_WARNING because it internally
    started to use the SAX handlers instead of the generic handlers.
    However, for the SAX handlers the current input stack is empty, so
    nothing is actually printed. I fixed this by falling back to a
    regular warning without a filename & line number reference, which
    mimicks the old behaviour. Furthermore, this change now also shows
    an additional warning in a test which was previously hidden.

[1] 9a82b94a94

Closes GH-11162.
2023-05-06 23:10:07 +02:00
Max Kellermann
1287747a9a
ext: make various internal functions static (#10650)
Namely in:
* ext/date
* ext/libxml
* ext/dba
* ext/curl
2023-02-21 15:51:41 +00:00
George Peter Banyard
32d3cae19f
Handle trampolines correctly in new FCC API + usages (#9877) 2022-11-22 17:12:53 +00:00
George Peter Banyard
fb114bf45b Only use FCC for libxml entity loader callback 2022-11-02 14:52:54 +00:00
Tim Starling
11796229f2
Add libxml_get_external_entity_loader()
Add libxml_get_external_entity_loader(), which returns the currently
installed external entity loader, i.e. the value which was passed to
libxml_set_external_entity_loader() or null if no loader was installed
and the default entity loader will be used.

This allows libraries to save and restore the loader, controlling entity
expansion without interfering with the rest of the application.

Add macro Z_PARAM_FUNC_OR_NULL_WITH_ZVAL(). This allows us to get the
zval for a callable parameter without duplicating callable argument
parsing.

The saved zval keeps the object needed for fcc/fci alive, simplifying
memory management.

Fixes #76763.
2022-08-28 12:47:20 +01:00
Christoph M. Becker
145525bc4c
Merge branch 'PHP-8.1'
* PHP-8.1:
  xmlRelaxNGCleanupTypes() is deprecated as of libxml2 2.10.0
2022-08-25 15:12:40 +02:00
Christoph M. Becker
afc5ab4531
Merge branch 'PHP-8.0' into PHP-8.1
* PHP-8.0:
  xmlRelaxNGCleanupTypes() is deprecated as of libxml2 2.10.0
2022-08-25 15:11:41 +02:00
Christoph M. Becker
f59754694e
xmlRelaxNGCleanupTypes() is deprecated as of libxml2 2.10.0
The documentation[1] suggest to call `xmlCleanupParser()` instead, but
we are not doing that for reasons[2].  Thus, we do no longer call
`xmlRelaxNGCleanupTypes()` for libxml2 ≥ 2.10.0.

[1] <https://gnome.pages.gitlab.gnome.org/libxml2/devhelp/libxml2-relaxng.html#xmlRelaxNGCleanupTypes>
[2] <8742276eb3>

Closes GH-9417.
2022-08-25 15:10:30 +02:00
Máté Kocsis
7601068f3d
Declare ext/libxml constants in stubs (#8721) 2022-06-09 08:18:44 +02:00
Ilija Tovilo
2f5295692f
Optimize stripos/stristr
Closes GH-7847
Closes GH-7852

Previously stripos/stristr would lowercase both the haystack and the
needle to reuse strpos. The approach in this PR is similar to strpos.
memchr is highly optimized so we're using it to search for the first
character of the needle in the haystack. If we find it we compare the
remaining characters of the needle manually.

The new implementation seems to perform about half as well as strpos (as
two memchr calls are necessary to find the next candidate).
2022-01-31 21:44:31 +01:00
Stanislav Malyshev
9de4eb9e37
Merge branch 'PHP-8.0' into PHP-8.1
* PHP-8.0:
  Fix #79971: special character is breaking the path in xml function
2021-11-14 23:29:59 -08:00
Stanislav Malyshev
9d74c5b40b
Merge branch 'PHP-8.0'
* PHP-8.0:
  Fix #79971: special character is breaking the path in xml function
2021-11-14 23:29:37 -08:00
Stanislav Malyshev
0ef1dfc9f6
Merge branch 'PHP-7.4' into PHP-8.0
* PHP-7.4:
  Fix #79971: special character is breaking the path in xml function
2021-11-14 23:29:27 -08:00
Stanislav Malyshev
ca87d46a3e
Merge branch 'PHP-7.3' into PHP-7.4
* PHP-7.3:
  Fix #79971: special character is breaking the path in xml function
2021-11-14 23:28:13 -08:00
Christoph M. Becker
f15f8fc573
Fix #79971: special character is breaking the path in xml function
The libxml based XML functions accepting a filename actually accept
URIs with possibly percent-encoded characters.  Percent-encoded NUL
bytes lead to truncation, like non-encoded NUL bytes would.  We catch
those, and let the functions fail with a respective warning.
2021-11-14 23:24:33 -08:00
KsaR
01b3fc03c3
Update http->https in license (#6945)
1. Update: http://www.php.net/license/3_01.txt to https, as there is anyway server header "Location:" to https.
2. Update few license 3.0 to 3.01 as 3.0 states "php 5.1.1, 4.1.1, and earlier".
3. In some license comments is "at through the world-wide-web" while most is without "at", so deleted.
4. fixed indentation in some files before |
2021-05-06 12:16:35 +02:00
George Peter Banyard
5caaf40b43
Introduce pseudo-keyword ZEND_FALLTHROUGH
And use it instead of comments
2021-04-07 00:46:29 +01:00
Christoph M. Becker
6adec555db Merge branch 'PHP-8.0'
* PHP-8.0:
  Fix #73533: Invalid memory access in php_libxml_xmlCheckUTF8
2021-03-24 11:53:53 +01:00
Christoph M. Becker
5832be768c Merge branch 'PHP-7.4' into PHP-8.0
* PHP-7.4:
  Fix #73533: Invalid memory access in php_libxml_xmlCheckUTF8
2021-03-24 11:52:54 +01:00
Christoph M. Becker
498eb8e052 Fix #73533: Invalid memory access in php_libxml_xmlCheckUTF8
A string passed to `php_libxml_xmlCheckUTF8()` may be longer than
1<<31-1 bytes, so we're better using a `size_t`.

Closes GH-6802.
2021-03-24 11:50:50 +01:00
Christoph M. Becker
9f826e8ce9 Merge branch 'PHP-8.0'
* PHP-8.0:
  Fix #51903: simplexml_load_file() doesn't use HTTP headers
2021-03-08 15:16:55 +01:00
Christoph M. Becker
7931956805 Merge branch 'PHP-7.4' into PHP-8.0
* PHP-7.4:
  Fix #51903: simplexml_load_file() doesn't use HTTP headers
2021-03-08 15:15:59 +01:00
Christoph M. Becker
f901bec494 Fix #51903: simplexml_load_file() doesn't use HTTP headers
The `encoding` attribute of the XML declaration is optional; it is good
practice to use external encoding information where available if it is
missing.  Thus, we check for `charset` info of `Content-Type` headers,
and see whether the encoding is supported.

We cater to trailing parameters and quoted-strings, but not to escaped
backslashes and quotes in quoted-strings, since no known character
encoding contains these anyway.

Co-authored-by: Michael Wallner <mike@php.net>

Closes GH-6747.
2021-03-08 15:07:01 +01:00
Máté Kocsis
1954e59758
Add support for generating class entries from stubs
Closes GH-6289

Co-authored-by: Nikita Popov <nikita.ppv@gmail.com>
2021-01-26 11:50:36 +01:00
Nikita Popov
3e01f5afb1 Replace zend_bool uses with bool
We're starting to see a mix between uses of zend_bool and bool.
Replace all usages with the standard bool type everywhere.

Of course, zend_bool is retained as an alias.
2021-01-15 12:33:06 +01:00
Nikita Popov
42a6ff4201 Try to fix windows build 2020-09-03 15:07:03 +02:00
Nikita Popov
22be60bb25 Add declared properties to LibXMLError
Partially addresses bug #79804.
2020-07-08 10:41:46 +02:00
Nikita Popov
302933daea Remove no_separation flag 2020-07-07 09:30:24 +02:00
Max Semenik
2b5de6f839
Remove proto comments from C files
Closes GH-5758
2020-07-06 21:13:34 +02:00
Nikita Popov
c9bc7dd110 Don't throw warning if exception thrown during dom validation 2020-06-25 15:24:35 +02:00
Christoph M. Becker
a582931f42 Revert "Revert "Merge branch 'PHP-7.4'""
This reverts commit 28e650a, which reverted commit 046dcfb, which had
to be reverted due to phpdbg issues.  The culprit was that we did not
properly reset `zend_handler_table` to `NULL`, which is required for
SAPIs which may restart the engine after shutdown.

[1] <http://git.php.net/?p=php-src.git;a=commit;h=28e650abf8097a28789a005e5028fee095359583>
[2] <http://git.php.net/?p=php-src.git;a=commit;h=046dcfb531e242d36a7af2942b9b148290c3c7fe>
2020-05-20 14:11:42 +02:00
George Peter Banyard
35e0a91db7 Fix [-Wundef] warnings in libxml extension 2020-05-16 15:31:23 +02:00
George Peter Banyard
197cac65fd Use ZEND_FCI_INITIALIZED macro
Instead of manually checking that the fci.size is different than 0
2020-05-16 00:36:15 +02:00