Commit graph

668 commits

Author SHA1 Message Date
Niels Dossche
dc27acddd6
Fix GH-17122: memory leak in regex
Because the subpattern names are persistent, and the fact that the
symbol table destruction is skipped when using fast_shutdown,
this means the refcounts will not be updated for the destruction of
the arrays that hold the subpattern name keys.
To solve this, detect this situation and duplicate the strings.

Closes GH-17132.
2025-01-06 20:11:36 +01:00
Niels Dossche
5839fc5dd9
Merge branch 'PHP-8.3' into PHP-8.4
* PHP-8.3:
  Fix GH-16184: UBSan address overflowed in ext/pcre/php_pcre.c
2024-10-03 21:12:42 +02:00
Niels Dossche
ddc7a6b1fc
Merge branch 'PHP-8.2' into PHP-8.3
* PHP-8.2:
  Fix GH-16184: UBSan address overflowed in ext/pcre/php_pcre.c
2024-10-03 21:11:25 +02:00
Niels Dossche
c4bb07552e
Fix GH-16184: UBSan address overflowed in ext/pcre/php_pcre.c
libpcre2 can return the special value -1 for a non-match.
In this case we get pointer overflow, although it doesn't matter in
practice because the pointer will be in bounds and the copy length will
be 0. Still, we should fix the UBSAN warning.

Closes GH-16205.
2024-10-03 21:10:57 +02:00
David Carlier
f5d4781ee0
Merge branch 'PHP-8.3' into PHP-8.4 2024-10-03 12:48:46 +01:00
David Carlier
1aeb70f83c
Merge branch 'PHP-8.2' into PHP-8.3 2024-10-03 12:48:34 +01:00
David Carlier
f453d1ae2a
Fix GH-16189: underflow on preg_match/preg_match_all start_offset.
close GH-16191
2024-10-03 12:48:13 +01:00
Ilija Tovilo
ffb440550c
Use APPLY_STOP in pcre_clean_cache() (GH-15839)
Once num_clean has reached 0, we never remove any more elements anyway.

Closes GH-15839
2024-09-12 22:32:23 +02:00
Niels Dossche
ded8fb79bd
Fix UAF issues with PCRE after request shutdown
There are two related issues, each tested.

First problem:
What happens is that on the CLI SAPI we have a per-request pcre cache,
and on there the request shutdown for the pcre module happens prior to
the remaining live object destruction. So when the SPL object wants to
clean up the regular expression object it gets a use-after-free.

Second problem:
Very similarly, the non-persistent resources are destroyed after request
shutdown, so on the CLI SAPI the pcre request cache is already gone, but
if a userspace stream references a regex in the pcre cache, this breaks.

Two things that come immediately to mind:
  -  We could fix it by no longer treating the CLI SAPI special and just use
     the same lifecycle as the module. This simplifies the pcre module code
     a bit too. I wonder why we even have the separation in the first place.
     The downside here is that we're using more the system allocator
     than Zend's allocator for cache entries.
  -  We could modify the shutdown code to not remove regular expressions
     with a refcount>0 and modify php_pcre_pce_decref code such that it
     becomes php_pcre_pce_decref's job to clean up when the refcount
     becomes 0 during shutdown. However, this gets nasty quickly.

I chose the first solution here as it should be reliable and simple.

Closes GH-15064.
2024-09-11 18:49:19 +02:00
Niels Dossche
259744148e
Cache pcre subpattern table (#14424)
Recreating this over and over is pointless, cache this as well.
Fixes GH-14423.
2024-06-03 21:15:26 +02:00
Niels Dossche
315f2059b7
Remove dead check in pcre
This isn't reachable since ab32d36, because since then the library
itself checks this condition during compilation. The compilation failure
that results of it makes this code not reachable.

This is split off of GH-14424.
2024-06-03 19:58:22 +02:00
Gina Peter Banyard
25a5146180
Clean-up unused headers (#14365)
* ext/mbstring.c: clean-up headers and include intrinsics
2024-06-01 17:12:42 +01:00
Peter Kokot
e45d2d6046
Sync HAVE_BUNDLED_PCRE #if/ifdef/defined (#14354)
Follow up of GH-5526 (-Wundef)
2024-05-29 07:53:36 +02:00
Niels Dossche
d0e15c8502
Fix external pcre2 build (#13662)
PCRE2_EXTRA_CASELESS_RESTRICT is only available as of pcre2 10.43.
Note: no check is necessary for pcre2_set_compile_extra_options because
it is available since pcre2 10.30, which is the minimum version PHP
requires.
2024-03-10 13:15:15 +01:00
Ayesh Karunaratne
7b23470666
ext/pcre: Add "/r" modifier (#13583)
Adds support for "Caseless restricted" matching added in PCRE2lib
10.43 with the "r" modifier.

This is `PCRE2_EXTRA_CASELESS_RESTRICT` in PCRE2. This is an "extra"
option, which means it is not possible to pass this option as
pcre2_compile() function parameter.

This option is passed in a pcre2_set_compile_extra_options() call.
Previously, these extra options are set at php_pcre_init_pcre2(),
but after this change, it is possible to customize the options
by adding bits to `eoptions` in pcre_get_compiled_regex_cache_ex().

The tests for this change are ported from upstream test suite[^1].

[^1]: c13d54f658 (diff-8c8312e4eb2d35bb16485404b7b5cc0eaef0bca1aa95ff5febf6a1890048305c)
2024-03-05 20:51:04 +01:00
Ilija Tovilo
631bc81607
Implement stackless internal function calls
Co-authored-by: Dmitry Stogov <dmitry@zend.com>

Closes GH-12461
2024-02-06 17:42:28 +01:00
Jorg Adam Sowa
73722df439
Improve preg_* functions warnings for NUL byte (#13068)
* Improve error messages for preg_ functions
* Adjusted tests and fixed formatting
* Removed unnecessary strings from preg_* tests
* Removed ZPP tests
2024-01-07 13:40:54 +00:00
Cristian Rodríguez
927adfb1a6
Use a single version of mempcpy(3) (#12257)
While __php_mempcpy is only used by ext/standard/crypt_sha*, the
mempcpy "pattern" is used everywhere.

This commit removes __php_mempcpy, adds zend_mempcpy and transforms
open-coded parts into function calls.
2023-12-20 15:16:32 +00:00
Ilija Tovilo
be46545ee0
Fix pcre out-of-bounds when using closing symbols as opening delimiter (#12946)
Apparently we support using closing symbols )]}> as opening and closing
delimiters.

Fixes oss-fuzz #65021
2023-12-12 21:58:34 +01:00
Niels Dossche
642e11140c
Minor pcre optimizations (#12923)
* Update signature of pcre API

This changes the variables that are bools to actually be bools instead
of ints, which allows some additional optimization by the compiler (e.g.
removing some ternaries and move extensions).

It also gets rid of the use_flags argument because that's just the same
as flags == 0. This reduces the call frame.

* Use zend_string_release_ex where possible

* Remove duplicate symbols from strchr

* Avoid useless value conversions

* Use a raw HashTable* instead of a zval

* Move condition

* Make for loop cheaper by reusing a recently used value as start iteration index

* Remove useless condition

This can't be true if the second condition is true because it would
require the string to occupy the entire address space.

* Upgrading + remark
2023-12-11 19:43:26 +01:00
Michael Voříšek
8d98f7202a
Use ZSTR_IS_VALID_UTF8 macro where possible (#12869) 2023-12-05 05:24:11 +00:00
Niels Dossche
1e2e2f391c
Refactor some ext/pcre code for performance (#12447)
* Always inline populate_match_value and fix argument type

The call overhead of this function is quite large.

* Use _new variant of zend_hash in some places to avoid additional check

* Move allocation of match_sets down to simplify and reduce code size

* Move pcre2_get_ovector_pointer out of the loop

This is allocated together with the match data and stays loop invariant:
the pointer is always the same (the values not however).

* Mark error condition as cold block

* Simplify condition: subpats is already checked

* Move array size preallocation to use allocate the up-to-date size

* Simplify condition

* Rework internal functions to avoid repeated unwrapping

* Remember Z_ARRVAL_P(return_value)

The lookup is loop invariant.

* Mark some pointers as const
2023-10-16 19:39:29 +02:00
Tim Düsterhus
72cac39698
pcre: Stop special-casing /e (#12355)
Support for /e was removed in PHP 7.0, remove the custom error message and stop
special casing it to simplify the logic.
2023-10-06 19:45:14 +02:00
Ilija Tovilo
33e2868f0d
Merge branch 'PHP-8.2'
* PHP-8.2:
  Revert "Mangle PCRE regex cache key with JIT option"
2023-06-22 23:14:37 +02:00
Ilija Tovilo
7f9ad4a83a
Merge branch 'PHP-8.1' into PHP-8.2
* PHP-8.1:
  Revert "Mangle PCRE regex cache key with JIT option"
2023-06-22 23:14:27 +02:00
Ilija Tovilo
4d91665f78
Revert "Mangle PCRE regex cache key with JIT option"
This reverts commit 466fc78d2c.
2023-06-22 23:13:37 +02:00
Ilija Tovilo
99340269c6
Merge branch 'PHP-8.2'
* PHP-8.2:
  Mangle PCRE regex cache key with JIT option
2023-06-22 11:11:09 +02:00
Ilija Tovilo
34a1a1bddb
Merge branch 'PHP-8.1' into PHP-8.2
* PHP-8.1:
  Mangle PCRE regex cache key with JIT option
2023-06-22 11:10:59 +02:00
Michael Voříšek
466fc78d2c
Mangle PCRE regex cache key with JIT option
Closes GH-11396
2023-06-22 11:08:54 +02:00
Ilija Tovilo
32968f8de0
Merge branch 'PHP-8.2'
* PHP-8.2:
  Fix preg_replace_callback_array() pattern validation
2023-05-24 13:42:44 +02:00
Ilija Tovilo
7c7698f754
Fix preg_replace_callback_array() pattern validation
Closes GH-11301
2023-05-24 13:42:16 +02:00
Ilija Tovilo
fced34ee1d
Merge branch 'PHP-8.2'
* PHP-8.2:
  Fix incorrect zval type_flags in preg_replace_callback_array() for immutable arrays
2023-03-31 14:42:44 +02:00
Ilija Tovilo
d1fc88c726
Merge branch 'PHP-8.1' into PHP-8.2
* PHP-8.1:
  Fix incorrect zval type_flags in preg_replace_callback_array() for immutable arrays
2023-03-31 14:42:35 +02:00
Ilija Tovilo
66ce205718
Fix incorrect zval type_flags in preg_replace_callback_array() for immutable arrays
The ZVAL_ARR macro always set the zval type_info to IS_ARRAY_EX, even if the
hash table is immutable. Since in preg_replace_callback_array() we can return
the passed array directly, and that passed array can be immutable, we need to
reset the type_flags to keep the VM from performing ref-counting on the array.

Fixes GH-10968
Closes GH-10970
2023-03-31 14:41:41 +02:00
Kamil Tekiela
fa1e3f9798
Remove pcre_get_compiled_regex_ex() (#10354) 2023-01-18 14:28:19 +00:00
Christoph M. Becker
c8955c078a
Revert GH-10220
Cf. <https://github.com/php/php-src/pull/10220#issuecomment-1383739816>.

This reverts commit ecc880f491.
This reverts commit 588a07f737.
This reverts commit f377e15751.
This reverts commit b4ba16fe18.
This reverts commit 694ec1deea.
This reverts commit 6b34de8eba.
This reverts commit aa1cd02a43.
This reverts commit 308fd311ea.
This reverts commit 16203b53e1.
This reverts commit 738fb5ca54.
This reverts commit 9fdbefacd3.
This reverts commit cd4a7c1d90.
This reverts commit 928685eba2.
This reverts commit 01e5ffc85c.
2023-01-16 12:27:33 +01:00
Max Kellermann
308fd311ea ext/{standard,json,random,...}: add missing includes 2023-01-10 14:19:03 +00:00
George Peter Banyard
1ad59b32c2 Update INI validator and displayers depending on INI type
Closes GH-9451
2022-09-06 10:33:34 +01:00
Máté Kocsis
28944b8fbe
Declare ext/pcre constants in stubs (#9077) 2022-07-21 13:21:02 +02:00
tobil4sk
5bb3e233db
Implement #77726: Allow null character in regex patterns
In 8b3c1a3, this was disallowed to fix #55856, which was a security
issue caused by the /e modifier. The fix that was made was the
"Easier fix" as described in the original report.

With this fix, pattern strings are no longer treated as null terminated,
so null characters can be placed inside and matched against with regex
patterns without security problems, so there is no longer a reason to
give the error. Allowing this is consistent with the behaviour of many
other languages, including JavaScript, and thanks to PCRE2[0], it does
not require manually escaping null characters. Now that we can avoid the
error here without the cost of escaping characters, there is really no
need anymore to stray here from the conventional behaviour.

Currently, null characters are still disallowed before the first
delimiter and in the options section at the end of a regex string, but
these error messages have been updated.

[0] Since PCRE2, pattern strings no longer have to be null terminated,
and raw null characters match as normal.

Closes GH-8114.
2022-06-17 19:30:44 +02:00
Christoph M. Becker
c0d890e918
Merge branch 'PHP-8.1'
* PHP-8.1:
  Fix #74604: Out of bounds in php_pcre_replace_impl
2021-11-29 19:17:49 +01:00
Christoph M. Becker
60717fcd34
Merge branch 'PHP-8.0' into PHP-8.1
* PHP-8.0:
  Fix #74604: Out of bounds in php_pcre_replace_impl
2021-11-29 19:17:16 +01:00
Christoph M. Becker
816aa20391
Merge branch 'PHP-7.4' into PHP-8.0
* PHP-7.4:
  Fix #74604: Out of bounds in php_pcre_replace_impl
2021-11-29 19:15:20 +01:00
Christoph M. Becker
712fc54e85
Fix #74604: Out of bounds in php_pcre_replace_impl
Trying to allocate a `zend_string` with a length only slighty smaller
than `SIZE_MAX` causes an integer overflow; we make sure that this
doesn't happen by catering to the maximal overhead of a `zend_string`.

Closes GH-7597.
2021-11-29 19:12:55 +01:00
Nikita Popov
2791afbf07 Merge branch 'PHP-8.1'
* PHP-8.1:
  Clarify that preg_match_all() cannot return null
2021-11-18 10:37:36 +01:00
Nikita Popov
3ec55d6cbf Merge branch 'PHP-8.0' into PHP-8.1
* PHP-8.0:
  Clarify that preg_match_all() cannot return null
2021-11-18 10:37:18 +01:00
Nikita Popov
bc6ec0a109 Clarify that preg_match_all() cannot return null 2021-11-18 10:36:35 +01:00
Felipe Pena
e089a50f53 Add support for PCRE n modifier
Add support for /n (NO_AUTO_CAPTURE) modifier, which makes simple
`(xyz)` groups non-capturing.

Closes GH-7583.
2021-11-03 15:17:54 +01:00
Dmitry Stogov
90b7bde615 Use more compact representation for packed arrays.
- for packed arrays we store just an array of zvals without keys.
- the elements of packed array are accessible throuf as ht->arPacked[i]
  instead of ht->arData[i]
- in addition to general ZEND_HASH_FOREACH_* macros, we introduced similar
  familied for packed (ZEND_HASH_PACKED_FORECH_*) and real hashes
  (ZEND_HASH_MAP_FOREACH_*)
- introduced an additional family of macros to access elements of array
  (packed or real hashes) ZEND_ARRAY_ELEMET_SIZE, ZEND_ARRAY_ELEMET_EX,
  ZEND_ARRAY_ELEMET, ZEND_ARRAY_NEXT_ELEMENT, ZEND_ARRAY_PREV_ELEMENT
- zend_hash_minmax() prototype was changed to compare only values

Because of smaller data set, this patch may show performance improvement
on some apps and benchmarks that use packed arrays. (~1% on PHP-Parser)

TODO:
    - sapi/phpdbg needs special support for packed arrays (WATCH_ON_BUCKET).
    - zend_hash_sort_ex() may require converting packed arrays to hash.
2021-11-03 15:18:26 +03:00
Remi Collet
a6f5c2dc8b
fix for pcre2 10.38 2021-10-21 13:37:26 +02:00