Commit graph

605 commits

Author SHA1 Message Date
Remi Collet
dd61002676
fix for pcre2 10.38 2021-10-21 13:34:09 +02:00
Christoph M. Becker
5fb5a739e2
Merge branch 'PHP-7.4' into PHP-8.0
* PHP-7.4:
  Fix #81243: Too much memory is allocated for preg_replace()
2021-07-12 18:35:49 +02:00
Christoph M. Becker
a6b43086e6
Fix #81243: Too much memory is allocated for preg_replace()
Trimming a potentially over-allocated string appears to be reasonable,
so we drop the condition altogether.

We also re-allocate twice the size needed in the first place, and not
roughly tripple the size.

Closes GH-7231.
2021-07-12 18:33:55 +02:00
Anatol Belski
f7ab7951f1
pcre: Workaround bug #81101
The way to fix it is to disable certain match start optimizaions. The
observed performance impact appears negligible ATM, compared to the
functional regression revealed.

A possible side effect might occur if a pattern uses (*COMMIT) or
(*MARK), which is however not a very broadly used syntax in PHP. Still
this should be observed and handled by possibly adding a possibility to
reverse PCRE2_NO_START_OPTIMIZE on the user side.

One test shows a behavior change, where instead of int 0 the match
would produce an error and return false. Except strict comparison
is used, this should be acceptable.

Signed-off-by: Anatol Belski <ab@php.net>
(cherry picked from commit d188ca7688)
Signed-off-by: Anatol Belski <ab@php.net>
2021-06-19 15:25:17 +02:00
Anatol Belski
1a1d86d562 pcre: Workaround bug #81101
The way to fix it is to disable certain match start optimizaions. The
observed performance impact appears negligible ATM, compared to the
functional regression revealed.

A possible side effect might occur if a pattern uses (*COMMIT) or
(*MARK), which is however not a very broadly used syntax in PHP. Still
this should be observed and handled by possibly adding a possibility to
reverse PCRE2_NO_START_OPTIMIZE on the user side.

One test shows a behavior change, where instead of int 0 the match
would produce an error and return false. Except strict comparison
is used, this should be acceptable.

Signed-off-by: Anatol Belski <ab@php.net>
(cherry picked from commit d188ca7688)
Signed-off-by: Anatol Belski <ab@php.net>
2021-06-19 15:23:43 +02:00
Nikita Popov
4dce2f83f5 Merge branch 'PHP-7.4' into PHP-8.0
* PHP-7.4:
  Fix locale switch back to C in pcre
2021-03-18 10:50:57 +01:00
Nikita Popov
4be867e910 Fix locale switch back to C in pcre
The compile context is shared between patterns, so we need to set
the character tables unconditionally in case we switched from
a non-C locale to the C locale.
2021-03-18 10:48:43 +01:00
Nikita Popov
50254de0a2 Merge branch 'PHP-7.4' into PHP-8.0
* PHP-7.4:
  Fix bug #80866
2021-03-15 14:48:02 +01:00
Dharman
282355efd5 Fix bug #80866
Closes GH-6774.
2021-03-15 14:47:45 +01:00
Nikita Popov
41b8cdd2e0 Don't leak pcre error_code across requests 2020-10-22 11:20:02 +02:00
Máté Kocsis
6b00196e04
Review parameter names in ext/pcre
Closes GH-6259
2020-10-02 11:55:23 +02:00
Nikita Popov
d81ea5e928 Fix preg_replace_callback_array() with array subject
Apparently this "feature" was completely untested...
2020-09-15 12:03:18 +02:00
Máté Kocsis
c98d47696f
Consolidate new union type ZPP macro names
They will now follow the canonical order of types. Older macros are
left intact due to maintaining BC.

Closes GH-6112
2020-09-11 11:00:18 +02:00
Nikita Popov
f4b2497ad8 Allocate temporary PCRE match data using ZMM
Create a separate general context that uses ZMM as allocator and
use it to allocate temporary PCRE match data (there is still one
global match data). There is no requirement that the match data
and the compiled regex / match context use the same general context.

This makes sure that we do not leak persistent memory on bailout
and fixes oss-fuzz #25296, on which half the libfuzzer runs
currently get stuck.
2020-09-07 12:30:43 +02:00
Máté Kocsis
ea87d0480f
Promote warnings to exceptions in ext/pcre
Closes GH-6006
2020-08-25 18:09:50 +02:00
Nikita Popov
d92229d8c7 Implement named parameters
From an engine perspective, named parameters mainly add three
concepts:

 * The SEND_* opcodes now accept a CONST op2, which is the
   argument name. For now, it is looked up by linear scan and
   runtime cached.
 * This may leave UNDEF arguments on the stack. To avoid having
   to deal with them in other places, a CHECK_UNDEF_ARGS opcode
   is used to either replace them with defaults, or error.
 * For variadic functions, EX(extra_named_params) are collected
   and need to be freed based on ZEND_CALL_HAS_EXTRA_NAMED_PARAMS.

RFC: https://wiki.php.net/rfc/named_params

Closes GH-5357.
2020-07-31 15:53:36 +02:00
George Peter Banyard
af1de14802 Use ZPP string|array union check in PCRE extension 2020-07-09 14:17:19 +02:00
Nikita Popov
302933daea Remove no_separation flag 2020-07-07 09:30:24 +02:00
Nikita Popov
632766a561 Disallow separation in a number of callbacks
All of these clearly do not need separation support.
2020-07-07 09:02:24 +02:00
Max Semenik
2b5de6f839
Remove proto comments from C files
Closes GH-5758
2020-07-06 21:13:34 +02:00
George Peter Banyard
1a2732f9a8 Use ZPP callable check for preg_replace_callback() $callback argument 2020-06-22 15:56:36 +02:00
twosee
83a77015ad Add helper APIs for maybe-interned string creation
Add ZVAL_CHAR/RETVAL_CHAR/RETURN_CHAR as a shortcut for using
ZVAL_INTERNED_STRING and ZSTR_CHAR.

Add zend_string_init_fast() as a helper for the empty string /
one char interned string / zend_string_init() pattern.

Also add corresponding ZVAL_STRINGL_FAST etc macros.

Closes GH-5684.
2020-06-08 15:31:52 +02:00
twosee
88355dd338 Constify char * arguments of APIs
Closes GH-5676.
2020-06-08 10:38:45 +02:00
Nikita Popov
2414b3d775 Ensure ctype_string is NULL for C locale
We already document that this is the case, but currently it's only
true if setlocale() has not been called. Make sure ctype_string is
always NULL, even with an explicit "C" locale call, so we can
more efficiently check whether we are in the "C" locale.

Closes GH-5542.
2020-05-07 21:26:13 +02:00
Nikita Popov
3f76947303 Rename locale_string to ctype_string
To make it more obvious that this only refers to the LC_CTYPE
locale.
2020-05-07 18:45:03 +02:00
Nikita Popov
4fb705a03d Add zend_string_concat2 API 2020-04-14 17:18:05 +02:00
Máté Kocsis
21cfa03f17
Generate function entries for another batch of extensions
Closes GH-5352
2020-04-05 21:15:30 +02:00
Máté Kocsis
01b266aac4
Improve error messages of various extensions
Closes GH-5278
2020-03-23 18:59:04 +01:00
Nicolas Oelgart
aa79a22d32 Add preg_last_error_msg() function
Provides the last PCRE error as a human-readable message, similar
to functionality existing in other extensions, such as
json_last_error_msg().

Closes GH-5185.
2020-02-25 10:26:03 +01:00
Nikita Popov
5cf9710ba8 Merge branch 'PHP-7.4'
* PHP-7.4:
  Fixed bug #79257
2020-02-11 17:32:49 +01:00
Nikita Popov
3a51530963 Fixed bug #79257
Replace an existing entry for a given name only if we have a match.
2020-02-11 17:31:48 +01:00
Nikita Popov
d2befbc17d Merge branch 'PHP-7.4'
* PHP-7.4:
  PCRE: Only remember valid UTF-8 if start offset zero
  PCRE: Check whether start offset is on char boundary
2020-02-07 17:02:49 +01:00
Nikita Popov
cd5591a28d PCRE: Only remember valid UTF-8 if start offset zero
PCRE only validates the string starting from the start offset
(minus maximum look-behind, but let's ignore that), so we can
only remember that the string is fully valid UTF-8 is the original
start offset is zero.
2020-02-07 17:01:39 +01:00
Nikita Popov
c9e78e6d33 PCRE: Check whether start offset is on char boundary
We need not just the whole string to be UTF-8, but the start
position to be on a character boundary as well. Check this by
looking for a continuation byte.
2020-02-07 16:49:28 +01:00
Nikita Popov
065224190d Merge branch 'PHP-7.4'
* PHP-7.4:
  Fixed bug #79188
2020-02-05 11:21:34 +01:00
Nikita Popov
e30f52b919 Merge branch 'PHP-7.3' into PHP-7.4
* PHP-7.3:
  Fixed bug #79188
2020-02-05 11:21:25 +01:00
Nikita Popov
13bfa9f5ac Fixed bug #79188 2020-02-05 11:18:46 +01:00
Máté Kocsis
9099dbd961
Use RETURN_THROWS() after zend_type_error() 2020-01-01 14:23:21 +01:00
Christoph M. Becker
e7e15450ef Merge branch 'PHP-7.4'
* PHP-7.4:
  Fix #78853: preg_match() may return integer > 1
2019-11-22 19:30:43 +01:00
Christoph M. Becker
cfb643ca2b Merge branch 'PHP-7.3' into PHP-7.4
* PHP-7.3:
  Fix #78853: preg_match() may return integer > 1
2019-11-22 19:29:11 +01:00
Christoph M. Becker
e1da72bdf1 Fix #78853: preg_match() may return integer > 1
Commit 54ebebd[1] optimized the match loop, but for this case it has
been overlooked, that we must only loop if we're doing global matching.

[1] <http://git.php.net/?p=php-src.git;a=commit;h=54ebebd686255c5f124af718c966edb392782d4a>
2019-11-22 19:26:26 +01:00
Nikita Popov
ea6d22cfad Merge branch 'PHP-7.4'
* PHP-7.4:
  Fix php_pcre_mutex_free()
2019-11-07 14:32:03 +01:00
Nikita Popov
e19f0e86dc Merge branch 'PHP-7.3' into PHP-7.4
* PHP-7.3:
  Fix php_pcre_mutex_free()
2019-11-07 14:31:55 +01:00
Nikita Popov
6dcc0b859f Fix php_pcre_mutex_free()
We should only set the mutex to NULL if we actually freed it.
Due to missing braces non-main threads may currently set it to
NULL first.
2019-11-07 14:31:19 +01:00
Nikita Popov
571a3bfc6c Merge branch 'PHP-7.4' 2019-10-08 16:14:19 +02:00
Nikita Popov
68b26ff8cf Merge branch 'PHP-7.3' into PHP-7.4 2019-10-08 16:14:06 +02:00
Nikita Popov
736af5f660 Merge branch 'PHP-7.2' into PHP-7.3 2019-10-08 16:13:17 +02:00
Sergei Turchanov
a8f60ac9dd Add pcre_get_compiled_regex_cache_ex() with local_aware flag
A new function `pcre_get_compiled_regex_cache_ex()` is introduced,
which allows to compile regexp pattern using the "C" locale instead
of a current locale.

This will be needed to replace setlocale() usage in fileinfo,
which is not thread-safe.
2019-10-08 16:11:55 +02:00
Nikita Popov
647b1c7fcf Remove most uses of ZEND_PARSE_PARAMETERS_END_EX()
As ZPP now throws, it makes no sense to specify an explicit return
value.
2019-10-07 10:02:18 +02:00
Nikita Popov
43358cc7b6 Merge branch 'PHP-7.4' 2019-10-04 16:04:42 +02:00