Commit graph

620 commits

Author SHA1 Message Date
Nikita Popov
3ec55d6cbf Merge branch 'PHP-8.0' into PHP-8.1
* PHP-8.0:
  Clarify that preg_match_all() cannot return null
2021-11-18 10:37:18 +01:00
Nikita Popov
bc6ec0a109 Clarify that preg_match_all() cannot return null 2021-11-18 10:36:35 +01:00
Remi Collet
a6f5c2dc8b
fix for pcre2 10.38 2021-10-21 13:37:26 +02:00
Remi Collet
17aae1302e
Merge branch 'PHP-8.0' into PHP-8.1
* PHP-8.0:
  fix for pcre2 10.38
  fix for pcre2 10.38
2021-10-21 13:34:28 +02:00
Remi Collet
dd61002676
fix for pcre2 10.38 2021-10-21 13:34:09 +02:00
Christoph M. Becker
e80dbd5f38
Merge branch 'PHP-8.0'
* PHP-8.0:
  Fix #81243: Too much memory is allocated for preg_replace()
2021-07-12 18:38:24 +02:00
Christoph M. Becker
5fb5a739e2
Merge branch 'PHP-7.4' into PHP-8.0
* PHP-7.4:
  Fix #81243: Too much memory is allocated for preg_replace()
2021-07-12 18:35:49 +02:00
Christoph M. Becker
a6b43086e6
Fix #81243: Too much memory is allocated for preg_replace()
Trimming a potentially over-allocated string appears to be reasonable,
so we drop the condition altogether.

We also re-allocate twice the size needed in the first place, and not
roughly tripple the size.

Closes GH-7231.
2021-07-12 18:33:55 +02:00
Patrick Allaert
aff365871a Fixed some spaces used instead of tabs 2021-06-29 11:30:26 +02:00
Anatol Belski
f7ab7951f1
pcre: Workaround bug #81101
The way to fix it is to disable certain match start optimizaions. The
observed performance impact appears negligible ATM, compared to the
functional regression revealed.

A possible side effect might occur if a pattern uses (*COMMIT) or
(*MARK), which is however not a very broadly used syntax in PHP. Still
this should be observed and handled by possibly adding a possibility to
reverse PCRE2_NO_START_OPTIMIZE on the user side.

One test shows a behavior change, where instead of int 0 the match
would produce an error and return false. Except strict comparison
is used, this should be acceptable.

Signed-off-by: Anatol Belski <ab@php.net>
(cherry picked from commit d188ca7688)
Signed-off-by: Anatol Belski <ab@php.net>
2021-06-19 15:25:17 +02:00
Anatol Belski
1a1d86d562 pcre: Workaround bug #81101
The way to fix it is to disable certain match start optimizaions. The
observed performance impact appears negligible ATM, compared to the
functional regression revealed.

A possible side effect might occur if a pattern uses (*COMMIT) or
(*MARK), which is however not a very broadly used syntax in PHP. Still
this should be observed and handled by possibly adding a possibility to
reverse PCRE2_NO_START_OPTIMIZE on the user side.

One test shows a behavior change, where instead of int 0 the match
would produce an error and return false. Except strict comparison
is used, this should be acceptable.

Signed-off-by: Anatol Belski <ab@php.net>
(cherry picked from commit d188ca7688)
Signed-off-by: Anatol Belski <ab@php.net>
2021-06-19 15:23:43 +02:00
Anatol Belski
cfec7a4131
pcre: Apply upstream patch for bug #81101 to bundled libpcre
Signed-off-by: Anatol Belski <ab@php.net>
2021-06-06 19:37:55 +02:00
Anatol Belski
d188ca7688
pcre: Workaround bug #81101
The way to fix it is to disable certain match start optimizaions. The
observed performance impact appears negligible ATM, compared to the
functional regression revealed.

A possible side effect might occur if a pattern uses (*COMMIT) or
(*MARK), which is however not a very broadly used syntax in PHP. Still
this should be observed and handled by possibly adding a possibility to
reverse PCRE2_NO_START_OPTIMIZE on the user side.

One test shows a behavior change, where instead of int 0 the match
would produce an error and return false. Except strict comparison
is used, this should be acceptable.

Signed-off-by: Anatol Belski <ab@php.net>
2021-06-06 18:02:53 +02:00
George Peter Banyard
aca6aefd85
Remove 'register' type qualifier (#6980)
The compiler should be smart enough to optimize this on its own
2021-05-14 13:38:01 +01:00
George Peter Banyard
c40231afbf
Mark various functions with void arguments.
This fixes a bunch of [-Wstrict-prototypes] warning,
because in C func() and func(void) have different semantics.
2021-05-12 14:55:53 +01:00
KsaR
01b3fc03c3
Update http->https in license (#6945)
1. Update: http://www.php.net/license/3_01.txt to https, as there is anyway server header "Location:" to https.
2. Update few license 3.0 to 3.01 as 3.0 states "php 5.1.1, 4.1.1, and earlier".
3. In some license comments is "at through the world-wide-web" while most is without "at", so deleted.
4. fixed indentation in some files before |
2021-05-06 12:16:35 +02:00
Nikita Popov
3b88e65265 Merge branch 'PHP-8.0'
* PHP-8.0:
  Fix locale switch back to C in pcre
2021-03-18 10:51:04 +01:00
Nikita Popov
4dce2f83f5 Merge branch 'PHP-7.4' into PHP-8.0
* PHP-7.4:
  Fix locale switch back to C in pcre
2021-03-18 10:50:57 +01:00
Nikita Popov
4be867e910 Fix locale switch back to C in pcre
The compile context is shared between patterns, so we need to set
the character tables unconditionally in case we switched from
a non-C locale to the C locale.
2021-03-18 10:48:43 +01:00
Nikita Popov
b82b85709d Merge branch 'PHP-8.0'
* PHP-8.0:
  Fix bug #80866
2021-03-15 14:48:09 +01:00
Nikita Popov
50254de0a2 Merge branch 'PHP-7.4' into PHP-8.0
* PHP-7.4:
  Fix bug #80866
2021-03-15 14:48:02 +01:00
Dharman
282355efd5 Fix bug #80866
Closes GH-6774.
2021-03-15 14:47:45 +01:00
Nikita Popov
3e01f5afb1 Replace zend_bool uses with bool
We're starting to see a mix between uses of zend_bool and bool.
Replace all usages with the standard bool type everywhere.

Of course, zend_bool is retained as an alias.
2021-01-15 12:33:06 +01:00
Nikita Popov
1b2aba285d Remove Z_PARAM separate params where they don't make sense
Separation can only possibly make sense for array parameters
(or something that can contain arrays, like zval parameters). It
never makes sense to separate a bool.

The deref parameters are also of dubious utility, but leaving them
for now.
2021-01-14 11:58:08 +01:00
Nikita Popov
41b8cdd2e0 Don't leak pcre error_code across requests 2020-10-22 11:20:02 +02:00
Máté Kocsis
6b00196e04
Review parameter names in ext/pcre
Closes GH-6259
2020-10-02 11:55:23 +02:00
Nikita Popov
d81ea5e928 Fix preg_replace_callback_array() with array subject
Apparently this "feature" was completely untested...
2020-09-15 12:03:18 +02:00
Máté Kocsis
c98d47696f
Consolidate new union type ZPP macro names
They will now follow the canonical order of types. Older macros are
left intact due to maintaining BC.

Closes GH-6112
2020-09-11 11:00:18 +02:00
Nikita Popov
f4b2497ad8 Allocate temporary PCRE match data using ZMM
Create a separate general context that uses ZMM as allocator and
use it to allocate temporary PCRE match data (there is still one
global match data). There is no requirement that the match data
and the compiled regex / match context use the same general context.

This makes sure that we do not leak persistent memory on bailout
and fixes oss-fuzz #25296, on which half the libfuzzer runs
currently get stuck.
2020-09-07 12:30:43 +02:00
Máté Kocsis
ea87d0480f
Promote warnings to exceptions in ext/pcre
Closes GH-6006
2020-08-25 18:09:50 +02:00
Nikita Popov
d92229d8c7 Implement named parameters
From an engine perspective, named parameters mainly add three
concepts:

 * The SEND_* opcodes now accept a CONST op2, which is the
   argument name. For now, it is looked up by linear scan and
   runtime cached.
 * This may leave UNDEF arguments on the stack. To avoid having
   to deal with them in other places, a CHECK_UNDEF_ARGS opcode
   is used to either replace them with defaults, or error.
 * For variadic functions, EX(extra_named_params) are collected
   and need to be freed based on ZEND_CALL_HAS_EXTRA_NAMED_PARAMS.

RFC: https://wiki.php.net/rfc/named_params

Closes GH-5357.
2020-07-31 15:53:36 +02:00
George Peter Banyard
af1de14802 Use ZPP string|array union check in PCRE extension 2020-07-09 14:17:19 +02:00
Nikita Popov
302933daea Remove no_separation flag 2020-07-07 09:30:24 +02:00
Nikita Popov
632766a561 Disallow separation in a number of callbacks
All of these clearly do not need separation support.
2020-07-07 09:02:24 +02:00
Max Semenik
2b5de6f839
Remove proto comments from C files
Closes GH-5758
2020-07-06 21:13:34 +02:00
George Peter Banyard
1a2732f9a8 Use ZPP callable check for preg_replace_callback() $callback argument 2020-06-22 15:56:36 +02:00
twosee
83a77015ad Add helper APIs for maybe-interned string creation
Add ZVAL_CHAR/RETVAL_CHAR/RETURN_CHAR as a shortcut for using
ZVAL_INTERNED_STRING and ZSTR_CHAR.

Add zend_string_init_fast() as a helper for the empty string /
one char interned string / zend_string_init() pattern.

Also add corresponding ZVAL_STRINGL_FAST etc macros.

Closes GH-5684.
2020-06-08 15:31:52 +02:00
twosee
88355dd338 Constify char * arguments of APIs
Closes GH-5676.
2020-06-08 10:38:45 +02:00
Nikita Popov
2414b3d775 Ensure ctype_string is NULL for C locale
We already document that this is the case, but currently it's only
true if setlocale() has not been called. Make sure ctype_string is
always NULL, even with an explicit "C" locale call, so we can
more efficiently check whether we are in the "C" locale.

Closes GH-5542.
2020-05-07 21:26:13 +02:00
Nikita Popov
3f76947303 Rename locale_string to ctype_string
To make it more obvious that this only refers to the LC_CTYPE
locale.
2020-05-07 18:45:03 +02:00
Nikita Popov
4fb705a03d Add zend_string_concat2 API 2020-04-14 17:18:05 +02:00
Máté Kocsis
21cfa03f17
Generate function entries for another batch of extensions
Closes GH-5352
2020-04-05 21:15:30 +02:00
Máté Kocsis
01b266aac4
Improve error messages of various extensions
Closes GH-5278
2020-03-23 18:59:04 +01:00
Nicolas Oelgart
aa79a22d32 Add preg_last_error_msg() function
Provides the last PCRE error as a human-readable message, similar
to functionality existing in other extensions, such as
json_last_error_msg().

Closes GH-5185.
2020-02-25 10:26:03 +01:00
Nikita Popov
5cf9710ba8 Merge branch 'PHP-7.4'
* PHP-7.4:
  Fixed bug #79257
2020-02-11 17:32:49 +01:00
Nikita Popov
3a51530963 Fixed bug #79257
Replace an existing entry for a given name only if we have a match.
2020-02-11 17:31:48 +01:00
Nikita Popov
d2befbc17d Merge branch 'PHP-7.4'
* PHP-7.4:
  PCRE: Only remember valid UTF-8 if start offset zero
  PCRE: Check whether start offset is on char boundary
2020-02-07 17:02:49 +01:00
Nikita Popov
cd5591a28d PCRE: Only remember valid UTF-8 if start offset zero
PCRE only validates the string starting from the start offset
(minus maximum look-behind, but let's ignore that), so we can
only remember that the string is fully valid UTF-8 is the original
start offset is zero.
2020-02-07 17:01:39 +01:00
Nikita Popov
c9e78e6d33 PCRE: Check whether start offset is on char boundary
We need not just the whole string to be UTF-8, but the start
position to be on a character boundary as well. Check this by
looking for a continuation byte.
2020-02-07 16:49:28 +01:00
Nikita Popov
065224190d Merge branch 'PHP-7.4'
* PHP-7.4:
  Fixed bug #79188
2020-02-05 11:21:34 +01:00