In 8b3c1a3, this was disallowed to fix#55856, which was a security
issue caused by the /e modifier. The fix that was made was the
"Easier fix" as described in the original report.
With this fix, pattern strings are no longer treated as null terminated,
so null characters can be placed inside and matched against with regex
patterns without security problems, so there is no longer a reason to
give the error. Allowing this is consistent with the behaviour of many
other languages, including JavaScript, and thanks to PCRE2[0], it does
not require manually escaping null characters. Now that we can avoid the
error here without the cost of escaping characters, there is really no
need anymore to stray here from the conventional behaviour.
Currently, null characters are still disallowed before the first
delimiter and in the options section at the end of a regex string, but
these error messages have been updated.
[0] Since PCRE2, pattern strings no longer have to be null terminated,
and raw null characters match as normal.
Closes GH-8114.
Trying to allocate a `zend_string` with a length only slighty smaller
than `SIZE_MAX` causes an integer overflow; we make sure that this
doesn't happen by catering to the maximal overhead of a `zend_string`.
Closes GH-7597.
- for packed arrays we store just an array of zvals without keys.
- the elements of packed array are accessible throuf as ht->arPacked[i]
instead of ht->arData[i]
- in addition to general ZEND_HASH_FOREACH_* macros, we introduced similar
familied for packed (ZEND_HASH_PACKED_FORECH_*) and real hashes
(ZEND_HASH_MAP_FOREACH_*)
- introduced an additional family of macros to access elements of array
(packed or real hashes) ZEND_ARRAY_ELEMET_SIZE, ZEND_ARRAY_ELEMET_EX,
ZEND_ARRAY_ELEMET, ZEND_ARRAY_NEXT_ELEMENT, ZEND_ARRAY_PREV_ELEMENT
- zend_hash_minmax() prototype was changed to compare only values
Because of smaller data set, this patch may show performance improvement
on some apps and benchmarks that use packed arrays. (~1% on PHP-Parser)
TODO:
- sapi/phpdbg needs special support for packed arrays (WATCH_ON_BUCKET).
- zend_hash_sort_ex() may require converting packed arrays to hash.
Trimming a potentially over-allocated string appears to be reasonable,
so we drop the condition altogether.
We also re-allocate twice the size needed in the first place, and not
roughly tripple the size.
Closes GH-7231.
The way to fix it is to disable certain match start optimizaions. The
observed performance impact appears negligible ATM, compared to the
functional regression revealed.
A possible side effect might occur if a pattern uses (*COMMIT) or
(*MARK), which is however not a very broadly used syntax in PHP. Still
this should be observed and handled by possibly adding a possibility to
reverse PCRE2_NO_START_OPTIMIZE on the user side.
One test shows a behavior change, where instead of int 0 the match
would produce an error and return false. Except strict comparison
is used, this should be acceptable.
Signed-off-by: Anatol Belski <ab@php.net>
(cherry picked from commit d188ca7688)
Signed-off-by: Anatol Belski <ab@php.net>
The way to fix it is to disable certain match start optimizaions. The
observed performance impact appears negligible ATM, compared to the
functional regression revealed.
A possible side effect might occur if a pattern uses (*COMMIT) or
(*MARK), which is however not a very broadly used syntax in PHP. Still
this should be observed and handled by possibly adding a possibility to
reverse PCRE2_NO_START_OPTIMIZE on the user side.
One test shows a behavior change, where instead of int 0 the match
would produce an error and return false. Except strict comparison
is used, this should be acceptable.
Signed-off-by: Anatol Belski <ab@php.net>
(cherry picked from commit d188ca7688)
Signed-off-by: Anatol Belski <ab@php.net>
The way to fix it is to disable certain match start optimizaions. The
observed performance impact appears negligible ATM, compared to the
functional regression revealed.
A possible side effect might occur if a pattern uses (*COMMIT) or
(*MARK), which is however not a very broadly used syntax in PHP. Still
this should be observed and handled by possibly adding a possibility to
reverse PCRE2_NO_START_OPTIMIZE on the user side.
One test shows a behavior change, where instead of int 0 the match
would produce an error and return false. Except strict comparison
is used, this should be acceptable.
Signed-off-by: Anatol Belski <ab@php.net>
1. Update: http://www.php.net/license/3_01.txt to https, as there is anyway server header "Location:" to https.
2. Update few license 3.0 to 3.01 as 3.0 states "php 5.1.1, 4.1.1, and earlier".
3. In some license comments is "at through the world-wide-web" while most is without "at", so deleted.
4. fixed indentation in some files before |
The compile context is shared between patterns, so we need to set
the character tables unconditionally in case we switched from
a non-C locale to the C locale.
We're starting to see a mix between uses of zend_bool and bool.
Replace all usages with the standard bool type everywhere.
Of course, zend_bool is retained as an alias.
Separation can only possibly make sense for array parameters
(or something that can contain arrays, like zval parameters). It
never makes sense to separate a bool.
The deref parameters are also of dubious utility, but leaving them
for now.
Create a separate general context that uses ZMM as allocator and
use it to allocate temporary PCRE match data (there is still one
global match data). There is no requirement that the match data
and the compiled regex / match context use the same general context.
This makes sure that we do not leak persistent memory on bailout
and fixes oss-fuzz #25296, on which half the libfuzzer runs
currently get stuck.
From an engine perspective, named parameters mainly add three
concepts:
* The SEND_* opcodes now accept a CONST op2, which is the
argument name. For now, it is looked up by linear scan and
runtime cached.
* This may leave UNDEF arguments on the stack. To avoid having
to deal with them in other places, a CHECK_UNDEF_ARGS opcode
is used to either replace them with defaults, or error.
* For variadic functions, EX(extra_named_params) are collected
and need to be freed based on ZEND_CALL_HAS_EXTRA_NAMED_PARAMS.
RFC: https://wiki.php.net/rfc/named_params
Closes GH-5357.
Add ZVAL_CHAR/RETVAL_CHAR/RETURN_CHAR as a shortcut for using
ZVAL_INTERNED_STRING and ZSTR_CHAR.
Add zend_string_init_fast() as a helper for the empty string /
one char interned string / zend_string_init() pattern.
Also add corresponding ZVAL_STRINGL_FAST etc macros.
Closes GH-5684.
We already document that this is the case, but currently it's only
true if setlocale() has not been called. Make sure ctype_string is
always NULL, even with an explicit "C" locale call, so we can
more efficiently check whether we are in the "C" locale.
Closes GH-5542.