When executing a foreach ($ht as &$ref), foreach calls zend_hash_iterator_pos_ex() on every iteration. If the HashTable contained in the $ht variable is not the tracked HashTable, it will reset the position to the internal array pointer of the array currently in $ht.
This behaviour is generally fine, but undesirable for copy-on-write copies of the iterated HashTable. This may trivially occur when the iterated over HashTable is assigned to some variable, then the iterated over variable modified, leading to array separation, changing the HashTable pointer in the variable. Thus foreach happily restarting iteration.
This behaviour (despite existing since PHP 7.0) is considered a bug, if not only for the behaviour being unexpected to the user, also copy-on-write should not have trivially observable side-effects by mere assignment.
The bugfix consists of duplicating HashTableIterators whenever zend_array_dup() is called (the primitive used on array separation).
When a further access to the HashPosition through the HashTableIterators API happens and the HashTable does not match the tracked one, all the duplicates (which are tracked by single linked list) are searched for the wanted HashTable. If found, the HashTableIterator is replaced by the found copy and all other copies are removed.
This ensures that we always end up tracking the correct HashTable.
Fixes GH-11244.
Signed-off-by: Bob Weinand <bobwei9@hotmail.com>
This PR introduces a new way of recursion protection in JSON, var_dump
and friends. It fixes issue in master for __debugInfo and also improves
perf for jsonSerializable in some cases. More info can be found in
GH-10020.
Closes GH-11812
This allows array_merge, array_intersect, array_replace, array_unique
and usort to avoid taking a copy and do the transformation in-place.
** Safety **
There are some array functions which take a copy of the input
array into a temporary C array for sorting purposes.
(e.g. array_unique, array_diff, and array_intersect do this).
Since we no longer take a copy in all cases, we must check if
it's possible that a value is accessed that was already destroyed.
For array_unique: cmpdata will never be removed so that will never reach
refcount 0. And when something is removed, it is the previous value of
cmpdata, not the one user later. So this seems okay.
For array_intersect: a previous pointer (ptr[0] - 1) is accessed.
But this can't be a destroyed value because the pointer is first moved forward.
For array_diff: it's possible a previous pointer is accessed after
destruction. So we can't optimise this case easily.
Commit d835de1993 added support for AVX2 in hash table initialization
code. The same kind of code also occurs for HT_HASH_RESET. However, this
place was forgotten in that patch. That is unfortunate, because a loop
is just when there may be the most benefit from this SIMD sequence.
Furthermore, the NEON special handling exists in the initialization code
but is also missing from HT_HASH_RESET, so add this as well.
`zend_rc_debug` is not a type and does not really belong in
`zend_types.h`; this allows using `ZEND_RC_MOD_CHECK()` without
including the huge `zend_types.h` header and allows decoupling
circular header dependencies.
`zend_uchar` suggests that the value is an ASCII character, but here,
it's about very small integers. This is misleading, so let's use a
C99 integer instead.
On all architectures currently supported by PHP, `zend_uchar` and
`uint8_t` are identical. This change is only about code readability.
Many sources need just `zend_result`, and with this new header, they
only need to include `zend_result.h` instead of `zend_types.h`; the
latter is large and has fat dependencies, which slows down the build.
zend_hash allocates a hash table twice as big as nTableSize
(HT_HASH_SIZE(HT_SIZE_TO_MASK(nTableSize)) == nTableSize*2), so HT_MAX_SIZE
must be half the max table size or less.
Fixes GH-10240
This does a compile time transformation of ``iterable`` into ``Traversable|array`` which simplifies some of the LSP variance handling.
The arginfo generation script from stubs is updated to produce a union type when it encounters the type ``iterable``
Extension functions which do not regenerate the arginfo, or write them manually are still supported by mimicking the compile time transformation while registering the function.
Type Reflection is preserved for single ``iterable`` (and ``?iterable``) to produce a ReflectionNamedType with name ``iterable``, however usage of ``iterable`` in union types will be converted to ``array|Traversable``
- for packed arrays we store just an array of zvals without keys.
- the elements of packed array are accessible throuf as ht->arPacked[i]
instead of ht->arData[i]
- in addition to general ZEND_HASH_FOREACH_* macros, we introduced similar
familied for packed (ZEND_HASH_PACKED_FORECH_*) and real hashes
(ZEND_HASH_MAP_FOREACH_*)
- introduced an additional family of macros to access elements of array
(packed or real hashes) ZEND_ARRAY_ELEMET_SIZE, ZEND_ARRAY_ELEMET_EX,
ZEND_ARRAY_ELEMET, ZEND_ARRAY_NEXT_ELEMENT, ZEND_ARRAY_PREV_ELEMENT
- zend_hash_minmax() prototype was changed to compare only values
Because of smaller data set, this patch may show performance improvement
on some apps and benchmarks that use packed arrays. (~1% on PHP-Parser)
TODO:
- sapi/phpdbg needs special support for packed arrays (WATCH_ON_BUCKET).
- zend_hash_sort_ex() may require converting packed arrays to hash.
Objects reuse the GC_PERSISTENT flag as IS_OBJ_WEAKLY_REFERENCED,
which we did not account for in ZVAL_COPY_OR_DUP. To make things
worse the incorrect zval_copy_ctor_func() invocation silently did
nothing. To avoid that, add an assertion that it should only be
called with arrays and strings (unlike the normal zval_copy_ctor()
which can be safely called on any zval).
Currently, resource IDs are limited to 32-bits. As resource IDs
are not reused, this means that resource ID overflow for
long-running processes is very possible.
This patch switches resource IDs to use zend_long instead, which
means that on 64-bit systems, 64-bit resource IDs will be used.
This makes resource ID overflow practically impossible.
The tradeoff is an 8 byte increase in zend_resource size.
Closes GH-7436.
Currently, CE_CACHE on strings is only used with opcache interned strings. This
patch extends usage to non-opcache interned strings as well. This means that
most type strings can now make use of CE_CACHE even if opcache is not loaded,
which allows us to remove TYPE_HAS_CE kind, and fix some discrepancies
depending on whether a type stores a resolved or non-resolved name.
There are two cases where CE_CACHE will not be used:
* When opcache is not used and a permanent interned string (that is not an
internal class name) is used as a type name during the request. In this case
we can't allocate a map_ptr index for the permanent string, as it would be
not be in the permanent map_ptr index space.
* When opcache is used but the script is not cached (e.g. eval'd code or
opcache full). If opcache is used, we can't allocate additional map_ptr
indexes at runtime, because they may conflict with indexes allocated by
opcache.
In these two cases we would end up not using CE caching for property types
(argument/return types still have the separate cache slot).
This macro is a footgun because it creates an uninitialized array
(only an allocation). This macro is no longer used in php-src,
and we have better alternatives like array_init() or
ZVAL_ARR(arr, zend_new_array(size_hint)).
The never type can be used to indicate that a function never
returns, for example because it always unwinds.
RFC: https://wiki.php.net/rfc/noreturn_type
Closes GH-6761.
This is generalization of idea, that was previously usesd for caching
resolution of class_entries in zend_type. Now very similar mechanizm is
used for general zend_string into zend_class_entry resolution.
Interned zend_string with IS_STR_CLASS_NAME_MAP_PTR GC_FLAG uses its
refcount to adress corresponding zend_class_entry cache slot.
The refcount keeps an offset to this slot from CG(map_ptr_base).
Flag may be checked by ZSTR_HAS_CE_CACHE(str), cache slot may be read by
ZSTR_GET_CE_CACHE(str) and set by ZSTR_SET_CE_CACHE(str, ce).