Adding strings to the worklist is useless, because they never contribute to
cycles. The assembly size on x86_64 does not change. This significantly improves
performance in this synthetic benchmark by 33%.
function test($a) {}
$a = new stdClass();
$a->self = $a;
$a->prop1 = str_repeat('a', 10);
$a->prop2 = str_repeat('a', 10);
$a->prop3 = str_repeat('a', 10);
$a->prop4 = str_repeat('a', 10);
$a->prop5 = str_repeat('a', 10);
$a->prop6 = str_repeat('a', 10);
$a->prop7 = str_repeat('a', 10);
$a->prop8 = str_repeat('a', 10);
$a->prop9 = str_repeat('a', 10);
$a->prop10 = str_repeat('a', 10);
for ($i = 0; $i < 10_000_000; $i++) {
test($a);
gc_collect_cycles();
}
This requires adding IS_TYPE_COLLECTABLE to IS_REFERENCE_EX to ensure these
values continue to be pushed onto the stack. Luckily, IS_TYPE_COLLECTABLE is
currently only used in gc_check_possible_root(), where the checked value cannot
be a reference.
Note that this changes the output of gc_collect_cycles(). Non-cyclic, refcounted
values no longer count towards the total reported values collected.
Also, there is some obvious overlap with GH-17130. This change should be good
nonetheless, especially if we can remove the GC_COLLECTABLE(Z_COUNTED_P(zv))
condition in PHP 9 and rely on Z_COLLECTABLE_P() exclusively, given we can
assume an object doesn't become cyclic at runtime anymore.
Closes GH-17194
GCC produces exactly the same binary with and without this change (without
extensions), which demonstrates two things:
* There is no additional register pressure.
* All usages of the macros were correct in older branches, i.e. the expressions
did not have any side-effects.
This flag was never necessary. We know a static variable is uninitialized (i.e.
the initializer has never been called) iff the zval in the static variable array
does not contain a reference.
Prompted by a related issue in ext-uopz reported by Christoph.
clang 18 is going to be released and in the meantime the counted_by
attribute usage had been constrained to true flexible arrays,
typical cases such as type name[1] ZEND_ELEMENT_COUNT(size) no longer
build.
When executing a foreach ($ht as &$ref), foreach calls zend_hash_iterator_pos_ex() on every iteration. If the HashTable contained in the $ht variable is not the tracked HashTable, it will reset the position to the internal array pointer of the array currently in $ht.
This behaviour is generally fine, but undesirable for copy-on-write copies of the iterated HashTable. This may trivially occur when the iterated over HashTable is assigned to some variable, then the iterated over variable modified, leading to array separation, changing the HashTable pointer in the variable. Thus foreach happily restarting iteration.
This behaviour (despite existing since PHP 7.0) is considered a bug, if not only for the behaviour being unexpected to the user, also copy-on-write should not have trivially observable side-effects by mere assignment.
The bugfix consists of duplicating HashTableIterators whenever zend_array_dup() is called (the primitive used on array separation).
When a further access to the HashPosition through the HashTableIterators API happens and the HashTable does not match the tracked one, all the duplicates (which are tracked by single linked list) are searched for the wanted HashTable. If found, the HashTableIterator is replaced by the found copy and all other copies are removed.
This ensures that we always end up tracking the correct HashTable.
Fixes GH-11244.
Signed-off-by: Bob Weinand <bobwei9@hotmail.com>
This PR introduces a new way of recursion protection in JSON, var_dump
and friends. It fixes issue in master for __debugInfo and also improves
perf for jsonSerializable in some cases. More info can be found in
GH-10020.
Closes GH-11812
This allows array_merge, array_intersect, array_replace, array_unique
and usort to avoid taking a copy and do the transformation in-place.
** Safety **
There are some array functions which take a copy of the input
array into a temporary C array for sorting purposes.
(e.g. array_unique, array_diff, and array_intersect do this).
Since we no longer take a copy in all cases, we must check if
it's possible that a value is accessed that was already destroyed.
For array_unique: cmpdata will never be removed so that will never reach
refcount 0. And when something is removed, it is the previous value of
cmpdata, not the one user later. So this seems okay.
For array_intersect: a previous pointer (ptr[0] - 1) is accessed.
But this can't be a destroyed value because the pointer is first moved forward.
For array_diff: it's possible a previous pointer is accessed after
destruction. So we can't optimise this case easily.
Commit d835de1993 added support for AVX2 in hash table initialization
code. The same kind of code also occurs for HT_HASH_RESET. However, this
place was forgotten in that patch. That is unfortunate, because a loop
is just when there may be the most benefit from this SIMD sequence.
Furthermore, the NEON special handling exists in the initialization code
but is also missing from HT_HASH_RESET, so add this as well.
`zend_rc_debug` is not a type and does not really belong in
`zend_types.h`; this allows using `ZEND_RC_MOD_CHECK()` without
including the huge `zend_types.h` header and allows decoupling
circular header dependencies.
`zend_uchar` suggests that the value is an ASCII character, but here,
it's about very small integers. This is misleading, so let's use a
C99 integer instead.
On all architectures currently supported by PHP, `zend_uchar` and
`uint8_t` are identical. This change is only about code readability.
Many sources need just `zend_result`, and with this new header, they
only need to include `zend_result.h` instead of `zend_types.h`; the
latter is large and has fat dependencies, which slows down the build.
zend_hash allocates a hash table twice as big as nTableSize
(HT_HASH_SIZE(HT_SIZE_TO_MASK(nTableSize)) == nTableSize*2), so HT_MAX_SIZE
must be half the max table size or less.
Fixes GH-10240
This does a compile time transformation of ``iterable`` into ``Traversable|array`` which simplifies some of the LSP variance handling.
The arginfo generation script from stubs is updated to produce a union type when it encounters the type ``iterable``
Extension functions which do not regenerate the arginfo, or write them manually are still supported by mimicking the compile time transformation while registering the function.
Type Reflection is preserved for single ``iterable`` (and ``?iterable``) to produce a ReflectionNamedType with name ``iterable``, however usage of ``iterable`` in union types will be converted to ``array|Traversable``
- for packed arrays we store just an array of zvals without keys.
- the elements of packed array are accessible throuf as ht->arPacked[i]
instead of ht->arData[i]
- in addition to general ZEND_HASH_FOREACH_* macros, we introduced similar
familied for packed (ZEND_HASH_PACKED_FORECH_*) and real hashes
(ZEND_HASH_MAP_FOREACH_*)
- introduced an additional family of macros to access elements of array
(packed or real hashes) ZEND_ARRAY_ELEMET_SIZE, ZEND_ARRAY_ELEMET_EX,
ZEND_ARRAY_ELEMET, ZEND_ARRAY_NEXT_ELEMENT, ZEND_ARRAY_PREV_ELEMENT
- zend_hash_minmax() prototype was changed to compare only values
Because of smaller data set, this patch may show performance improvement
on some apps and benchmarks that use packed arrays. (~1% on PHP-Parser)
TODO:
- sapi/phpdbg needs special support for packed arrays (WATCH_ON_BUCKET).
- zend_hash_sort_ex() may require converting packed arrays to hash.
Objects reuse the GC_PERSISTENT flag as IS_OBJ_WEAKLY_REFERENCED,
which we did not account for in ZVAL_COPY_OR_DUP. To make things
worse the incorrect zval_copy_ctor_func() invocation silently did
nothing. To avoid that, add an assertion that it should only be
called with arrays and strings (unlike the normal zval_copy_ctor()
which can be safely called on any zval).
Currently, resource IDs are limited to 32-bits. As resource IDs
are not reused, this means that resource ID overflow for
long-running processes is very possible.
This patch switches resource IDs to use zend_long instead, which
means that on 64-bit systems, 64-bit resource IDs will be used.
This makes resource ID overflow practically impossible.
The tradeoff is an 8 byte increase in zend_resource size.
Closes GH-7436.
Currently, CE_CACHE on strings is only used with opcache interned strings. This
patch extends usage to non-opcache interned strings as well. This means that
most type strings can now make use of CE_CACHE even if opcache is not loaded,
which allows us to remove TYPE_HAS_CE kind, and fix some discrepancies
depending on whether a type stores a resolved or non-resolved name.
There are two cases where CE_CACHE will not be used:
* When opcache is not used and a permanent interned string (that is not an
internal class name) is used as a type name during the request. In this case
we can't allocate a map_ptr index for the permanent string, as it would be
not be in the permanent map_ptr index space.
* When opcache is used but the script is not cached (e.g. eval'd code or
opcache full). If opcache is used, we can't allocate additional map_ptr
indexes at runtime, because they may conflict with indexes allocated by
opcache.
In these two cases we would end up not using CE caching for property types
(argument/return types still have the separate cache slot).