This does a compile time transformation of ``iterable`` into ``Traversable|array`` which simplifies some of the LSP variance handling.
The arginfo generation script from stubs is updated to produce a union type when it encounters the type ``iterable``
Extension functions which do not regenerate the arginfo, or write them manually are still supported by mimicking the compile time transformation while registering the function.
Type Reflection is preserved for single ``iterable`` (and ``?iterable``) to produce a ReflectionNamedType with name ``iterable``, however usage of ``iterable`` in union types will be converted to ``array|Traversable``
* ext/oci8: use zend_string_equals()
Eliminate duplicate code.
* main/php_variables: use zend_string_equals_literal()
Eliminate duplicate code.
* Zend/zend_string: add zend_string_equals_cstr()
Allows eliminating duplicate code.
* Zend, ext/{opcache,standard}, main/output: use zend_string_equals_cstr()
Eliminate duplicate code.
* Zend/zend_string: add zend_string_starts_with()
* ext/{opcache,phar,spl,standard}: use zend_string_starts_with()
This adds missing length checks to several callers, e.g. in
cache_script_in_shared_memory(). This is important when the
zend_string is shorter than the string parameter, when memcmp()
happens to check backwards; this can result in an out-of-bounds memory
access.
Previously, code such as subclasses of SplFixedArray would check for method
overrides when instantiating the objects.
This optimization was mentioned as a followup to GH-6552
Trying to allocate a `zend_string` with a length only slighty smaller
than `SIZE_MAX` causes an integer overflow; we make sure that this
doesn't happen by catering to the maximal overhead of a `zend_string`.
Closes GH-7597.
Add a new interned string handler that fetches an interned string
if it exists, but does not create one if it does not (and instead
returns a non-interned string).
This fixes bug #81142, by preventing the creating of new interned
strings for unserialized array keys.
Closes GH-7360.
Trying to allocate a `zend_string` with a length only slighty smaller
than `SIZE_MAX` causes an integer overflow, so callers may need to
check that explicitly. To make that easy in a portable way, we
introduce `ZSTR_MAX_LEN`.
Closes GH-7294.
The never type can be used to indicate that a function never
returns, for example because it always unwinds.
RFC: https://wiki.php.net/rfc/noreturn_type
Closes GH-6761.
We're starting to see a mix between uses of zend_bool and bool.
Replace all usages with the standard bool type everywhere.
Of course, zend_bool is retained as an alias.
mysqlnd already creates interned zend_strings for us, so let's
make use of them.
This also required updating the PDO case changing code to work
with potentially shared strings. For the lowercasing, use the
optimized zend_string_tolower() implementation.
Do not decrement the refcount before allocating the new string,
as the allocation operation may bail out and cause a use-after-free
lateron. We can only decrement the refcount once the allocation
has succeeded.
Fixes oss-fuzz #25384.
Voidification of Zend API which always succeeded
Use bool argument types instead of int for boolean arguments
Use bool return type for functions which return true/false (1/0)
Use zend_result return type for functions which return SUCCESS/FAILURE as they don't follow normal boolean semantics
Closes GH-6002
Add ZVAL_CHAR/RETVAL_CHAR/RETURN_CHAR as a shortcut for using
ZVAL_INTERNED_STRING and ZSTR_CHAR.
Add zend_string_init_fast() as a helper for the empty string /
one char interned string / zend_string_init() pattern.
Also add corresponding ZVAL_STRINGL_FAST etc macros.
Closes GH-5684.
According to RFC: https://wiki.php.net/rfc/union_types_v2
The type representation now makes use of both the pointer payload
and the type mask at the same time. Additionall, zend_type_list is
introduced as a new kind of pointer payload, which is used to store
multiple class types. Each of the class types is a tagged pointer,
which may be either a class name or class entry. The latter is only
used for typed properties, while arguments/returns will instead use
cache slots. A type list can contain a mix of both names and CEs at
the same time, as not all classes may be resolvable.
One thing this is missing is support for union types in arginfo
and stubs, which I want to handle separately.
I've also dropped the special object code from the JIT implementation
for now -- I plan to add this back in a different form at a later time.
For now I did not want to include non-trivial JIT changes together
with large functional changes.
Another possible piece of follow-up work is to implement "iterable"
as an internal alias for "array|Traversable". I believe this will
eliminate quite a few special-cases that had to be implemented.
Closes GH-4838.
This switches zend_type from storing a single IS_* type code to
storing a MAY_BE_* type mask. Right now most code still assumes
that there is only a single type in the mask (or two together
with MAY_BE_NULL). But this will make it a lot simpler to introduce
union types.
An additional advantage (and why I'm doing this separately), is
that a number of special cases no longer need to be handled
separately: We can do a single mask & (1 << type) check to handle
all simple types, booleans (true|false) and null.
RFC: https://wiki.php.net/rfc/tostring_exceptions
And convert some object to string conversion related recoverable
fatal errors into Error exceptions.
Improve exception safety of internal code performing string
conversions.
We currently have a large performance problem when implementing lexers
working on UTF-8 strings in PHP. This kind of code tends to perform a
large number of matches at different offsets on a single string. This
is generally fast. However, if /u mode is used, the full string will
be UTF-8 validated on each match. This results in quadratic runtime.
This patch fixes the issue by adding a IS_STR_VALID_UTF8 flag, which
is set when we have determined that the string is valid UTF8 and
further validation is skipped.
A limitation of this approach is that we can't set the flag for interned
strings. I think this is not a problem for this use-case which will
generally work on dynamic data. If we want to use this flag for other
purposes as well (mbstring?) then it might be worthwhile to UTF-8 validate
strings during interning. But right now this doesn't seem useful.