Inline the lookup whether a function is observed at all.
This strategy is also used for FRAMELESS calls. If the frameless call is observed, we instead allocate a call frame and push the arguments, to call the the function afterwards.
Doing so is still a performance benefit as opposed to executing individual INIT_FCALL+SEND_VAL ops. Thus, even if the frameless call turns out to be observed, the call overhead is slightly lower than before.
If the internal function is not observed at all, the unavoidable overhead is fetching the FLF zend_function pointer and the run-time cache needs to be inspected.
As part of this work, it turned out to be most viable to put the result operand on the ZEND_OP_DATA instead of ZEND_FRAMELESS_ICALL_3, allowing seamless interoperability with the DO_ICALL opcode.
This is a bit unusual in comparison to all other ZEND_OP_DATA usages, but seems to not pose problems overall.
There is also a small issue resolved: trampolines would always use the ZEND_CALL_TRAMPOLINE_SPEC_OBSERVER function due to zend_observer_fcall_op_array_extension being set to -1 too late.
* zend_compile: Add `zend_compile_rope_finalize()`
This just extracts the implementation as-is into a dedicated function to make
it reusable in preparation of a future commit.
* zend_compile: Use clearer parameter names for `zend_compile_rope_finalize()`
* zend_compile: Fix `zend_compile_rope_finalize()` for ropes containing a single constant string
Without this Opcache will trigger a use-after-free in
`zend_optimizer_compact_literals()`.
Co-authored-by: Ilija Tovilo <ilija.tovilo@me.com>
* zend_compile: Optimize `sprintf()` into a rope
This optimization will compile `sprintf()` using only `%s` placeholders into a
rope at compile time, effectively making those calls equivalent to the use of
string interpolation, with the added benefit of supporting arbitrary
expressions instead of just expressions starting with a `$`.
For a synthetic test using:
<?php
$a = 'foo';
$b = 'bar';
for ($i = 0; $i < 100_000_000; $i++) {
sprintf("%s-%s", $a, $b);
}
This optimization yields a 2.1× performance improvement:
$ hyperfine 'sapi/cli/php -d zend_extension=php-src/modules/opcache.so -d opcache.enable_cli=1 test.php' \
'/tmp/unoptimized -d zend_extension=php-src/modules/opcache.so -d opcache.enable_cli=1 test.php'
Benchmark 1: sapi/cli/php -d zend_extension=php-src/modules/opcache.so -d opcache.enable_cli=1 test.php
Time (mean ± σ): 1.869 s ± 0.033 s [User: 1.865 s, System: 0.003 s]
Range (min … max): 1.840 s … 1.945 s 10 runs
Benchmark 2: /tmp/unoptimized -d zend_extension=php-src/modules/opcache.so -d opcache.enable_cli=1 test.php
Time (mean ± σ): 4.011 s ± 0.034 s [User: 4.006 s, System: 0.005 s]
Range (min … max): 3.964 s … 4.079 s 10 runs
Summary
sapi/cli/php -d zend_extension=php-src/modules/opcache.so -d opcache.enable_cli=1 test.php ran
2.15 ± 0.04 times faster than /tmp/unoptimized -d zend_extension=php-src/modules/opcache.so -d opcache.enable_cli=1 test.php
This optimization comes with a small and probably insignificant behavioral
change: If one of the values cannot be (cleanly) converted to a string, for
example when attempting to insert an object that is not `Stringable`, the
resulting Exception will naturally not show the `sprintf()` call in the
resulting stack trace, because there is no call to `sprintf()`.
Nevertheless it will correctly point out the line of the `sprintf()` call as
the source of the Exception, pointing the user towards the correct location.
* zend_compile: Eagerly handle empty format strings in `sprintf()` optimization
* zend_compile: Add additional explanatory comments to zend_compile_func_sprintf()
* Add zero-argument test to sprintf_rope_optimization_001.phpt
---------
Co-authored-by: Ilija Tovilo <ilija.tovilo@me.com>
These are either undefined or defined (to value 1):
- __DragonFly__
- __FreeBSD__
- HAS_MCAST_EXT
- HAVE_GETCWD
- HAVE_GETWD
- HAVE_GLIBC_ICONV
- HAVE_JIT
- HAVE_LCHOWN
- HAVE_NL_LANGINFO
- HAVE_RL_CALLBACK_READ_CHAR
- HAVE_RL_ON_NEW_LINE
- HAVE_SQL_EXTENDED_FETCH
- HAVE_UTIME
Follow up of GH-5526 (-Wundef)
* Mark many functions as static
Multiple functions are missing the static qualifier.
* remove unused struct sigactions
struct sigaction act, old_term, old_quit, old_int;
all unused.
* optimizer: minXOR and maxXOR are unused
* Include the source location in Closure names
This change makes stack traces involving Closures, especially multiple
different Closures, much more useful, because it's more easily visible *which*
closure was called for a given stack frame.
The implementation is similar to that of anonymous classes which already
include the file name and line number within their generated classname.
* Update scripts/dev/bless_tests.php for closure naming
* Adjust existing tests for closure naming
* Adjust tests for closure naming that were not caught locally
* Drop the namespace from closure names
This is redundant with the included filename.
* Include filename and line number as separate keys in Closure debug info
* Fix test
* Fix test
* Include the surrounding class and function name in closure names
* Fix test
* Relax test expecations
* Fix tests after merge
* NEWS / UPGRADING
I added the function/method name to some compile-time deprecation messages which are related to parameters/return values. Consistently with the other similar error messages, I included the function/method name at the start of the message.
The force_allow_null parameter is always false, therefore it is not needed. This simplification paves the road for deprecating implicitly nullable parameter types.
* Zend: Make zend_strnlen available for use outside zend_compile
* exif: remove local php_strnlen, use zend_strnlen instead
* main: remove local strnlen, use zend_strnlen instead
* phar: remove local strnlen, use zend_strnlen
Evaluating constants at comptime can result in arrays that contain objects. This
is problematic for printing the default value of constant ASTs containing
objects, because we don't actually know what the constructor arguments were.
Avoid this by not propagating array constants.
Fixes GH-11937
Closes GH-11947
opnum_start denotes the start of the ZEND_FREE block of skipped consuming
opcodes. Storing the number before zend_compile_expr(..., label_ast) makes it
seem like it denotes the start of the label block. However, label_ast must only
be a zval string AST, and as such never results in an actual opcode.
We don't want to invoke calls twice, even if they are considered "variables",
i.e. might be writable if returning a reference. Function calls behave the same
in all BP contexts so they don't need to be invoked twice. The singular
exception to this is nullsafe coalesce in isset/empty, because it needs to
return false/true respectively when short-circuited. However, since nullsafe
calls are not allwed in write context we may ignore this problem.
Closes GH-11592
We transform the arrow function by nesting the expression into a return
statement. If we compile the arrow function twice this would be done twice,
leading to a compile assertion.
Fix oss-fuzz #60411
Closes GH-11632
Normally, PHP evaluates all expressions in offsets (property or array), as well
as the right hand side of assignments before actually fetching the offsets. This
is well explained in this blog post.
https://www.npopov.com/2017/04/14/PHP-7-Virtual-machine.html#writes-and-memory-safety
For ??= we have a bit of a problem in that the rhs must only be evaluated if the
lhs is null or undefined. Thus, we have to first compile the lhs with BP_VAR_IS,
conditionally run the rhs and then re-fetch the lhs with BP_VAR_W to to make
sure the offsets are valid if they have been invalidated.
However, we don't want to just re-evaluate the entire lhs because it may contain
side-effects, as in $array[$x++] ??= 42;. In this case, we don't want to
re-evaluate $x++ because it would result in writing to a different offset than
was previously tested. The same goes for function calls, like
$array[foo()] ??= 42;, where the second call to foo() might result in a
different value. PHP behaves correctly in these cases. This is implemented by
memoizing sub-expressions in the lhs of ??= and reusing them when compiling the
lhs for the second time. This is done for any expression that isn't a variable,
i.e. anything that can (potentially) be written to.
Unfortunately, this also means that function calls are considered writable due
to their return-by-reference semantics, and will thus not be memoized. The
expression foo()['bar'] ??= 42; will invoke foo() twice. Even worse,
foo(bar()) ??= 42; will call both foo() and bar() twice, but
foo(bar() + 1) ??= 42; will only call foo() twice. This is likely not by design,
and was just overlooked in the implementation. The RFC does not specify how
function calls in the lhs of the coalesce assignment behaves. This should
probably be improved in the future.
Now, the problem this commit actually fixes is that ??= may memoize expressions
inside assert() function calls that may not actually execute. This is not only
an issue when using the VAR in the second expression (which would usually also
be skipped) but also when freeing the VAR. For this reason, it is not safe to
memoize assert() sub-expressions.
There are two possible solutions:
1. Don't memoize any sub-expressions of assert(), meaning they will execute
twice.
2. Throw a compile error.
Option 2 is not quite simple, because we can't disallow all memoization inside
assert(), as that would break assertions like assert($array[foo()] ??= 'bar');.
Code like this is highly unlikely (and dubious) but possible. In this case, we
would need to make sure that a memoized value could not be used across the
assert boundary it was created in. The complexity for this is not worthwhile. So
we opt for option 1 and disable memoization immediately inside assert().
Fixes GH-11580
Closes GH-11581