Commit graph

715 commits

Author SHA1 Message Date
Niels Dossche
a0476fd32f
Micro-optimize double comparison (#11061)
When using ZEND_NORMALIZE_BOOL(a - b) where a and b are doubles, this
generates the following instruction sequence on x64:
subsd   xmm0, xmm1
pxor    xmm1, xmm1
comisd  xmm0, xmm1
...

whereas if we use ZEND_THREEWAY_COMPARE we get two instructions less:
ucomisd xmm0, xmm1

The only difference is that the threeway compare uses *u*comisd instead
of comisd. The difference is that it will cause a FP signal if a
signaling NAN is used, but as far as I'm aware this doesn't matter for
our use case.

Similarly, the amount of instructions on AArch64 is also quite a bit
lower for this code compared to the old code.

** Results **

Using the benchmark https://gist.github.com/nielsdos/b36517d81a1af74d96baa3576c2b70df
I used hyperfine: hyperfine --runs 25 --warmup 3 './sapi/cli/php sort_double.php'
No extensions such as opcache used during benchmarking.

BEFORE THIS PATCH
-----------------
  Time (mean ± σ):     255.5 ms ±   2.2 ms    [User: 251.0 ms, System: 2.5 ms]
  Range (min … max):   251.5 ms … 260.7 ms    25 runs

AFTER THIS PATCH
----------------
  Time (mean ± σ):     236.2 ms ±   2.8 ms    [User: 228.9 ms, System: 5.0 ms]
  Range (min … max):   231.5 ms … 242.7 ms    25 runs
2023-04-14 18:22:42 +02:00
Ilija Tovilo
8360efde8d
Merge branch 'PHP-8.2'
* PHP-8.2:
  Fix add_function_array() assertion when op2 contains op1
2023-04-03 12:49:43 +02:00
Ilija Tovilo
c4f56c5099
Merge branch 'PHP-8.1' into PHP-8.2
* PHP-8.1:
  Fix add_function_array() assertion when op2 contains op1
2023-04-03 12:49:33 +02:00
Ilija Tovilo
84b4020eb4
Fix add_function_array() assertion when op2 contains op1
Fixes GH-10085
Closes GH-10975
Co-authored-by: Dmitry Stogov <dmitry@zend.com>
2023-04-03 12:48:46 +02:00
Niels Dossche
2b9d2bcee7 Merge branch 'PHP-8.2'
* PHP-8.2:
  Fix undefined behaviour in string uppercasing and lowercasing
2023-03-25 21:28:09 +01:00
Niels Dossche
bf487bde13 Merge branch 'PHP-8.1' into PHP-8.2
* PHP-8.1:
  Fix undefined behaviour in string uppercasing and lowercasing
2023-03-25 21:22:35 +01:00
Niels Dossche
93e0f6b424 Fix undefined behaviour in string uppercasing and lowercasing
At least on 32-bit, the address computations overflow in running the
test on CI with UBSAN enabled. Fix it by reordering the arithmetic.
Since a part of the expression is already used in the code above the
computation, this should not negatively affect performance.

Closes GH-10936.
2023-03-25 21:17:15 +01:00
Max Kellermann
d5c649b36b
zend_compiler, ...: use uint8_t instead of zend_uchar (#10621)
`zend_uchar` suggests that the value is an ASCII character, but here,
it's about very small integers.  This is misleading, so let's use a
C99 integer instead.

On all architectures currently supported by PHP, `zend_uchar` and
`uint8_t` are identical.  This change is only about code readability.
2023-02-23 14:56:54 +00:00
Max Kellermann
49c1e6eb33
Make various pointers const in Zend/ (#10608)
* Zend/zend_operators: pass const pointers to zend_is_identical()

* Zend/zend_operators: pass const pointers to zend_get_{long,double}()

* Zend/Optimizer/sccp: make pointers const

* Zend/Optimizer/scdf: make pointers const

* Zend/Optimizer/zend_worklist: make pointers const

* Zend/Optimizer/zend_optimizer: make pointers const

* Zend/zend_compile: make pointers const
2023-02-20 14:00:59 +00:00
Niels Dossche
99b86141ae Introduce convenience macros for copying flags that hold when concatenating two strings
This abstracts away, and cleans up, the flag handling for properties of
strings that hold when concatenating two strings if they both hold that
property. (These macros also work with simply copies of strings because
a copy of a string can be considered a concatenation with the empty
string.) This gets rid of some branches and some repetitive code, and
leaves room for adding more flags like these in the future.
2023-02-05 14:32:50 +00:00
Alex Dowad
c02af98ae5 Use AVX2 to accelerate strto{upper,lower} (only on 'AVX2-native' builds for now)
On short strings, there is no difference in performance. However, for
strings around 10,000 bytes long, the AVX2-accelerated function is
about 55% faster than the SSE2-accelerated one.
2023-02-03 16:29:27 +02:00
George Peter Banyard
64127b66c6 Concatenating two valid UTF-8 strings produces a valid UTF-8 string
The UTF-8 valid flag needs to be copied upon interning,
otherwise strings that are concatenated at compile time lose this information.

However, if previously this string was interned without the flag it is not added
E.g. in the case the string is an existing class name.

Co-authored-by: Niels Dossche <7771979+nielsdos@users.noreply.github.com>
2023-02-02 12:02:36 +00:00
George Peter Banyard
78720e39a6 Mark numeric strings as valid UTF-8 2023-02-02 12:02:36 +00:00
Máté Kocsis
7936c8085e
Fix GH-8329 Print true/false instead of bool in error and debug messages (#8385) 2023-01-23 10:52:14 +01:00
Alex Dowad
0b7986f976 Tweak SSE2-accelerated strtoupper() and strtolower() for speed
I learned this trick for doing a faster bounds check with both upper
and lower bounds by reading a disassembler listing of optimized code
produced by GCC; instead of doing 2 compares to check the upper and the
lower bound, add an immediate value to shift the range you are testing
for to the far low or high end of the range of possible values for the
type in question, and then a single compare will do. Intstead of
compare + compare + AND, you just do ADD + compare.

From microbenchmarking on my development PC, this makes strtoupper()
about 10% faster on long strings (~10,000 bytes).
2023-01-20 08:21:45 +02:00
Christoph M. Becker
c8955c078a
Revert GH-10220
Cf. <https://github.com/php/php-src/pull/10220#issuecomment-1383739816>.

This reverts commit ecc880f491.
This reverts commit 588a07f737.
This reverts commit f377e15751.
This reverts commit b4ba16fe18.
This reverts commit 694ec1deea.
This reverts commit 6b34de8eba.
This reverts commit aa1cd02a43.
This reverts commit 308fd311ea.
This reverts commit 16203b53e1.
This reverts commit 738fb5ca54.
This reverts commit 9fdbefacd3.
This reverts commit cd4a7c1d90.
This reverts commit 928685eba2.
This reverts commit 01e5ffc85c.
2023-01-16 12:27:33 +01:00
Max Kellermann
694ec1deea Zend/zend_{operators,variables}: include cleanup 2023-01-10 14:19:03 +00:00
Max Kellermann
a8eb399ca3 Zend/zend_operators: make several pointers const 2023-01-04 12:59:16 +00:00
zeriyoshi
30ed8fb32d Merge remote-tracking branch 'upstream/PHP-8.1' 2022-08-05 00:08:36 +09:00
zeriyoshi
2d777466c0 Merge remote-tracking branch 'upstream/PHP-8.0' into PHP-8.1 2022-08-05 00:06:04 +09:00
Go Kudo
3725717de1
Remove ZEND_DVAL_TO_LVAL_CAST_OK (#9215)
* Remove ZEND_DVAL_TO_LVAL_CAST_OK
As far as I can see, this operation should always use the _slow method, and the results seem to be wrong when ZEND_DVAL_TO_LVAL_CAST_OK is enabled.

* update NEWS
2022-08-04 23:56:19 +09:00
Arnaud Le Blanc
efc8f0ebf8
Deprecate zend_atol() / add zend_ini_parse_quantity() (#7951)
Add zend_ini_parse_quantity() and deprecate zend_atol(), zend_atoi()

zend_atol() and zend_atoi() don't just do number parsing.
They also check for a 'K', 'M', or 'G' at the end of the string,
and multiply the parsed value out accordingly.

Unfortunately, they ignore any other non-numerics between the
numeric component and the last character in the string.
This means that numbers such as the following are both valid
and non-intuitive in their final output.

* "123KMG" is interpreted as "123G" -> 132070244352
* "123G " is interpreted as "123 " -> 123
* "123GB" is interpreted as "123B" -> 123
* "123 I like tacos." is also interpreted as "123." -> 123

Currently, in php-src these functions are used only for parsing ini values.

In this change we deprecate zend_atol(), zend_atoi(), and introduce a new
function with the same behavior, but with the ability to report invalid inputs
to the caller. The function's name also makes the behavior less unexpected:
zend_ini_parse_quantity().

Co-authored-by: Sara Golemon <pollita@php.net>
2022-06-17 14:12:53 +02:00
Max Kellermann
c1a06704da
Add ZEND_THREEWAY_COMPARE() macro to fix casting underflowed unsigned to signed (#8220)
Casting a huge unsigned value to signed is implementation-defined
behavior in C.  By introducing the ZEND_THREEWAY_COMPARE() macro, we
can sidestep this integer overflow/underflow/casting problem.
2022-06-08 13:24:18 +01:00
Dmitry Stogov
e7c2e11ca0 Merge branch 'PHP-8.1'
* PHP-8.1:
  Fix typo (wrong string length)
2022-01-28 11:08:44 +03:00
Dmitry Stogov
e700864055 Merge branch 'PHP-8.0' into PHP-8.1
* PHP-8.0:
  Fix typo (wrong string length)
2022-01-28 11:06:04 +03:00
Dmitry Stogov
464e725bb5 Fix typo (wrong string length)
Fixes oss-fuzz #44110
2022-01-28 11:04:51 +03:00
Tim Starling
8eee0d6130
Make strtolower() and strtoupper() do ASCII case conversion (#7506)
Implement RFC https://wiki.php.net/rfc/strtolower-ascii
2021-12-15 08:38:35 -05:00
Nikita Popov
e32642c541 Merge branch 'PHP-8.1'
* PHP-8.1:
  Fix bug #81598: Use C.UTF-8 as LC_CTYPE locale by default
2021-12-05 21:04:10 +01:00
Nikita Popov
26e424465c Fix bug #81598: Use C.UTF-8 as LC_CTYPE locale by default
Unfortunately, libedit is locale based and does not accept UTF-8
input when the C locale is used. This patch switches the default
locale to C.UTF-8 instead (if it is available). This makes libedit
work and I believe it shouldn't affect behavior of single-byte
locale-dependent functions that PHP otherwise uses.

Closes GH-7635.
2021-12-05 21:03:27 +01:00
Nikita Popov
ce62a98534 Merge branch 'PHP-8.1'
* PHP-8.1:
  Remove unnecessary assertion
2021-11-04 17:01:04 +01:00
Nikita Popov
e291dcd836 Merge branch 'PHP-8.0' into PHP-8.1
* PHP-8.0:
  Remove unnecessary assertion
2021-11-04 17:00:58 +01:00
Nikita Popov
7e67366a9b Remove unnecessary assertion
zend_class_implements_interface works fine if the "class" is an
interface, so simply drop this assertion. This avoids the need to
special case this situation.
2021-11-04 17:00:17 +01:00
Tim Starling
da0c70508e
Add upper case functions to zend_operators.c and use them (#7521)
Add a family of upper case conversion functions to zend_operators.c,
by analogy with the lower case functions.

Move the single-character conversion macros to the header so that they
can be used as a locale-independent replacement for tolower() and
toupper().

Factor out the ugly bits of the SSE2 case conversion so that the four
functions that use it are easy to read and processor-independent.

Use the new ASCII upper case functions in ext/xml, ext/pdo_dblib and as
an optimization for strtoupper() when the locale is "C".
2021-09-29 09:37:40 +02:00
Nikita Popov
498674058c Remove zend_binary_zval_strcasecmp() APIs
These are thin wrappers ... around the wrong functions. They call
the "_l()" version of the underlying APIs. For clarify, just call
the wrapped API directly.
2021-09-24 09:38:08 +02:00
Nikita Popov
604848188b Add additional double to string APIs
zend_double_to_str() converts a double to string in the way that
(string) would (using %.*H using precision).

smart_str_append_double() provides some more fine control over
the precision, and whether a zero fraction should be appeneded
for whole numbers.

A caveat here is that raw calls to zend_gcvt and going through
s*printf has slightly different behavior for the degenarate
precision=0 case. zend_gcvt will add a dummy E+0 in that case,
while s*printf convert this to precision=1 and will not. I'm
going with the s*printf behavior here, which is more common,
but does result in a minor change to the precision.phpt test.
2021-08-02 16:14:53 +02:00
Christoph M. Becker
9f18bff6b4
Merge branch 'PHP-8.0'
* PHP-8.0:
  Fix #74960: Heap buffer overflow via str_repeat
2021-07-21 15:36:16 +02:00
Christoph M. Becker
f03e7c845e
Merge branch 'PHP-7.4' into PHP-8.0
* PHP-7.4:
  Fix #74960: Heap buffer overflow via str_repeat
2021-07-21 15:33:17 +02:00
Christoph M. Becker
760ff841a1
Fix #74960: Heap buffer overflow via str_repeat
Trying to allocate a `zend_string` with a length only slighty smaller
than `SIZE_MAX` causes an integer overflow, so callers may need to
check that explicitly.  To make that easy in a portable way, we
introduce `ZSTR_MAX_LEN`.

Closes GH-7294.
2021-07-21 15:31:37 +02:00
Nikita Popov
a733b1ada7 Restore zend_atoi()
I dropped this in preparation for changes that I didn't end up
doing. Restore the function for now to avoid unnecessary churn for
extensions.
2021-07-16 14:46:56 +02:00
Nikita Popov
26e8a3ba29 Use unsigned arithmetic in zend_atol
To avoid UB on overflow. I'm not really sure what the correct
overflow behavior here would be.
2021-07-13 11:01:42 +02:00
Nikita Popov
1cba7764b4
Remove zend_atoi() (#7232)
It's the same as (int) zend_atol() -- it doesn't try to do anything
integer size specific. Canonicalize to one function in preparation
for renaming zend_atol() to something less misleading.

FFI test is adjusted to use a zend_test function. It just calls
zend_atol() internally, but could really be anything.

Co-authored-by: Christoph M. Becker <cmbecker69@gmx.de>
2021-07-13 09:22:31 +02:00
Nikita Popov
ce3846cd87 Merge branch 'PHP-8.0'
* PHP-8.0:
  Fix use after free on compound division by zero
2021-07-07 09:38:57 +02:00
Nikita Popov
62ecf54f35 Fix use after free on compound division by zero
We can't destroy the result operand early, because the division
might fail, in which case we need to preserve the original value.
Place the division result in a temporary zval, and only copy it
on success.

Fixes oss-fuzz #35876.
2021-07-07 09:38:30 +02:00
Nikita Popov
d3deb8253d Merge branch 'PHP-8.0'
* PHP-8.0:
  Fix leak on div by zero compound assignment with coercion
2021-07-01 14:50:45 +02:00
Nikita Popov
540fed1b36 Fix leak on div by zero compound assignment with coercion
The result == op1 check did not work properly here, because op1
was &op1_copy at this point. Move the division by zero reporting
out of the _base function, so it can check the original op1.
2021-07-01 14:50:18 +02:00
Nikita Popov
65bbd92dca Initialize retval on bitwise_not exception 2021-07-01 13:21:41 +02:00
Patrick Allaert
aff365871a Fixed some spaces used instead of tabs 2021-06-29 11:30:26 +02:00
Ayesh Karunaratne
b8e380ab09 Update deprecation message for incompatible float to int conversion
Updates the deprecation message for implicit incompatible float to int conversion from:

```
Implicit conversion from non-compatible float %.*H to int in %s on line %d
```

to

```
Implicit conversion from float %.*H to int loses precision in %s on line %d
```

Related: #6661
2021-06-07 14:36:11 +02:00
George Peter Banyard
b6958bb847
Implement "Deprecate implicit non-integer-compatible float to int conversions" RFC. (#6661)
RFC: https://wiki.php.net/rfc/implicit-float-int-deprecate

Co-authored-by: Nikita Popov <nikita.ppv@gmail.com>
2021-05-31 15:48:45 +01:00
George Peter Banyard
aca6aefd85
Remove 'register' type qualifier (#6980)
The compiler should be smart enough to optimize this on its own
2021-05-14 13:38:01 +01:00