* Added function for squaring to improve performance of power calculation
* Aligned backslashes
* Removed unnecessary comments
* Extracted common part of multiplication and square functions
* Added comment to bc_fast_square
* Improved wording of bc_mul_finish_from_vector
* Reused new function name
* Replaced macro with function
Added BcMath\Number class. It is an immutable object, has methods that are
equivalent to existing BCMath calculation functions, and can also be calculated
using operators.
The existing BCMath function returned a string for each calculation, but this
class returns an object.
RFC: https://wiki.php.net/rfc/support_object_type_in_bcmath,
https://wiki.php.net/rfc/fix_up_bcmath_number_class
---------
Co-authored-by: Niels Dossche <7771979+nielsdos@users.noreply.github.com>
Fixed the incorrect scale that should be used when dividing by 1, that is,
comparing the divisor and 1 to confirm equality.
Additionally, have increased the number of test cases for bcdiv_by_pow_10.phpt.
In the original specification, the scale of bc_num was directly changed
and compared.
This becomes a problem when objects are supported, so we will modify it
to compare without changing bc_num.
The original calculation method for prod_arr_size allowed for some error,
which could have increased the number of simple loops without byte tricks
at the end of the calculation when converting to bc_num.
The new method calculates the size accurately, so the number of loops does
not increase unnecessarily.
Multiplication is performed after converting to uint32_t/uint64_t, making calculations faster.
---------
Co-authored-by: Niels Dossche <7771979+nielsdos@users.noreply.github.com>
Co-authored-by: Gina Peter Banyard <girgias@php.net>
The code for _bc_do_add and _bc_do_sub were written slightly differently for
similar processing (and add was slower than sub), so I changed the code to one
similar to sub.
Also, _bc_do_add has been changed to use SIMD to perform faster calculations
when possible.
Changed to count trailing zeros using SIMD when converting a string to
a bc_num structure if possible.
Removed unnecessary pointer resetting.
Added UNEXPECTED to some branches.
This simplifies the code, and also might indirectly improve performance
due to a decrease in instruction cache pressure. Although the latter is
probably negligible.
This works because 0x30 has no overlapping bits with [0, 9].
Also avoid some memsets where we do call bc_new_num.
After:
```
1.2066178321838
1.5389559268951
1.6050860881805
```
Before:
```
1.3858470916748
1.6806011199951
1.9091980457306
```
Using SIMD to accelerate the validation.
Using the benchmark from #14076.
After:
```
1.3504369258881
1.6206321716309
1.6845638751984
```
Before:
```
1.4750170707703
1.9039781093597
1.9632289409637
```