The reference implementation of the "Drawing Random Floating-Point Numbers from
an Interval" paper contains two mistakes that will result in erroneous values
being returned under certain circumstances:
- For large values of `g` the multiplication of `k * g` might overflow to
infinity.
- The value of `ceilint()` might exceed 2^53, possibly leading to a rounding
error when promoting `k` to double within the multiplication of `k * g`.
This commit updates the implementation based on Prof. Goualard suggestions
after reaching out to him. It will correctly handle inputs larger than 2^-1020
in absolute values. This limitation will be documented and those inputs
possibly be rejected in a follow-up commit depending on performance concerns.
* random: Add `max_offset` local to Randomizer::getBytesFromString()
* random: Use branchless implementation for mask generation in Randomizer::getBytesFromString()
This was benchmarked against clzl with a standalone script with random inputs
and is slightly faster. clzl requires an additional branch to handle the
source_length = 1 / max_offset = 0 case.
* Improve comment for masking in Randomizer::getBytesFromString()
With a single byte we can choose offsets between 0x00 and 0xff, thus 0x100
different offsets. We only need to use the slow path for sources of more than
0x100 bytes.
The previous version was correct with regard to the output expectations, it was
just slower than necessary. Better fix this now while we still can before being
bound by our BC guarantees with regard to emitted sequences.
This also adds a test to verify the behavior: For powers of two we never reject
any values during rejection sampling, we just need to mask off the unneeded
bits. Thus we can specifically verify that the number of calls to the engine
match the expected amount. We also verify that all the possible values are
emitted to make sure the masking does not remove any required bits. For inputs
longer than 0x100 bytes we need trust the `range()` implementation to be
unbiased, but still verify the number of engine calls and perform a basic
output check.
* random: Randomizer::getFloat(): Fix check for empty open intervals
The check for invalid parameters for the IntervalBoundary::OpenOpen variant was
not correct: If two consecutive doubles are passed as parameters, the resulting
interval is empty, resulting in an uint64 underflow in the γ-section
implementation.
Instead of checking whether `$min < $max`, we must check that there is at least
one more double between `$min` and `$max`, i.e. it must hold that:
nextafter($min, $max) != $max
Instead of duplicating the comparatively complicated and expensive `nextafter`
logic for a rare error case we instead return `NAN` from the γ-section
implementation when the parameters result in an empty interval and thus underflow.
This allows us to reliably detect this specific error case *after* the fact,
but without modifying the engine state. It also provides reliable error
reporting for other internal functions that might use the γ-section
implementation.
* random: γ-section: Also check that that min is smaller than max
This extends the empty-interval check in the γ-section implementation with a
check that min is actually the smaller of the two parameters.
* random: Use PHP_FLOAT_EPSILON in getFloat_error.phpt
Co-authored-by: Christoph M. Becker <cmbecker69@gmx.de>
* random: Add Randomizer::nextFloat()
* random: Check that doubles are IEEE-754 in Randomizer::nextFloat()
* random: Add Randomizer::nextFloat() tests
* random: Add Randomizer::getFloat() implementing the y-section algorithm
The algorithm is published in:
Drawing Random Floating-Point Numbers from an Interval. Frédéric
Goualard, ACM Trans. Model. Comput. Simul., 32:3, 2022.
https://doi.org/10.1145/3503512
* random: Implement getFloat_gamma() optimization
see https://github.com/php/php-src/pull/9679/files#r994668327
* random: Add Random\IntervalBoundary
* random: Split the implementation of γ-section into its own file
* random: Add tests for Randomizer::getFloat()
* random: Fix γ-section for 32-bit systems
* random: Replace check for __STDC_IEC_559__ by compile-time check for DBL_MANT_DIG
* random: Drop nextFloat_spacing.phpt
* random: Optimize Randomizer::getFloat() implementation
* random: Reject non-finite parameters in Randomizer::getFloat()
* random: Add NEWS/UPGRADING for Randomizer’s float functionality
* Unify structure for ext/random's engine tests (2)
This makes adjustments that were missed in
2d6a883b3a.
* Add `engines.inc` for ext/random tests
* Unify structure for ext/random's randomizer tests
* Emit deprecation warnings when adding dynamic properties to classes during unserialization - this will become an Error in php 9.0.
(Adding dynamic properties in other contexts was already a deprecation warning - the use case of unserialization was overlooked)
* Throw an error when attempting to add a dynamic property to a `readonly` class when unserializing
* Add new serialization methods `__serialize`/`__unserialize` for SplFixedArray to avoid creating deprecated dynamic
properties that would then be added to the backing fixed-size array
* Don't add named dynamic/declared properties (e.g. $obj->foo) of SplFixedArray to the backing array when unserializing
* Update tests to declare properties or to expect the deprecation warning
* Add news entry
Co-authored-by: Tyson Andre <tysonandre775@hotmail.com>
* Fix rand_range32() for umax = UINT32_MAX
This was introduced in the commit that added the random extension:
4d8dd8d258.
Resolves GH-9415
* [ci skip] Rename `$r` to `$randomizer` in gh9415.phpt
* Make gh9415.phpt deterministic
* Make gh9415.phpt compatible with 32-bit
* Add Random\Random{Error,Exception} and Random\BrokenRandomEngineError
* Throw BrokenRandomEngineError
* Throw RandomException on seeding failure
* Throw RandomException when CSPRNG fails
* Remove unused include from ext/random/engine_combinedlcg.c
* Remove unused include from ext/random/engine_secure.c
* Remove unused include from ext/random/random.c
* [ci skip] Add ext/random Exception hierarchy to NEWS
* [ci skip] Add the change of Exception for random_(int|bytes) to UPGRADING
* Verify that the engine doesn't change in construct_twice.phpt
* Clean up the implementation of Randomizer::__construct()
Instead of manually checking whether the constructor was already called, we
rely on the `readonly` modifier of the `$engine` property.
Additionally use `object_init_ex()` instead of manually calling
`->create_object()`.
* Remove exception in Randomizer::shuffleBytes()
The only way that `php_binary_string_shuffle` fails is when the engine itself
fails. With the currently available list of engines we have:
- Mt19937 : Infallible.
- PcgOneseq128XslRr64: Infallible.
- Xoshiro256StarStar : Infallible.
- Secure : Practically infallible on modern systems.
Exception messages were cleaned up in GH-9169.
- User : Error when returning an empty string.
Error when seriously biased (range() fails).
And whatever Throwable the userland developer decides to use.
So the existing engines are either infallible or throw an Exception/Error with
a high quality message themselves, making this exception not a value-add and
possibly confusing.
* Remove exception in Randomizer::shuffleArray()
Same reasoning as in the previous commit applies.
* Remove exception in Randomizer::getInt()
Same reasoning as in the previous commit applies.
* Remove exception in Randomizer::nextInt()
Same reasoning as in the previous commit applies, except that it won't throw on
a seriously biased user engine, as `range()` is not used.
* Remove exception in Randomizer::getBytes()
Same reasoning as in the previous commit applies.
* Remove exception in Mt19937::generate()
This implementation is shared across all native engines. Thus the same
reasoning as the previous commits applies, except that the User engine does not
use this method. Thus is only applicable to the Secure engine, which is the
only fallible native engine.
* [ci skip] Add cleanup of Randomizer exceptions to NEWS
Since argument overloading is not safe for reflection, the method needed
to be split appropriately.
Co-authored-by: Tim Düsterhus <timwolla@googlemail.com>
Closes GH-9057.
* Use `php_random_bytes_throw()` in Secure engine's generate()
This exposes the underlying exception, improving debugging:
Fatal error: Uncaught Exception: Cannot open source device in php-src/test.php:5
Stack trace:
#0 php-src/test.php(5): Random\Engine\Secure->generate()
#1 {main}
Next RuntimeException: Random number generation failed in php-src/test.php:5
Stack trace:
#0 php-src/test.php(5): Random\Engine\Secure->generate()
#1 {main}
thrown in php-src/test.php on line 5
* Use `php_random_int_throw()` in Secure engine's range()
This exposes the underlying exception, improving debugging:
Exception: Cannot open source device in php-src/test.php:17
Stack trace:
#0 php-src/test.php(17): Random\Randomizer->getInt(1, 3)
#1 {main}
Next RuntimeException: Random number generation failed in php-src/test.php:17
Stack trace:
#0 php-src/test.php(17): Random\Randomizer->getInt(1, 3)
#1 {main}
* Throw exception when a user engine returns an empty string
This improves debugging, because the actual reason for the failure is available
as a previous Exception:
DomainException: The returned string must not be empty in php-src/test.php:17
Stack trace:
#0 php-src/test.php(17): Random\Randomizer->getBytes(123)
#1 {main}
Next RuntimeException: Random number generation failed in php-src/test.php:17
Stack trace:
#0 php-src/test.php(17): Random\Randomizer->getBytes(123)
#1 {main}
* Throw exception when the range selector fails to get acceptable numbers in 50 attempts
This improves debugging, because the actual reason for the failure is available
as a previous Exception:
RuntimeException: Failed to generate an acceptable random number in 50 attempts in php-src/test.php:17
Stack trace:
#0 php-src/test.php(17): Random\Randomizer->getInt(1, 3)
#1 {main}
Next RuntimeException: Random number generation failed in php-src/test.php:17
Stack trace:
#0 php-src/test.php(17): Random\Randomizer->getInt(1, 3)
#1 {main}
* Improve user_unsafe test
Select parameters for ->getInt() that will actually lead to unsafe behavior.
* Fix user_unsafe test
If an engine fails once it will be permanently poisoned by setting
`->last_unsafe`. This is undesirable for the test, because it skews the
results.
Fix this by creating a fresh engine for each "assertion".
* Remove duplication in user_unsafe.phpt
* Catch `Throwable` in user_unsafe.phpt
As we print the full stringified exception we implicitly assert the type of the
exception. No need to be overly specific in the catch block.
* Throw an error if an engine returns an empty string
* Throw an Error if range fails to find an acceptable number in 50 attempts
When Radomizer::__construct() was called with no arguments, Randomizer\Engine\Secure was implicitly instantiate and memory was leaking.
Co-authored-by: Tim Düsterhus <timwolla@googlemail.com>
The previous shifting logic is problematic for two reasons:
1. It invokes undefined behavior when the `->last_generated_size` is at least
as large as the target integer in `result`, because the shift is larger than
the target integer. This was reported in GH-9083.
2. It expands the returned bytes in a big-endian fashion: Earlier bytes are
shifting into the most-significant position. As all the other logic in the
random extension treats byte-strings as little-endian numbers this is
inconsistent.
By fixing the second issue, we can implicitly fix the first one: Instead of
shifting the existing bits by the number of "newly added" bits, we shift the
newly added bits by the number of existing bits. As we stop requesting new bits
once the total_size reached the size of the target integer we can be sure to
never invoke undefined behavior during shifting.
The get_int_user.phpt test was adjusted to verify the little-endian behavior.
It generates a single byte per call and we expect the first byte generated to
appear at the start of the resulting number.
see GH-9056 for a previous fix in the same area.
Fixes GH-9083 which reports the undefined behavior.
Resolves GH-9085 which was an alternative attempt to fix GH-9083.
* Fix shift in rand_range??()
The last generated size is in bytes, whereas the shift is in bits. Multiple the
generated size by 8 to correctly handle each byte once.
* Correctly handle user engines returning less than 4 bytes in rand_rangeXX()
We need to loop until we accumulate sufficient bytes, instead of just checking
once. The version in the rejection loop was already correct.
* Clean up some repetition in rand_rangeXX()
This fixes:
==374077== Use of uninitialised value of size 8
==374077== at 0x532B06: generate (engine_user.c:39)
==374077== by 0x533F71: zim_Random_Randomizer_getBytes (randomizer.c:152)
==374077== by 0x7F581D: ZEND_DO_FCALL_SPEC_RETVAL_USED_HANDLER (zend_vm_execute.h:1885)
==374077== by 0x8725BE: execute_ex (zend_vm_execute.h:55930)
==374077== by 0x877DB4: zend_execute (zend_vm_execute.h:60253)
==374077== by 0x7B0FD4: zend_execute_scripts (zend.c:1770)
==374077== by 0x6F1647: php_execute_script (main.c:2535)
==374077== by 0x937DA4: do_cli (php_cli.c:964)
==374077== by 0x938C3A: main (php_cli.c:1333)
==374077==
==374077== Invalid read of size 8
==374077== at 0x532B06: generate (engine_user.c:39)
==374077== by 0x533F71: zim_Random_Randomizer_getBytes (randomizer.c:152)
==374077== by 0x7F581D: ZEND_DO_FCALL_SPEC_RETVAL_USED_HANDLER (zend_vm_execute.h:1885)
==374077== by 0x8725BE: execute_ex (zend_vm_execute.h:55930)
==374077== by 0x877DB4: zend_execute (zend_vm_execute.h:60253)
==374077== by 0x7B0FD4: zend_execute_scripts (zend.c:1770)
==374077== by 0x6F1647: php_execute_script (main.c:2535)
==374077== by 0x937DA4: do_cli (php_cli.c:964)
==374077== by 0x938C3A: main (php_cli.c:1333)
==374077== Address 0x11 is not stack'd, malloc'd or (recently) free'd