To avoid that `parse_url()` returns an erroneous host, which would be
valid for `FILTER_VALIDATE_URL`, we make sure that only userinfo which
is valid according to RFC 3986 is treated as such.
For consistency with the existing url parsing code, we use ctype
functions, although that is not necessarily correct.
Make strspn($str1, $str2, $offset, $length) behaviorally
equivalent to strspn(substr($str1, $offset, $length), $str2)
by not throwing for out of bounds offset.
There have been two reports that this change cause issues,
including bug #80285.
As noted on https://bugs.php.net/bug.php?id=80073, I don't think
having this limitation makes sense. The similar_text() function
has much worse asymptotic complexity than levenshtein() and does
not enforce such a limitation. levenshtein() does have fairly high
memory requirements, but they are a fixed factor of the string
length (and subject to memory limit).
The 32-bit tests work on both 32-bit and 64-bit. I dropped the
64-bit variants as they only test one additional case that I don't
think adds particular value.
Make the behavior of substr(), mb_substr(), iconv_substr() and
grapheme_substr() consistent when it comes to the handling of
out of bounds offsets. substr() will now always clamp out of
bounds offsets to the string boundary. Cases that previously
returned false will now return an empty string. This means that
substr() itself *always* returns a string now (like mb_substr()
already did before.)
Closes GH-6182.
Cross checking implementations from other languages, empty strings
are always allowed. PHP's output is peculiar due to it's insistence
to encode a trailing \0, but otherwise sensible and does round-trip
as expected.
Return "0000" instead of false to have a consistent return type.
"0000" is already a possible return value if the string doesn't
contain any letters, such as with soundex(" "). We can treat the
case of soundex("") exactly the same.
The implementation here was pretty confused. In reality the only
error condition it has right now is that for a string input,
from & length cannot be arrays.
The fact that the array lengths are the same was probably supposed
to be checked for the case of array input, as it wouldn't matter
otherwise.
crypt() without salt generates a weak $1$ MD5 hash. It has been
throwing a notice since 2013 and we provide a much better alternative
in password_hash() (which can auto-generate salts for strong
password hashes), so keeping this is just a liability.
Based on:
"Fast CRC Computation for Generic Polynomials Using PCLMULQDQ Instruction"
V. Gopal, E. Ozturk, et al., 2009, http://intel.ly/2ySEwL0
Signed-off-by: Frank Du <frank.du@intel.com>
Closes GH-6018
`tolower()` returns an `int`, so we must not convert to `char` which
may be `signed` and as such may be subject to overflow (actually,
implementation defined behavior).
Closes GH-6007
RFC: https://wiki.php.net/rfc/saner-numeric-strings
This removes the -1 allow_error mode from is_numeric_string functions and replaces it by
a trailing boolean out argument to preserve BC in a couple of places.
Most of the changes can be resumed to "numeric" strings which emitted a E_NOTICE now emit
a E_WARNING and "numeric" strings which emitted a E_WARNING now throw a TypeError.
This mostly affects:
- String offsets
- Arithmetic operations
- Bitwise operations
Closes GH-5762
The `php_serialize` decode function has to return `FAILURE`, if the
unserialization failed on anything but an empty string.
The `php` decode function has also to return `FAILURE`, if there is
trailing garbage in the string.