mirror of
https://github.com/php/php-src.git
synced 2025-08-20 17:34:35 +02:00
![]() This broke one old test (Zend/tests/multibyte_encoding_003.phpt), which used a PHP script encoded as UTF-16. The problem was that to terminate the test script, we need the text: "\n--EXPECT--". Out of that text, the terminating newline (0x0A byte) becomes part of the resulting test script; but a bare 0x0A byte with no 0x00 is not valid UTF-16. Since we now treat truncated UTF-16 characters as erroneous, an extra '?' is appended to the output as an 'illegal character' marker. Really, if we are running PHP scripts which are treated as encoded in UTF-16 or some other arbitrary text encoding (not ASCII), and the script is not actually a valid string in that encoding, inserting '?' characters into the code which the PHP interpreter runs is a bad thing to do. In such cases, the script shouldn't be treated as UTF-16 (or whatever) at all. I wonder if mbstring's encoding detection is being used in 'non-strict' mode? |
||
---|---|---|
.. | ||
libmbfl | ||
tests | ||
ucgendat | ||
config.m4 | ||
config.w32 | ||
CREDITS | ||
mb_gpc.c | ||
mb_gpc.h | ||
mbstring.c | ||
mbstring.h | ||
mbstring.stub.php | ||
mbstring_arginfo.h | ||
php_mbregex.c | ||
php_mbregex.h | ||
php_onig_compat.h | ||
php_unicode.c | ||
php_unicode.h | ||
unicode_data.h |