mirror of
https://github.com/php/php-src.git
synced 2025-08-15 21:48:51 +02:00
![]() In a GitHub thread, Michael Voříšek and Kamil Tekiela mentioned that the PCRE2 function `pcre_match` can be used to validate UTF-8, and that historically it was more efficient than mbstring's `mb_check_encoding`. `mb_check_encoding` is now much faster on hosts with SSE2, and much faster again on hosts with AVX2. However, while all x86-64 CPUs support at least SSE2, not all PHP users run their code on x86-64 hardware. For example, some use recent Macs with ARM CPUs. Therefore, borrow PCRE2's UTF-8 validation function as a fallback for hosts with no SSE2/AVX2 support. On long UTF-8 strings, this code is 50% faster than mbstring's existing fallback code. |
||
---|---|---|
.. | ||
libmbfl | ||
tests | ||
ucgendat | ||
common_codepoints.txt | ||
config.m4 | ||
config.w32 | ||
CREDITS | ||
gen_rare_cp_bitvec.php | ||
mb_gpc.c | ||
mb_gpc.h | ||
mbstring.c | ||
mbstring.h | ||
mbstring.stub.php | ||
mbstring_arginfo.h | ||
php_mbregex.c | ||
php_mbregex.h | ||
php_onig_compat.h | ||
php_unicode.c | ||
php_unicode.h | ||
rare_cp_bitvec.h | ||
unicode_data.h |