php-src

mirror of https://github.com/php/php-src.git synced 2025-08-17 22:48:57 +02:00

Author	SHA1	Message	Date
Alex Dowad	a06c20a17c	Remove useless constant MBFL_ENCTYPE_MBCS This flag indicated that an encoding was 'multi-byte'; it can use a variable number of bytes to encode each character. As it turns out, we don't actually need to check this flag anywhere, so it's better to remove it.	2021-01-15 21:55:41 +02:00
Nikita Popov	3e01f5afb1	Replace zend_bool uses with bool We're starting to see a mix between uses of zend_bool and bool. Replace all usages with the standard bool type everywhere. Of course, zend_bool is retained as an alias.	2021-01-15 12:33:06 +01:00
Alex Dowad	72660c416a	Combine MBFL_ENCTYPE_WCS{2,4}{BE,LE} constants These flags identify text encodings in mbstring which use a constant number of bytes per character. While some parts of the code do use these flags, usually to detect cases which can be optimized due to constant-width encoding, nothing cares whether the encodings are 'LE' (little-endian) or 'BE' (big-endian). So we can simplify things by combining constants.	2020-11-25 19:52:19 +02:00
Alex Dowad	e169ad3b61	Consolidate all single-byte encodings in one source file We can squeeze out a lot of duplicated code in this way.	2020-11-11 11:18:59 +02:00
Alex Dowad	3e7acf901d	Remove mbstring identify filters mbstring had an 'identify filter' for almost every supported text encoding which was used when auto-detecting the most likely encoding for a string. It would run over the string and set a 'flag' if it saw anything which did not appear likely to be the encoding in question. One problem with this scheme was that encodings which merely appeared less likely to be the correct one were completely rejected, even if there was no better candidate. Another problem was that the 'identify filters' had a huge amount of code duplication with the 'conversion filters'. Eliminate the identify filters. Instead, when auto-detecting text encoding, use conversion filters to see whether the input string is valid in candidate encodings or not. At the same type, watch the type of codepoints which the string decodes to and mark it as less likely if non-printable characters (ESC, form feed, bell, etc.) or 'private use area' codepoints are seen. Interestingly, one old test case in which JIS text was misidentified as UTF-8 (and this wrong behavior was enshrined in the test) was 'fixed' and the JIS string is now auto-detected as JIS.	2020-11-09 13:45:17 +02:00
Alex Dowad	be1a215538	Optimize (AND FIX) mb_check_encoding (cut execution time by 50%+) Previously, `mb_check_encoding` did an awful lot of unneeded work. In order to determine whether a string was valid or not, it would convert the whole string into wchar (code points), which required dynamically allocating a (potentially large) buffer. Then it would turn right around and convert that big 'ol buffer of code points back to the original encoding again. Finally, it would check whether any invalid bytes were detected during that long and onerous process. The thing is, mbstring _already_ has machinery for detecting whether a string is valid in a certain encoding or not, and it doesn't require copying any data around or allocating buffers. Better yet, it can fail fast when an invalid byte is found. Why not use it? It's sure a lot faster! Further, the legacy code was also badly broken. Why? Because aside from checking whether illegal characters were detected, it would also check whether the conversion to and from wchars was lossless. But, some encodings have more than one valid encoding for the same character. In such cases, it is not possible to make the conversion to and from wchars lossless for every valid character. So `mb_check_encoding` would actually reject good strings in a lot of encodings!	2020-11-02 21:31:06 +02:00
Alex Dowad	7dc16374b4	Remove unused IS_SJIS1 and IS_SJIS2 macros	2020-10-14 08:31:51 +02:00
Nikita Popov	4371a4b241	Merge branch 'PHP-8.0' * PHP-8.0: Fix incorrect zpp parameter count in mb_substr() / mb_strcut()	2020-10-13 17:47:11 +02:00
Nikita Popov	9b4094c3d7	Fix incorrect zpp parameter count in mb_substr() / mb_strcut() These functions only accept 4 params.	2020-10-13 17:46:56 +02:00
Nikita Popov	40e920ebd9	Merge branch 'PHP-8.0' * PHP-8.0: Fix argument nullability in mbstring	2020-10-13 16:03:29 +02:00
Nikita Popov	124bce3c7a	Fix argument nullability in mbstring These arguments were declared nullable in stubs (and should be nullable), but didn't accept null in zpp.	2020-10-13 16:03:04 +02:00
Alex Dowad	0ffc1f55b3	Refactor mbfl_ident.c, mbfl_encoding.c, mbfl_memory_device.c, mbfl_string.c - Make everything less gratuitously verbose - Don't litter the code with lots of unneeded NULL checks (for things which will never be NULL) - Don't return success/failure code from functions which can never fail - For encoding structs, don't use pointers to pointers to pointers for the list of alias strings. Pointers to pointers (2 levels of indirection) is what actually makes sense. This gets rid of some extraneous dereference operations.	2020-10-13 06:12:38 +02:00
Máté Kocsis	e950ca13ea	Consolidate the usage of "either" and "one of" in error messages Closes GH-6173	2020-09-20 19:41:47 +02:00
Máté Kocsis	c37a1cd650	Promote a few remaining errors in ext/standard Closes GH-6110	2020-09-15 14:26:16 +02:00
Máté Kocsis	1c81a34563	Make mb_send_mail() consistent with mail() The $additional_headers parameter shouldn't accept null.	2020-09-14 11:52:33 +02:00
Máté Kocsis	c98d47696f	Consolidate new union type ZPP macro names They will now follow the canonical order of types. Older macros are left intact due to maintaining BC. Closes GH-6112	2020-09-11 11:00:18 +02:00
Nikita Popov	f33fd9b7fe	Throw ValueError on null bytes in mb_send_mail() Instead of silently replacing with spaces.	2020-09-11 10:46:59 +02:00
Alex Dowad	5b78d76ec8	mb_str_split is already documented on php.net So remove TODO comment which implies that it's not.	2020-09-08 20:09:45 +02:00
Nikita Popov	2386f655d8	Always use PCRE for mbstring.http_output_conv_mimetypes Instead of using either oniguruma or pcre depending on which is available. We always have PCRE, so use it. This ensures consistent behavior.	2020-09-08 15:02:15 +02:00
Nikita Popov	623bf96e7e	Throw on invalid mb_http_input() type	2020-09-07 09:59:51 +02:00
Nikita Popov	d57f9e5ea4	Handle null encoding in mb_http_input()	2020-09-04 17:15:35 +02:00
Alex Dowad	409aa20ab0	Refactor mbfl_convert.c	2020-09-03 15:56:29 +02:00
Máté Kocsis	3e800e997b	Move custom type checks to ZPP Closes GH-6034	2020-09-02 11:11:38 +02:00
Alex Dowad	b03fd37677	Code cleanup in mbstring.c	2020-08-31 23:19:43 +02:00
Alex Dowad	7eddcabe2b	Don't guard mbstring code with #ifdef HAVE_MBSTRING This is just a very silly feature of mbstring -- you can compile the source files with HAVE_MBSTRING undefined, and it will all just compile to (almost) nothing. What is the use of this? Why compile the source files and link against them if you don't want the mbstring extension? It doesn't make any kind of sense.	2020-08-31 23:18:13 +02:00
Alex Dowad	62317d592f	Remove redundant includes from mbstring (and make sure correct config.h is used) Very interesting... it turns out that when Valgrind support was enabled, `#include "config.h"` from within mbstring was actually including the file "config.h" from Valgrind, and not the one from mbstring!! This is because -I/usr/include/valgrind was added to the compiler invocation _before_ -Iext/mbstring/libmbfl. Make sure we actually include the file which was intended.	2020-08-31 23:17:58 +02:00
Alex Dowad	ddc76e5abf	Fix typos in comments in mb_send_mail	2020-08-31 23:17:14 +02:00
Alex Dowad	a64241b540	Remove unused functions from mbstring - mbfl_buffer_converter_reset - mbfl_buffer_converter_strncat - mbfl_buffer_converter_getbuffer - mbfl_oddlen - mbfl_filter_output_pipe_flush - mbfl_memory_device_output2 - mbfl_memory_device_output4 - mbfl_is_support_encoding - mbfl_buffer_converter_feed2 - _php_mb_regex_globals_dtor - mime_header_encoder_feed - mime_header_decoder_feed - mbfl_convert_filter_feed	2020-08-31 23:16:57 +02:00
Alex Dowad	8d13348bb5	Separate implementation of mb_{en,de}code_numericentity Rather than using a magic boolean parameter to choose different behavior of the subfunction, inline it. The code size doesn't really grow anyways. And soon these will be trimmed down more.	2020-08-31 23:16:28 +02:00
Alex Dowad	29b02bf290	Use new-style argument parsing macros in mbstring.c	2020-08-31 23:16:21 +02:00
Alex Dowad	d4ef7ef11d	Inline unneeded indirection for mbstring memory management All memory allocation and deallocation for mbstring bounces through a table of function pointers before going to emalloc/efree/etc. But this is unnecessary. The allocators are never swapped out. Better to just call them directly.	2020-08-31 23:16:09 +02:00
George Peter Banyard	fa8d9b1183	Improve type declarations for Zend APIs Voidification of Zend API which always succeeded Use bool argument types instead of int for boolean arguments Use bool return type for functions which return true/false (1/0) Use zend_result return type for functions which return SUCCESS/FAILURE as they don't follow normal boolean semantics Closes GH-6002	2020-08-28 15:41:27 +02:00
Máté Kocsis	ac0da090ae	Fix UNKNOWN default values in ext/mbstring and ext/gd Closes GH-5598	2020-07-28 17:06:25 +02:00
Máté Kocsis	d30cd7d7e7	Review the usage of apostrophes in error messages Closes GH-5590	2020-07-10 21:05:28 +02:00
Max Semenik	2b5de6f839	Remove proto comments from C files Closes GH-5758	2020-07-06 21:13:34 +02:00
Nikita Popov	88021ffe0e	Fix count_commas implementation Ooops, I did not account for the changing length here.	2020-06-12 11:04:35 +02:00
Nikita Popov	f691693ebc	Fix null pointer ub in encoding parsing And do a bit of drive-by cleanup by extracting count_commas and reducing some variable scopes.	2020-06-12 10:08:34 +02:00
Christoph M. Becker	5a04796f76	Fix MSVC level 1 (severe) warnings We fix (hopefully) all instances of: * <https://docs.microsoft.com/en-us/cpp/error-messages/compiler-warnings/compiler-warning-level-1-c4005> * <https://docs.microsoft.com/en-us/cpp/error-messages/compiler-warnings/compiler-warning-level-1-c4024> * <https://docs.microsoft.com/en-us/cpp/error-messages/compiler-warnings/compiler-warning-level-1-c4028> * <https://docs.microsoft.com/en-us/cpp/error-messages/compiler-warnings/compiler-warning-level-1-c4047> * <https://docs.microsoft.com/en-us/cpp/error-messages/compiler-warnings/compiler-warning-level-1-c4087> * <https://docs.microsoft.com/en-us/cpp/error-messages/compiler-warnings/compiler-warning-level-1-c4090> * <https://docs.microsoft.com/en-us/cpp/error-messages/compiler-warnings/compiler-warning-level-1-c4273> * <https://docs.microsoft.com/en-us/cpp/error-messages/compiler-warnings/compiler-warning-level-1-c4312> `zend_llist_add_element()` and `zend_llist_prepend_element()` now explicitly expect a const pointer. We use the macro `ZEND_VOIDP()` instead of a `(void*)` cast to suppress C4090; this should prevent accidential removal of the cast by clarifying the intention, and makes it easier to remove the casts if the issue[1] will be resolved sometime. [1] <https://developercommunity.visualstudio.com/content/problem/390711/c-compiler-incorrect-propagation-of-const-qualifie.html>	2020-06-05 11:17:05 +02:00
George Peter Banyard	68164f40ce	Fix [-Wundef] warning in MBString extension	2020-05-16 15:31:20 +02:00
George Peter Banyard	7dd332f110	Refactor mb_substitute_character() Using the new Fast ZPP API for string\|int\|null This also fixes Bug #79448 which was too disruptive to fix in PHP 7.x	2020-05-11 17:30:01 +02:00
Nikita Popov	481b7421f3	Throw warning if invalid internal_encoding ini is specified	2020-05-07 14:44:13 +02:00
Nikita Popov	217f6013b3	Remove no_language from mbfl_string This is not actually used for anything and just causes confusion.	2020-05-07 11:36:57 +02:00
Nikita Popov	226d9dd30a	Only allow "pass" as input/output encoding "pass" is not a real encoding, it just means "don't perform any conversion". Using it as an internal encoding or passing it to any of the mbstring() function will not work (and on master commonly assert).	2020-05-07 11:19:14 +02:00
Nikita Popov	5bfa9598f4	Return false from failed mb_convert_variables() If we fail to detect the encoding return false, just like mb_convert_encoding() does, and the implementation here clearly intended. Previously the "pass" pseudo-incoding was returned.	2020-05-07 10:16:46 +02:00
Nikita Popov	71f48260af	Fix assertion failure when failing to detect encoding Looks like prior to 7.3 this just passed the original string through. Since 7.3 it returns false. Let's stick with that behavior.	2020-05-06 22:56:01 +02:00
Nikita Popov	7d4ff8443e	Remove persistent allocators from libmbfl These functions are not used, and I don't think we have any plans to ever use them.	2020-05-04 23:19:07 +02:00
Máté Kocsis	6111d64cda	Improve a last couple of argument error messages Closes GH-5404	2020-04-20 13:09:00 +02:00
Máté Kocsis	1f48feebb9	Improve some TypeError and ValueError messages Closes GH-5377	2020-04-14 14:38:45 +02:00
George Peter Banyard	01762e56ed	Adapt assertion as mbfl_strwidth returns a size_t	2020-04-12 19:34:05 +02:00
George Peter Banyard	12ec7a2730	Convert if blocks to assertions and adapt stubs accordingly	2020-04-09 13:50:37 +02:00

1 2 3 4 5 ...

765 commits