Commit graph

107 commits

Author SHA1 Message Date
Alex Dowad
e169ad3b61 Consolidate all single-byte encodings in one source file
We can squeeze out a lot of duplicated code in this way.
2020-11-11 11:18:59 +02:00
Alex Dowad
3e7acf901d Remove mbstring identify filters
mbstring had an 'identify filter' for almost every supported text encoding
which was used when auto-detecting the most likely encoding for a string.
It would run over the string and set a 'flag' if it saw anything which
did not appear likely to be the encoding in question.

One problem with this scheme was that encodings which merely appeared
less likely to be the correct one were completely rejected, even if there
was no better candidate. Another problem was that the 'identify filters'
had a huge amount of code duplication with the 'conversion filters'.

Eliminate the identify filters. Instead, when auto-detecting text
encoding, use conversion filters to see whether the input string is valid
in candidate encodings or not. At the same type, watch the type of
codepoints which the string decodes to and mark it as less likely if
non-printable characters (ESC, form feed, bell, etc.) or 'private use
area' codepoints are seen.

Interestingly, one old test case in which JIS text was misidentified
as UTF-8 (and this wrong behavior was enshrined in the test) was 'fixed'
and the JIS string is now auto-detected as JIS.
2020-11-09 13:45:17 +02:00
Alex Dowad
cc03c54c36 Remove useless byte{2,4}{be,le} encodings from mbstring
There is no meaningful difference between these and UCS-{2,4}. They are
just a little bit more lax about passing errors silently. They also have
no known use.

Alias to UCS-{2,4} in case someone, somewhere is using them.
2020-11-09 13:45:16 +02:00
Alex Dowad
62317d592f Remove redundant includes from mbstring (and make sure correct config.h is used)
Very interesting... it turns out that when Valgrind support was enabled,
`#include "config.h"` from within mbstring was actually including the file "config.h"
from Valgrind, and not the one from mbstring!!

This is because -I/usr/include/valgrind was added to the compiler invocation _before_
-Iext/mbstring/libmbfl.

Make sure we actually include the file which was intended.
2020-08-31 23:17:58 +02:00
Alex Dowad
d4ef7ef11d Inline unneeded indirection for mbstring memory management
All memory allocation and deallocation for mbstring bounces through a table of
function pointers before going to emalloc/efree/etc. But this is unnecessary.
The allocators are never swapped out. Better to just call them directly.
2020-08-31 23:16:09 +02:00
Peter Kokot
75fb74860d Normalize comments in *nix build system m4 files
Normalization include:
- Use dnl for everything that can be ommitted when configure is built in
  favor of the shell comment character # which is visible in the output.
- Line length normalized to 80 columns
- Dots for most of the one line sentences
- Macro definitions include similar pattern header comments now
2019-05-12 18:43:03 +02:00
Peter Kokot
3a4df95793 Simplify ext/mbstring/libmbfl/config.h creation
- The config.h.in is part of the standalone libmbfl library and it is
  forked and bundled.
2019-05-11 19:44:40 +02:00
Dmitry Stogov
7139c381f1 Fixed ZTS cache usage 2019-03-12 16:58:02 +03:00
Peter Kokot
9df6a1e4dd Add AS_HELP_STRING to *nix build configure options
The Autoconf's default AS_HELP_STRING macro can properly format help
strings [1] so watching out if columns are aligned manually is not
anymore.

[1] https://www.gnu.org/software/autoconf/manual/autoconf.html#Pretty-Help-Strings
2019-03-07 20:36:59 +01:00
Nikita Popov
d1c1481081 Unbundle oniguruma
And also switch detection over to pkg-config.
2019-02-11 14:53:19 +01:00
Peter Kokot
92ac598aab Remove local variables
This patch removes the so called local variables defined per
file basis for certain editors to properly show tab width, and
similar settings. These are mainly used by Vim and Emacs editors
yet with recent changes the once working definitions don't work
anymore in Vim without custom plugins or additional configuration.
Neither are these settings synced across the PHP code base.

A simpler and better approach is EditorConfig and fixing code
using some code style fixing tools in the future instead.

This patch also removes the so called modelines for Vim. Modelines
allow Vim editor specifically to set some editor configuration such as
syntax highlighting, indentation style and tab width to be set in the
first line or the last 5 lines per file basis. Since the php test
files have syntax highlighting already set in most editors properly and
EditorConfig takes care of the indentation settings, this patch removes
these as well for the Vim 6.0 and newer versions.

With the removal of local variables for certain editors such as
Emacs and Vim, the footer is also probably not needed anymore when
creating extensions using ext_skel.php script.

Additionally, Vim modelines for setting php syntax and some editor
settings has been removed from some *.phpt files.  All these are
mostly not relevant for phpt files neither work properly in the
middle of the file.
2019-02-03 21:03:00 +01:00
Peter Kokot
b189c2432a Remove HAVE_STDARG_H
The C89 standard and later defines the `<stdarg.h>` header as part of
the standard headers [1]. On current systems it is always present and
can be included unconditionally.

Checking for presence and functionality of the `<stdarg.h>` header and
variadic function is not relevant anymore on current systems since this
is always available.

Also Autoconf suggests relying on at least C89 or above [2] and [3].

The following files were regenerated with re2c 1.0.3:
- Zend/zend_language_scanner.c
- Zend/zend_language_scanner_defs.h

Refs:
[1] https://port70.net/~nsz/c/c89/c89-draft.html#4.1.2
[2] http://git.savannah.gnu.org/cgit/autoconf.git/tree/lib/autoconf/headers.m4
[3] https://www.gnu.org/software/autoconf/manual/autoconf-2.69/autoconf.html
2018-09-18 05:44:56 +02:00
Peter Kokot
d3ca28f569 Remove HAVE_STRING_H
The C89 standard and later defines the `<string.h>` header as part of
the standard headers [1] and on current systems it is always present.

Code included also `<strings.h>` header as an alterinative in some
files. This kind of check was relevant on some older systems where the
`<strings.h>` file included definitions for the C89 compliant
`<string.h>`. Today such alternative check is not required anymore. The
`<strings.h>` file is part of the POSIX definition these days.

Also Autoconf suggests doing this and relying on C89 or above [2] and [3].

This patch also cleans few unused `<strings.h>` inclusions in the libmbfl.

[1]: https://port70.net/~nsz/c/c89/c89-draft.html#4.1.2
[2]: http://git.savannah.gnu.org/cgit/autoconf.git/tree/lib/autoconf/headers.m4
[3]: https://www.gnu.org/software/autoconf/manual/autoconf-2.69/autoconf.html
2018-09-18 05:32:08 +02:00
Peter Kokot
7dd62811ce Remove HAVE_STDLIB_H
The C89 and later standard defines the `<stdlib.h>` header as part of
the standard headers [1] and on current systems it is always present
and the `HAVE_STDLIB_H` symbol can be removed.

Also Autoconf suggests doing this and relying on C89 or above [2] and [3].

[1] https://port70.net/~nsz/c/c89/c89-draft.html#4.1.2
[2] http://git.savannah.gnu.org/cgit/autoconf.git/tree/lib/autoconf/headers.m4
[3] https://www.gnu.org/software/autoconf/manual/autoconf-2.69/autoconf.html
2018-09-16 20:53:53 +02:00
Peter Kokot
6db3c105f2
Remove AC_FUNC_MEMCMP
Autoconf 2.59d (released in 2006) [1] started promoting several macros
as not relevant for newer systems anymore, including the `AC_FUNC_MEMCMP`.

On some old systems such as SunOS 4.1.3 (EOL in 2003) and NeXT x86
OpenStep (discontinued) the `memcmp` function wasn't present or it
didn't work properly. [2]

On current systems including at least Solaris 10+ this check is not
relevant anymore.

Refs:
[1] http://git.savannah.gnu.org/cgit/autoconf.git/tree/NEWS
[2] https://www.gnu.org/software/autoconf/manual/autoconf-2.69/autoconf.html
2018-09-04 12:02:47 +02:00
Peter Kokot
f86d3de87f Remove AC_HEADER_TIME
Autoconf 2.59d (released in 2006) [1] started promoting several macros
as not relevant for newer systems anymore, including the `AC_HEADER_TIME`.

This macro checks if both `<sys/time.h>` and `<time.h>` can be included
at the same time and defines the `TIME_WITH_SYS_TIME` and
`HAVE_SYS_TIME_H` symbols. On current system such check is not relevant
anymore because in case both headers are present both can be also
included at the same time.

This patch simplifies this checking.

Refs:
[1] http://git.savannah.gnu.org/cgit/autoconf.git/tree/NEWS
[2] https://www.gnu.org/software/autoconf/manual/autoconf-2.69/autoconf.html
2018-09-02 19:24:55 +02:00
Peter Kokot
8e230d364d Remove AC_C_CONST
Autoconf 2.59d (released in 2006) [1] started promoting several macros
as not relevant for newer systems, including the `AC_C_CONST`.

The `const` keyword is used in C since C89. On old systems some compilers
lacked the `const` and this macro defined it to be empty. This check was
relevant on systems with compilers before C89 and on current systems it
can be omitted. [2]

PHP also requires at least C89 so `const` is always available.

Refs:
[1] http://git.savannah.gnu.org/cgit/autoconf.git/tree/NEWS
[2] https://www.gnu.org/software/autoconf/manual/autoconf-2.69/autoconf.html
2018-09-02 18:55:03 +02:00
Peter Kokot
4371945b8b Replace obsolete AC_TRY_FOO with AC_FOO_IFELSE
Autoconf 2.50 released in 2001 made several macros obsolete including
the AC_TRY_RUN, AC_TRY_COMPILE and AC_TRY_LINK:
http://git.savannah.gnu.org/cgit/autoconf.git/tree/ChangeLog.2

These macros should be replaced with the current AC_FOO_IFELSE instead:
- AC_TRY_RUN with AC_RUN_IFELSE and AC_LANG_SOURCE
- AC_TRY_LINK with AC_LINK_IFELSE and AC_LANG_PROGRAM
- AC_TRY_COMPILE with AC_COMPILE_IFELSE and AC_LANG_PROGRAM

PHP 5.4 to 7.1 require Autoconf 2.59+ version, PHP 7.2 and above require
2.64+ version, and the PHP 7.2 phpize script requires 2.59+ version which
are all greater than above mentioned 2.50 version therefore systems
should be well supported by now.

This patch was created with the help of autoupdate script:
autoupdate <file>

Reference docs:
- https://www.gnu.org/software/autoconf/manual/autoconf-2.69/html_node/Obsolete-Macros.html
- https://www.gnu.org/software/autoconf/manual/autoconf-2.59/autoconf.pdf
2018-07-30 02:36:38 +02:00
Peter Kokot
8d3f8ca12a Remove unused Git attributes ident
The $Id$ keywords were used in Subversion where they can be substituted
with filename, last revision number change, last changed date, and last
user who changed it.

In Git this functionality is different and can be done with Git attribute
ident. These need to be defined manually for each file in the
.gitattributes file and are afterwards replaced with 40-character
hexadecimal blob object name which is based only on the particular file
contents.

This patch simplifies handling of $Id$ keywords by removing them since
they are not used anymore.
2018-07-25 00:53:25 +02:00
Christoph M. Becker
271ae3eb2b Fix #76574: use of undeclared identifiers INT_MAX and LONG_MAX
As of Oniguruma 6.4.0 <limits.h> is required, so we have to add a check
for this header file to set the respective macro.
2018-07-10 14:28:28 +02:00
Peter Kokot
5c5bd30339 Remove --with-libmbfl configure option
The bundled libmbfl library is no longer API or ABI compatible with
the (currently unmaintained) upstream library. As such, building
against an external libmbfl is no longer possible.
2017-10-28 16:11:30 +02:00
Dmitry Stogov
85a62d54b4 Fixed build 2016-11-28 12:12:54 +03:00
Anatol Belski
b8645ef29e fix oniguruma.h copying 2016-11-25 23:55:27 +01:00
Anatol Belski
2a76d2282a upgrade to Oniguruma 6.1.2 2016-11-25 22:00:53 +01:00
Kalle Sommer Nielsen
2104bea5d7 Remove Netware support
If this does not break the Unix system somehow, I'll be amazed. This should get most of it out, apologies for any errors this may cause on non-Windows ends which I cannot test atm.
2016-11-12 11:20:01 +01:00
Anatol Belski
7a6a3d923b fix arg order, CFLAGS is the fifth arg in m4 2014-10-17 16:03:40 +02:00
Anatol Belski
0490a32249 more exts converted for static tsrm ls pointer
mbstring, pcre, reflection
2014-10-15 19:19:23 +02:00
Christopher Jones
c6d977dd39 Fix long-standing visual pain point: the misalignment of './configure help' text.
Whitespace changes and a couple of grammar fixes.
2013-08-06 11:06:09 -07:00
Rui Hirokawa
4122ef275c added iso2022jp-mobile and emoji unsuppoted in unicode 6.0. 2011-08-24 15:28:44 +00:00
Rui Hirokawa
c746cf5dc9 updated libmbfl to 1.3.2 (JISX-0213:2004 support). 2011-08-20 07:24:04 +00:00
Rui Hirokawa
484e6b8fb3 added gb18030 encoding to mbstring/libmbfl.~ 2011-08-14 14:09:11 +00:00
Rui Hirokawa
360d18c479 added UTF-8-Mobile for pictogram support. 2011-08-13 12:44:28 +00:00
Rui Hirokawa
52948b534c added new files of libmbfl 1.3.0. 2011-08-02 02:50:11 +00:00
Moriyoshi Koizumi
d9dda48f8a - Update the bundled libmbfl to the latest on upstream. 2010-03-12 04:55:37 +00:00
Rasmus Lerdorf
3f5a58bd3a Someone strap down Jani and give him a sedative please.
This makes our toolchain work with the latest versions
of autoconf and avoids a lot of end-user grief.
2009-11-25 01:30:06 +00:00
Jani Taskinen
a0f3cf5cc4 MFB: Thanks to the "maintainers" who are too lazy to commit FIRST to HEAD! 2009-04-20 17:06:03 +00:00
Moriyoshi Koizumi
7f5d554cde - MFB (fixes build) 2009-04-16 02:05:00 +00:00
Moriyoshi Koizumi
5b494c0f16 - Add mbstring.http_output_conv_mimetypes that allows common non-text
types such as "application/xhtml+xml" to be converted by
  mb_output_handler().
2008-07-24 12:58:37 +00:00
Moriyoshi Koizumi
a9c4d66340 - Added a new configure option --with-onig=[DIR] that allows the extension
to link to the external oniguruma library.
- Prevent libmbfl files from being installed when --with-libmbfl is specified.
- Fixed mb_ereg_replace() to work with unicode strings.
2008-07-16 02:29:14 +00:00
Moriyoshi Koizumi
e8f4d65f0d - indentation fix & reenable mbregex again. 2008-07-15 18:04:14 +00:00
Rui Hirokawa
1dcec80d7b fixed bug #42502 va_* cannot detect. 2007-09-18 21:35:13 +00:00
Antony Dovgal
24c766b902 fix typo
it would be much better if Gentoo people send us this patch long ago instead of using their private patches.
2007-07-31 12:23:42 +00:00
Seiji Masugata
0d9e23c38b Synced PHP_5_2 Branch. 2006-12-21 17:37:53 +00:00
Rui Hirokawa
7d47f629e4 fixed --disable-mbregex to disable multibe-regex. 2006-10-02 23:27:43 +00:00
Rui Hirokawa
adbab589aa fixed bug #37103: libmbfl headers was not installed correctly. 2006-10-02 15:32:48 +00:00
foobar
8bd7796184 Fixed bug #37103 (libmbfl headers not installed) 2006-04-17 22:13:39 +00:00
Marcus Boerger
bb94742080 - Disable mbregex support until someone finds a way to reenable the
required engine stuff
- Fix build
2006-02-23 20:15:36 +00:00
Rui Hirokawa
7c20dce548 fixed #29955 mb_strtoupper() / lower() broken with Turkish encoding.. 2005-12-23 13:50:29 +00:00
Antony Dovgal
0fb9af4eac fix #34977 (Compile failure on MacOSX due to use of varargs.h) 2005-10-26 13:49:19 +00:00
foobar
d12196e575 Fix VPATH build 2005-05-29 23:15:16 +00:00