Commit graph

43 commits

Author SHA1 Message Date
Bob Weinand
a01dd9feda Revert "Port all internally used classes to use default_object_handlers"
This reverts commit 94ee4f9834.

The commit was a bit too late to be included in PHP 8.2 RC1. Given it's a massive ABI break, we decide to postpone the change to PHP 8.3.
2022-09-14 11:13:23 +02:00
Bob Weinand
94ee4f9834 Port all internally used classes to use default_object_handlers
Signed-off-by: Bob Weinand <bobwei9@hotmail.com>
2022-08-31 16:45:27 +02:00
Máté Kocsis
559d5030a8
Declare ext/intl constants in stubs - part 10 (#9280) 2022-08-09 11:19:23 +02:00
KsaR
01b3fc03c3
Update http->https in license (#6945)
1. Update: http://www.php.net/license/3_01.txt to https, as there is anyway server header "Location:" to https.
2. Update few license 3.0 to 3.01 as 3.0 states "php 5.1.1, 4.1.1, and earlier".
3. In some license comments is "at through the world-wide-web" while most is without "at", so deleted.
4. fixed indentation in some files before |
2021-05-06 12:16:35 +02:00
Máté Kocsis
c6723728df
Generate ext/intl class entries from stubs
Closes GH-6670
2021-02-09 13:37:24 +01:00
Nikita Popov
259d050a39 Clean up BreakIterator create_object handler
Use standard zend_object_alloc() function and fix the
object_init_properties() call (which works out okay because there
are no properties).
2020-08-25 13:11:43 +02:00
Nikita Popov
ff19ec2df3 Introduce InternalIterator
Userland classes that implement Traversable must do so either
through Iterator or IteratorAggregate. The same requirement does
not exist for internal classes: They can implement the internal
get_iterator mechanism, without exposing either the Iterator or
IteratorAggregate APIs. This makes them usable in get_iterator(),
but incompatible with any Iterator based APIs.

A lot of internal classes do this, because exposing the userland
APIs is simply a lot of work. This patch alleviates this issue by
providing a generic InternalIterator class, which acts as an
adapater between get_iterator and Iterator, and can be easily
used by many internal classes. At the same time, we extend the
requirement that Traversable implies Iterator or IteratorAggregate
to internal classes as well.

Closes GH-5216.
2020-06-24 15:31:41 +02:00
Máté Kocsis
f00bcfbb7d
Generate method entries for ext/intl
Closes GH-5370
2020-04-14 13:39:00 +02:00
Nikita Popov
fb5bfcb75b Add a ZEND_UNCOMPARABLE value
To explicitly indicate that objects are uncomparable. For now
this has no functional difference from the usual 1 return value,
but makes intent clearer.
2020-03-31 12:36:48 +02:00
Máté Kocsis
81fee9f29f
Get rid of method mapping of BreakIterator classes 2020-02-25 23:05:24 +01:00
Máté Kocsis
bcb7847139
Add stubs for Intl BreakIterator
Closes GH-5207
2020-02-25 23:05:03 +01:00
Dmitry Stogov
b02b81299c Comparison cleanup:
- introduce zend_compare() that returns -1,0,1 dirctly (without intermediate zval)
- remove compare_objects() object handler, and keep only compare() handler
2019-10-07 17:57:49 +03:00
Gabriel Caruso
5d6e923d46
Remove mention of PHP major version in Copyright headers
Closes GH-4732.
2019-09-25 14:51:43 +02:00
Dmitry Stogov
91ef4124e5 Refactor zend_object_handlers API to pass zend_object* and zend_string* insted of zval(s). 2019-02-04 13:20:25 +03:00
Christoph M. Becker
8a4c2f1621 Require ICU ≥ 50.1
Given that ICU is a set of lively developed libraries, that ICU 50.1
has been released on 2012-11-05, and PHP 7.4 is scheduled to be
released seven years after it, we consider it appropriate to ditch
these legacy versions.

Particularly, that would be a reasonable groundwork to implement part
two of the “Deprecate and remove INTL_IDNA_VARIANT_2003” RFC[1], namely
to default idn_to_ascii()'s and idn_to_utf8()'s $variant parameter to
INTL_IDNA_VARIANT_UTS46, which is not defined in ICU < 4.6.

See also the related discussion on internals@[2].

[1] <https://wiki.php.net/rfc/deprecate-and-remove-intl_idna_variant_2003>
[2] <http://news.php.net/php.internals/101626>ff
2018-09-15 13:59:54 +02:00
Christoph M. Becker
1b61c3b210 Merge branch 'PHP-7.2'
* PHP-7.2:
  Fix #76556: get_debug_info handler for BreakIterator shows wrong type
2018-06-30 23:22:20 +02:00
Christoph M. Becker
1118fca75d Fix #76556: get_debug_info handler for BreakIterator shows wrong type
We use the retrieved type for the "type" element instead of the text.
This has been confused during the PHP 7 upgrade[1].

[1] http://git.php.net/?p=php-src.git;a=commit;h=1d793348067e5769144c0f7efd86428a4137baec
2018-06-30 23:15:02 +02:00
Dmitry Stogov
f2b4ec4bdc Export standard object handlers, to avoid indirect access 2018-05-31 11:57:22 +03:00
Anatol Belski
d8200e4885 Simplify namespace access
The icu namespace is an alias which resolves to the real namespace.
2018-04-01 01:03:40 +02:00
Anatol Belski
8d35a42383 Utilize the recommended way to handle the icu namespace 2018-03-31 18:51:56 +02:00
Dmitry Stogov
44e0b79ac6 Refactored array creation API. array_init() and array_init_size() are converted into macros calling zend_new_array(). They are not functions anymore and don't return any values. 2017-09-20 02:25:56 +03:00
Nikita Popov
29af302395 Remove useless dtor handlers in intl
These are only indirections to the default handler
2016-07-16 23:16:43 +02:00
Dmitry Stogov
4a2e40bb86 Use ZSTR_ API to access zend_string elements (this is just renaming without semantick changes). 2015-06-30 04:05:24 +03:00
Anatol Belski
bdeb220f48 first shot remove TSRMLS_* things 2014-12-13 23:06:14 +01:00
Johannes Schlüter
d0cb715373 s/PHP 5/PHP 7/ 2014-09-19 18:33:14 +02:00
Anatol Belski
c3e3c98ec6 master renames phase 1 2014-08-25 19:24:55 +02:00
Anatol Belski
63d3f0b844 basic macro replacements, all at once 2014-08-19 08:07:31 +02:00
Xinchen Hui
9fb8c16b6c Fixed temporarily un-expected object re-init 2014-06-29 15:28:55 +08:00
Xinchen Hui
1d79334806 Fixed get_debug_info 2014-06-28 23:30:46 +08:00
Xinchen Hui
b6e9c76d67 Refactoring ext/intl (only compilerable now, far to finish :<) 2014-06-28 12:20:35 +08:00
Xinchen Hui
4fbaddb4f8 Refactoring ext/intl (incompleted) 2014-06-28 00:02:50 +08:00
Dmitry Stogov
050d7e38ad Cleanup (1-st round) 2014-04-15 15:40:40 +04:00
Dmitry Stogov
f4cfaf36e2 Use better data structures (incomplete) 2014-02-10 10:04:30 +04:00
Gustavo André dos Santos Lopes
3363e04fb4 intl: remove extra quotes from arginfo params 2013-07-21 03:30:28 +02:00
Gustavo Lopes
b4ba46cb99 Fix arginfo of BreakIterator::getLocale 2013-01-29 19:06:14 +01:00
Gustavo André dos Santos Lopes
d8b067e66f BreakIterator: fix compat with old ICU versions 2012-06-25 12:05:13 +02:00
Gustavo André dos Santos Lopes
77daa3482d BreakIterator::getPartsIterator: new optional arg
Can take one of:
* IntlPartsIterator::KEY_SEQUENTIAL (keys are 0, 1, ...)
* IntlPartsIterator::KEY_LEFT (keys are left boundaries)
* IntlPartsIterator::KEY_LEFT (keys are right boundaries)

The default is IntlPartsIterator::KEY_SEQUENTIAL (the previous behavior).
2012-06-22 18:52:06 +02:00
Gustavo André dos Santos Lopes
0a7ae87e91 Added IntlCodePointBreakIterator.
Objects of this class can be instantiated with

IntlBreakIterator::createCodePointInstance()

The method does not take a locale, as it would not make sense in this
context.

This class has one additional method:

long IntlCodePointIterator::getLastCodePoint()

which returns either -1 or the last code point we moved over, if any
(and discounting any movement before the last call to
IntlBreakIterator::first() or IntlBreakIterator::last()).
2012-06-22 18:19:54 +02:00
Gustavo André dos Santos Lopes
cee31091a9 Add Intl prefix to BreakIterator/RuleBasedBI 2012-06-10 22:42:38 +02:00
Gustavo André dos Santos Lopes
87dd0269ba Remove trailing space 2012-06-10 13:26:28 +02:00
Gustavo André dos Santos Lopes
afed66bb9e BreakIter: Removed getAvailableLocales/getHashCode 2012-06-10 00:05:00 +02:00
Gustavo André dos Santos Lopes
c6593a0e9b BreakIterator: add rules status constants 2012-06-04 23:09:10 +02:00
Gustavo André dos Santos Lopes
f5b421621d BreakIterator and RuleBasedBreakiterator added
This commit adds wrappers for the classes BreakIterator and
RuleBasedbreakIterator. The C++ ICU classes are described here:
<http://icu-project.org/apiref/icu4c/classBreakIterator.html>
<http://icu-project.org/apiref/icu4c/classRuleBasedBreakIterator.html>

Additionally, a tutorial is available at:
<http://userguide.icu-project.org/boundaryanalysis>

This implementation wraps UTF-8 text in a UText. The text is
iterated without any copying or conversion to UTF-16. There is
also no validation that the input is actually UTF-8; where there
are malformed sequences, the UText will simply U+FFFD.

The class BreakIterator cannot be instantiated directly (has a
private constructor). It provides the interface exposed by the ICU
abstract class with the same name. The PHP class is not abstract
because we may use it to wrap native subclasses of BreakIterator
that we don't know how to wrap. This class includes methods to
move the iterator position to the beginning (first()), to the
end (last()), forward (next()), backwards (previous()), to the
boundary preceding a certain position (preceding()) and following
a certain position (following()) and to obtain the current position
(current()). next() can also be used to advance or recede an
arbitrary number of positions.

BreakIterator also exposes other native methods:
getAvailableLocales(), getLocale() and factory methods to build
several predefined types of BreakIterators: createWordInstance()
for word boundaries, createCharacterInstance() for locale
dependent notions of "characters", createSentenceInstance() for
sentences, createLineInstance() and createTitleInstance() -- for
title casing breaks. These factories currently return
RuleBasedbreakIterators where the names of the rule sets are found
in the ICU data, observing the passed locale (although the locale
is taken into considering there are very few exceptions to the
root rules).

The clone and compare_object PHP object handlers are also
implemented, though the comparison does not yield meaningful results
when used with >, <, >= and <=.

Note that BreakIterator is an iterator only in the sense of the
first 'Iterator' in 'IteratorIterator', i.e., it does not
implement the Iterator interface. The reason is that there is
no sensible implementation for Iterator::key(). Using it for
an ordinal of the current boundary is not feasible because
we are allowed to move to any boundary at any time. It we were
to determine the current ordinal when last() is called we'd
have to traverse the whole input text to find out how many
breaks there were before. Therefore, BreakIterator implements
only Traversable. It can be wrapped in an IteratorIterator,
but the usual warnings apply.

Finally, I added a convenience method to BreakIterator:
getPartsIterator(). This provides an IntlIterator, backed
by the BreakIterator PHP object (i.e. moving the pointer or
changing the text in BreakIterator affects the iterator
and also moving the iterator affects the backing BreakIterator),
which allows traversing the text between each boundary.
This iterator uses the original text to retrieve the text
between two positions, not the code points returned by the
wrapping UText. Therefore, if the text includes invalid code
unit sequences, these invalid sequences will be in the output
of this iterator, not U+FFFD code points.

The class RuleBasedIterator exposes a constructor that allows
building an iterator from arbitrary compiled or non-compiled
rules. The form of these rules in described in the tutorial linked
above. The rest of the methods allow retrieving the rules --
getRules() and getCompiledRules() --, a hash code of the rule set
(hashCode()) and the rules statuses (getRuleStatus() and
getRuleStatusVec()).

Because the RuleBasedBreakIterator constructor may return parse
errors, I reuse the UParseError to text function that was in the
transliterator files. Therefore, I move that function to
intl_error.c.

common_enum.cpp was also changed, mainly to expose previously
static functions. This avoided code duplication when implementing
the BreakIterator iterator and the IntlIterator returned by
BreakIterator::getPartsIterator().
2012-06-04 22:25:07 +02:00