Previously the case mapping table was segregated by the type of
the character (upper, lower, title) and always stored the other
two variants (key, other1, other2). Now the table is segregated
by the target type (key, other). As only very few characters have
more than one target this only slightly increases the size of the
table.
The advantage of this layout is that we only need to perform a
single table lookup in the case table. Previously, depending on
the case that was hit, either one lookup in the property table,
or two lookups in the property table and one lookup in the case
table were required.
This changes the layout from libunicode in the OpenLDAP project
-- however, the last commit there was over 10 years ago, so I
don't see value in keeping this in sync.
mb_strtoupper() was converting lowercase characters into
titlecase characters, instead of uppercase characters. Luckily
there are only very few characters with a distinct titlecase
representation, so this mostly worked out okay...
In particular strings now store encoding rather than the
no_encoding.
I've also pruned out libmbfl APIs that existed in two forms, one
using no_encoding and the other using encoding. We were not actually
using any of the former.
As a side-effect mb_strtolower() and mb_strtoupper() now correctly
handle a NULL encoding parameter by using the internal encoding.
This is what caused the two test changes.
Do not try to extract the properties from a bitmask. Instead make
the function variadic and pass all properties individually.
Also add a php_unicode_is_prop1() function to check only a single
property.