Commit graph

703 commits

Author SHA1 Message Date
nobu
d1e2c50a0c Updating casefold.h
* common.mk (lib/unicode_normalize/tables.rb): should not depend
  on Unicode data files unless ALWAYS_UPDATE_UNICODE=yes, to get
  rid of downloading Unicode data unnecessary.  [ruby-dev:49681]
* common.mk (enc/unicode/casefold.h): update Unicode files in a
  sub-make, not to let the header depend on the files always.
* enc/unicode/case-folding.rb: if gperf is not usable, assume the
  existing file is OK.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55492 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-06-24 00:17:17 +00:00
nobu
2d2b6460f4 iso_8859.h: SHARP_s
* enc/iso_8859.h (SHARP_s): name frequently used codepoint.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55375 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-06-11 02:24:38 +00:00
duerst
9fa8b80550 * enc/iso_8859_1.c: Revert to older version of code.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55374 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-06-11 00:58:49 +00:00
duerst
02f7ad6237 * enc/iso_8859_1.c: Implement non-ASCII case mapping.
* test/ruby/enc/test_case_comprehensive.rb: Tests for above.
* string.c: Add iso-8859-1 to supported encodings.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55373 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-06-11 00:46:21 +00:00
duerst
fd7925ffa5 * regenc.h/c: Rename onigenc_not_support_case_map to
onigenc_ascii_only_case_map.
* regenc.h: Add definition of onigenc_single_byte_ascii_only_case_map.
* enc/iso_8859_X.c, windows_125X.c, ascii.c, us-ascii.c, koi8_x.c:
  Replace onigenc_not_support_case_map by
  onigenc_single_byte_ascii_only_case_map.
* enc/big5.c, cp949.c, emacs_mule.c, euc_X.c, gbX.c, shift_jis.c,
  windows_31j.c: Replace onigenc_not_support_case_map by
  onigenc_ascii_only_case_map.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55305 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-06-07 06:05:18 +00:00
duerst
3dd98b2446 * string.c: Raise ArgumentError when invalid string is detected in
case mapping methods.
* enc/unicode.c: Check for invalid string and signal with negative
  length value.
* test/ruby/enc/test_case_mapping.rb: Add tests for above.
* test/ruby/test_m17n_comb.rb: Add a message to clarify test failure.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55253 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-06-02 01:24:52 +00:00
duerst
46647ac8df * enc/unicode.c: Handle DOTLESS_i by hand because it isn't involved in folding.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55164 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-05-25 10:07:19 +00:00
duerst
ef6405f71c * enc/unicode.c: Fix flag error for switch from titlecase to lowercase.
* test/ruby/enc/test_case_mapping.rb: Tests for above error.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55153 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-05-24 23:01:39 +00:00
duerst
78c5ca7074 * include/ruby/oniguruma.h: Extend OnigEncodingTypeDefine to define a
new encoding primitive 'case_map' for case mapping
* enc/utf-8.c, utf_16be/le.c, utf_32be/le.c:
  add onigenc_unicode_case_map as case_map primitive
* enc/ascii.c, big5.c, cp949.c, emacs_mule.c, euc_jp/kr/tw.c, gb18030.c,
  gbk.c, iso_8859_1/2/3/4/5/6/7/8/9/10/11/13/14/15/16.c, koi8_r/u.c,
  shift_jis.c, us_ascii.c, windows_1250/1251/1252.c:
  add onigenc_not_support_case_map as case_map primitive


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55113 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-05-22 05:57:44 +00:00
duerst
84cd51919b * enc/unicode.h: Additional uses of ONIG_CASE_MAPPING compilation switch
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55020 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-05-16 11:00:29 +00:00
svn
3ab0ea8027 * append newline at EOF.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55019 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-05-16 10:46:33 +00:00
duerst
65db16de9f * include/ruby/oniguruma.h: Introducing ONIG_CASE_MAPPING compilation
switch
* include/ruby/oniguruma.h, enc/unicode.h: Using ONIG_CASE_MAPPING
  compilation switch


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55018 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-05-16 10:46:32 +00:00
akr
9d8ef4ea20 Update dependencies.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54544 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-04-11 11:50:00 +00:00
duerst
5e9d33ad49 * enc/unicode/case-folding.rb, casefold.h: Data generation to implement
swapcase functionality for titlecase characters. Swapcase isn't defined
  by Unicode, because the purpose/usage of swapcase is unclear anyway.
  The implementation follows a proposal from Nobu, swaping the case of
  each component of a titlecase character individually.
  This means that the titlecase characters have to be decomposed.
* enc/unicode.c: Code using the above data.
* test/ruby/enc/test_case_mapping.rb: Tests for the above.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54469 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-04-01 11:58:47 +00:00
kazu
4a58f51a95 fix a typo [ci skip]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54400 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-29 12:34:37 +00:00
duerst
78f540019a * enc/unicode/case-folding.rb, casefold.h: Tweaked handling of 6
special cases in CaseUnfold_11_Table.
* enc/unicode.c: Adjustments for above.
* test/ruby/enc/test_case_mapping.rb: Tests for the above: Some tests in
  test_titlecase activated; test_greek added. A test in test_cherokee fixed.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54383 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-29 07:53:43 +00:00
duerst
49f25a1299 * enc/unicode.c: Cleaned up some comments.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54349 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-29 04:30:58 +00:00
duerst
0e6f8b166d * enc/unicode/case-folding.rb, casefold.h: Removing data for idempotent
titlecasing.
* enc/unicode.c: Adjust code to data removal.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54347 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-29 04:24:55 +00:00
duerst
2d20a27fb4 * enc/unicode.c: Refactoring in preparation for data reduction for
titlecase.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54313 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-28 05:54:47 +00:00
duerst
890ce36b79 * enc/unicode.c: Minor refactoring for I WITH DOT ABOVE.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54312 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-28 05:36:35 +00:00
duerst
1582093c77 * enc/unicode.c: Removed code now covered by data from table.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54311 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-28 05:26:23 +00:00
duerst
663fb4dd44 * enc/unicode.c: Adding comments. [ci skip]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54310 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-28 02:49:20 +00:00
svn
d864828fb4 * remove trailing spaces.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54230 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-22 12:08:31 +00:00
duerst
2f455ceca4 * include/ruby/oniguruma.h: Additional flag for characters that are titlecase.
* enc/unicode/case-folding.rb, casefold.h: Using above flag in data.
* enc/unicode.c: Marking capitalized character as unmodified if it is
  already titlecase.
* test/ruby/enc/test_case_mapping.rb: Tests for above functionality.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54229 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-22 12:08:30 +00:00
duerst
fdbb82967f * enc/unicode.c: Fixed two macro definitions.
* test/ruby/enc/test_case_mapping.rb: Test cases that detected
  the above bugs.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54140 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-17 03:09:00 +00:00
svn
5c725ba9fe * remove trailing spaces.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54130 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-16 12:42:16 +00:00
naruse
623dde6ce7 * enc/trans/JIS: update Unicode's notice. [Bug #11844]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54129 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-16 12:42:15 +00:00
duerst
e89232eb15 * enc/unicode.c: Eliminating common code.
(with Kimihito Matsui)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54118 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-15 07:29:51 +00:00
duerst
8679f113e9 * enc/unicode.c: Expansion of some code repetition in preparation for
elimination of common code pieces.
  (with Kimihito Matsui)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54117 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-15 07:17:09 +00:00
svn
10fa31a603 * remove trailing spaces.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54113 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-15 04:49:25 +00:00
duerst
00cc59a054 * enc/unicode.c: Additional macros and code to use mapping data in
CaseMappingSpecials array.
  (with Kimihito Matsui)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54112 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-15 04:49:24 +00:00
duerst
4b15b54d68 * include/ruby/oniguruma.h, enc/unicode.c: Adjusting flag assignments
and macros to work with unified CaseMappingSpecials array.
  (with Kimihito Matsui)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54101 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-14 09:39:54 +00:00
nobu
8b4448e2e1 unicode.c: off-by-one error
* enc/unicode.c (CodePointListValidP): fix off-by-one error.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54091 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-12 01:35:54 +00:00
nobu
d48f923648 unicode.c: boundary check
* enc/unicode.c (CodePointListValidP): add pathological boundary
  check, for gcc 4.9.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54090 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-12 01:15:31 +00:00
duerst
59766643db * enc/unicode/case-folding.rb, casefold.h: Streamlining approach to
case mapping data not available from case folding by unifying all
  three cases (special title, special upper, special lower).
* enc/unicode.c: Adjust macro names for above (macros are currently inactive).
  (with Kimihito Matsui)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54085 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-11 07:11:27 +00:00
duerst
c4e6964141 * enc/unicode/case-folding.rb, casefold.h: Reducing size of TitleCase
table by eliminating duplicates.
  (with Kimihito Matsui)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53957 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-27 08:06:17 +00:00
duerst
7feb182a08 * enc/unicode/case-folding.rb: Adding possibility for debugging output
for TitleCase table in casefold.h.
  (with Kimihito Matsui)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53930 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-25 10:04:59 +00:00
duerst
f1f48e6103 * include/ruby/oniguruma.h: Rearranging flag assignments and making
space for titlecase indices; adding additional macros to add or
  extract titlecase index; adding comments for better documentation.
* enc/unicode.c: Moving some macros to include/ruby/oniguruma.h;
  activating use of titlecase indices.
  (with Kimihito Matsui)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53915 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-24 13:32:01 +00:00
duerst
1cc579cb00 * enc/unicode/case-folding.rb, casefold.h: Outputting actual titlecase
data (new table, with indices from other tables).
* enc/unicode.c: Ignoring titlecase data indices for the moment.
  (with Kimihito Matsui)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53906 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-23 12:53:10 +00:00
duerst
8aa8847b7c * enc/unicode/case-folding.rb, casefold.h: Reading casing data from
SpecialCasing.txt.
  (with Kimihito Matsui)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53904 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-23 06:21:55 +00:00
duerst
4ca9138bac * enc/unicode/case-folding.rb, casefold.h: Adding flag for title-case,
not yet operational.
  (with Kimihito Matsui)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53891 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-22 09:34:34 +00:00
duerst
5470ce8206 * enc/unicode/case-folding.rb, casefold.h: Fixed bug that avoided inclusion
of compatibility characters in uppper-/lower-case mappings.
  (with Kimihito Matsui)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53890 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-22 09:17:43 +00:00
duerst
6286ff6301 * enc/unicode.c: Activated use of case mapping data in CaseUnfold_11 array.
(with Kimihito Matsui)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53870 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-19 03:45:32 +00:00
duerst
6a808bda64 * enc/unicode/case-folding.rb, casefold.h: Used only first element
(rather than all) of target in CaseUnfold_11 array.
  (with Kimihito Matsui)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53843 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-16 10:10:37 +00:00
duerst
c3554cdea6 * enc/unicode/case-folding.rb: Added debugging option
(with Kimihito Matsui)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53833 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-15 05:43:55 +00:00
svn
60c7061770 * remove trailing spaces.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53780 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-08 12:26:36 +00:00
duerst
73ab88994f * enc/unicode/case-folding.rb, enc/unicode/casefold.h: Flags for
upper/lower conversion added (titlecase and SpecialCasing still missing)
  (with Kimihito Matsui)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53779 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-08 12:26:35 +00:00
duerst
2ca7569c6d * string.c, enc/unicode.c: Disassociating ONIGENC_CASE_FOLD flag from
ONIGENC_CASE_DOWNCASE.
(with Kimihito Matsui)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53778 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-08 11:44:12 +00:00
nobu
584f9e51d6 unicode.c: magic numbers
* enc/unicode.c (I_WITH_DOT_ABOVE, DOTLESS_i, DOT_ABOVE): name
  magic numbers.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53776 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-08 05:01:00 +00:00
duerst
8f10a72d90 * enc/unicode.c: Shortened macros for enc/unicode/casefold.h to
single-letter; use flags in casefold.h for logic.
* enc/unicode/case-folding.rb: Added flag for case folding.
  Changed parameter passing.
* enc/unicode/casefold.h: New flags added.
(with Kimihito Matsui)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53775 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-08 04:00:31 +00:00