* re.c (rb_reg_preprocess): new function for dynamic regexp with
\u{} such as Regexp.new("\\u{6666}").
(rb_reg_prepare_re): preprocess regexp for recompiling.
(read_escaped_byte): new function.
(unescape_escaped_nonascii): new function.
(append_utf8): new function.
(unescape_unicode_list): new function.
(unescape_unicode_bmp): new function.
(unescape_nonascii): new function.
(rb_reg_initialize): preprocess regexp.
* pack.c (rb_uv_to_utf8): renamed from uv_to_utf8.
* parse.y (STR_NEW3): take func instead of has8 and hasmb.
(parser_str_new): use default coderange mechanism except for regexp.
(parser_tokadd_utf8): copy regexp source as-is.
(parser_read_escape): UTF-8 stuff removed.
(parser_tokadd_escape): has8bit and hasmb removed.
(parser_tokadd_string): fix 8-bit single byte character with \u.
(parser_parse_string): has8bit and hasmb removed.
(parser_here_document): has8bit and hasmb removed.
(parser_yylex): call parser_tokadd_utf8 instead of read_escape for
UTF-8 character.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14072 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
rename ENC_CODERANGE_SINGLE to ENC_CODERANGE_7BIT.
rename ENC_CODERANGE_MULTI to ENC_CODERANGE_8BIT.
Because single byte 8bit character, such as Shift_JIS 1byte katakana,
is represented by ENC_CODERANGE_MULTI even if it is not multi byte.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14027 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
(rb_enc_str_asciicompat_p): defined.
* re.c (rb_reg_initialize_str): use rb_enc_str_asciionly_p.
(rb_reg_quote): return ascii-8bit string if the argument is
ascii-only to generate encoding generic regexp if possible.
(rb_reg_s_union): fix encoding handling. [ruby-dev:32094]
* string.c (rb_enc_str_asciionly_p): defined.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14013 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
matches to current encoding.
* re.c (char_to_option, rb_char_to_option_kcode): 'n' is not kcode
option now.
* re.c (rb_reg_to_s, rb_reg_error_desc): copy encoding rather than
append as an option.
* re.c (make_regexp, rb_reg_prepare_re): use encoding of Regexp and
String instead of kcode.
* re.c (rb_reg_initialize): set fixed option if none is set.
* re.c (rb_reg_regcomp): ditto.
* re.c (rb_reg_equal): check if encodings are equal.
* re.c (rb_reg_initialize_m): encoding option is obsolete.
* re.c (rb_kcode, rb_get_kcode, rb_set_kcode): removed.
* re.c (Init_Regexp): removed Regexp#kcode method.
* ruby.c (proc_options): allow long encoding name.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13717 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* re.c (Init_Regexp): new method Regexp#encoding.
* string.c (str_encoding): moved to encoding.c
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13613 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
condition statement much shorter, if no else clause is needed.
* string.c (rb_str_match_m): ditto.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13475 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* string.c (rb_str_rindex_m): was confusing character offset and
byte offset.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13295 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
object or nil if it's not target-type. this mechanism is used
to convert types in the C implemented methods.
* hash.c (rb_hash_s_try_convert): ditto.
* io.c (rb_io_s_try_convert): ditto.
* re.c (rb_reg_s_try_convert): ditto.
* string.c (rb_str_s_try_convert): ditto.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13251 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* re.c (rb_reg_error_desc): make RegexpError for initialization error.
* re.c (rb_reg_compile): return nil and set errinfo if error.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13092 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
between int and string.
* re.c (rb_reg_compile): append regexp options to error message.
[ruby-dev:31334]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@12863 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
literal. [ruby-dev:31336]
* re.c (rb_reg_compile): should not use regexp which could not get
initialized. [ruby-dev:31333]
return error message to let the parser know it.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@12862 b2dd03c8-39d4-4d8f-98ff-823fe69b080e