Merge branch 'PHP-5.4' into PHP-5.5

* PHP-5.4: Upgrade PCRE to 8.36, it fixes some crashes
2025-08-18 15:08:55 +02:00 · 2015-04-27 23:22:44 -07:00 · 2015-04-27 23:22:44 -07:00 · 13c32a102c
commit 13c32a102c
parent 957aa220aa 23917b451b
65 changed files with 5142 additions and 4382 deletions
--- a/ext/pcre/pcrelib/AUTHORS
+++ b/ext/pcre/pcrelib/AUTHORS
@ -8,7 +8,7 @@ Email domain:     cam.ac.uk
 University of Cambridge Computing Service,
 Cambridge, England.
-Copyright (c) 1997-2013 University of Cambridge
+Copyright (c) 1997-2014 University of Cambridge
 All rights reserved
@ -19,7 +19,7 @@ Written by:       Zoltan Herczeg
 Email local part: hzmester
 Emain domain:     freemail.hu
-Copyright(c) 2010-2013 Zoltan Herczeg
+Copyright(c) 2010-2014 Zoltan Herczeg
 All rights reserved.
@ -30,7 +30,7 @@ Written by:       Zoltan Herczeg
 Email local part: hzmester
 Emain domain:     freemail.hu
-Copyright(c) 2009-2013 Zoltan Herczeg
+Copyright(c) 2009-2014 Zoltan Herczeg
 All rights reserved.
--- a/ext/pcre/pcrelib/ChangeLog
+++ b/ext/pcre/pcrelib/ChangeLog
@ -1,6 +1,224 @@
 ChangeLog for PCRE
 ------------------
 Version 8.36 26-September-2014
 ------------------------------
 1.  Got rid of some compiler warnings in the C++ modules that were shown up by
    -Wmissing-field-initializers and -Wunused-parameter.
 2.  The tests for quantifiers being too big (greater than 65535) were being
    applied after reading the number, and stupidly assuming that integer
    overflow would give a negative number. The tests are now applied as the
    numbers are read.
 3.  Tidy code in pcre_exec.c where two branches that used to be different are
    now the same.
 4.  The JIT compiler did not generate match limit checks for certain
    bracketed expressions with quantifiers. This may lead to exponential
    backtracking, instead of returning with PCRE_ERROR_MATCHLIMIT. This
    issue should be resolved now.
 5.  Fixed an issue, which occures when nested alternatives are optimized
    with table jumps.
 6.  Inserted two casts and changed some ints to size_t in the light of some
    reported 64-bit compiler warnings (Bugzilla 1477).
 7.  Fixed a bug concerned with zero-minimum possessive groups that could match
    an empty string, which sometimes were behaving incorrectly in the
    interpreter (though correctly in the JIT matcher). This pcretest input is
    an example:
      '\A(?:[^"]++|"(?:[^"]*+|"")*+")++'
      NON QUOTED "QUOT""ED" AFTER "NOT MATCHED
    the interpreter was reporting a match of 'NON QUOTED ' only, whereas the
    JIT matcher and Perl both matched 'NON QUOTED "QUOT""ED" AFTER '. The test
    for an empty string was breaking the inner loop and carrying on at a lower
    level, when possessive repeated groups should always return to a higher
    level as they have no backtrack points in them. The empty string test now
    occurs at the outer level.
 8.  Fixed a bug that was incorrectly auto-possessifying \w+ in the pattern
    ^\w+(?>\s*)(?<=\w) which caused it not to match "test test".
 9.  Give a compile-time error for \o{} (as Perl does) and for \x{} (which Perl
    doesn't).
 10. Change 8.34/15 introduced a bug that caused the amount of memory needed
    to hold a pattern to be incorrectly computed (too small) when there were
    named back references to duplicated names. This could cause "internal
    error: code overflow" or "double free or corruption" or other memory
    handling errors.
 11. When named subpatterns had the same prefixes, back references could be
    confused. For example, in this pattern:
      /(?P<Name>a)?(?P<Name2>b)?(?(<Name>)c|d)*l/
    the reference to 'Name' was incorrectly treated as a reference to a
    duplicate name.
 12. A pattern such as /^s?c/mi8 where the optional character has more than
    one "other case" was incorrectly compiled such that it would only try to
    match starting at "c".
 13. When a pattern starting with \s was studied, VT was not included in the
    list of possible starting characters; this should have been part of the
    8.34/18 patch.
 14. If a character class started [\Qx]... where x is any character, the class
    was incorrectly terminated at the ].
 15. If a pattern that started with a caseless match for a character with more
    than one "other case" was studied, PCRE did not set up the starting code
    unit bit map for the list of possible characters. Now it does. This is an
    optimization improvement, not a bug fix.
 16. The Unicode data tables have been updated to Unicode 7.0.0.
 17. Fixed a number of memory leaks in pcregrep.
 18. Avoid a compiler warning (from some compilers) for a function call with
    a cast that removes "const" from an lvalue by using an intermediate
    variable (to which the compiler does not object).
 19. Incorrect code was compiled if a group that contained an internal recursive
    back reference was optional (had quantifier with a minimum of zero). This
    example compiled incorrect code: /(((a\2)|(a*)\g<-1>))*/ and other examples
    caused segmentation faults because of stack overflows at compile time.
 20. A pattern such as /((?(R)a|(?1)))+/, which contains a recursion within a
    group that is quantified with an indefinite repeat, caused a compile-time
    loop which used up all the system stack and provoked a segmentation fault.
    This was not the same bug as 19 above.
 21. Add PCRECPP_EXP_DECL declaration to operator<< in pcre_stringpiece.h.
    Patch by Mike Frysinger.
 Version 8.35 04-April-2014
 --------------------------
 1.  A new flag is set, when property checks are present in an XCLASS.
    When this flag is not set, PCRE can perform certain optimizations
    such as studying these XCLASS-es.
 2.  The auto-possessification of character sets were improved: a normal
    and an extended character set can be compared now. Furthermore
    the JIT compiler optimizes more character set checks.
 3.  Got rid of some compiler warnings for potentially uninitialized variables
    that show up only when compiled with -O2.
 4.  A pattern such as (?=ab\K) that uses \K in an assertion can set the start
    of a match later then the end of the match. The pcretest program was not
    handling the case sensibly - it was outputting from the start to the next
    binary zero. It now reports this situation in a message, and outputs the
    text from the end to the start.
 5.  Fast forward search is improved in JIT. Instead of the first three
    characters, any three characters with fixed position can be searched.
    Search order: first, last, middle.
 6.  Improve character range checks in JIT. Characters are read by an inprecise
    function now, which returns with an unknown value if the character code is
    above a certain threshold (e.g: 256). The only limitation is that the value
    must be bigger than the threshold as well. This function is useful when
    the characters above the threshold are handled in the same way.
 7.  The macros whose names start with RAWUCHAR are placeholders for a future
    mode in which only the bottom 21 bits of 32-bit data items are used. To
    make this more memorable for those maintaining the code, the names have
    been changed to start with UCHAR21, and an extensive comment has been added
    to their definition.
 8.  Add missing (new) files sljitNativeTILEGX.c and sljitNativeTILEGX-encoder.c
    to the export list in Makefile.am (they were accidentally omitted from the
    8.34 tarball).
 9.  The informational output from pcretest used the phrase "starting byte set"
    which is inappropriate for the 16-bit and 32-bit libraries. As the output
    for "first char" and "need char" really means "non-UTF-char", I've changed
    "byte" to "char", and slightly reworded the output. The documentation about
    these values has also been (I hope) clarified.
 10. Another JIT related optimization: use table jumps for selecting the correct
    backtracking path, when more than four alternatives are present inside a
    bracket.
 11. Empty match is not possible, when the minimum length is greater than zero,
    and there is no \K in the pattern. JIT should avoid empty match checks in
    such cases.
 12. In a caseless character class with UCP support, when a character with more
    than one alternative case was not the first character of a range, not all
    the alternative cases were added to the class. For example, s and \x{17f}
    are both alternative cases for S: the class [RST] was handled correctly,
    but [R-T] was not.
 13. The configure.ac file always checked for pthread support when JIT was
    enabled. This is not used in Windows, so I have put this test inside a
    check for the presence of windows.h (which was already tested for).
 14. Improve pattern prefix search by a simplified Boyer-Moore algorithm in JIT.
    The algorithm provides a way to skip certain starting offsets, and usually
    faster than linear prefix searches.
 15. Change 13 for 8.20 updated RunTest to check for the 'fr' locale as well
    as for 'fr_FR' and 'french'. For some reason, however, it then used the
    Windows-specific input and output files, which have 'french' screwed in.
    So this could never have worked. One of the problems with locales is that
    they aren't always the same. I have now updated RunTest so that it checks
    the output of the locale test (test 3) against three different output
    files, and it allows the test to pass if any one of them matches. With luck
    this should make the test pass on some versions of Solaris where it was
    failing. Because of the uncertainty, the script did not used to stop if
    test 3 failed; it now does. If further versions of a French locale ever
    come to light, they can now easily be added.
 16. If --with-pcregrep-bufsize was given a non-integer value such as "50K",
    there was a message during ./configure, but it did not stop. This now
    provokes an error. The invalid example in README has been corrected.
    If a value less than the minimum is given, the minimum value has always
    been used, but now a warning is given.
 17. If --enable-bsr-anycrlf was set, the special 16/32-bit test failed. This
    was a bug in the test system, which is now fixed. Also, the list of various
    configurations that are tested for each release did not have one with both
    16/32 bits and --enable-bar-anycrlf. It now does.
 18. pcretest was missing "-C bsr" for displaying the \R default setting.
 19. Little endian PowerPC systems are supported now by the JIT compiler.
 20. The fast forward newline mechanism could enter to an infinite loop on
    certain invalid UTF-8 input. Although we don't support these cases
    this issue can be fixed by a performance optimization.
 21. Change 33 of 8.34 is not sufficient to ensure stack safety because it does
    not take account if existing stack usage. There is now a new global
    variable called pcre_stack_guard that can be set to point to an external
    function to check stack availability. It is called at the start of
    processing every parenthesized group.
 22. A typo in the code meant that in ungreedy mode the max/min qualifier
    behaved like a min-possessive qualifier, and, for example, /a{1,3}b/U did
    not match "ab".
 23. When UTF was disabled, the JIT program reported some incorrect compile
    errors. These messages are silenced now.
 24. Experimental support for ARM-64 and MIPS-64 has been added to the JIT
    compiler.
 25. Change all the temporary files used in RunGrepTest to be different to those
    used by RunTest so that the tests can be run simultaneously, for example by
    "make -j check".
 Version 8.34 15-December-2013
 -----------------------------
@ -5311,7 +5529,7 @@ by an auxiliary program - but can then be edited by hand if required. There are
 now no calls to isalnum(), isspace(), isdigit(), isxdigit(), tolower() or
 toupper() in the code.
-7. Turn the malloc/free functions variables into pcre_malloc and pcre_free and
+7. Turn the malloc/free funtions variables into pcre_malloc and pcre_free and
 make them global. Abolish the function for setting them, as the caller can now
 set them directly.
--- a/ext/pcre/pcrelib/LICENCE
+++ b/ext/pcre/pcrelib/LICENCE
@ -24,7 +24,7 @@ Email domain:     cam.ac.uk
 University of Cambridge Computing Service,
 Cambridge, England.
-Copyright (c) 1997-2013 University of Cambridge
+Copyright (c) 1997-2014 University of Cambridge
 All rights reserved.
@ -35,7 +35,7 @@ Written by:       Zoltan Herczeg
 Email local part: hzmester
 Emain domain:     freemail.hu
-Copyright(c) 2010-2013 Zoltan Herczeg
+Copyright(c) 2010-2014 Zoltan Herczeg
 All rights reserved.
@ -46,7 +46,7 @@ Written by:       Zoltan Herczeg
 Email local part: hzmester
 Emain domain:     freemail.hu
-Copyright(c) 2009-2013 Zoltan Herczeg
+Copyright(c) 2009-2014 Zoltan Herczeg
 All rights reserved.
--- a/ext/pcre/pcrelib/NEWS
+++ b/ext/pcre/pcrelib/NEWS
@ -1,6 +1,24 @@
 News about PCRE releases
 ------------------------
 Release 8.36 26-September-2014
 ------------------------------
 This is primarily a bug-fix release. However, in addition, the Unicode data
 tables have been updated to Unicode 7.0.0.
 Release 8.35 04-April-2014
 --------------------------
 There have been performance improvements for classes containing non-ASCII
 characters and the "auto-possessification" feature has been extended. Other
 minor improvements have been implemented and bugs fixed. There is a new callout
 feature to enable applications to do detailed stack checks at compile time, to
 avoid running out of stack for deeply nested parentheses. The JIT compiler has
 been extended with experimental support for ARM-64, MIPS-64, and PPC-LE.
 Release 8.34 15-December-2013
 -----------------------------
--- a/ext/pcre/pcrelib/README
+++ b/ext/pcre/pcrelib/README
@ -45,14 +45,16 @@ the 16-bit library, which processes strings of 16-bit values, and one for the
 32-bit library, which processes strings of 32-bit values. The distribution also
 includes a set of C++ wrapper functions (see the pcrecpp man page for details),
 courtesy of Google Inc., which can be used to call the 8-bit PCRE library from
-C++.
+C++. Other C++ wrappers have been created from time to time. See, for example:
 https://github.com/YasserAsmi/regexp, which aims to be simple and similar in
 style to the C API.
-In addition, there is a set of C wrapper functions (again, just for the 8-bit
+The distribution also contains a set of C wrapper functions (again, just for
-library) that are based on the POSIX regular expression API (see the pcreposix
+the 8-bit library) that are based on the POSIX regular expression API (see the
-man page). These end up in the library called libpcreposix. Note that this just
+pcreposix man page). These end up in the library called libpcreposix. Note that
-provides a POSIX calling interface to PCRE; the regular expressions themselves
+this just provides a POSIX calling interface to PCRE; the regular expressions
-still follow Perl syntax and semantics. The POSIX API is restricted, and does
+themselves still follow Perl syntax and semantics. The POSIX API is restricted,
-not give full access to all of PCRE's facilities.
+and does not give full access to all of PCRE's facilities.
 The header file for the POSIX-style functions is called pcreposix.h. The
 official POSIX name is regex.h, but I did not want to risk possible problems
@ -85,11 +87,12 @@ documentation is supplied in two other forms:
  1. There are files called doc/pcre.txt, doc/pcregrep.txt, and
     doc/pcretest.txt in the source distribution. The first of these is a
     concatenation of the text forms of all the section 3 man pages except
-     those that summarize individual functions. The other two are the text
+     the listing of pcredemo.c and those that summarize individual functions.
-     forms of the section 1 man pages for the pcregrep and pcretest commands.
+     The other two are the text forms of the section 1 man pages for the
-     These text forms are provided for ease of scanning with text editors or
+     pcregrep and pcretest commands. These text forms are provided for ease of
-     similar tools. They are installed in <prefix>/share/doc/pcre, where
+     scanning with text editors or similar tools. They are installed in
-     <prefix> is the installation prefix (defaulting to /usr/local).
+     <prefix>/share/doc/pcre, where <prefix> is the installation prefix
     (defaulting to /usr/local).
  2. A set of files containing all the documentation in HTML form, hyperlinked
     in various ways, and rooted in a file called index.html, is distributed in
@ -372,12 +375,12 @@ library. They are also documented in the pcrebuild man page.
  Of course, the relevant libraries must be installed on your system.
-. The default size of internal buffer used by pcregrep can be set by, for
+. The default size (in bytes) of the internal buffer used by pcregrep can be
-  example:
+  set by, for example:
-  --with-pcregrep-bufsize=50K
+  --with-pcregrep-bufsize=51200
-  The default value is 20K.
+  The value must be a plain integer. The default is 20480.
 . It is possible to compile pcretest so that it links with the libreadline
  or libedit libraries, by specifying, respectively,
@ -987,4 +990,4 @@ pcre_xxx, one with the name pcre16_xx, and a third with the name pcre32_xxx.
 Philip Hazel
 Email local part: ph10
 Email domain: cam.ac.uk
-Last updated: 05 November 2013
+Last updated: 24 October 2014
--- a/ext/pcre/pcrelib/config.h
+++ b/ext/pcre/pcrelib/config.h
@ -314,7 +314,7 @@ them both to 0; an emulation function will be used. */
 #define PACKAGE_NAME "PCRE"
 /* Define to the full name and version of this package. */
-#define PACKAGE_STRING "PCRE 8.32"
+#define PACKAGE_STRING "PCRE 8.36"
 /* Define to the one symbol short name of this package. */
 #define PACKAGE_TARNAME "pcre"
@ -323,7 +323,7 @@ them both to 0; an emulation function will be used. */
 #define PACKAGE_URL ""
 /* Define to the version of this package. */
-#define PACKAGE_VERSION "8.32"
+#define PACKAGE_VERSION "8.36"
 /* to make a symbol visible */
 /* #undef PCRECPP_EXP_DECL */
@ -331,6 +331,13 @@ them both to 0; an emulation function will be used. */
 /* to make a symbol visible */
 /* #undef PCRECPP_EXP_DEFN */
 /* The value of PARENS_NEST_LIMIT specifies the maximum depth of nested
   parentheses (of any kind) in a pattern. This limits the amount of system
   stack that is used while compiling a pattern. */
 #ifndef PARENS_NEST_LIMIT
 #define PARENS_NEST_LIMIT 250
 #endif
 /* The value of PCREGREP_BUFSIZE determines the size of buffer used by
   pcregrep to hold parts of the file it is searching. This is also the
   minimum value. The actual amount of memory used by pcregrep is three times
@ -432,7 +439,7 @@ them both to 0; an emulation function will be used. */
 /* Version number of package */
 #ifndef VERSION
-#define VERSION "8.34"
+#define VERSION "8.36"
 #endif
 /* Define to empty if `const' does not conform to ANSI C. */
@ -444,3 +451,4 @@ them both to 0; an emulation function will be used. */
 /* Define to `unsigned int' if <sys/types.h> does not define. */
 /* #undef size_t */
--- a/ext/pcre/pcrelib/doc/pcre.txt
+++ b/ext/pcre/pcrelib/doc/pcre.txt
@ -130,9 +130,11 @@ USER DOCUMENTATION
       The  user  documentation  for PCRE comprises a number of different sec-
       tions. In the "man" format, each of these is a separate "man page".  In
       the  HTML  format, each is a separate page, linked from the index page.
-       In the plain text format, all the sections, except  the  pcredemo  sec-
+       In the plain text format, the descriptions of the pcregrep and pcretest
-       tion, are concatenated, for ease of searching. The sections are as fol-
+       programs  are  in  files  called pcregrep.txt and pcretest.txt, respec-
-       lows:
+       tively. The remaining sections, except for the pcredemo section  (which
       is  a  program  listing),  are  concatenated  in  pcre.txt, for ease of
       searching. The sections are as follows:
         pcre              this document
         pcre-config       show PCRE installation configuration information
@ -160,8 +162,8 @@ USER DOCUMENTATION
         pcretest          description of the pcretest testing command
         pcreunicode       discussion of Unicode and UTF-8/16/32 support
-       In addition, in the "man" and HTML formats, there is a short  page  for
+       In the "man" and HTML formats, there is also a short page  for  each  C
-       each C library function, listing its arguments and results.
+       library function, listing its arguments and results.
 AUTHOR
@ -177,8 +179,8 @@ AUTHOR
 REVISION
-       Last updated: 13 May 2013
+       Last updated: 08 January 2014
-       Copyright (c) 1997-2013 University of Cambridge.
+       Copyright (c) 1997-2014 University of Cambridge.
 ------------------------------------------------------------------------------
@ -1674,6 +1676,8 @@ PCRE NATIVE API INDIRECTED FUNCTIONS
       int (*pcre_callout)(pcre_callout_block *);
       int (*pcre_stack_guard)(void);
 PCRE 8-BIT, 16-BIT, AND 32-BIT LIBRARIES
@ -1809,6 +1813,14 @@ PCRE API OVERVIEW
       specified  points during a matching operation. Details are given in the
       pcrecallout documentation.
       The global variable pcre_stack_guard initially contains NULL. It can be
       set  by  the  caller  to  a function that is called by PCRE whenever it
       starts to compile a parenthesized part of a pattern.  When  parentheses
       are nested, PCRE uses recursive function calls, which use up the system
       stack. This function is provided so that applications  with  restricted
       stacks  can  force a compilation error if the stack runs out. The func-
       tion should return zero if all is well, or non-zero to force an error.
 NEWLINES
@ -1849,7 +1861,8 @@ MULTITHREADING
       The  PCRE  functions  can be used in multi-threading applications, with
       the  proviso  that  the  memory  management  functions  pointed  to  by
       pcre_malloc, pcre_free, pcre_stack_malloc, and pcre_stack_free, and the
-       callout function pointed to by pcre_callout, are shared by all threads.
+       callout and stack-checking functions pointed  to  by  pcre_callout  and
       pcre_stack_guard, are shared by all threads.
       The  compiled form of a regular expression is not altered during match-
       ing, so the same compiled pattern can safely be used by several threads
@ -1971,7 +1984,10 @@ CHECKING BUILD-TIME OPTIONS
       The output is a long integer that gives the maximum depth of nesting of
       parentheses  (of  any  kind) in a pattern. This limit is imposed to cap
       the amount of system stack used when a pattern is compiled. It is spec-
-       ified when PCRE is built; the default is 250.
+       ified  when PCRE is built; the default is 250. This limit does not take
       into account the stack that may already be used by the calling applica-
       tion.  For  finer  control  over compilation stack usage, you can set a
       pointer to an external checking function in pcre_stack_guard.
         PCRE_CONFIG_MATCH_LIMIT
@ -2474,6 +2490,8 @@ COMPILATION ERROR CODES
         81  missing opening brace after \o
         82  parentheses are too deeply nested
         83  invalid range in character class
         84  group name must start with a non-digit
         85  parentheses are too deeply nested (stack check)
       The  numbers  32  and 10000 in errors 48 and 49 are defaults; different
       values may be used if the limits were changed when PCRE was built.
@ -2714,12 +2732,16 @@ INFORMATION ABOUT A PATTERN
       tion. External callers can cause PCRE to use  its  internal  tables  by
       passing a NULL table pointer.
-         PCRE_INFO_FIRSTBYTE
+         PCRE_INFO_FIRSTBYTE (deprecated)
       Return information about the first data unit of any matched string, for
-       a non-anchored pattern. (The name of this option refers  to  the  8-bit
+       a non-anchored pattern. The name of this option  refers  to  the  8-bit
-       library,  where data units are bytes.) The fourth argument should point
+       library,  where  data units are bytes. The fourth argument should point
-       to an int variable.
+       to an int variable. Negative values are used for  special  cases.  How-
       ever,  this  means  that when the 32-bit library is in non-UTF-32 mode,
       the full 32-bit range of characters cannot be returned. For  this  rea-
       son,  this  value  is deprecated; use PCRE_INFO_FIRSTCHARACTERFLAGS and
       PCRE_INFO_FIRSTCHARACTER instead.
       If there is a fixed first value, for example, the  letter  "c"  from  a
       pattern  such  as (cat|cow|coyote), its value is returned. In the 8-bit
@ -2739,10 +2761,38 @@ INFORMATION ABOUT A PATTERN
       of  a  subject string or after any newline within the string. Otherwise
       -2 is returned. For anchored patterns, -2 is returned.
-       Since for the 32-bit library using the non-UTF-32 mode,  this  function
+         PCRE_INFO_FIRSTCHARACTER
-       is  unable to return the full 32-bit range of the character, this value
+
-       is   deprecated;   instead   the   PCRE_INFO_FIRSTCHARACTERFLAGS    and
+       Return the value of the first data  unit  (non-UTF  character)  of  any
-       PCRE_INFO_FIRSTCHARACTER values should be used.
+       matched  string  in  the  situation where PCRE_INFO_FIRSTCHARACTERFLAGS
       returns 1; otherwise return 0. The fourth argument should point  to  an
       uint_t variable.
       In  the 8-bit library, the value is always less than 256. In the 16-bit
       library the value can be up to 0xffff. In the 32-bit library in  UTF-32
       mode  the  value  can  be up to 0x10ffff, and up to 0xffffffff when not
       using UTF-32 mode.
         PCRE_INFO_FIRSTCHARACTERFLAGS
       Return information about the first data unit of any matched string, for
       a  non-anchored  pattern.  The  fourth  argument should point to an int
       variable.
       If there is a fixed first value, for example, the  letter  "c"  from  a
       pattern  such  as  (cat|cow|coyote),  1  is returned, and the character
       value can be retrieved using PCRE_INFO_FIRSTCHARACTER. If there  is  no
       fixed first value, and if either
       (a)  the pattern was compiled with the PCRE_MULTILINE option, and every
       branch starts with "^", or
       (b) every branch of the pattern starts with ".*" and PCRE_DOTALL is not
       set (if it were set, the pattern would be anchored),
       2 is returned, indicating that the pattern matches only at the start of
       a subject string or after any newline within the string. Otherwise 0 is
       returned. For anchored patterns, 0 is returned.
         PCRE_INFO_FIRSTTABLE
@ -2954,39 +3004,6 @@ INFORMATION ABOUT A PATTERN
       option so that it can be saved and  restored  (see  the  pcreprecompile
       documentation for details).
         PCRE_INFO_FIRSTCHARACTERFLAGS
       Return information about the first data unit of any matched string, for
       a non-anchored pattern. The fourth argument  should  point  to  an  int
       variable.
       If  there  is  a  fixed first value, for example, the letter "c" from a
       pattern such as (cat|cow|coyote), 1  is  returned,  and  the  character
       value can be retrieved using PCRE_INFO_FIRSTCHARACTER.
       If there is no fixed first value, and if either
       (a)  the pattern was compiled with the PCRE_MULTILINE option, and every
       branch starts with "^", or
       (b) every branch of the pattern starts with ".*" and PCRE_DOTALL is not
       set (if it were set, the pattern would be anchored),
       2 is returned, indicating that the pattern matches only at the start of
       a subject string or after any newline within the string. Otherwise 0 is
       returned. For anchored patterns, 0 is returned.
         PCRE_INFO_FIRSTCHARACTER
       Return   the  fixed  first  character  value  in  the  situation  where
       PCRE_INFO_FIRSTCHARACTERFLAGS returns 1; otherwise return 0. The fourth
       argument should point to an uint_t variable.
       In  the 8-bit library, the value is always less than 256. In the 16-bit
       library the value can be up to 0xffff. In the 32-bit library in  UTF-32
       mode  the  value  can  be up to 0x10ffff, and up to 0xffffffff when not
       using UTF-32 mode.
         PCRE_INFO_REQUIREDCHARFLAGS
       Returns  1 if there is a rightmost literal data unit that must exist in
@ -4248,8 +4265,8 @@ AUTHOR
 REVISION
-       Last updated: 12 November 2013
+       Last updated: 09 February 2014
-       Copyright (c) 1997-2013 University of Cambridge.
+       Copyright (c) 1997-2014 University of Cambridge.
 ------------------------------------------------------------------------------
@ -5309,21 +5326,25 @@ BACKSLASH
       Those  that are not part of an identified script are lumped together as
       "Common". The current list of scripts is:
-       Arabic, Armenian, Avestan, Balinese, Bamum, Batak,  Bengali,  Bopomofo,
+       Arabic, Armenian, Avestan, Balinese, Bamum, Bassa_Vah, Batak,  Bengali,
-       Brahmi,  Braille, Buginese, Buhid, Canadian_Aboriginal, Carian, Chakma,
+       Bopomofo,  Brahmi,  Braille, Buginese, Buhid, Canadian_Aboriginal, Car-
-       Cham, Cherokee, Common, Coptic, Cuneiform, Cypriot, Cyrillic,  Deseret,
+       ian, Caucasian_Albanian, Chakma, Cham, Cherokee, Common, Coptic, Cunei-
-       Devanagari,   Egyptian_Hieroglyphs,   Ethiopic,  Georgian,  Glagolitic,
+       form, Cypriot, Cyrillic, Deseret, Devanagari, Duployan, Egyptian_Hiero-
-       Gothic, Greek, Gujarati, Gurmukhi, Han, Hangul, Hanunoo, Hebrew,  Hira-
+       glyphs,  Elbasan,  Ethiopic,  Georgian,  Glagolitic,  Gothic,  Grantha,
-       gana,   Imperial_Aramaic,  Inherited,  Inscriptional_Pahlavi,  Inscrip-
+       Greek,  Gujarati,  Gurmukhi,  Han,  Hangul,  Hanunoo, Hebrew, Hiragana,
       Imperial_Aramaic,    Inherited,     Inscriptional_Pahlavi,     Inscrip-
       tional_Parthian,   Javanese,   Kaithi,   Kannada,  Katakana,  Kayah_Li,
-       Kharoshthi,  Khmer,  Lao, Latin, Lepcha, Limbu, Linear_B, Lisu, Lycian,
+       Kharoshthi, Khmer, Khojki, Khudawadi, Lao, Latin, Lepcha,  Limbu,  Lin-
-       Lydian,    Malayalam,    Mandaic,    Meetei_Mayek,    Meroitic_Cursive,
+       ear_A,  Linear_B,  Lisu,  Lycian, Lydian, Mahajani, Malayalam, Mandaic,
-       Meroitic_Hieroglyphs,   Miao,  Mongolian,  Myanmar,  New_Tai_Lue,  Nko,
+       Manichaean,     Meetei_Mayek,     Mende_Kikakui,      Meroitic_Cursive,
-       Ogham,   Old_Italic,   Old_Persian,   Old_South_Arabian,    Old_Turkic,
+       Meroitic_Hieroglyphs,  Miao,  Modi, Mongolian, Mro, Myanmar, Nabataean,
-       Ol_Chiki,  Oriya, Osmanya, Phags_Pa, Phoenician, Rejang, Runic, Samari-
+       New_Tai_Lue,  Nko,  Ogham,  Ol_Chiki,  Old_Italic,   Old_North_Arabian,
-       tan, Saurashtra, Sharada, Shavian,  Sinhala,  Sora_Sompeng,  Sundanese,
+       Old_Permic, Old_Persian, Old_South_Arabian, Old_Turkic, Oriya, Osmanya,
-       Syloti_Nagri,  Syriac,  Tagalog,  Tagbanwa, Tai_Le, Tai_Tham, Tai_Viet,
+       Pahawh_Hmong,    Palmyrene,    Pau_Cin_Hau,    Phags_Pa,    Phoenician,
-       Takri, Tamil, Telugu, Thaana, Thai, Tibetan, Tifinagh,  Ugaritic,  Vai,
+       Psalter_Pahlavi,  Rejang,  Runic,  Samaritan, Saurashtra, Sharada, Sha-
       vian, Siddham, Sinhala, Sora_Sompeng, Sundanese, Syloti_Nagri,  Syriac,
       Tagalog,  Tagbanwa,  Tai_Le,  Tai_Tham, Tai_Viet, Takri, Tamil, Telugu,
       Thaana, Thai, Tibetan, Tifinagh, Tirhuta, Ugaritic,  Vai,  Warang_Citi,
       Yi.
       Each character has exactly one Unicode general category property, spec-
@ -5510,7 +5531,9 @@ BACKSLASH
       Perl  documents  that  the  use  of  \K  within assertions is "not well
       defined". In PCRE, \K is acted upon  when  it  occurs  inside  positive
-       assertions, but is ignored in negative assertions.
+       assertions,  but  is  ignored  in negative assertions. Note that when a
       pattern such as (?=ab\K) matches, the reported start of the  match  can
       be greater than the end of the match.
   Simple assertions
@ -7399,19 +7422,23 @@ BACKTRACKING CONTROL
       Note  that  (*COMMIT)  at  the start of a pattern is not the same as an
       anchor, unless PCRE's start-of-match optimizations are turned  off,  as
-       shown in this pcretest example:
+       shown in this output from pcretest:
           re> /(*COMMIT)abc/
         data> xyzabc
          0: abc
-         xyzabc\Y
+         data> xyzabc\Y
         No match
-       PCRE  knows  that  any  match  must start with "a", so the optimization
+       For this pattern, PCRE knows that any match must start with "a", so the
-       skips along the subject to "a" before running the first match  attempt,
+       optimization skips along the subject to "a" before applying the pattern
-       which  succeeds.  When the optimization is disabled by the \Y escape in
+       to  the first set of data. The match attempt then succeeds. In the sec-
-       the second subject, the match starts at "x" and so the (*COMMIT) causes
+       ond set of data, the escape sequence \Y is interpreted by the  pcretest
-       it to fail without trying any other starting points.
+       program.  It  causes  the  PCRE_NO_START_OPTIMIZE option to be set when
       pcre_exec() is called.  This disables the optimization that skips along
       to the first character. The pattern is now applied starting at "x", and
       so the (*COMMIT) causes the match to  fail  without  trying  any  other
       starting points.
         (*PRUNE) or (*PRUNE:NAME)
@ -7618,8 +7645,8 @@ AUTHOR
 REVISION
-       Last updated: 03 December 2013
+       Last updated: 08 January 2014
-       Copyright (c) 1997-2013 University of Cambridge.
+       Copyright (c) 1997-2014 University of Cambridge.
 ------------------------------------------------------------------------------
@ -7754,21 +7781,25 @@ PCRE SPECIAL CATEGORY PROPERTIES FOR \p and \P
 SCRIPT NAMES FOR \p AND \P
-       Arabic, Armenian, Avestan, Balinese, Bamum, Batak,  Bengali,  Bopomofo,
+       Arabic, Armenian, Avestan, Balinese, Bamum, Bassa_Vah, Batak,  Bengali,
-       Brahmi,  Braille, Buginese, Buhid, Canadian_Aboriginal, Carian, Chakma,
+       Bopomofo,  Brahmi,  Braille, Buginese, Buhid, Canadian_Aboriginal, Car-
-       Cham, Cherokee, Common, Coptic, Cuneiform, Cypriot, Cyrillic,  Deseret,
+       ian, Caucasian_Albanian, Chakma, Cham, Cherokee, Common, Coptic, Cunei-
-       Devanagari,   Egyptian_Hieroglyphs,   Ethiopic,  Georgian,  Glagolitic,
+       form, Cypriot, Cyrillic, Deseret, Devanagari, Duployan, Egyptian_Hiero-
-       Gothic, Greek, Gujarati, Gurmukhi, Han, Hangul, Hanunoo, Hebrew,  Hira-
+       glyphs,  Elbasan,  Ethiopic,  Georgian,  Glagolitic,  Gothic,  Grantha,
-       gana,   Imperial_Aramaic,  Inherited,  Inscriptional_Pahlavi,  Inscrip-
+       Greek,  Gujarati,  Gurmukhi,  Han,  Hangul,  Hanunoo, Hebrew, Hiragana,
       Imperial_Aramaic,    Inherited,     Inscriptional_Pahlavi,     Inscrip-
       tional_Parthian,   Javanese,   Kaithi,   Kannada,  Katakana,  Kayah_Li,
-       Kharoshthi,  Khmer,  Lao, Latin, Lepcha, Limbu, Linear_B, Lisu, Lycian,
+       Kharoshthi, Khmer, Khojki, Khudawadi, Lao, Latin, Lepcha,  Limbu,  Lin-
-       Lydian,    Malayalam,    Mandaic,    Meetei_Mayek,    Meroitic_Cursive,
+       ear_A,  Linear_B,  Lisu,  Lycian, Lydian, Mahajani, Malayalam, Mandaic,
-       Meroitic_Hieroglyphs,   Miao,  Mongolian,  Myanmar,  New_Tai_Lue,  Nko,
+       Manichaean,     Meetei_Mayek,     Mende_Kikakui,      Meroitic_Cursive,
-       Ogham,   Old_Italic,   Old_Persian,   Old_South_Arabian,    Old_Turkic,
+       Meroitic_Hieroglyphs,  Miao,  Modi, Mongolian, Mro, Myanmar, Nabataean,
-       Ol_Chiki,  Oriya, Osmanya, Phags_Pa, Phoenician, Rejang, Runic, Samari-
+       New_Tai_Lue,  Nko,  Ogham,  Ol_Chiki,  Old_Italic,   Old_North_Arabian,
-       tan, Saurashtra, Sharada, Shavian,  Sinhala,  Sora_Sompeng,  Sundanese,
+       Old_Permic, Old_Persian, Old_South_Arabian, Old_Turkic, Oriya, Osmanya,
-       Syloti_Nagri,  Syriac,  Tagalog,  Tagbanwa, Tai_Le, Tai_Tham, Tai_Viet,
+       Pahawh_Hmong,    Palmyrene,    Pau_Cin_Hau,    Phags_Pa,    Phoenician,
-       Takri, Tamil, Telugu, Thaana, Thai, Tibetan, Tifinagh,  Ugaritic,  Vai,
+       Psalter_Pahlavi,  Rejang,  Runic,  Samaritan, Saurashtra, Sharada, Sha-
       vian, Siddham, Sinhala, Sora_Sompeng, Sundanese, Syloti_Nagri,  Syriac,
       Tagalog,  Tagbanwa,  Tai_Le,  Tai_Tham, Tai_Viet, Takri, Tamil, Telugu,
       Thaana, Thai, Tibetan, Tifinagh, Tirhuta, Ugaritic,  Vai,  Warang_Citi,
       Yi.
@ -7840,6 +7871,8 @@ MATCH POINT RESET
         \K          reset start of match
       \K is honoured in positive assertions, but ignored in negative ones.
 ALTERNATION
@ -7877,11 +7910,13 @@ OPTION SETTING
         (?x)            extended (ignore white space)
         (?-...)         unset option(s)
-       The  following  are  recognized only at the start of a pattern or after
+       The  following  are  recognized  only at the very start of a pattern or
-       one of the newline-setting options with similar syntax:
+       after one of the newline or \R options with similar syntax.  More  than
       one of them may appear.
         (*LIMIT_MATCH=d) set the match limit to d (decimal number)
         (*LIMIT_RECURSION=d) set the recursion limit to d (decimal number)
         (*NO_AUTO_POSSESS) no auto-possessification (PCRE_NO_AUTO_POSSESS)
         (*NO_START_OPT) no start-match optimization (PCRE_NO_START_OPTIMIZE)
         (*UTF8)         set UTF-8 mode: 8-bit library (PCRE_UTF8)
         (*UTF16)        set UTF-16 mode: 16-bit library (PCRE_UTF16)
@ -7893,6 +7928,27 @@ OPTION SETTING
       the limits set by the caller of pcre_exec(), not increase them.
 NEWLINE CONVENTION
       These are recognized only at the very start of  the  pattern  or  after
       option settings with a similar syntax.
         (*CR)           carriage return only
         (*LF)           linefeed only
         (*CRLF)         carriage return followed by linefeed
         (*ANYCRLF)      all three of the above
         (*ANY)          any Unicode newline sequence
 WHAT \R MATCHES
       These  are  recognized  only  at the very start of the pattern or after
       option setting with a similar syntax.
         (*BSR_ANYCRLF)  CR, LF, or CRLF
         (*BSR_UNICODE)  any Unicode newline sequence
 LOOKAHEAD AND LOOKBEHIND ASSERTIONS
         (?=...)         positive look ahead
@ -7975,27 +8031,6 @@ BACKTRACKING CONTROL
         (*THEN:NAME)    equivalent to (*MARK:NAME)(*THEN)
 NEWLINE CONVENTIONS
       These are recognized only at the very start of the pattern or  after  a
       (*BSR_...), (*UTF8), (*UTF16), (*UTF32) or (*UCP) option.
         (*CR)           carriage return only
         (*LF)           linefeed only
         (*CRLF)         carriage return followed by linefeed
         (*ANYCRLF)      all three of the above
         (*ANY)          any Unicode newline sequence
 WHAT \R MATCHES
       These  are  recognized only at the very start of the pattern or after a
       (*...) option that sets the newline convention or a UTF or UCP mode.
         (*BSR_ANYCRLF)  CR, LF, or CRLF
         (*BSR_UNICODE)  any Unicode newline sequence
 CALLOUTS
         (?C)      callout
@ -8016,8 +8051,8 @@ AUTHOR
 REVISION
-       Last updated: 12 November 2013
+       Last updated: 08 January 2014
-       Copyright (c) 1997-2013 University of Cambridge.
+       Copyright (c) 1997-2014 University of Cambridge.
 ------------------------------------------------------------------------------
--- a/ext/pcre/pcrelib/pcre.h
+++ b/ext/pcre/pcrelib/pcre.h
@ -5,7 +5,7 @@
 /* This is the public header file for the PCRE library, to be #included by
 applications that call the PCRE functions.
-           Copyright (c) 1997-2013 University of Cambridge
+           Copyright (c) 1997-2014 University of Cambridge
 -----------------------------------------------------------------------------
 Redistribution and use in source and binary forms, with or without
@ -42,9 +42,9 @@ POSSIBILITY OF SUCH DAMAGE.
 /* The current PCRE version information. */
 #define PCRE_MAJOR          8
-#define PCRE_MINOR          34
+#define PCRE_MINOR          36
 #define PCRE_PRERELEASE     
-#define PCRE_DATE           2013-12-15
+#define PCRE_DATE           2014-09-26
 /* When an application links to a PCRE DLL in Windows, the symbols that are
 imported have to be identified as such. When building PCRE, the appropriate
@ -491,36 +491,42 @@ PCRE_EXP_DECL void  (*pcre_free)(void *);
 PCRE_EXP_DECL void *(*pcre_stack_malloc)(size_t);
 PCRE_EXP_DECL void  (*pcre_stack_free)(void *);
 PCRE_EXP_DECL int   (*pcre_callout)(pcre_callout_block *);
 PCRE_EXP_DECL int   (*pcre_stack_guard)(void);
 PCRE_EXP_DECL void *(*pcre16_malloc)(size_t);
 PCRE_EXP_DECL void  (*pcre16_free)(void *);
 PCRE_EXP_DECL void *(*pcre16_stack_malloc)(size_t);
 PCRE_EXP_DECL void  (*pcre16_stack_free)(void *);
 PCRE_EXP_DECL int   (*pcre16_callout)(pcre16_callout_block *);
 PCRE_EXP_DECL int   (*pcre16_stack_guard)(void);
 PCRE_EXP_DECL void *(*pcre32_malloc)(size_t);
 PCRE_EXP_DECL void  (*pcre32_free)(void *);
 PCRE_EXP_DECL void *(*pcre32_stack_malloc)(size_t);
 PCRE_EXP_DECL void  (*pcre32_stack_free)(void *);
 PCRE_EXP_DECL int   (*pcre32_callout)(pcre32_callout_block *);
 PCRE_EXP_DECL int   (*pcre32_stack_guard)(void);
 #else   /* VPCOMPAT */
 PCRE_EXP_DECL void *pcre_malloc(size_t);
 PCRE_EXP_DECL void  pcre_free(void *);
 PCRE_EXP_DECL void *pcre_stack_malloc(size_t);
 PCRE_EXP_DECL void  pcre_stack_free(void *);
 PCRE_EXP_DECL int   pcre_callout(pcre_callout_block *);
 PCRE_EXP_DECL int   pcre_stack_guard(void);
 PCRE_EXP_DECL void *pcre16_malloc(size_t);
 PCRE_EXP_DECL void  pcre16_free(void *);
 PCRE_EXP_DECL void *pcre16_stack_malloc(size_t);
 PCRE_EXP_DECL void  pcre16_stack_free(void *);
 PCRE_EXP_DECL int   pcre16_callout(pcre16_callout_block *);
 PCRE_EXP_DECL int   pcre16_stack_guard(void);
 PCRE_EXP_DECL void *pcre32_malloc(size_t);
 PCRE_EXP_DECL void  pcre32_free(void *);
 PCRE_EXP_DECL void *pcre32_stack_malloc(size_t);
 PCRE_EXP_DECL void  pcre32_stack_free(void *);
 PCRE_EXP_DECL int   pcre32_callout(pcre32_callout_block *);
 PCRE_EXP_DECL int   pcre32_stack_guard(void);
 #endif  /* VPCOMPAT */
 /* User defined callback which provides a stack just before the match starts. */
--- a/ext/pcre/pcrelib/pcre_compile.c
+++ b/ext/pcre/pcrelib/pcre_compile.c
@ -6,7 +6,7 @@
 and semantics are as close as possible to those of the Perl 5 language.
                       Written by Philip Hazel
-           Copyright (c) 1997-2013 University of Cambridge
+           Copyright (c) 1997-2014 University of Cambridge
 -----------------------------------------------------------------------------
 Redistribution and use in source and binary forms, with or without
@ -47,8 +47,8 @@ supporting internal functions that are not used by other modules. */
 #endif
 #define NLBLOCK cd             /* Block containing newline information */
-#define PSSTART start_pattern  /* Field containing processed string start */
+#define PSSTART start_pattern  /* Field containing pattern start */
-#define PSEND   end_pattern    /* Field containing processed string end */
+#define PSEND   end_pattern    /* Field containing pattern end */
 #include "pcre_internal.h"
@ -547,6 +547,9 @@ static const char error_texts[] =
  "parentheses are too deeply nested\0"
  "invalid range in character class\0"
  "group name must start with a non-digit\0"
  /* 85 */
  "parentheses are too deeply nested (stack check)\0"
  "digits missing in \\x{} or \\o{}\0"
  ;
 /* Table to identify digits and hex digits. This is used when compiling
@ -1257,6 +1260,7 @@ else
    case CHAR_o:
    if (ptr[1] != CHAR_LEFT_CURLY_BRACKET) *errorcodeptr = ERR81; else
    if (ptr[2] == CHAR_RIGHT_CURLY_BRACKET) *errorcodeptr = ERR86; else
      {
      ptr += 2;
      c = 0;
@ -1326,6 +1330,11 @@ else
      if (ptr[1] == CHAR_LEFT_CURLY_BRACKET)
        {
        ptr += 2;
        if (*ptr == CHAR_RIGHT_CURLY_BRACKET)
          {
          *errorcodeptr = ERR86;
          break;
          }
        c = 0;
        overflow = FALSE;
        while (MAX_255(*ptr) && (digitab[*ptr] & ctype_xdigit) != 0)
@ -1581,30 +1590,30 @@ read_repeat_counts(const pcre_uchar *p, int *minp, int *maxp, int *errorcodeptr)
 int min = 0;
 int max = -1;
-/* Read the minimum value and do a paranoid check: a negative value indicates
+while (IS_DIGIT(*p))
-an integer overflow. */
+  {
-
+  min = min * 10 + (int)(*p++ - CHAR_0);
-while (IS_DIGIT(*p)) min = min * 10 + (int)(*p++ - CHAR_0);
+  if (min > 65535)
 if (min < 0 || min > 65535)
    {
    *errorcodeptr = ERR5;
    return p;
    }
-
+  }
 /* Read the maximum value if there is one, and again do a paranoid on its size.
 Also, max must not be less than min. */
 if (*p == CHAR_RIGHT_CURLY_BRACKET) max = min; else
  {
  if (*(++p) != CHAR_RIGHT_CURLY_BRACKET)
    {
    max = 0;
-    while(IS_DIGIT(*p)) max = max * 10 + (int)(*p++ - CHAR_0);
+    while(IS_DIGIT(*p))
-    if (max < 0 || max > 65535)
+      {
      max = max * 10 + (int)(*p++ - CHAR_0);
      if (max > 65535)
        {
        *errorcodeptr = ERR5;
        return p;
        }
      }
    if (max < min)
      {
      *errorcodeptr = ERR4;
@ -1613,9 +1622,6 @@ if (*p == CHAR_RIGHT_CURLY_BRACKET) max = min; else
    }
  }
 /* Fill in the required variables, and pass back the pointer to the terminating
 '}'. */
 *minp = min;
 *maxp = max;
 return p;
@ -2368,6 +2374,7 @@ for (code = first_significant_code(code + PRIV(OP_lengths)[*code], TRUE);
  if (c == OP_RECURSE)
    {
    const pcre_uchar *scode = cd->start_code + GET(code, 1);
    const pcre_uchar *endgroup = scode;
    BOOL empty_branch;
    /* Test for forward reference or uncompleted reference. This is disabled
@ -2382,20 +2389,16 @@ for (code = first_significant_code(code + PRIV(OP_lengths)[*code], TRUE);
      if (GET(scode, 1) == 0) return TRUE;    /* Unclosed */
      }
-    /* If we are scanning a completed pattern, there are no forward references
+    /* If the reference is to a completed group, we need to detect whether this
-    and all groups are complete. We need to detect whether this is a recursive
+    is a recursive call, as otherwise there will be an infinite loop. If it is
-    call, as otherwise there will be an infinite loop. If it is a recursion,
+    a recursion, just skip over it. Simple recursions are easily detected. For
-    just skip over it. Simple recursions are easily detected. For mutual
+    mutual recursions we keep a chain on the stack. */
    recursions we keep a chain on the stack. */
    else
      {
      recurse_check *r = recurses;
      const pcre_uchar *endgroup = scode;
    do endgroup += GET(endgroup, 1); while (*endgroup == OP_ALT);
    if (code >= scode && code <= endgroup) continue;  /* Simple recursion */
-
+    else
      {
      recurse_check *r = recurses;
      for (r = recurses; r != NULL; r = r->prev)
        if (r->group == scode) break;
      if (r != NULL) continue;   /* Mutual recursion */
@ -3036,7 +3039,7 @@ switch(c)
    end += 1 + 2 * IMM2_SIZE;
    break;
    }
-  list[2] = end - code;
+  list[2] = (pcre_uint32)(end - code);
  return end;
  }
 return NULL;    /* Opcode not accepted */
@ -3070,10 +3073,14 @@ const pcre_uint32 *chr_ptr;
 const pcre_uint32 *ochr_ptr;
 const pcre_uint32 *list_ptr;
 const pcre_uchar *next_code;
 #if defined SUPPORT_UTF || !defined COMPILE_PCRE8
 const pcre_uchar *xclass_flags;
 #endif
 const pcre_uint8 *class_bitset;
 const pcre_uint8 *set1, *set2, *set_end;
 pcre_uint32 chr;
 BOOL accepted, invert_bits;
 BOOL entered_a_group = FALSE;
 /* Note: the base_list[1] contains whether the current opcode has greedy
 (represented by a non-zero value) quantifier. This is a different from
@ -3127,8 +3134,10 @@ for(;;)
      case OP_ONCE:
      case OP_ONCE_NC:
      /* Atomic sub-patterns and assertions can always auto-possessify their
-      last iterator. */
+      last iterator. However, if the group was entered as a result of checking
-      return TRUE;
+      a previous iterator, this is not possible. */
      return !entered_a_group;
      }
    code += PRIV(OP_lengths)[c];
@ -3147,6 +3156,8 @@ for(;;)
      code = next_code + 1 + LINK_SIZE;
      next_code += GET(next_code, 1);
      }
    entered_a_group = TRUE;
    continue;
    case OP_BRAZERO:
@ -3166,6 +3177,9 @@ for(;;)
    code += PRIV(OP_lengths)[c];
    continue;
    default:
    break;
    }
  /* Check for a supported opcode, and load its properties. */
@ -3220,6 +3234,21 @@ for(;;)
        ((list_ptr == list ? code : base_end) - list_ptr[2]);
      break;
 #if defined SUPPORT_UTF || !defined COMPILE_PCRE8
      case OP_XCLASS:
      xclass_flags = (list_ptr == list ? code : base_end) - list_ptr[2] + LINK_SIZE;
      if ((*xclass_flags & XCL_HASPROP) != 0) return FALSE;
      if ((*xclass_flags & XCL_MAP) == 0)
        {
        /* No bits are set for characters < 256. */
        if (list[1] == 0) return TRUE;
        /* Might be an empty repeat. */
        continue;
        }
      set2 = (pcre_uint8 *)(xclass_flags + 1);
      break;
 #endif
      case OP_NOT_DIGIT:
      invert_bits = TRUE;
      /* Fall through */
@ -3389,8 +3418,7 @@ for(;;)
           rightop >= FIRST_AUTOTAB_OP && rightop <= LAST_AUTOTAB_RIGHT_OP &&
           autoposstab[leftop - FIRST_AUTOTAB_OP][rightop - FIRST_AUTOTAB_OP];
-    if (!accepted)
+    if (!accepted) return FALSE;
      return FALSE;
    if (list[1] == 0) return TRUE;
    /* Might be an empty repeat. */
@ -3548,7 +3576,9 @@ for(;;)
  if (list[1] == 0) return TRUE;
  }
-return FALSE;
+/* Control never reaches here. There used to be a fail-save return FALSE; here,
 but some compilers complain about an unreachable statement. */
 }
@ -4059,12 +4089,16 @@ for (c = *cptr; c <= d; c++)
 if (c > d) return -1;  /* Reached end of range */
 /* Found a character that has a single other case. Search for the end of the
 range, which is either the end of the input range, or a character that has zero
 or more than one other cases. */
 *ocptr = othercase;
 next = othercase + 1;
 for (++c; c <= d; c++)
  {
-  if (UCD_OTHERCASE(c) != next) break;
+  if ((co = UCD_CASESET(c)) != 0 || UCD_OTHERCASE(c) != next) break;
  next++;
  }
@ -4102,6 +4136,7 @@ add_to_class(pcre_uint8 *classbits, pcre_uchar **uchardptr, int options,
  compile_data *cd, pcre_uint32 start, pcre_uint32 end)
 {
 pcre_uint32 c;
 pcre_uint32 classbits_end = (end <= 0xff ? end : 0xff);
 int n8 = 0;
 /* If caseless matching is required, scan the range and process alternate
@ -4145,7 +4180,7 @@ if ((options & PCRE_CASELESS) != 0)
  /* Not UTF-mode, or no UCP */
-  for (c = start; c <= end && c < 256; c++)
+  for (c = start; c <= classbits_end; c++)
    {
    SETBIT(classbits, cd->fcc[c]);
    n8++;
@ -4170,22 +4205,21 @@ in all cases. */
 #endif /* COMPILE_PCRE[8|16] */
-/* If all characters are less than 256, use the bit map. Otherwise use extra
+/* Use the bitmap for characters < 256. Otherwise use extra data.*/
 data. */
-if (end < 0x100)
+for (c = start; c <= classbits_end; c++)
  {
-  for (c = start; c <= end; c++)
+  /* Regardless of start, c will always be <= 255. */
    {
    n8++;
  SETBIT(classbits, c);
-    }
+  n8++;
  }
-else
+#if defined SUPPORT_UTF || !defined COMPILE_PCRE8
 if (start <= 0xff) start = 0xff + 1;
 if (end >= start)
  {
  pcre_uchar *uchardata = *uchardptr;
 #ifdef SUPPORT_UTF
  if ((options & PCRE_UTF8) != 0)  /* All UTFs use the same flag bit */
    {
@ -4225,6 +4259,7 @@ else
  *uchardptr = uchardata;   /* Updata extra data pointer */
  }
 #endif /* SUPPORT_UTF || !COMPILE_PCRE8 */
 return n8;    /* Number of 8-bit characters */
 }
@ -4446,6 +4481,9 @@ for (;; ptr++)
  BOOL reset_bracount;
  int class_has_8bitchar;
  int class_one_char;
 #if defined SUPPORT_UTF || !defined COMPILE_PCRE8
  BOOL xclass_has_prop;
 #endif
  int newoptions;
  int recno;
  int refsign;
@ -4653,7 +4691,8 @@ for (;; ptr++)
    previous = NULL;
    if ((options & PCRE_MULTILINE) != 0)
      {
-      if (firstcharflags == REQ_UNSET) firstcharflags = REQ_NONE;
+      if (firstcharflags == REQ_UNSET)
        zerofirstcharflags = firstcharflags = REQ_NONE;
      *code++ = OP_CIRCM;
      }
    else *code++ = OP_CIRC;
@ -4780,13 +4819,26 @@ for (;; ptr++)
    should_flip_negation = FALSE;
    /* Extended class (xclass) will be used when characters > 255
    might match. */
 #if defined SUPPORT_UTF || !defined COMPILE_PCRE8
    xclass = FALSE;
    class_uchardata = code + LINK_SIZE + 2;   /* For XCLASS items */
    class_uchardata_base = class_uchardata;   /* Save the start */
 #endif
    /* For optimization purposes, we track some properties of the class:
    class_has_8bitchar will be non-zero if the class contains at least one <
    256 character; class_one_char will be 1 if the class contains just one
-    character. */
+    character; xclass_has_prop will be TRUE if unicode property checks
    are present in the class. */
    class_has_8bitchar = 0;
    class_one_char = 0;
 #if defined SUPPORT_UTF || !defined COMPILE_PCRE8
    xclass_has_prop = FALSE;
 #endif
    /* Initialize the 32-char bit map to all zeros. We build the map in a
    temporary bit of memory, in case the class contains fewer than two
@ -4795,12 +4847,6 @@ for (;; ptr++)
    memset(classbits, 0, 32 * sizeof(pcre_uint8));
 #if defined SUPPORT_UTF || !defined COMPILE_PCRE8
    xclass = FALSE;
    class_uchardata = code + LINK_SIZE + 2;   /* For XCLASS items */
    class_uchardata_base = class_uchardata;   /* Save the start */
 #endif
    /* Process characters until ] is reached. By writing this as a "do" it
    means that an initial ] is taken as a data character. At the start of the
    loop, c contains the first byte of the character. */
@ -4826,7 +4872,7 @@ for (;; ptr++)
      if (lengthptr != NULL && class_uchardata > class_uchardata_base)
        {
        xclass = TRUE;
-        *lengthptr += class_uchardata - class_uchardata_base;
+        *lengthptr += (int)(class_uchardata - class_uchardata_base);
        class_uchardata = class_uchardata_base;
        }
 #endif
@ -4924,6 +4970,7 @@ for (;; ptr++)
            *class_uchardata++ = local_negate? XCL_NOTPROP : XCL_PROP;
            *class_uchardata++ = ptype;
            *class_uchardata++ = 0;
            xclass_has_prop = TRUE;
            ptr = tempptr + 1;
            continue;
@ -5106,6 +5153,7 @@ for (;; ptr++)
                XCL_PROP : XCL_NOTPROP;
              *class_uchardata++ = ptype;
              *class_uchardata++ = pdata;
              xclass_has_prop = TRUE;
              class_has_8bitchar--;                /* Undo! */
              continue;
              }
@ -5274,7 +5322,7 @@ for (;; ptr++)
      whatever repeat count may follow. In the case of reqchar, save the
      previous value for reinstating. */
-      if (class_one_char == 1 && ptr[1] == CHAR_RIGHT_SQUARE_BRACKET)
+      if (!inescq && class_one_char == 1 && ptr[1] == CHAR_RIGHT_SQUARE_BRACKET)
        {
        ptr++;
        zeroreqchar = reqchar;
@ -5400,6 +5448,7 @@ for (;; ptr++)
      *code++ = OP_XCLASS;
      code += LINK_SIZE;
      *code = negate_class? XCL_NOT:0;
      if (xclass_has_prop) *code |= XCL_HASPROP;
      /* If the map is required, move up the extra data to make room for it;
      otherwise just move the code pointer to the end of the extra data. */
@ -5409,6 +5458,8 @@ for (;; ptr++)
        *code++ |= XCL_MAP;
        memmove(code + (32 / sizeof(pcre_uchar)), code,
          IN_UCHARS(class_uchardata - code));
        if (negate_class && !xclass_has_prop)
          for (c = 0; c < 32; c++) classbits[c] = ~classbits[c];
        memcpy(code, classbits, 32);
        code = class_uchardata + (32 / sizeof(pcre_uchar));
        }
@ -5966,8 +6017,8 @@ for (;; ptr++)
              while (cd->hwm > cd->start_workspace + cd->workspace_size -
                     WORK_SIZE_SAFETY_MARGIN - (this_hwm - save_hwm))
                {
-                int save_offset = save_hwm - cd->start_workspace;
+                size_t save_offset = save_hwm - cd->start_workspace;
-                int this_offset = this_hwm - cd->start_workspace;
+                size_t this_offset = this_hwm - cd->start_workspace;
                *errorcodeptr = expand_workspace(cd);
                if (*errorcodeptr != 0) goto FAILED;
                save_hwm = (pcre_uchar *)cd->start_workspace + save_offset;
@ -6048,8 +6099,8 @@ for (;; ptr++)
          while (cd->hwm > cd->start_workspace + cd->workspace_size -
                 WORK_SIZE_SAFETY_MARGIN - (this_hwm - save_hwm))
            {
-            int save_offset = save_hwm - cd->start_workspace;
+            size_t save_offset = save_hwm - cd->start_workspace;
-            int this_offset = this_hwm - cd->start_workspace;
+            size_t this_offset = this_hwm - cd->start_workspace;
            *errorcodeptr = expand_workspace(cd);
            if (*errorcodeptr != 0) goto FAILED;
            save_hwm = (pcre_uchar *)cd->start_workspace + save_offset;
@ -6577,7 +6628,10 @@ for (;; ptr++)
        code[1+LINK_SIZE] = OP_CREF;
        skipbytes = 1+IMM2_SIZE;
-        refsign = -1;
+        refsign = -1;     /* => not a number */
        namelen = -1;     /* => not a name; must set to avoid warning */
        name = NULL;      /* Always set to avoid warning */
        recno = 0;        /* Always set to avoid warning */
        /* Check for a test for recursion in a named group. */
@ -6614,7 +6668,6 @@ for (;; ptr++)
        if (refsign >= 0)
          {
          recno = 0;
          while (IS_DIGIT(*ptr))
            {
            recno = recno * 10 + (int)(*ptr - CHAR_0);
@ -6645,7 +6698,8 @@ for (;; ptr++)
            ptr++;
            }
          namelen = (int)(ptr - name);
-          if (lengthptr != NULL) *lengthptr += IMM2_SIZE;
+          if (lengthptr != NULL && (options & PCRE_DUPNAMES) != 0)
            *lengthptr += IMM2_SIZE;
          }
        /* Check the terminator */
@ -6706,9 +6760,11 @@ for (;; ptr++)
          for (; i < cd->names_found; i++)
            {
            slot += cd->name_entry_size;
-            if (STRNCMP_UC_UC(name, slot+IMM2_SIZE, namelen) != 0) break;
+            if (STRNCMP_UC_UC(name, slot+IMM2_SIZE, namelen) != 0 ||
              (slot+IMM2_SIZE)[namelen] != 0) break;
            count++;
            }
          if (count > 1)
            {
            PUT2(code, 2+LINK_SIZE, offset);
@ -7057,6 +7113,12 @@ for (;; ptr++)
          /* Count named back references. */
          if (!is_recurse) cd->namedrefcount++;
          /* If duplicate names are permitted, we have to allow for a named
          reference to a duplicated name (this cannot be determined until the
          second pass). This needs an extra 16-bit data item. */
          if ((options & PCRE_DUPNAMES) != 0) *lengthptr += IMM2_SIZE;
          }
        /* In the real compile, search the name table. We check the name
@ -7103,6 +7165,8 @@ for (;; ptr++)
          for (i++; i < cd->names_found; i++)
            {
            if (STRCMP_UC_UC(slot + IMM2_SIZE, cslot + IMM2_SIZE) != 0) break;
            count++;
            cslot += cd->name_entry_size;
            }
@ -7991,6 +8055,16 @@ unsigned int orig_bracount;
 unsigned int max_bracount;
 branch_chain bc;
 /* If set, call the external function that checks for stack availability. */
 if (PUBL(stack_guard) != NULL && PUBL(stack_guard)())
  {
  *errorcodeptr= ERR85;
  return FALSE;
  }
 /* Miscellaneous initialization */
 bc.outer = bcptr;
 bc.current_branch = code;
@ -8190,12 +8264,16 @@ for (;;)
    /* If it was a capturing subpattern, check to see if it contained any
    recursive back references. If so, we must wrap it in atomic brackets.
-    In any event, remove the block from the chain. */
+    Because we are moving code along, we must ensure that any pending recursive
    references are updated. In any event, remove the block from the chain. */
    if (capnumber > 0)
      {
      if (cd->open_caps->flag)
        {
        *code = OP_END;
        adjust_recurse(start_bracket, 1 + LINK_SIZE,
          (options & PCRE_UTF8) != 0, cd, cd->hwm);
        memmove(start_bracket + 1 + LINK_SIZE, start_bracket,
          IN_UCHARS(code - start_bracket));
        *start_bracket = OP_ONCE;
@ -9200,11 +9278,18 @@ subpattern. */
 if (errorcode == 0 && re->top_backref > re->top_bracket) errorcode = ERR15;
-/* Unless disabled, check whether single character iterators can be
+/* Unless disabled, check whether any single character iterators can be
-auto-possessified. The function overwrites the appropriate opcode values. */
+auto-possessified. The function overwrites the appropriate opcode values, so
 the type of the pointer must be cast. NOTE: the intermediate variable "temp" is
 used in this code because at least one compiler gives a warning about loss of
 "const" attribute if the cast (pcre_uchar *)codestart is used directly in the
 function call. */
 if ((options & PCRE_NO_AUTO_POSSESS) == 0)
-  auto_possessify((pcre_uchar *)codestart, utf, cd);
+  {
  pcre_uchar *temp = (pcre_uchar *)codestart;
  auto_possessify(temp, utf, cd);
  }
 /* If there were any lookbehind assertions that contained OP_RECURSE
 (recursions or subroutine calls), a flag is set for them to be checked here,
--- a/ext/pcre/pcrelib/pcre_exec.c
+++ b/ext/pcre/pcrelib/pcre_exec.c
@ -6,7 +6,7 @@
 and semantics are as close as possible to those of the Perl 5 language.
                       Written by Philip Hazel
-           Copyright (c) 1997-2013 University of Cambridge
+           Copyright (c) 1997-2014 University of Cambridge
 -----------------------------------------------------------------------------
 Redistribution and use in source and binary forms, with or without
@ -134,7 +134,7 @@ pcre_uint32 c;
 BOOL utf = md->utf;
 if (is_subject && length > md->end_subject - p) length = md->end_subject - p;
 while (length-- > 0)
-  if (isprint(c = RAWUCHARINCTEST(p))) printf("%c", (char)c); else printf("\\x{%02x}", c);
+  if (isprint(c = UCHAR21INCTEST(p))) printf("%c", (char)c); else printf("\\x{%02x}", c);
 }
 #endif
@ -237,8 +237,8 @@ if (caseless)
      {
      pcre_uint32 cc, cp;
      if (eptr >= md->end_subject) return -2;   /* Partial match */
-      cc = RAWUCHARTEST(eptr);
+      cc = UCHAR21TEST(eptr);
-      cp = RAWUCHARTEST(p);
+      cp = UCHAR21TEST(p);
      if (TABLE_GET(cp, md->lcc, cp) != TABLE_GET(cc, md->lcc, cc)) return -1;
      p++;
      eptr++;
@ -254,7 +254,7 @@ else
  while (length-- > 0)
    {
    if (eptr >= md->end_subject) return -2;   /* Partial match */
-    if (RAWUCHARINCTEST(p) != RAWUCHARINCTEST(eptr)) return -1;
+    if (UCHAR21INCTEST(p) != UCHAR21INCTEST(eptr)) return -1;
    }
  }
@ -1167,11 +1167,16 @@ for (;;)
        if (rrc == MATCH_KETRPOS)
          {
          offset_top = md->end_offset_top;
          eptr = md->end_match_ptr;
          ecode = md->start_code + code_offset;
          save_capture_last = md->capture_last;
          matched_once = TRUE;
          mstart = md->start_match_ptr;    /* In case \K changed it */
          if (eptr == md->end_match_ptr)   /* Matched an empty string */
            {
            do ecode += GET(ecode, 1); while (*ecode == OP_ALT);
            break;
            }
          eptr = md->end_match_ptr;
          continue;
          }
@ -1241,10 +1246,15 @@ for (;;)
      if (rrc == MATCH_KETRPOS)
        {
        offset_top = md->end_offset_top;
        eptr = md->end_match_ptr;
        ecode = md->start_code + code_offset;
        matched_once = TRUE;
        mstart = md->start_match_ptr;   /* In case \K reset it */
        if (eptr == md->end_match_ptr)  /* Matched an empty string */
          {
          do ecode += GET(ecode, 1); while (*ecode == OP_ALT);
          break;
          }
        eptr = md->end_match_ptr;
        continue;
        }
@ -1979,6 +1989,19 @@ for (;;)
        }
      }
    /* OP_KETRPOS is a possessive repeating ket. Remember the current position,
    and return the MATCH_KETRPOS. This makes it possible to do the repeats one
    at a time from the outer level, thus saving stack. This must precede the
    empty string test - in this case that test is done at the outer level. */
    if (*ecode == OP_KETRPOS)
      {
      md->start_match_ptr = mstart;    /* In case \K reset it */
      md->end_match_ptr = eptr;
      md->end_offset_top = offset_top;
      RRETURN(MATCH_KETRPOS);
      }
    /* For an ordinary non-repeating ket, just continue at this level. This
    also happens for a repeating ket if no characters were matched in the
    group. This is the forcible breaking of infinite loops as implemented in
@ -2001,18 +2024,6 @@ for (;;)
      break;
      }
    /* OP_KETRPOS is a possessive repeating ket. Remember the current position,
    and return the MATCH_KETRPOS. This makes it possible to do the repeats one
    at a time from the outer level, thus saving stack. */
    if (*ecode == OP_KETRPOS)
      {
      md->start_match_ptr = mstart;    /* In case \K reset it */
      md->end_match_ptr = eptr;
      md->end_offset_top = offset_top;
      RRETURN(MATCH_KETRPOS);
      }
    /* The normal repeating kets try the rest of the pattern or restart from
    the preceding bracket, in the appropriate order. In the second case, we can
    use tail recursion to avoid using another stack frame, unless we have an
@ -2103,7 +2114,7 @@ for (;;)
            eptr + 1 >= md->end_subject &&
            NLBLOCK->nltype == NLTYPE_FIXED &&
            NLBLOCK->nllen == 2 &&
-            RAWUCHARTEST(eptr) == NLBLOCK->nl[0])
+            UCHAR21TEST(eptr) == NLBLOCK->nl[0])
          {
          md->hitend = TRUE;
          if (md->partial > 1) RRETURN(PCRE_ERROR_PARTIAL);
@ -2147,7 +2158,7 @@ for (;;)
          eptr + 1 >= md->end_subject &&
          NLBLOCK->nltype == NLTYPE_FIXED &&
          NLBLOCK->nllen == 2 &&
-          RAWUCHARTEST(eptr) == NLBLOCK->nl[0])
+          UCHAR21TEST(eptr) == NLBLOCK->nl[0])
        {
        md->hitend = TRUE;
        if (md->partial > 1) RRETURN(PCRE_ERROR_PARTIAL);
@ -2290,7 +2301,7 @@ for (;;)
        eptr + 1 >= md->end_subject &&
        NLBLOCK->nltype == NLTYPE_FIXED &&
        NLBLOCK->nllen == 2 &&
-        RAWUCHARTEST(eptr) == NLBLOCK->nl[0])
+        UCHAR21TEST(eptr) == NLBLOCK->nl[0])
      {
      md->hitend = TRUE;
      if (md->partial > 1) RRETURN(PCRE_ERROR_PARTIAL);
@ -2444,7 +2455,7 @@ for (;;)
        {
        SCHECK_PARTIAL();
        }
-      else if (RAWUCHARTEST(eptr) == CHAR_LF) eptr++;
+      else if (UCHAR21TEST(eptr) == CHAR_LF) eptr++;
      break;
      case CHAR_LF:
@ -2691,16 +2702,22 @@ for (;;)
      pcre_uchar *slot = md->name_table + GET2(ecode, 1) * md->name_entry_size;
      ecode += 1 + 2*IMM2_SIZE;
      /* Setting the default length first and initializing 'offset' avoids
      compiler warnings in the REF_REPEAT code. */
      length = (md->jscript_compat)? 0 : -1;
      offset = 0;
      while (count-- > 0)
        {
        offset = GET2(slot, 0) << 1;
-        if (offset < offset_top && md->offset_vector[offset] >= 0) break;
+        if (offset < offset_top && md->offset_vector[offset] >= 0)
          {
          length = md->offset_vector[offset+1] - md->offset_vector[offset];
          break;
          }
        slot += md->name_entry_size;
        }
      if (count < 0)
        length = (md->jscript_compat)? 0 : -1;
      else
        length = md->offset_vector[offset+1] - md->offset_vector[offset];
      }
    goto REF_REPEAT;
@ -3212,7 +3229,7 @@ for (;;)
        CHECK_PARTIAL();             /* Not SCHECK_PARTIAL() */
        RRETURN(MATCH_NOMATCH);
        }
-      while (length-- > 0) if (*ecode++ != RAWUCHARINC(eptr)) RRETURN(MATCH_NOMATCH);
+      while (length-- > 0) if (*ecode++ != UCHAR21INC(eptr)) RRETURN(MATCH_NOMATCH);
      }
    else
 #endif
@ -3252,7 +3269,7 @@ for (;;)
      if (fc < 128)
        {
-        pcre_uint32 cc = RAWUCHAR(eptr);
+        pcre_uint32 cc = UCHAR21(eptr);
        if (md->lcc[fc] != TABLE_GET(cc, md->lcc, cc)) RRETURN(MATCH_NOMATCH);
        ecode++;
        eptr++;
@ -3521,7 +3538,7 @@ for (;;)
          SCHECK_PARTIAL();
          RRETURN(MATCH_NOMATCH);
          }
-        cc = RAWUCHARTEST(eptr);
+        cc = UCHAR21TEST(eptr);
        if (fc != cc && foc != cc) RRETURN(MATCH_NOMATCH);
        eptr++;
        }
@ -3539,7 +3556,7 @@ for (;;)
            SCHECK_PARTIAL();
            RRETURN(MATCH_NOMATCH);
            }
-          cc = RAWUCHARTEST(eptr);
+          cc = UCHAR21TEST(eptr);
          if (fc != cc && foc != cc) RRETURN(MATCH_NOMATCH);
          eptr++;
          }
@ -3556,7 +3573,7 @@ for (;;)
            SCHECK_PARTIAL();
            break;
            }
-          cc = RAWUCHARTEST(eptr);
+          cc = UCHAR21TEST(eptr);
          if (fc != cc && foc != cc) break;
          eptr++;
          }
@ -3583,7 +3600,7 @@ for (;;)
          SCHECK_PARTIAL();
          RRETURN(MATCH_NOMATCH);
          }
-        if (fc != RAWUCHARINCTEST(eptr)) RRETURN(MATCH_NOMATCH);
+        if (fc != UCHAR21INCTEST(eptr)) RRETURN(MATCH_NOMATCH);
        }
      if (min == max) continue;
@ -3600,7 +3617,7 @@ for (;;)
            SCHECK_PARTIAL();
            RRETURN(MATCH_NOMATCH);
            }
-          if (fc != RAWUCHARINCTEST(eptr)) RRETURN(MATCH_NOMATCH);
+          if (fc != UCHAR21INCTEST(eptr)) RRETURN(MATCH_NOMATCH);
          }
        /* Control never gets here */
        }
@ -3614,7 +3631,7 @@ for (;;)
            SCHECK_PARTIAL();
            break;
            }
-          if (fc != RAWUCHARTEST(eptr)) break;
+          if (fc != UCHAR21TEST(eptr)) break;
          eptr++;
          }
        if (possessive) continue;    /* No backtracking */
@ -4369,7 +4386,7 @@ for (;;)
              eptr + 1 >= md->end_subject &&
              NLBLOCK->nltype == NLTYPE_FIXED &&
              NLBLOCK->nllen == 2 &&
-              RAWUCHAR(eptr) == NLBLOCK->nl[0])
+              UCHAR21(eptr) == NLBLOCK->nl[0])
            {
            md->hitend = TRUE;
            if (md->partial > 1) RRETURN(PCRE_ERROR_PARTIAL);
@ -4411,7 +4428,7 @@ for (;;)
            default: RRETURN(MATCH_NOMATCH);
            case CHAR_CR:
-            if (eptr < md->end_subject && RAWUCHAR(eptr) == CHAR_LF) eptr++;
+            if (eptr < md->end_subject && UCHAR21(eptr) == CHAR_LF) eptr++;
            break;
            case CHAR_LF:
@ -4521,7 +4538,7 @@ for (;;)
            SCHECK_PARTIAL();
            RRETURN(MATCH_NOMATCH);
            }
-          cc = RAWUCHAR(eptr);
+          cc = UCHAR21(eptr);
          if (cc >= 128 || (md->ctypes[cc] & ctype_digit) == 0)
            RRETURN(MATCH_NOMATCH);
          eptr++;
@ -4538,7 +4555,7 @@ for (;;)
            SCHECK_PARTIAL();
            RRETURN(MATCH_NOMATCH);
            }
-          cc = RAWUCHAR(eptr);
+          cc = UCHAR21(eptr);
          if (cc < 128 && (md->ctypes[cc] & ctype_space) != 0)
            RRETURN(MATCH_NOMATCH);
          eptr++;
@ -4555,7 +4572,7 @@ for (;;)
            SCHECK_PARTIAL();
            RRETURN(MATCH_NOMATCH);
            }
-          cc = RAWUCHAR(eptr);
+          cc = UCHAR21(eptr);
          if (cc >= 128 || (md->ctypes[cc] & ctype_space) == 0)
            RRETURN(MATCH_NOMATCH);
          eptr++;
@ -4572,7 +4589,7 @@ for (;;)
            SCHECK_PARTIAL();
            RRETURN(MATCH_NOMATCH);
            }
-          cc = RAWUCHAR(eptr);
+          cc = UCHAR21(eptr);
          if (cc < 128 && (md->ctypes[cc] & ctype_word) != 0)
            RRETURN(MATCH_NOMATCH);
          eptr++;
@ -4589,7 +4606,7 @@ for (;;)
            SCHECK_PARTIAL();
            RRETURN(MATCH_NOMATCH);
            }
-          cc = RAWUCHAR(eptr);
+          cc = UCHAR21(eptr);
          if (cc >= 128 || (md->ctypes[cc] & ctype_word) == 0)
            RRETURN(MATCH_NOMATCH);
          eptr++;
@ -5150,7 +5167,7 @@ for (;;)
              {
              default: RRETURN(MATCH_NOMATCH);
              case CHAR_CR:
-              if (eptr < md->end_subject && RAWUCHAR(eptr) == CHAR_LF) eptr++;
+              if (eptr < md->end_subject && UCHAR21(eptr) == CHAR_LF) eptr++;
              break;
              case CHAR_LF:
@ -5675,8 +5692,6 @@ for (;;)
        switch(ctype)
          {
          case OP_ANY:
          if (max < INT_MAX)
            {
          for (i = min; i < max; i++)
            {
            if (eptr >= md->end_subject)
@ -5689,7 +5704,7 @@ for (;;)
                eptr + 1 >= md->end_subject &&
                NLBLOCK->nltype == NLTYPE_FIXED &&
                NLBLOCK->nllen == 2 &&
-                  RAWUCHAR(eptr) == NLBLOCK->nl[0])
+                UCHAR21(eptr) == NLBLOCK->nl[0])
              {
              md->hitend = TRUE;
              if (md->partial > 1) RRETURN(PCRE_ERROR_PARTIAL);
@ -5697,33 +5712,6 @@ for (;;)
            eptr++;
            ACROSSCHAR(eptr < md->end_subject, *eptr, eptr++);
            }
            }
          /* Handle unlimited UTF-8 repeat */
          else
            {
            for (i = min; i < max; i++)
              {
              if (eptr >= md->end_subject)
                {
                SCHECK_PARTIAL();
                break;
                }
              if (IS_NEWLINE(eptr)) break;
              if (md->partial != 0 &&    /* Take care with CRLF partial */
                  eptr + 1 >= md->end_subject &&
                  NLBLOCK->nltype == NLTYPE_FIXED &&
                  NLBLOCK->nllen == 2 &&
                  RAWUCHAR(eptr) == NLBLOCK->nl[0])
                {
                md->hitend = TRUE;
                if (md->partial > 1) RRETURN(PCRE_ERROR_PARTIAL);
                }
              eptr++;
              ACROSSCHAR(eptr < md->end_subject, *eptr, eptr++);
              }
            }
          break;
          case OP_ALLANY:
@ -5772,7 +5760,7 @@ for (;;)
            if (c == CHAR_CR)
              {
              if (++eptr >= md->end_subject) break;
-              if (RAWUCHAR(eptr) == CHAR_LF) eptr++;
+              if (UCHAR21(eptr) == CHAR_LF) eptr++;
              }
            else
              {
@ -5935,8 +5923,8 @@ for (;;)
          if (rrc != MATCH_NOMATCH) RRETURN(rrc);
          eptr--;
          BACKCHAR(eptr);
-          if (ctype == OP_ANYNL && eptr > pp  && RAWUCHAR(eptr) == CHAR_NL &&
+          if (ctype == OP_ANYNL && eptr > pp  && UCHAR21(eptr) == CHAR_NL &&
-              RAWUCHAR(eptr - 1) == CHAR_CR) eptr--;
+              UCHAR21(eptr - 1) == CHAR_CR) eptr--;
          }
        }
      else
@ -6513,7 +6501,7 @@ tables = re->tables;
 if (extra_data != NULL)
  {
-  register unsigned int flags = extra_data->flags;
+  unsigned long int flags = extra_data->flags;
  if ((flags & PCRE_EXTRA_STUDY_DATA) != 0)
    study = (const pcre_study_data *)extra_data->study_data;
  if ((flags & PCRE_EXTRA_MATCH_LIMIT) != 0)
@ -6783,10 +6771,10 @@ for(;;)
      if (first_char != first_char2)
        while (start_match < end_subject &&
-          (smc = RAWUCHARTEST(start_match)) != first_char && smc != first_char2)
+          (smc = UCHAR21TEST(start_match)) != first_char && smc != first_char2)
          start_match++;
      else
-        while (start_match < end_subject && RAWUCHARTEST(start_match) != first_char)
+        while (start_match < end_subject && UCHAR21TEST(start_match) != first_char)
          start_match++;
      }
@ -6818,7 +6806,7 @@ for(;;)
        if (start_match[-1] == CHAR_CR &&
             (md->nltype == NLTYPE_ANY || md->nltype == NLTYPE_ANYCRLF) &&
             start_match < end_subject &&
-             RAWUCHARTEST(start_match) == CHAR_NL)
+             UCHAR21TEST(start_match) == CHAR_NL)
          start_match++;
        }
      }
@ -6829,22 +6817,12 @@ for(;;)
      {
      while (start_match < end_subject)
        {
-        register pcre_uint32 c = RAWUCHARTEST(start_match);
+        register pcre_uint32 c = UCHAR21TEST(start_match);
 #ifndef COMPILE_PCRE8
        if (c > 255) c = 255;
 #endif
-        if ((start_bits[c/8] & (1 << (c&7))) == 0)
+        if ((start_bits[c/8] & (1 << (c&7))) != 0) break;
          {
        start_match++;
 #if defined SUPPORT_UTF && defined COMPILE_PCRE8
          /* In non 8-bit mode, the iteration will stop for
          characters > 255 at the beginning or not stop at all. */
          if (utf)
            ACROSSCHAR(start_match < end_subject, *start_match,
              start_match++);
 #endif
          }
        else break;
        }
      }
    }   /* Starting optimizations */
@ -6897,7 +6875,7 @@ for(;;)
          {
          while (p < end_subject)
            {
-            register pcre_uint32 pp = RAWUCHARINCTEST(p);
+            register pcre_uint32 pp = UCHAR21INCTEST(p);
            if (pp == req_char || pp == req_char2) { p--; break; }
            }
          }
@ -6905,7 +6883,7 @@ for(;;)
          {
          while (p < end_subject)
            {
-            if (RAWUCHARINCTEST(p) == req_char) { p--; break; }
+            if (UCHAR21INCTEST(p) == req_char) { p--; break; }
            }
          }
--- a/ext/pcre/pcrelib/pcre_globals.c
+++ b/ext/pcre/pcrelib/pcre_globals.c
@ -6,7 +6,7 @@
 and semantics are as close as possible to those of the Perl 5 language.
                       Written by Philip Hazel
-           Copyright (c) 1997-2012 University of Cambridge
+           Copyright (c) 1997-2014 University of Cambridge
 -----------------------------------------------------------------------------
 Redistribution and use in source and binary forms, with or without
@ -72,6 +72,7 @@ PCRE_EXP_DATA_DEFN void  (*PUBL(free))(void *) = LocalPcreFree;
 PCRE_EXP_DATA_DEFN void *(*PUBL(stack_malloc))(size_t) = LocalPcreMalloc;
 PCRE_EXP_DATA_DEFN void  (*PUBL(stack_free))(void *) = LocalPcreFree;
 PCRE_EXP_DATA_DEFN int   (*PUBL(callout))(PUBL(callout_block) *) = NULL;
 PCRE_EXP_DATA_DEFN int   (*PUBL(stack_guard))(void) = NULL;
 #elif !defined VPCOMPAT
 PCRE_EXP_DATA_DEFN void *(*PUBL(malloc))(size_t) = malloc;
@ -79,6 +80,7 @@ PCRE_EXP_DATA_DEFN void  (*PUBL(free))(void *) = free;
 PCRE_EXP_DATA_DEFN void *(*PUBL(stack_malloc))(size_t) = malloc;
 PCRE_EXP_DATA_DEFN void  (*PUBL(stack_free))(void *) = free;
 PCRE_EXP_DATA_DEFN int   (*PUBL(callout))(PUBL(callout_block) *) = NULL;
 PCRE_EXP_DATA_DEFN int   (*PUBL(stack_guard))(void) = NULL;
 #endif
 /* End of pcre_globals.c */
--- a/ext/pcre/pcrelib/pcre_internal.h
+++ b/ext/pcre/pcrelib/pcre_internal.h
@ -7,7 +7,7 @@
 and semantics are as close as possible to those of the Perl 5 language.
                       Written by Philip Hazel
-           Copyright (c) 1997-2013 University of Cambridge
+           Copyright (c) 1997-2014 University of Cambridge
 -----------------------------------------------------------------------------
 Redistribution and use in source and binary forms, with or without
@ -316,8 +316,8 @@ start/end of string field names are. */
       &(NLBLOCK->nllen), utf)) \
    : \
    ((p) <= NLBLOCK->PSEND - NLBLOCK->nllen && \
-     RAWUCHARTEST(p) == NLBLOCK->nl[0] && \
+     UCHAR21TEST(p) == NLBLOCK->nl[0] && \
-     (NLBLOCK->nllen == 1 || RAWUCHARTEST(p+1) == NLBLOCK->nl[1])       \
+     (NLBLOCK->nllen == 1 || UCHAR21TEST(p+1) == NLBLOCK->nl[1])       \
    ) \
  )
@ -330,8 +330,8 @@ start/end of string field names are. */
       &(NLBLOCK->nllen), utf)) \
    : \
    ((p) >= NLBLOCK->PSSTART + NLBLOCK->nllen && \
-     RAWUCHARTEST(p - NLBLOCK->nllen) == NLBLOCK->nl[0] &&              \
+     UCHAR21TEST(p - NLBLOCK->nllen) == NLBLOCK->nl[0] &&              \
-     (NLBLOCK->nllen == 1 || RAWUCHARTEST(p - NLBLOCK->nllen + 1) == NLBLOCK->nl[1]) \
+     (NLBLOCK->nllen == 1 || UCHAR21TEST(p - NLBLOCK->nllen + 1) == NLBLOCK->nl[1]) \
    ) \
  )
@ -582,12 +582,27 @@ changed in future to be a fixed number of bytes or to depend on LINK_SIZE. */
 #define MAX_MARK ((1u << 8) - 1)
 #endif
 /* There is a proposed future special "UTF-21" mode, in which only the lowest
 21 bits of a 32-bit character are interpreted as UTF, with the remaining 11
 high-order bits available to the application for other uses. In preparation for
 the future implementation of this mode, there are macros that load a data item
 and, if in this special mode, mask it to 21 bits. These macros all have names
 starting with UCHAR21. In all other modes, including the normal 32-bit
 library, the macros all have the same simple definitions. When the new mode is
 implemented, it is expected that these definitions will be varied appropriately
 using #ifdef when compiling the library that supports the special mode. */
 #define UCHAR21(eptr)        (*(eptr))
 #define UCHAR21TEST(eptr)    (*(eptr))
 #define UCHAR21INC(eptr)     (*(eptr)++)
 #define UCHAR21INCTEST(eptr) (*(eptr)++)
 /* When UTF encoding is being used, a character is no longer just a single
-byte. The macros for character handling generate simple sequences when used in
+byte in 8-bit mode or a single short in 16-bit mode. The macros for character
-character-mode, and more complicated ones for UTF characters. GETCHARLENTEST
+handling generate simple sequences when used in the basic mode, and more
-and other macros are not used when UTF is not supported, so they are not
+complicated ones for UTF characters. GETCHARLENTEST and other macros are not
-defined. To make sure they can never even appear when UTF support is omitted,
+used when UTF is not supported. To make sure they can never even appear when
-we don't even define them. */
+UTF support is omitted, we don't even define them. */
 #ifndef SUPPORT_UTF
@ -600,10 +615,6 @@ we don't even define them. */
 #define GETCHARINC(c, eptr) c = *eptr++;
 #define GETCHARINCTEST(c, eptr) c = *eptr++;
 #define GETCHARLEN(c, eptr, len) c = *eptr;
 #define RAWUCHAR(eptr) (*(eptr))
 #define RAWUCHARINC(eptr) (*(eptr)++)
 #define RAWUCHARTEST(eptr) (*(eptr))
 #define RAWUCHARINCTEST(eptr) (*(eptr)++)
 /* #define GETCHARLENTEST(c, eptr, len) */
 /* #define BACKCHAR(eptr) */
 /* #define FORWARDCHAR(eptr) */
@ -776,30 +787,6 @@ do not know if we are in UTF-8 mode. */
  c = *eptr; \
  if (utf && c >= 0xc0) GETUTF8LEN(c, eptr, len);
 /* Returns the next uchar, not advancing the pointer. This is called when
 we know we are in UTF mode. */
 #define RAWUCHAR(eptr) \
  (*(eptr))
 /* Returns the next uchar, advancing the pointer. This is called when
 we know we are in UTF mode. */
 #define RAWUCHARINC(eptr) \
  (*((eptr)++))
 /* Returns the next uchar, testing for UTF mode, and not advancing the
 pointer. */
 #define RAWUCHARTEST(eptr) \
  (*(eptr))
 /* Returns the next uchar, testing for UTF mode, advancing the
 pointer. */
 #define RAWUCHARINCTEST(eptr) \
  (*((eptr)++))
 /* If the pointer is not at the start of a character, move it back until
 it is. This is called only in UTF-8 mode - we don't put a test within the macro
 because almost all calls are already within a block of UTF-8 only code. */
@ -895,30 +882,6 @@ we do not know if we are in UTF-16 mode. */
  c = *eptr; \
  if (utf && (c & 0xfc00) == 0xd800) GETUTF16LEN(c, eptr, len);
 /* Returns the next uchar, not advancing the pointer. This is called when
 we know we are in UTF mode. */
 #define RAWUCHAR(eptr) \
  (*(eptr))
 /* Returns the next uchar, advancing the pointer. This is called when
 we know we are in UTF mode. */
 #define RAWUCHARINC(eptr) \
  (*((eptr)++))
 /* Returns the next uchar, testing for UTF mode, and not advancing the
 pointer. */
 #define RAWUCHARTEST(eptr) \
  (*(eptr))
 /* Returns the next uchar, testing for UTF mode, advancing the
 pointer. */
 #define RAWUCHARINCTEST(eptr) \
  (*((eptr)++))
 /* If the pointer is not at the start of a character, move it back until
 it is. This is called only in UTF-16 mode - we don't put a test within the
 macro because almost all calls are already within a block of UTF-16 only
@ -980,30 +943,6 @@ This is called when we do not know if we are in UTF-32 mode. */
 #define GETCHARLENTEST(c, eptr, len) \
  GETCHARTEST(c, eptr)
 /* Returns the next uchar, not advancing the pointer. This is called when
 we know we are in UTF mode. */
 #define RAWUCHAR(eptr) \
  (*(eptr))
 /* Returns the next uchar, advancing the pointer. This is called when
 we know we are in UTF mode. */
 #define RAWUCHARINC(eptr) \
  (*((eptr)++))
 /* Returns the next uchar, testing for UTF mode, and not advancing the
 pointer. */
 #define RAWUCHARTEST(eptr) \
  (*(eptr))
 /* Returns the next uchar, testing for UTF mode, advancing the
 pointer. */
 #define RAWUCHARINCTEST(eptr) \
  (*((eptr)++))
 /* If the pointer is not at the start of a character, move it back until
 it is. This is called only in UTF-32 mode - we don't put a test within the
 macro because almost all calls are already within a block of UTF-32 only
@ -1876,6 +1815,7 @@ contain characters with values greater than 255. */
 #define XCL_NOT       0x01    /* Flag: this is a negative class */
 #define XCL_MAP       0x02    /* Flag: a 32-byte map is present */
 #define XCL_HASPROP   0x04    /* Flag: property checks are present. */
 #define XCL_END       0    /* Marks end of individual items */
 #define XCL_SINGLE    1    /* Single item (one multibyte char) follows */
@ -2341,7 +2281,7 @@ enum { ERR0,  ERR1,  ERR2,  ERR3,  ERR4,  ERR5,  ERR6,  ERR7,  ERR8,  ERR9,
       ERR50, ERR51, ERR52, ERR53, ERR54, ERR55, ERR56, ERR57, ERR58, ERR59,
       ERR60, ERR61, ERR62, ERR63, ERR64, ERR65, ERR66, ERR67, ERR68, ERR69,
       ERR70, ERR71, ERR72, ERR73, ERR74, ERR75, ERR76, ERR77, ERR78, ERR79,
-       ERR80, ERR81, ERR82, ERR83, ERR84, ERRCOUNT };
+       ERR80, ERR81, ERR82, ERR83, ERR84, ERR85, ERR86, ERRCOUNT };
 /* JIT compiling modes. The function list is indexed by them. */
--- a/ext/pcre/pcrelib/pcre_printint.src
+++ b/ext/pcre/pcrelib/pcre_printint.src
@ -1,572 +0,0 @@
 /*************************************************
 *      Perl-Compatible Regular Expressions       *
 *************************************************/
 /* PCRE is a library of functions to support regular expressions whose syntax
 and semantics are as close as possible to those of the Perl 5 language.
                       Written by Philip Hazel
           Copyright (c) 1997-2010 University of Cambridge
 -----------------------------------------------------------------------------
 Redistribution and use in source and binary forms, with or without
 modification, are permitted provided that the following conditions are met:
    * Redistributions of source code must retain the above copyright notice,
      this list of conditions and the following disclaimer.
    * Redistributions in binary form must reproduce the above copyright
      notice, this list of conditions and the following disclaimer in the
      documentation and/or other materials provided with the distribution.
    * Neither the name of the University of Cambridge nor the names of its
      contributors may be used to endorse or promote products derived from
      this software without specific prior written permission.
 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
 AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
 LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
 CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
 SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
 INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
 CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
 ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
 POSSIBILITY OF SUCH DAMAGE.
 -----------------------------------------------------------------------------
 */
 /* This module contains a PCRE private debugging function for printing out the
 internal form of a compiled regular expression, along with some supporting
 local functions. This source file is used in two places:
 (1) It is #included by pcre_compile.c when it is compiled in debugging mode
 (PCRE_DEBUG defined in pcre_internal.h). It is not included in production
 compiles.
 (2) It is always #included by pcretest.c, which can be asked to print out a
 compiled regex for debugging purposes. */
 /* Macro that decides whether a character should be output as a literal or in
 hexadecimal. We don't use isprint() because that can vary from system to system
 (even without the use of locales) and we want the output always to be the same,
 for testing purposes. This macro is used in pcretest as well as in this file. */
 #ifdef EBCDIC
 #define PRINTABLE(c) ((c) >= 64 && (c) < 255)
 #else
 #define PRINTABLE(c) ((c) >= 32 && (c) < 127)
 #endif
 /* The table of operator names. */
 static const char *OP_names[] = { OP_NAME_LIST };
 /*************************************************
 *       Print single- or multi-byte character    *
 *************************************************/
 static int
 print_char(FILE *f, uschar *ptr, BOOL utf8)
 {
 int c = *ptr;
 #ifndef SUPPORT_UTF8
 utf8 = utf8;  /* Avoid compiler warning */
 if (PRINTABLE(c)) fprintf(f, "%c", c); else fprintf(f, "\\x%02x", c);
 return 0;
 #else
 if (!utf8 || (c & 0xc0) != 0xc0)
  {
  if (PRINTABLE(c)) fprintf(f, "%c", c); else fprintf(f, "\\x%02x", c);
  return 0;
  }
 else
  {
  int i;
  int a = _pcre_utf8_table4[c & 0x3f];  /* Number of additional bytes */
  int s = 6*a;
  c = (c & _pcre_utf8_table3[a]) << s;
  for (i = 1; i <= a; i++)
    {
    /* This is a check for malformed UTF-8; it should only occur if the sanity
    check has been turned off. Rather than swallow random bytes, just stop if
    we hit a bad one. Print it with \X instead of \x as an indication. */
    if ((ptr[i] & 0xc0) != 0x80)
      {
      fprintf(f, "\\X{%x}", c);
      return i - 1;
      }
    /* The byte is OK */
    s -= 6;
    c |= (ptr[i] & 0x3f) << s;
    }
  if (c < 128) fprintf(f, "\\x%02x", c); else fprintf(f, "\\x{%x}", c);
  return a;
  }
 #endif
 }
 /*************************************************
 *          Find Unicode property name            *
 *************************************************/
 static const char *
 get_ucpname(int ptype, int pvalue)
 {
 #ifdef SUPPORT_UCP
 int i;
 for (i = _pcre_utt_size - 1; i >= 0; i--)
  {
  if (ptype == _pcre_utt[i].type && pvalue == _pcre_utt[i].value) break;
  }
 return (i >= 0)? _pcre_utt_names + _pcre_utt[i].name_offset : "??";
 #else
 /* It gets harder and harder to shut off unwanted compiler warnings. */
 ptype = ptype * pvalue;
 return (ptype == pvalue)? "??" : "??";
 #endif
 }
 /*************************************************
 *         Print compiled regex                   *
 *************************************************/
 /* Make this function work for a regex with integers either byte order.
 However, we assume that what we are passed is a compiled regex. The
 print_lengths flag controls whether offsets and lengths of items are printed.
 They can be turned off from pcretest so that automatic tests on bytecode can be
 written that do not depend on the value of LINK_SIZE. */
 static void
 pcre_printint(pcre *external_re, FILE *f, BOOL print_lengths)
 {
 real_pcre *re = (real_pcre *)external_re;
 uschar *codestart, *code;
 BOOL utf8;
 unsigned int options = re->options;
 int offset = re->name_table_offset;
 int count = re->name_count;
 int size = re->name_entry_size;
 if (re->magic_number != MAGIC_NUMBER)
  {
  offset = ((offset << 8) & 0xff00) | ((offset >> 8) & 0xff);
  count = ((count << 8) & 0xff00) | ((count >> 8) & 0xff);
  size = ((size << 8) & 0xff00) | ((size >> 8) & 0xff);
  options = ((options << 24) & 0xff000000) |
            ((options <<  8) & 0x00ff0000) |
            ((options >>  8) & 0x0000ff00) |
            ((options >> 24) & 0x000000ff);
  }
 code = codestart = (uschar *)re + offset + count * size;
 utf8 = (options & PCRE_UTF8) != 0;
 for(;;)
  {
  uschar *ccode;
  int c;
  int extra = 0;
  if (print_lengths)
    fprintf(f, "%3d ", (int)(code - codestart));
  else
    fprintf(f, "    ");
  switch(*code)
    {
 /* ========================================================================== */
      /* These cases are never obeyed. This is a fudge that causes a compile-
      time error if the vectors OP_names or _pcre_OP_lengths, which are indexed
      by opcode, are not the correct length. It seems to be the only way to do
      such a check at compile time, as the sizeof() operator does not work in
      the C preprocessor. We do this while compiling pcretest, because that
      #includes pcre_tables.c, which holds _pcre_OP_lengths. We can't do this
      when building pcre_compile.c with PCRE_DEBUG set, because it doesn't then
      know the size of _pcre_OP_lengths. */
 #ifdef COMPILING_PCRETEST
      case OP_TABLE_LENGTH:
      case OP_TABLE_LENGTH +
        ((sizeof(OP_names)/sizeof(const char *) == OP_TABLE_LENGTH) &&
        (sizeof(_pcre_OP_lengths) == OP_TABLE_LENGTH)):
      break;
 #endif
 /* ========================================================================== */
    case OP_END:
    fprintf(f, "    %s\n", OP_names[*code]);
    fprintf(f, "------------------------------------------------------------------\n");
    return;
    case OP_OPT:
    fprintf(f, " %.2x %s", code[1], OP_names[*code]);
    break;
    case OP_CHAR:
    fprintf(f, "    ");
    do
      {
      code++;
      code += 1 + print_char(f, code, utf8);
      }
    while (*code == OP_CHAR);
    fprintf(f, "\n");
    continue;
    case OP_CHARNC:
    fprintf(f, " NC ");
    do
      {
      code++;
      code += 1 + print_char(f, code, utf8);
      }
    while (*code == OP_CHARNC);
    fprintf(f, "\n");
    continue;
    case OP_CBRA:
    case OP_SCBRA:
    if (print_lengths) fprintf(f, "%3d ", GET(code, 1));
      else fprintf(f, "    ");
    fprintf(f, "%s %d", OP_names[*code], GET2(code, 1+LINK_SIZE));
    break;
    case OP_BRA:
    case OP_SBRA:
    case OP_KETRMAX:
    case OP_KETRMIN:
    case OP_ALT:
    case OP_KET:
    case OP_ASSERT:
    case OP_ASSERT_NOT:
    case OP_ASSERTBACK:
    case OP_ASSERTBACK_NOT:
    case OP_ONCE:
    case OP_COND:
    case OP_SCOND:
    case OP_REVERSE:
    if (print_lengths) fprintf(f, "%3d ", GET(code, 1));
      else fprintf(f, "    ");
    fprintf(f, "%s", OP_names[*code]);
    break;
    case OP_CLOSE:
    fprintf(f, "    %s %d", OP_names[*code], GET2(code, 1));
    break;
    case OP_CREF:
    case OP_NCREF:
    fprintf(f, "%3d %s", GET2(code,1), OP_names[*code]);
    break;
    case OP_RREF:
    c = GET2(code, 1);
    if (c == RREF_ANY)
      fprintf(f, "    Cond recurse any");
    else
      fprintf(f, "    Cond recurse %d", c);
    break;
    case OP_NRREF:
    c = GET2(code, 1);
    if (c == RREF_ANY)
      fprintf(f, "    Cond nrecurse any");
    else
      fprintf(f, "    Cond nrecurse %d", c);
    break;
    case OP_DEF:
    fprintf(f, "    Cond def");
    break;
    case OP_STAR:
    case OP_MINSTAR:
    case OP_POSSTAR:
    case OP_PLUS:
    case OP_MINPLUS:
    case OP_POSPLUS:
    case OP_QUERY:
    case OP_MINQUERY:
    case OP_POSQUERY:
    case OP_TYPESTAR:
    case OP_TYPEMINSTAR:
    case OP_TYPEPOSSTAR:
    case OP_TYPEPLUS:
    case OP_TYPEMINPLUS:
    case OP_TYPEPOSPLUS:
    case OP_TYPEQUERY:
    case OP_TYPEMINQUERY:
    case OP_TYPEPOSQUERY:
    fprintf(f, "    ");
    if (*code >= OP_TYPESTAR)
      {
      fprintf(f, "%s", OP_names[code[1]]);
      if (code[1] == OP_PROP || code[1] == OP_NOTPROP)
        {
        fprintf(f, " %s ", get_ucpname(code[2], code[3]));
        extra = 2;
        }
      }
    else extra = print_char(f, code+1, utf8);
    fprintf(f, "%s", OP_names[*code]);
    break;
    case OP_EXACT:
    case OP_UPTO:
    case OP_MINUPTO:
    case OP_POSUPTO:
    fprintf(f, "    ");
    extra = print_char(f, code+3, utf8);
    fprintf(f, "{");
    if (*code != OP_EXACT) fprintf(f, "0,");
    fprintf(f, "%d}", GET2(code,1));
    if (*code == OP_MINUPTO) fprintf(f, "?");
      else if (*code == OP_POSUPTO) fprintf(f, "+");
    break;
    case OP_TYPEEXACT:
    case OP_TYPEUPTO:
    case OP_TYPEMINUPTO:
    case OP_TYPEPOSUPTO:
    fprintf(f, "    %s", OP_names[code[3]]);
    if (code[3] == OP_PROP || code[3] == OP_NOTPROP)
      {
      fprintf(f, " %s ", get_ucpname(code[4], code[5]));
      extra = 2;
      }
    fprintf(f, "{");
    if (*code != OP_TYPEEXACT) fprintf(f, "0,");
    fprintf(f, "%d}", GET2(code,1));
    if (*code == OP_TYPEMINUPTO) fprintf(f, "?");
      else if (*code == OP_TYPEPOSUPTO) fprintf(f, "+");
    break;
    case OP_NOT:
    c = code[1];
    if (PRINTABLE(c)) fprintf(f, "    [^%c]", c);
      else fprintf(f, "    [^\\x%02x]", c);
    break;
    case OP_NOTSTAR:
    case OP_NOTMINSTAR:
    case OP_NOTPOSSTAR:
    case OP_NOTPLUS:
    case OP_NOTMINPLUS:
    case OP_NOTPOSPLUS:
    case OP_NOTQUERY:
    case OP_NOTMINQUERY:
    case OP_NOTPOSQUERY:
    c = code[1];
    if (PRINTABLE(c)) fprintf(f, "    [^%c]", c);
      else fprintf(f, "    [^\\x%02x]", c);
    fprintf(f, "%s", OP_names[*code]);
    break;
    case OP_NOTEXACT:
    case OP_NOTUPTO:
    case OP_NOTMINUPTO:
    case OP_NOTPOSUPTO:
    c = code[3];
    if (PRINTABLE(c)) fprintf(f, "    [^%c]{", c);
      else fprintf(f, "    [^\\x%02x]{", c);
    if (*code != OP_NOTEXACT) fprintf(f, "0,");
    fprintf(f, "%d}", GET2(code,1));
    if (*code == OP_NOTMINUPTO) fprintf(f, "?");
      else if (*code == OP_NOTPOSUPTO) fprintf(f, "+");
    break;
    case OP_RECURSE:
    if (print_lengths) fprintf(f, "%3d ", GET(code, 1));
      else fprintf(f, "    ");
    fprintf(f, "%s", OP_names[*code]);
    break;
    case OP_REF:
    fprintf(f, "    \\%d", GET2(code,1));
    ccode = code + _pcre_OP_lengths[*code];
    goto CLASS_REF_REPEAT;
    case OP_CALLOUT:
    fprintf(f, "    %s %d %d %d", OP_names[*code], code[1], GET(code,2),
      GET(code, 2 + LINK_SIZE));
    break;
    case OP_PROP:
    case OP_NOTPROP:
    fprintf(f, "    %s %s", OP_names[*code], get_ucpname(code[1], code[2]));
    break;
    /* OP_XCLASS can only occur in UTF-8 mode. However, there's no harm in
    having this code always here, and it makes it less messy without all those
    #ifdefs. */
    case OP_CLASS:
    case OP_NCLASS:
    case OP_XCLASS:
      {
      int i, min, max;
      BOOL printmap;
      fprintf(f, "    [");
      if (*code == OP_XCLASS)
        {
        extra = GET(code, 1);
        ccode = code + LINK_SIZE + 1;
        printmap = (*ccode & XCL_MAP) != 0;
        if ((*ccode++ & XCL_NOT) != 0) fprintf(f, "^");
        }
      else
        {
        printmap = TRUE;
        ccode = code + 1;
        }
      /* Print a bit map */
      if (printmap)
        {
        for (i = 0; i < 256; i++)
          {
          if ((ccode[i/8] & (1 << (i&7))) != 0)
            {
            int j;
            for (j = i+1; j < 256; j++)
              if ((ccode[j/8] & (1 << (j&7))) == 0) break;
            if (i == '-' || i == ']') fprintf(f, "\\");
            if (PRINTABLE(i)) fprintf(f, "%c", i);
              else fprintf(f, "\\x%02x", i);
            if (--j > i)
              {
              if (j != i + 1) fprintf(f, "-");
              if (j == '-' || j == ']') fprintf(f, "\\");
              if (PRINTABLE(j)) fprintf(f, "%c", j);
                else fprintf(f, "\\x%02x", j);
              }
            i = j;
            }
          }
        ccode += 32;
        }
      /* For an XCLASS there is always some additional data */
      if (*code == OP_XCLASS)
        {
        int ch;
        while ((ch = *ccode++) != XCL_END)
          {
          if (ch == XCL_PROP)
            {
            int ptype = *ccode++;
            int pvalue = *ccode++;
            fprintf(f, "\\p{%s}", get_ucpname(ptype, pvalue));
            }
          else if (ch == XCL_NOTPROP)
            {
            int ptype = *ccode++;
            int pvalue = *ccode++;
            fprintf(f, "\\P{%s}", get_ucpname(ptype, pvalue));
            }
          else
            {
            ccode += 1 + print_char(f, ccode, TRUE);
            if (ch == XCL_RANGE)
              {
              fprintf(f, "-");
              ccode += 1 + print_char(f, ccode, TRUE);
              }
            }
          }
        }
      /* Indicate a non-UTF8 class which was created by negation */
      fprintf(f, "]%s", (*code == OP_NCLASS)? " (neg)" : "");
      /* Handle repeats after a class or a back reference */
      CLASS_REF_REPEAT:
      switch(*ccode)
        {
        case OP_CRSTAR:
        case OP_CRMINSTAR:
        case OP_CRPLUS:
        case OP_CRMINPLUS:
        case OP_CRQUERY:
        case OP_CRMINQUERY:
        fprintf(f, "%s", OP_names[*ccode]);
        extra += _pcre_OP_lengths[*ccode];
        break;
        case OP_CRRANGE:
        case OP_CRMINRANGE:
        min = GET2(ccode,1);
        max = GET2(ccode,3);
        if (max == 0) fprintf(f, "{%d,}", min);
        else fprintf(f, "{%d,%d}", min, max);
        if (*ccode == OP_CRMINRANGE) fprintf(f, "?");
        extra += _pcre_OP_lengths[*ccode];
        break;
        /* Do nothing if it's not a repeat; this code stops picky compilers
        warning about the lack of a default code path. */
        default:
        break;
        }
      }
    break;
    case OP_MARK:
    case OP_PRUNE_ARG:
    case OP_SKIP_ARG:
    fprintf(f, "    %s %s", OP_names[*code], code + 2);
    extra += code[1];
    break;
    case OP_THEN:
    if (print_lengths)
      fprintf(f, "    %s %d", OP_names[*code], GET(code, 1));
    else
      fprintf(f, "    %s", OP_names[*code]);
    break;
    case OP_THEN_ARG:
    if (print_lengths)
      fprintf(f, "    %s %d %s", OP_names[*code], GET(code, 1),
        code + 2 + LINK_SIZE);
    else
      fprintf(f, "    %s %s", OP_names[*code], code + 2 + LINK_SIZE);
    extra += code[1+LINK_SIZE];
    break;
    /* Anything else is just an item with no data*/
    default:
    fprintf(f, "    %s", OP_names[*code]);
    break;
    }
  code += _pcre_OP_lengths[*code] + extra;
  fprintf(f, "\n");
  }
 }
 /* End of pcre_printint.src */
--- a/ext/pcre/pcrelib/pcre_study.c
+++ b/ext/pcre/pcrelib/pcre_study.c
@ -863,7 +863,6 @@ do
      case OP_NOTUPTOI:
      case OP_NOT_HSPACE:
      case OP_NOT_VSPACE:
      case OP_PROP:
      case OP_PRUNE:
      case OP_PRUNE_ARG:
      case OP_RECURSE:
@ -879,11 +878,33 @@ do
      case OP_SOM:
      case OP_THEN:
      case OP_THEN_ARG:
 #if defined SUPPORT_UTF || !defined COMPILE_PCRE8
      case OP_XCLASS:
 #endif
      return SSB_FAIL;
      /* A "real" property test implies no starting bits, but the fake property
      PT_CLIST identifies a list of characters. These lists are short, as they
      are used for characters with more than one "other case", so there is no
      point in recognizing them for OP_NOTPROP. */
      case OP_PROP:
      if (tcode[1] != PT_CLIST) return SSB_FAIL;
        {
        const pcre_uint32 *p = PRIV(ucd_caseless_sets) + tcode[2];
        while ((c = *p++) < NOTACHAR)
          {
 #if defined SUPPORT_UTF && defined COMPILE_PCRE8
          if (utf)
            {
            pcre_uchar buff[6];
            (void)PRIV(ord2utf)(c, buff);
            c = buff[0];
            }
 #endif
          if (c > 0xff) SET_BIT(0xff); else SET_BIT(c);
          }
        }
      try_next = FALSE;
      break;
      /* We can ignore word boundary tests. */
      case OP_WORD_BOUNDARY:
@ -1109,24 +1130,17 @@ do
      try_next = FALSE;
      break;
-      /* The cbit_space table has vertical tab as whitespace; we have to
+      /* The cbit_space table has vertical tab as whitespace; we no longer
-      ensure it is set as not whitespace. Luckily, the code value is the same
+      have to play fancy tricks because Perl added VT to its whitespace at
-      (0x0b) in ASCII and EBCDIC, so we can just adjust the appropriate bit. */
+      release 5.18. PCRE added it at release 8.34. */
      case OP_NOT_WHITESPACE:
      set_nottype_bits(start_bits, cbit_space, table_limit, cd);
      start_bits[1] |= 0x08;
      try_next = FALSE;
      break;
      /* The cbit_space table has vertical tab as whitespace; we have to not
      set it from the table. Luckily, the code value is the same (0x0b) in
      ASCII and EBCDIC, so we can just adjust the appropriate bit. */
      case OP_WHITESPACE:
      c = start_bits[1];    /* Save in case it was already set */
      set_type_bits(start_bits, cbit_space, table_limit, cd);
      start_bits[1] = (start_bits[1] & ~0x08) | c;
      try_next = FALSE;
      break;
@ -1257,6 +1271,16 @@ do
      with a value >= 0xc4 is a potentially valid starter because it starts a
      character with a value > 255. */
 #if defined SUPPORT_UTF || !defined COMPILE_PCRE8
      case OP_XCLASS:
      if ((tcode[1 + LINK_SIZE] & XCL_HASPROP) != 0)
        return SSB_FAIL;
      /* All bits are set. */
      if ((tcode[1 + LINK_SIZE] & XCL_MAP) == 0 && (tcode[1 + LINK_SIZE] & XCL_NOT) != 0)
        return SSB_FAIL;
 #endif
      /* Fall through */
      case OP_NCLASS:
 #if defined SUPPORT_UTF && defined COMPILE_PCRE8
      if (utf)
@ -1273,8 +1297,21 @@ do
      case OP_CLASS:
        {
        pcre_uint8 *map;
 #if defined SUPPORT_UTF || !defined COMPILE_PCRE8
        map = NULL;
        if (*tcode == OP_XCLASS)
          {
          if ((tcode[1 + LINK_SIZE] & XCL_MAP) != 0)
            map = (pcre_uint8 *)(tcode + 1 + LINK_SIZE + 1);
          tcode += GET(tcode, 1);
          }
        else
 #endif
          {
          tcode++;
          map = (pcre_uint8 *)tcode;
          tcode += 32 / sizeof(pcre_uchar);
          }
        /* In UTF-8 mode, the bits in a bit map correspond to character
        values, not to byte values. However, the bit map we are constructing is
@ -1282,6 +1319,10 @@ do
        value is > 127. In fact, there are only two possible starting bytes for
        characters in the range 128 - 255. */
 #if defined SUPPORT_UTF || !defined COMPILE_PCRE8
        if (map != NULL)
 #endif
          {
 #if defined SUPPORT_UTF && defined COMPILE_PCRE8
          if (utf)
            {
@ -1302,11 +1343,11 @@ do
            /* In non-UTF-8 mode, the two bit maps are completely compatible. */
            for (c = 0; c < 32; c++) start_bits[c] |= map[c];
            }
          }
        /* Advance past the bit map, and act on what follows. For a zero
        minimum repeat, continue; otherwise stop processing. */
        tcode += 32 / sizeof(pcre_uchar);
        switch (*tcode)
          {
          case OP_CRSTAR:
--- a/ext/pcre/pcrelib/pcre_tables.c
+++ b/ext/pcre/pcrelib/pcre_tables.c
@ -213,6 +213,7 @@ strings to make sure that UTF-8 support works on EBCDIC platforms. */
 #define STRING_Avestan0 STR_A STR_v STR_e STR_s STR_t STR_a STR_n "\0"
 #define STRING_Balinese0 STR_B STR_a STR_l STR_i STR_n STR_e STR_s STR_e "\0"
 #define STRING_Bamum0 STR_B STR_a STR_m STR_u STR_m "\0"
 #define STRING_Bassa_Vah0 STR_B STR_a STR_s STR_s STR_a STR_UNDERSCORE STR_V STR_a STR_h "\0"
 #define STRING_Batak0 STR_B STR_a STR_t STR_a STR_k "\0"
 #define STRING_Bengali0 STR_B STR_e STR_n STR_g STR_a STR_l STR_i "\0"
 #define STRING_Bopomofo0 STR_B STR_o STR_p STR_o STR_m STR_o STR_f STR_o "\0"
@ -223,6 +224,7 @@ strings to make sure that UTF-8 support works on EBCDIC platforms. */
 #define STRING_C0 STR_C "\0"
 #define STRING_Canadian_Aboriginal0 STR_C STR_a STR_n STR_a STR_d STR_i STR_a STR_n STR_UNDERSCORE STR_A STR_b STR_o STR_r STR_i STR_g STR_i STR_n STR_a STR_l "\0"
 #define STRING_Carian0 STR_C STR_a STR_r STR_i STR_a STR_n "\0"
 #define STRING_Caucasian_Albanian0 STR_C STR_a STR_u STR_c STR_a STR_s STR_i STR_a STR_n STR_UNDERSCORE STR_A STR_l STR_b STR_a STR_n STR_i STR_a STR_n "\0"
 #define STRING_Cc0 STR_C STR_c "\0"
 #define STRING_Cf0 STR_C STR_f "\0"
 #define STRING_Chakma0 STR_C STR_h STR_a STR_k STR_m STR_a "\0"
@ -238,11 +240,14 @@ strings to make sure that UTF-8 support works on EBCDIC platforms. */
 #define STRING_Cyrillic0 STR_C STR_y STR_r STR_i STR_l STR_l STR_i STR_c "\0"
 #define STRING_Deseret0 STR_D STR_e STR_s STR_e STR_r STR_e STR_t "\0"
 #define STRING_Devanagari0 STR_D STR_e STR_v STR_a STR_n STR_a STR_g STR_a STR_r STR_i "\0"
 #define STRING_Duployan0 STR_D STR_u STR_p STR_l STR_o STR_y STR_a STR_n "\0"
 #define STRING_Egyptian_Hieroglyphs0 STR_E STR_g STR_y STR_p STR_t STR_i STR_a STR_n STR_UNDERSCORE STR_H STR_i STR_e STR_r STR_o STR_g STR_l STR_y STR_p STR_h STR_s "\0"
 #define STRING_Elbasan0 STR_E STR_l STR_b STR_a STR_s STR_a STR_n "\0"
 #define STRING_Ethiopic0 STR_E STR_t STR_h STR_i STR_o STR_p STR_i STR_c "\0"
 #define STRING_Georgian0 STR_G STR_e STR_o STR_r STR_g STR_i STR_a STR_n "\0"
 #define STRING_Glagolitic0 STR_G STR_l STR_a STR_g STR_o STR_l STR_i STR_t STR_i STR_c "\0"
 #define STRING_Gothic0 STR_G STR_o STR_t STR_h STR_i STR_c "\0"
 #define STRING_Grantha0 STR_G STR_r STR_a STR_n STR_t STR_h STR_a "\0"
 #define STRING_Greek0 STR_G STR_r STR_e STR_e STR_k "\0"
 #define STRING_Gujarati0 STR_G STR_u STR_j STR_a STR_r STR_a STR_t STR_i "\0"
 #define STRING_Gurmukhi0 STR_G STR_u STR_r STR_m STR_u STR_k STR_h STR_i "\0"
@ -262,12 +267,15 @@ strings to make sure that UTF-8 support works on EBCDIC platforms. */
 #define STRING_Kayah_Li0 STR_K STR_a STR_y STR_a STR_h STR_UNDERSCORE STR_L STR_i "\0"
 #define STRING_Kharoshthi0 STR_K STR_h STR_a STR_r STR_o STR_s STR_h STR_t STR_h STR_i "\0"
 #define STRING_Khmer0 STR_K STR_h STR_m STR_e STR_r "\0"
 #define STRING_Khojki0 STR_K STR_h STR_o STR_j STR_k STR_i "\0"
 #define STRING_Khudawadi0 STR_K STR_h STR_u STR_d STR_a STR_w STR_a STR_d STR_i "\0"
 #define STRING_L0 STR_L "\0"
 #define STRING_L_AMPERSAND0 STR_L STR_AMPERSAND "\0"
 #define STRING_Lao0 STR_L STR_a STR_o "\0"
 #define STRING_Latin0 STR_L STR_a STR_t STR_i STR_n "\0"
 #define STRING_Lepcha0 STR_L STR_e STR_p STR_c STR_h STR_a "\0"
 #define STRING_Limbu0 STR_L STR_i STR_m STR_b STR_u "\0"
 #define STRING_Linear_A0 STR_L STR_i STR_n STR_e STR_a STR_r STR_UNDERSCORE STR_A "\0"
 #define STRING_Linear_B0 STR_L STR_i STR_n STR_e STR_a STR_r STR_UNDERSCORE STR_B "\0"
 #define STRING_Lisu0 STR_L STR_i STR_s STR_u "\0"
 #define STRING_Ll0 STR_L STR_l "\0"
@ -278,18 +286,24 @@ strings to make sure that UTF-8 support works on EBCDIC platforms. */
 #define STRING_Lycian0 STR_L STR_y STR_c STR_i STR_a STR_n "\0"
 #define STRING_Lydian0 STR_L STR_y STR_d STR_i STR_a STR_n "\0"
 #define STRING_M0 STR_M "\0"
 #define STRING_Mahajani0 STR_M STR_a STR_h STR_a STR_j STR_a STR_n STR_i "\0"
 #define STRING_Malayalam0 STR_M STR_a STR_l STR_a STR_y STR_a STR_l STR_a STR_m "\0"
 #define STRING_Mandaic0 STR_M STR_a STR_n STR_d STR_a STR_i STR_c "\0"
 #define STRING_Manichaean0 STR_M STR_a STR_n STR_i STR_c STR_h STR_a STR_e STR_a STR_n "\0"
 #define STRING_Mc0 STR_M STR_c "\0"
 #define STRING_Me0 STR_M STR_e "\0"
 #define STRING_Meetei_Mayek0 STR_M STR_e STR_e STR_t STR_e STR_i STR_UNDERSCORE STR_M STR_a STR_y STR_e STR_k "\0"
 #define STRING_Mende_Kikakui0 STR_M STR_e STR_n STR_d STR_e STR_UNDERSCORE STR_K STR_i STR_k STR_a STR_k STR_u STR_i "\0"
 #define STRING_Meroitic_Cursive0 STR_M STR_e STR_r STR_o STR_i STR_t STR_i STR_c STR_UNDERSCORE STR_C STR_u STR_r STR_s STR_i STR_v STR_e "\0"
 #define STRING_Meroitic_Hieroglyphs0 STR_M STR_e STR_r STR_o STR_i STR_t STR_i STR_c STR_UNDERSCORE STR_H STR_i STR_e STR_r STR_o STR_g STR_l STR_y STR_p STR_h STR_s "\0"
 #define STRING_Miao0 STR_M STR_i STR_a STR_o "\0"
 #define STRING_Mn0 STR_M STR_n "\0"
 #define STRING_Modi0 STR_M STR_o STR_d STR_i "\0"
 #define STRING_Mongolian0 STR_M STR_o STR_n STR_g STR_o STR_l STR_i STR_a STR_n "\0"
 #define STRING_Mro0 STR_M STR_r STR_o "\0"
 #define STRING_Myanmar0 STR_M STR_y STR_a STR_n STR_m STR_a STR_r "\0"
 #define STRING_N0 STR_N "\0"
 #define STRING_Nabataean0 STR_N STR_a STR_b STR_a STR_t STR_a STR_e STR_a STR_n "\0"
 #define STRING_Nd0 STR_N STR_d "\0"
 #define STRING_New_Tai_Lue0 STR_N STR_e STR_w STR_UNDERSCORE STR_T STR_a STR_i STR_UNDERSCORE STR_L STR_u STR_e "\0"
 #define STRING_Nko0 STR_N STR_k STR_o "\0"
@ -298,12 +312,17 @@ strings to make sure that UTF-8 support works on EBCDIC platforms. */
 #define STRING_Ogham0 STR_O STR_g STR_h STR_a STR_m "\0"
 #define STRING_Ol_Chiki0 STR_O STR_l STR_UNDERSCORE STR_C STR_h STR_i STR_k STR_i "\0"
 #define STRING_Old_Italic0 STR_O STR_l STR_d STR_UNDERSCORE STR_I STR_t STR_a STR_l STR_i STR_c "\0"
 #define STRING_Old_North_Arabian0 STR_O STR_l STR_d STR_UNDERSCORE STR_N STR_o STR_r STR_t STR_h STR_UNDERSCORE STR_A STR_r STR_a STR_b STR_i STR_a STR_n "\0"
 #define STRING_Old_Permic0 STR_O STR_l STR_d STR_UNDERSCORE STR_P STR_e STR_r STR_m STR_i STR_c "\0"
 #define STRING_Old_Persian0 STR_O STR_l STR_d STR_UNDERSCORE STR_P STR_e STR_r STR_s STR_i STR_a STR_n "\0"
 #define STRING_Old_South_Arabian0 STR_O STR_l STR_d STR_UNDERSCORE STR_S STR_o STR_u STR_t STR_h STR_UNDERSCORE STR_A STR_r STR_a STR_b STR_i STR_a STR_n "\0"
 #define STRING_Old_Turkic0 STR_O STR_l STR_d STR_UNDERSCORE STR_T STR_u STR_r STR_k STR_i STR_c "\0"
 #define STRING_Oriya0 STR_O STR_r STR_i STR_y STR_a "\0"
 #define STRING_Osmanya0 STR_O STR_s STR_m STR_a STR_n STR_y STR_a "\0"
 #define STRING_P0 STR_P "\0"
 #define STRING_Pahawh_Hmong0 STR_P STR_a STR_h STR_a STR_w STR_h STR_UNDERSCORE STR_H STR_m STR_o STR_n STR_g "\0"
 #define STRING_Palmyrene0 STR_P STR_a STR_l STR_m STR_y STR_r STR_e STR_n STR_e "\0"
 #define STRING_Pau_Cin_Hau0 STR_P STR_a STR_u STR_UNDERSCORE STR_C STR_i STR_n STR_UNDERSCORE STR_H STR_a STR_u "\0"
 #define STRING_Pc0 STR_P STR_c "\0"
 #define STRING_Pd0 STR_P STR_d "\0"
 #define STRING_Pe0 STR_P STR_e "\0"
@ -313,6 +332,7 @@ strings to make sure that UTF-8 support works on EBCDIC platforms. */
 #define STRING_Pi0 STR_P STR_i "\0"
 #define STRING_Po0 STR_P STR_o "\0"
 #define STRING_Ps0 STR_P STR_s "\0"
 #define STRING_Psalter_Pahlavi0 STR_P STR_s STR_a STR_l STR_t STR_e STR_r STR_UNDERSCORE STR_P STR_a STR_h STR_l STR_a STR_v STR_i "\0"
 #define STRING_Rejang0 STR_R STR_e STR_j STR_a STR_n STR_g "\0"
 #define STRING_Runic0 STR_R STR_u STR_n STR_i STR_c "\0"
 #define STRING_S0 STR_S "\0"
@ -321,6 +341,7 @@ strings to make sure that UTF-8 support works on EBCDIC platforms. */
 #define STRING_Sc0 STR_S STR_c "\0"
 #define STRING_Sharada0 STR_S STR_h STR_a STR_r STR_a STR_d STR_a "\0"
 #define STRING_Shavian0 STR_S STR_h STR_a STR_v STR_i STR_a STR_n "\0"
 #define STRING_Siddham0 STR_S STR_i STR_d STR_d STR_h STR_a STR_m "\0"
 #define STRING_Sinhala0 STR_S STR_i STR_n STR_h STR_a STR_l STR_a "\0"
 #define STRING_Sk0 STR_S STR_k "\0"
 #define STRING_Sm0 STR_S STR_m "\0"
@ -341,8 +362,10 @@ strings to make sure that UTF-8 support works on EBCDIC platforms. */
 #define STRING_Thai0 STR_T STR_h STR_a STR_i "\0"
 #define STRING_Tibetan0 STR_T STR_i STR_b STR_e STR_t STR_a STR_n "\0"
 #define STRING_Tifinagh0 STR_T STR_i STR_f STR_i STR_n STR_a STR_g STR_h "\0"
 #define STRING_Tirhuta0 STR_T STR_i STR_r STR_h STR_u STR_t STR_a "\0"
 #define STRING_Ugaritic0 STR_U STR_g STR_a STR_r STR_i STR_t STR_i STR_c "\0"
 #define STRING_Vai0 STR_V STR_a STR_i "\0"
 #define STRING_Warang_Citi0 STR_W STR_a STR_r STR_a STR_n STR_g STR_UNDERSCORE STR_C STR_i STR_t STR_i "\0"
 #define STRING_Xan0 STR_X STR_a STR_n "\0"
 #define STRING_Xps0 STR_X STR_p STR_s "\0"
 #define STRING_Xsp0 STR_X STR_s STR_p "\0"
@ -361,6 +384,7 @@ const char PRIV(utt_names)[] =
  STRING_Avestan0
  STRING_Balinese0
  STRING_Bamum0
  STRING_Bassa_Vah0
  STRING_Batak0
  STRING_Bengali0
  STRING_Bopomofo0
@ -371,6 +395,7 @@ const char PRIV(utt_names)[] =
  STRING_C0
  STRING_Canadian_Aboriginal0
  STRING_Carian0
  STRING_Caucasian_Albanian0
  STRING_Cc0
  STRING_Cf0
  STRING_Chakma0
@ -386,11 +411,14 @@ const char PRIV(utt_names)[] =
  STRING_Cyrillic0
  STRING_Deseret0
  STRING_Devanagari0
  STRING_Duployan0
  STRING_Egyptian_Hieroglyphs0
  STRING_Elbasan0
  STRING_Ethiopic0
  STRING_Georgian0
  STRING_Glagolitic0
  STRING_Gothic0
  STRING_Grantha0
  STRING_Greek0
  STRING_Gujarati0
  STRING_Gurmukhi0
@ -410,12 +438,15 @@ const char PRIV(utt_names)[] =
  STRING_Kayah_Li0
  STRING_Kharoshthi0
  STRING_Khmer0
  STRING_Khojki0
  STRING_Khudawadi0
  STRING_L0
  STRING_L_AMPERSAND0
  STRING_Lao0
  STRING_Latin0
  STRING_Lepcha0
  STRING_Limbu0
  STRING_Linear_A0
  STRING_Linear_B0
  STRING_Lisu0
  STRING_Ll0
@ -426,18 +457,24 @@ const char PRIV(utt_names)[] =
  STRING_Lycian0
  STRING_Lydian0
  STRING_M0
  STRING_Mahajani0
  STRING_Malayalam0
  STRING_Mandaic0
  STRING_Manichaean0
  STRING_Mc0
  STRING_Me0
  STRING_Meetei_Mayek0
  STRING_Mende_Kikakui0
  STRING_Meroitic_Cursive0
  STRING_Meroitic_Hieroglyphs0
  STRING_Miao0
  STRING_Mn0
  STRING_Modi0
  STRING_Mongolian0
  STRING_Mro0
  STRING_Myanmar0
  STRING_N0
  STRING_Nabataean0
  STRING_Nd0
  STRING_New_Tai_Lue0
  STRING_Nko0
@ -446,12 +483,17 @@ const char PRIV(utt_names)[] =
  STRING_Ogham0
  STRING_Ol_Chiki0
  STRING_Old_Italic0
  STRING_Old_North_Arabian0
  STRING_Old_Permic0
  STRING_Old_Persian0
  STRING_Old_South_Arabian0
  STRING_Old_Turkic0
  STRING_Oriya0
  STRING_Osmanya0
  STRING_P0
  STRING_Pahawh_Hmong0
  STRING_Palmyrene0
  STRING_Pau_Cin_Hau0
  STRING_Pc0
  STRING_Pd0
  STRING_Pe0
@ -461,6 +503,7 @@ const char PRIV(utt_names)[] =
  STRING_Pi0
  STRING_Po0
  STRING_Ps0
  STRING_Psalter_Pahlavi0
  STRING_Rejang0
  STRING_Runic0
  STRING_S0
@ -469,6 +512,7 @@ const char PRIV(utt_names)[] =
  STRING_Sc0
  STRING_Sharada0
  STRING_Shavian0
  STRING_Siddham0
  STRING_Sinhala0
  STRING_Sk0
  STRING_Sm0
@ -489,8 +533,10 @@ const char PRIV(utt_names)[] =
  STRING_Thai0
  STRING_Tibetan0
  STRING_Tifinagh0
  STRING_Tirhuta0
  STRING_Ugaritic0
  STRING_Vai0
  STRING_Warang_Citi0
  STRING_Xan0
  STRING_Xps0
  STRING_Xsp0
@ -509,146 +555,169 @@ const ucp_type_table PRIV(utt)[] = {
  {  20, PT_SC, ucp_Avestan },
  {  28, PT_SC, ucp_Balinese },
  {  37, PT_SC, ucp_Bamum },
-  {  43, PT_SC, ucp_Batak },
+  {  43, PT_SC, ucp_Bassa_Vah },
-  {  49, PT_SC, ucp_Bengali },
+  {  53, PT_SC, ucp_Batak },
-  {  57, PT_SC, ucp_Bopomofo },
+  {  59, PT_SC, ucp_Bengali },
-  {  66, PT_SC, ucp_Brahmi },
+  {  67, PT_SC, ucp_Bopomofo },
-  {  73, PT_SC, ucp_Braille },
+  {  76, PT_SC, ucp_Brahmi },
-  {  81, PT_SC, ucp_Buginese },
+  {  83, PT_SC, ucp_Braille },
-  {  90, PT_SC, ucp_Buhid },
+  {  91, PT_SC, ucp_Buginese },
-  {  96, PT_GC, ucp_C },
+  { 100, PT_SC, ucp_Buhid },
-  {  98, PT_SC, ucp_Canadian_Aboriginal },
+  { 106, PT_GC, ucp_C },
-  { 118, PT_SC, ucp_Carian },
+  { 108, PT_SC, ucp_Canadian_Aboriginal },
-  { 125, PT_PC, ucp_Cc },
+  { 128, PT_SC, ucp_Carian },
-  { 128, PT_PC, ucp_Cf },
+  { 135, PT_SC, ucp_Caucasian_Albanian },
-  { 131, PT_SC, ucp_Chakma },
+  { 154, PT_PC, ucp_Cc },
-  { 138, PT_SC, ucp_Cham },
+  { 157, PT_PC, ucp_Cf },
-  { 143, PT_SC, ucp_Cherokee },
+  { 160, PT_SC, ucp_Chakma },
-  { 152, PT_PC, ucp_Cn },
+  { 167, PT_SC, ucp_Cham },
-  { 155, PT_PC, ucp_Co },
+  { 172, PT_SC, ucp_Cherokee },
-  { 158, PT_SC, ucp_Common },
+  { 181, PT_PC, ucp_Cn },
-  { 165, PT_SC, ucp_Coptic },
+  { 184, PT_PC, ucp_Co },
-  { 172, PT_PC, ucp_Cs },
+  { 187, PT_SC, ucp_Common },
-  { 175, PT_SC, ucp_Cuneiform },
+  { 194, PT_SC, ucp_Coptic },
-  { 185, PT_SC, ucp_Cypriot },
+  { 201, PT_PC, ucp_Cs },
-  { 193, PT_SC, ucp_Cyrillic },
+  { 204, PT_SC, ucp_Cuneiform },
-  { 202, PT_SC, ucp_Deseret },
+  { 214, PT_SC, ucp_Cypriot },
-  { 210, PT_SC, ucp_Devanagari },
+  { 222, PT_SC, ucp_Cyrillic },
-  { 221, PT_SC, ucp_Egyptian_Hieroglyphs },
+  { 231, PT_SC, ucp_Deseret },
-  { 242, PT_SC, ucp_Ethiopic },
+  { 239, PT_SC, ucp_Devanagari },
-  { 251, PT_SC, ucp_Georgian },
+  { 250, PT_SC, ucp_Duployan },
-  { 260, PT_SC, ucp_Glagolitic },
+  { 259, PT_SC, ucp_Egyptian_Hieroglyphs },
-  { 271, PT_SC, ucp_Gothic },
+  { 280, PT_SC, ucp_Elbasan },
-  { 278, PT_SC, ucp_Greek },
+  { 288, PT_SC, ucp_Ethiopic },
-  { 284, PT_SC, ucp_Gujarati },
+  { 297, PT_SC, ucp_Georgian },
-  { 293, PT_SC, ucp_Gurmukhi },
+  { 306, PT_SC, ucp_Glagolitic },
-  { 302, PT_SC, ucp_Han },
+  { 317, PT_SC, ucp_Gothic },
-  { 306, PT_SC, ucp_Hangul },
+  { 324, PT_SC, ucp_Grantha },
-  { 313, PT_SC, ucp_Hanunoo },
+  { 332, PT_SC, ucp_Greek },
-  { 321, PT_SC, ucp_Hebrew },
+  { 338, PT_SC, ucp_Gujarati },
-  { 328, PT_SC, ucp_Hiragana },
+  { 347, PT_SC, ucp_Gurmukhi },
-  { 337, PT_SC, ucp_Imperial_Aramaic },
+  { 356, PT_SC, ucp_Han },
-  { 354, PT_SC, ucp_Inherited },
+  { 360, PT_SC, ucp_Hangul },
-  { 364, PT_SC, ucp_Inscriptional_Pahlavi },
+  { 367, PT_SC, ucp_Hanunoo },
-  { 386, PT_SC, ucp_Inscriptional_Parthian },
+  { 375, PT_SC, ucp_Hebrew },
-  { 409, PT_SC, ucp_Javanese },
+  { 382, PT_SC, ucp_Hiragana },
-  { 418, PT_SC, ucp_Kaithi },
+  { 391, PT_SC, ucp_Imperial_Aramaic },
-  { 425, PT_SC, ucp_Kannada },
+  { 408, PT_SC, ucp_Inherited },
-  { 433, PT_SC, ucp_Katakana },
+  { 418, PT_SC, ucp_Inscriptional_Pahlavi },
-  { 442, PT_SC, ucp_Kayah_Li },
+  { 440, PT_SC, ucp_Inscriptional_Parthian },
-  { 451, PT_SC, ucp_Kharoshthi },
+  { 463, PT_SC, ucp_Javanese },
-  { 462, PT_SC, ucp_Khmer },
+  { 472, PT_SC, ucp_Kaithi },
-  { 468, PT_GC, ucp_L },
+  { 479, PT_SC, ucp_Kannada },
-  { 470, PT_LAMP, 0 },
+  { 487, PT_SC, ucp_Katakana },
-  { 473, PT_SC, ucp_Lao },
+  { 496, PT_SC, ucp_Kayah_Li },
-  { 477, PT_SC, ucp_Latin },
+  { 505, PT_SC, ucp_Kharoshthi },
-  { 483, PT_SC, ucp_Lepcha },
+  { 516, PT_SC, ucp_Khmer },
-  { 490, PT_SC, ucp_Limbu },
+  { 522, PT_SC, ucp_Khojki },
-  { 496, PT_SC, ucp_Linear_B },
+  { 529, PT_SC, ucp_Khudawadi },
-  { 505, PT_SC, ucp_Lisu },
+  { 539, PT_GC, ucp_L },
-  { 510, PT_PC, ucp_Ll },
+  { 541, PT_LAMP, 0 },
-  { 513, PT_PC, ucp_Lm },
+  { 544, PT_SC, ucp_Lao },
-  { 516, PT_PC, ucp_Lo },
+  { 548, PT_SC, ucp_Latin },
-  { 519, PT_PC, ucp_Lt },
+  { 554, PT_SC, ucp_Lepcha },
-  { 522, PT_PC, ucp_Lu },
+  { 561, PT_SC, ucp_Limbu },
-  { 525, PT_SC, ucp_Lycian },
+  { 567, PT_SC, ucp_Linear_A },
-  { 532, PT_SC, ucp_Lydian },
+  { 576, PT_SC, ucp_Linear_B },
-  { 539, PT_GC, ucp_M },
+  { 585, PT_SC, ucp_Lisu },
-  { 541, PT_SC, ucp_Malayalam },
+  { 590, PT_PC, ucp_Ll },
-  { 551, PT_SC, ucp_Mandaic },
+  { 593, PT_PC, ucp_Lm },
-  { 559, PT_PC, ucp_Mc },
+  { 596, PT_PC, ucp_Lo },
-  { 562, PT_PC, ucp_Me },
+  { 599, PT_PC, ucp_Lt },
-  { 565, PT_SC, ucp_Meetei_Mayek },
+  { 602, PT_PC, ucp_Lu },
-  { 578, PT_SC, ucp_Meroitic_Cursive },
+  { 605, PT_SC, ucp_Lycian },
-  { 595, PT_SC, ucp_Meroitic_Hieroglyphs },
+  { 612, PT_SC, ucp_Lydian },
-  { 616, PT_SC, ucp_Miao },
+  { 619, PT_GC, ucp_M },
-  { 621, PT_PC, ucp_Mn },
+  { 621, PT_SC, ucp_Mahajani },
-  { 624, PT_SC, ucp_Mongolian },
+  { 630, PT_SC, ucp_Malayalam },
-  { 634, PT_SC, ucp_Myanmar },
+  { 640, PT_SC, ucp_Mandaic },
-  { 642, PT_GC, ucp_N },
+  { 648, PT_SC, ucp_Manichaean },
-  { 644, PT_PC, ucp_Nd },
+  { 659, PT_PC, ucp_Mc },
-  { 647, PT_SC, ucp_New_Tai_Lue },
+  { 662, PT_PC, ucp_Me },
-  { 659, PT_SC, ucp_Nko },
+  { 665, PT_SC, ucp_Meetei_Mayek },
-  { 663, PT_PC, ucp_Nl },
+  { 678, PT_SC, ucp_Mende_Kikakui },
-  { 666, PT_PC, ucp_No },
+  { 692, PT_SC, ucp_Meroitic_Cursive },
-  { 669, PT_SC, ucp_Ogham },
+  { 709, PT_SC, ucp_Meroitic_Hieroglyphs },
-  { 675, PT_SC, ucp_Ol_Chiki },
+  { 730, PT_SC, ucp_Miao },
-  { 684, PT_SC, ucp_Old_Italic },
+  { 735, PT_PC, ucp_Mn },
-  { 695, PT_SC, ucp_Old_Persian },
+  { 738, PT_SC, ucp_Modi },
-  { 707, PT_SC, ucp_Old_South_Arabian },
+  { 743, PT_SC, ucp_Mongolian },
-  { 725, PT_SC, ucp_Old_Turkic },
+  { 753, PT_SC, ucp_Mro },
-  { 736, PT_SC, ucp_Oriya },
+  { 757, PT_SC, ucp_Myanmar },
-  { 742, PT_SC, ucp_Osmanya },
+  { 765, PT_GC, ucp_N },
-  { 750, PT_GC, ucp_P },
+  { 767, PT_SC, ucp_Nabataean },
-  { 752, PT_PC, ucp_Pc },
+  { 777, PT_PC, ucp_Nd },
-  { 755, PT_PC, ucp_Pd },
+  { 780, PT_SC, ucp_New_Tai_Lue },
-  { 758, PT_PC, ucp_Pe },
+  { 792, PT_SC, ucp_Nko },
-  { 761, PT_PC, ucp_Pf },
+  { 796, PT_PC, ucp_Nl },
-  { 764, PT_SC, ucp_Phags_Pa },
+  { 799, PT_PC, ucp_No },
-  { 773, PT_SC, ucp_Phoenician },
+  { 802, PT_SC, ucp_Ogham },
-  { 784, PT_PC, ucp_Pi },
+  { 808, PT_SC, ucp_Ol_Chiki },
-  { 787, PT_PC, ucp_Po },
+  { 817, PT_SC, ucp_Old_Italic },
-  { 790, PT_PC, ucp_Ps },
+  { 828, PT_SC, ucp_Old_North_Arabian },
-  { 793, PT_SC, ucp_Rejang },
+  { 846, PT_SC, ucp_Old_Permic },
-  { 800, PT_SC, ucp_Runic },
+  { 857, PT_SC, ucp_Old_Persian },
-  { 806, PT_GC, ucp_S },
+  { 869, PT_SC, ucp_Old_South_Arabian },
-  { 808, PT_SC, ucp_Samaritan },
+  { 887, PT_SC, ucp_Old_Turkic },
-  { 818, PT_SC, ucp_Saurashtra },
+  { 898, PT_SC, ucp_Oriya },
-  { 829, PT_PC, ucp_Sc },
+  { 904, PT_SC, ucp_Osmanya },
-  { 832, PT_SC, ucp_Sharada },
+  { 912, PT_GC, ucp_P },
-  { 840, PT_SC, ucp_Shavian },
+  { 914, PT_SC, ucp_Pahawh_Hmong },
-  { 848, PT_SC, ucp_Sinhala },
+  { 927, PT_SC, ucp_Palmyrene },
-  { 856, PT_PC, ucp_Sk },
+  { 937, PT_SC, ucp_Pau_Cin_Hau },
-  { 859, PT_PC, ucp_Sm },
+  { 949, PT_PC, ucp_Pc },
-  { 862, PT_PC, ucp_So },
+  { 952, PT_PC, ucp_Pd },
-  { 865, PT_SC, ucp_Sora_Sompeng },
+  { 955, PT_PC, ucp_Pe },
-  { 878, PT_SC, ucp_Sundanese },
+  { 958, PT_PC, ucp_Pf },
-  { 888, PT_SC, ucp_Syloti_Nagri },
+  { 961, PT_SC, ucp_Phags_Pa },
-  { 901, PT_SC, ucp_Syriac },
+  { 970, PT_SC, ucp_Phoenician },
-  { 908, PT_SC, ucp_Tagalog },
+  { 981, PT_PC, ucp_Pi },
-  { 916, PT_SC, ucp_Tagbanwa },
+  { 984, PT_PC, ucp_Po },
-  { 925, PT_SC, ucp_Tai_Le },
+  { 987, PT_PC, ucp_Ps },
-  { 932, PT_SC, ucp_Tai_Tham },
+  { 990, PT_SC, ucp_Psalter_Pahlavi },
-  { 941, PT_SC, ucp_Tai_Viet },
+  { 1006, PT_SC, ucp_Rejang },
-  { 950, PT_SC, ucp_Takri },
+  { 1013, PT_SC, ucp_Runic },
-  { 956, PT_SC, ucp_Tamil },
+  { 1019, PT_GC, ucp_S },
-  { 962, PT_SC, ucp_Telugu },
+  { 1021, PT_SC, ucp_Samaritan },
-  { 969, PT_SC, ucp_Thaana },
+  { 1031, PT_SC, ucp_Saurashtra },
-  { 976, PT_SC, ucp_Thai },
+  { 1042, PT_PC, ucp_Sc },
-  { 981, PT_SC, ucp_Tibetan },
+  { 1045, PT_SC, ucp_Sharada },
-  { 989, PT_SC, ucp_Tifinagh },
+  { 1053, PT_SC, ucp_Shavian },
-  { 998, PT_SC, ucp_Ugaritic },
+  { 1061, PT_SC, ucp_Siddham },
-  { 1007, PT_SC, ucp_Vai },
+  { 1069, PT_SC, ucp_Sinhala },
-  { 1011, PT_ALNUM, 0 },
+  { 1077, PT_PC, ucp_Sk },
-  { 1015, PT_PXSPACE, 0 },
+  { 1080, PT_PC, ucp_Sm },
-  { 1019, PT_SPACE, 0 },
+  { 1083, PT_PC, ucp_So },
-  { 1023, PT_UCNC, 0 },
+  { 1086, PT_SC, ucp_Sora_Sompeng },
-  { 1027, PT_WORD, 0 },
+  { 1099, PT_SC, ucp_Sundanese },
-  { 1031, PT_SC, ucp_Yi },
+  { 1109, PT_SC, ucp_Syloti_Nagri },
-  { 1034, PT_GC, ucp_Z },
+  { 1122, PT_SC, ucp_Syriac },
-  { 1036, PT_PC, ucp_Zl },
+  { 1129, PT_SC, ucp_Tagalog },
-  { 1039, PT_PC, ucp_Zp },
+  { 1137, PT_SC, ucp_Tagbanwa },
-  { 1042, PT_PC, ucp_Zs }
+  { 1146, PT_SC, ucp_Tai_Le },
  { 1153, PT_SC, ucp_Tai_Tham },
  { 1162, PT_SC, ucp_Tai_Viet },
  { 1171, PT_SC, ucp_Takri },
  { 1177, PT_SC, ucp_Tamil },
  { 1183, PT_SC, ucp_Telugu },
  { 1190, PT_SC, ucp_Thaana },
  { 1197, PT_SC, ucp_Thai },
  { 1202, PT_SC, ucp_Tibetan },
  { 1210, PT_SC, ucp_Tifinagh },
  { 1219, PT_SC, ucp_Tirhuta },
  { 1227, PT_SC, ucp_Ugaritic },
  { 1236, PT_SC, ucp_Vai },
  { 1240, PT_SC, ucp_Warang_Citi },
  { 1252, PT_ALNUM, 0 },
  { 1256, PT_PXSPACE, 0 },
  { 1260, PT_SPACE, 0 },
  { 1264, PT_UCNC, 0 },
  { 1268, PT_WORD, 0 },
  { 1272, PT_SC, ucp_Yi },
  { 1275, PT_GC, ucp_Z },
  { 1277, PT_PC, ucp_Zl },
  { 1280, PT_PC, ucp_Zp },
  { 1283, PT_PC, ucp_Zs }
 };
 const int PRIV(utt_size) = sizeof(PRIV(utt)) / sizeof(ucp_type_table);
--- a/ext/pcre/pcrelib/pcre_ucd.c
+++ b/ext/pcre/pcrelib/pcre_ucd.c
--- a/ext/pcre/pcrelib/pcre_xclass.c
+++ b/ext/pcre/pcrelib/pcre_xclass.c
@ -81,6 +81,11 @@ additional data. */
 if (c < 256)
  {
  if ((*data & XCL_HASPROP) == 0)
    {
    if ((*data & XCL_MAP) == 0) return negated;
    return (((pcre_uint8 *)(data + 1))[c/8] & (1 << (c&7))) != 0;
    }
  if ((*data & XCL_MAP) != 0 &&
    (((pcre_uint8 *)(data + 1))[c/8] & (1 << (c&7))) != 0)
    return !negated; /* char found */
--- a/ext/pcre/pcrelib/pcreposix.c
+++ b/ext/pcre/pcrelib/pcreposix.c
@ -6,7 +6,7 @@
 and semantics are as close as possible to those of the Perl 5 language.
                       Written by Philip Hazel
-           Copyright (c) 1997-2012 University of Cambridge
+           Copyright (c) 1997-2014 University of Cambridge
 -----------------------------------------------------------------------------
 Redistribution and use in source and binary forms, with or without
@ -170,7 +170,10 @@ static const int eint[] = {
  REG_BADPAT,  /* missing opening brace after \o */
  REG_BADPAT,  /* parentheses too deeply nested */
  REG_BADPAT,  /* invalid range in character class */
-  REG_BADPAT   /* group name must start with a non-digit */
+  REG_BADPAT,  /* group name must start with a non-digit */
  /* 85 */
  REG_BADPAT,  /* parentheses too deeply nested (stack check) */
  REG_BADPAT   /* missing digits in \x{} or \o{} */
 };
 /* Table of texts corresponding to POSIX error codes */
--- a/ext/pcre/pcrelib/testdata/saved16BE-1
+++ b/ext/pcre/pcrelib/testdata/saved16BE-1
--- a/ext/pcre/pcrelib/testdata/saved16LE-1
+++ b/ext/pcre/pcrelib/testdata/saved16LE-1
--- a/ext/pcre/pcrelib/testdata/saved32BE-1
+++ b/ext/pcre/pcrelib/testdata/saved32BE-1
--- a/ext/pcre/pcrelib/testdata/saved32LE-1
+++ b/ext/pcre/pcrelib/testdata/saved32LE-1
--- a/ext/pcre/pcrelib/testdata/testinput1
+++ b/ext/pcre/pcrelib/testdata/testinput1
@ -111,7 +111,7 @@
    bababbc
    babababc
-/^\ca\cA\c[\c{\c:/
+/^\ca\cA\c[;\c:/
    \x01\x01\e;z
 /^[ab\]cde]/
@ -4938,6 +4938,12 @@ however, we need the complication for Perl. ---/
 /((?(R1)a+|(?1)b))/
    aaaabcde
 /((?(R)a|(?1)))*/
    aaa
 /((?(R)a|(?1)))+/
    aaa
 /a(*:any 
 name)/K
    abc
@ -5666,4 +5672,52 @@ AbcdCBefgBhiBqz
 /(a\Kb)*/+
    ababc
 /(?:x|(?:(xx|yy)+|x|x|x|x|x)|a|a|a)bc/
    acb
 '\A(?:[^\"]++|\"(?:[^\"]*+|\"\")*+\")++'
    NON QUOTED \"QUOT\"\"ED\" AFTER \"NOT MATCHED
 '\A(?:[^\"]++|\"(?:[^\"]++|\"\")*+\")++'
    NON QUOTED \"QUOT\"\"ED\" AFTER \"NOT MATCHED
 '\A(?:[^\"]++|\"(?:[^\"]++|\"\")++\")++'
    NON QUOTED \"QUOT\"\"ED\" AFTER \"NOT MATCHED
 '\A([^\"1]++|[\"2]([^\"3]*+|[\"4][\"5])*+[\"6])++'
    NON QUOTED \"QUOT\"\"ED\" AFTER \"NOT MATCHED
 /^\w+(?>\s*)(?<=\w)/
  test test
 /(?P<same>a)(?P<same>b)/gJ
    abbaba
 /(?P<same>a)(?P<same>b)(?P=same)/gJ
    abbaba
 /(?P=same)?(?P<same>a)(?P<same>b)/gJ
    abbaba
 /(?:(?P=same)?(?:(?P<same>a)|(?P<same>b))(?P=same))+/gJ
    bbbaaabaabb
 /(?:(?P=same)?(?:(?P=same)(?P<same>a)(?P=same)|(?P=same)?(?P<same>b)(?P=same)){2}(?P=same)(?P<same>c)(?P=same)){2}(?P<same>z)?/gJ
    bbbaaaccccaaabbbcc
 /(?P<Name>a)?(?P<Name2>b)?(?(<Name>)c|d)*l/
    acl
    bdl
    adl
    bcl    
 /\sabc/
    \x{0b}abc
 /[\Qa]\E]+/
    aa]]
 /[\Q]a\E]+/
    aa]]
 /-- End of testinput1 --/
--- a/ext/pcre/pcrelib/testdata/testinput11
+++ b/ext/pcre/pcrelib/testdata/testinput11
@ -132,4 +132,6 @@ is required for these tests. --/
 /abc(d|e)(*THEN)x(123(*THEN)4|567(b|q)(*THEN)xx)/B
 /(((a\2)|(a*)\g<-1>))*a?/B
 /-- End of testinput11 --/
--- a/ext/pcre/pcrelib/testdata/testinput16
+++ b/ext/pcre/pcrelib/testdata/testinput16
@ -32,4 +32,10 @@
 /[[:blank:]]/WBZ
 /\x{212a}+/i8SI
    KKkk\x{212a}
 /s+/i8SI
    SSss\x{17f}
 /-- End of testinput16 --/
--- a/ext/pcre/pcrelib/testdata/testinput18
+++ b/ext/pcre/pcrelib/testdata/testinput18
@ -207,7 +207,7 @@ correctly, but that messes up comparisons). --/
    CDBABC
    \x{2000}ABC 
-/\R*A/SI8
+/\R*A/SI8<bsr_unicode>
    CDBABC
    \x{2028}A  
--- a/ext/pcre/pcrelib/testdata/testinput19
+++ b/ext/pcre/pcrelib/testdata/testinput19
@ -19,4 +19,10 @@
 /[[:blank:]]/WBZ
 /\x{212a}+/i8SI
    KKkk\x{212a}
 /s+/i8SI
    SSss\x{17f}
 /-- End of testinput19 --/ 
--- a/ext/pcre/pcrelib/testdata/testinput2
+++ b/ext/pcre/pcrelib/testdata/testinput2
@ -907,6 +907,9 @@
 /\U/I
 /a{1,3}b/U
    ab
 /[/I
 /[a-/I
@ -4032,6 +4035,8 @@ backtracking verbs. --/
 /(?(R&6yh)abc)/
 /(((a\2)|(a*)\g<-1>))*a?/BZ
 /-- Test the ugly "start or end of word" compatibility syntax --/
 /[[:<:]]red[[:>:]]/BZ
@ -4045,4 +4050,32 @@ backtracking verbs. --/
 /[a[:<:]] should give error/ 
 /(?=ab\K)/+
    abcd
 /abcd/f<lf>
    xx\nxabcd
 / -- Test stack check external calls --/ 
 /(((((a)))))/Q0
 /(((((a)))))/Q1
 /(((((a)))))/Q
 /^\w+(?>\s*)(?<=\w)/BZ
 /\othing/
 /\o{}/
 /\o{whatever}/
 /\xthing/
 /\x{}/
 /\x{whatever}/
 /-- End of testinput2 --/
--- a/ext/pcre/pcrelib/testdata/testinput25
+++ b/ext/pcre/pcrelib/testdata/testinput25
@ -1,6 +1,6 @@
 /-- Tests for the 32-bit library only */
-< forbid 8w
+< forbid 8W
 /-- Check maximum character size --/
--- a/ext/pcre/pcrelib/testdata/testinput3
+++ b/ext/pcre/pcrelib/testdata/testinput3
@ -1,6 +1,9 @@
-/-- This set of tests checks local-specific features, using the fr_FR locale. 
+/-- This set of tests checks local-specific features, using the "fr_FR" locale. 
-    It is not Perl-compatible. There is different version called wintestinput3
+    It is not Perl-compatible. When run via RunTest, the locale is edited to
-  f  or use on Windows, where the locale is called "french". --/
+    be whichever of "fr_FR", "french", or "fr" is found to exist. There is
    different version of this file called wintestinput3 for use on Windows,
    where the locale is called "french" and the tests are run using
    RunTest.bat. --/
 < forbid 8W 
--- a/ext/pcre/pcrelib/testdata/testinput4
+++ b/ext/pcre/pcrelib/testdata/testinput4
@ -716,4 +716,10 @@
 /^a+[a\x{200}]/8
    aa
 /^.\B.\B./8
    \x{10123}\x{10124}\x{10125}
 /^#[^\x{ffff}]#[^\x{ffff}]#[^\x{ffff}]#/8
    #\x{10000}#\x{100}#\x{10ffff}#
 /-- End of testinput4 --/
--- a/ext/pcre/pcrelib/testdata/testinput5
+++ b/ext/pcre/pcrelib/testdata/testinput5
@ -788,4 +788,6 @@
 /^a+[a\x{200}]/8BZ
    aa
 /[b-d\x{200}-\x{250}]*[ae-h]?#[\x{200}-\x{250}]{0,8}[\x00-\xff]*#[\x{200}-\x{250}]+[a-z]/8BZ
 /-- End of testinput5 --/
--- a/ext/pcre/pcrelib/testdata/testinput6
+++ b/ext/pcre/pcrelib/testdata/testinput6
@ -421,8 +421,8 @@
 /^[\p{Arabic}]/8
    \x{06e9}
    \x{060b}
    \x{061c}
    ** Failers
    \x{061c}
    X\x{06e9}   
 /^[\P{Yi}]/8
@ -1484,4 +1484,16 @@
    \x{a1}\x{a7}  
    \x{37e} 
 /[RST]+/8iW
    Ss\x{17f}
 /[R-T]+/8iW 
    Ss\x{17f}
 /[q-u]+/8iW 
    Ss\x{17f}
 /^s?c/mi8
    scat
 /-- End of testinput6 --/
--- a/ext/pcre/pcrelib/testdata/testinput7
+++ b/ext/pcre/pcrelib/testdata/testinput7
@ -829,4 +829,13 @@ of case for anything other than the ASCII letters. --/
 /\d+\s{0,5}=\s*\S?=\w{0,4}\W*/8WBZ
 /[RST]+/8iWBZ
 /[R-T]+/8iWBZ 
 /[Q-U]+/8iWBZ 
 /^s?c/mi8I
    scat
 /-- End of testinput7 --/
--- a/ext/pcre/pcrelib/testdata/testinput8
+++ b/ext/pcre/pcrelib/testdata/testinput8
@ -4831,4 +4831,10 @@
 /[ab]{2,}?/
    aaaa    
 '\A(?:[^\"]++|\"(?:[^\"]*+|\"\")*+\")++'
    NON QUOTED \"QUOT\"\"ED\" AFTER \"NOT MATCHED
 '\A(?:[^\"]++|\"(?:[^\"]++|\"\")*+\")++'
    NON QUOTED \"QUOT\"\"ED\" AFTER \"NOT MATCHED
 /-- End of testinput8 --/
--- a/ext/pcre/pcrelib/testdata/testoutput1
+++ b/ext/pcre/pcrelib/testdata/testoutput1
@ -223,7 +223,7 @@ No match
    babababc
 No match
-/^\ca\cA\c[\c{\c:/
+/^\ca\cA\c[;\c:/
    \x01\x01\e;z
 0: \x01\x01\x1b;z
@ -8235,6 +8235,16 @@ MK: M
 0: aaaab
 1: aaaab
 /((?(R)a|(?1)))*/
    aaa
 0: aaa
 1: a
 /((?(R)a|(?1)))+/
    aaa
 0: aaa
 1: a
 /a(*:any 
 name)/K
    abc
@ -9313,4 +9323,92 @@ No match
 0+ c
 1: ab
 /(?:x|(?:(xx|yy)+|x|x|x|x|x)|a|a|a)bc/
    acb
 No match
 '\A(?:[^\"]++|\"(?:[^\"]*+|\"\")*+\")++'
    NON QUOTED \"QUOT\"\"ED\" AFTER \"NOT MATCHED
 0: NON QUOTED "QUOT""ED" AFTER 
 '\A(?:[^\"]++|\"(?:[^\"]++|\"\")*+\")++'
    NON QUOTED \"QUOT\"\"ED\" AFTER \"NOT MATCHED
 0: NON QUOTED "QUOT""ED" AFTER 
 '\A(?:[^\"]++|\"(?:[^\"]++|\"\")++\")++'
    NON QUOTED \"QUOT\"\"ED\" AFTER \"NOT MATCHED
 0: NON QUOTED "QUOT""ED" AFTER 
 '\A([^\"1]++|[\"2]([^\"3]*+|[\"4][\"5])*+[\"6])++'
    NON QUOTED \"QUOT\"\"ED\" AFTER \"NOT MATCHED
 0: NON QUOTED "QUOT""ED" AFTER 
 1:  AFTER 
 2: 
 /^\w+(?>\s*)(?<=\w)/
  test test
 0: tes
 /(?P<same>a)(?P<same>b)/gJ
    abbaba
 0: ab
 1: a
 2: b
 0: ab
 1: a
 2: b
 /(?P<same>a)(?P<same>b)(?P=same)/gJ
    abbaba
 0: aba
 1: a
 2: b
 /(?P=same)?(?P<same>a)(?P<same>b)/gJ
    abbaba
 0: ab
 1: a
 2: b
 0: ab
 1: a
 2: b
 /(?:(?P=same)?(?:(?P<same>a)|(?P<same>b))(?P=same))+/gJ
    bbbaaabaabb
 0: bbbaaaba
 1: a
 2: b
 0: bb
 1: <unset>
 2: b
 /(?:(?P=same)?(?:(?P=same)(?P<same>a)(?P=same)|(?P=same)?(?P<same>b)(?P=same)){2}(?P=same)(?P<same>c)(?P=same)){2}(?P<same>z)?/gJ
    bbbaaaccccaaabbbcc
 No match
 /(?P<Name>a)?(?P<Name2>b)?(?(<Name>)c|d)*l/
    acl
 0: acl
 1: a
    bdl
 0: bdl
 1: <unset>
 2: b
    adl
 0: dl
    bcl    
 0: l
 /\sabc/
    \x{0b}abc
 0: \x0babc
 /[\Qa]\E]+/
    aa]]
 0: aa]]
 /[\Q]a\E]+/
    aa]]
 0: aa]]
 /-- End of testinput1 --/
--- a/ext/pcre/pcrelib/testdata/testoutput11-16
+++ b/ext/pcre/pcrelib/testdata/testoutput11-16
@ -709,4 +709,28 @@ Memory allocation (code space): 14
 62     End
 ------------------------------------------------------------------
 /(((a\2)|(a*)\g<-1>))*a?/B
 ------------------------------------------------------------------
  0  39 Bra
  2     Brazero
  3  32 SCBra 1
  6  27 Once
  8  12 CBra 2
 11   7 CBra 3
 14     a
 16     \2
 18   7 Ket
 20  11 Alt
 22   5 CBra 4
 25     a*
 27   5 Ket
 29  22 Recurse
 31  23 Ket
 33  27 Ket
 35  32 KetRmax
 37     a?+
 39  39 Ket
 41     End
 ------------------------------------------------------------------
 /-- End of testinput11 --/
--- a/ext/pcre/pcrelib/testdata/testoutput11-32
+++ b/ext/pcre/pcrelib/testdata/testoutput11-32
@ -709,4 +709,28 @@ Memory allocation (code space): 28
 62     End
 ------------------------------------------------------------------
 /(((a\2)|(a*)\g<-1>))*a?/B
 ------------------------------------------------------------------
  0  39 Bra
  2     Brazero
  3  32 SCBra 1
  6  27 Once
  8  12 CBra 2
 11   7 CBra 3
 14     a
 16     \2
 18   7 Ket
 20  11 Alt
 22   5 CBra 4
 25     a*
 27   5 Ket
 29  22 Recurse
 31  23 Ket
 33  27 Ket
 35  32 KetRmax
 37     a?+
 39  39 Ket
 41     End
 ------------------------------------------------------------------
 /-- End of testinput11 --/
--- a/ext/pcre/pcrelib/testdata/testoutput11-8
+++ b/ext/pcre/pcrelib/testdata/testoutput11-8
@ -709,4 +709,28 @@ Memory allocation (code space): 10
 76     End
 ------------------------------------------------------------------
 /(((a\2)|(a*)\g<-1>))*a?/B
 ------------------------------------------------------------------
  0  57 Bra
  3     Brazero
  4  48 SCBra 1
  9  40 Once
 12  18 CBra 2
 17  10 CBra 3
 22     a
 24     \2
 27  10 Ket
 30  16 Alt
 33   7 CBra 4
 38     a*
 40   7 Ket
 43  33 Recurse
 46  34 Ket
 49  40 Ket
 52  48 KetRmax
 55     a?+
 57  57 Ket
 60     End
 ------------------------------------------------------------------
 /-- End of testinput11 --/
--- a/ext/pcre/pcrelib/testdata/testoutput12
+++ b/ext/pcre/pcrelib/testdata/testoutput12
@ -8,7 +8,7 @@ No options
 First char = 'a'
 Need char = 'c'
 Subject length lower bound = 3
-No set of starting bytes
+No starting char list
 JIT study was successful
 /(?(?C1)(?=a)a)/S+I
@ -27,7 +27,7 @@ No options
 No first char
 No need char
 Subject length lower bound = -1
-No set of starting bytes
+No starting char list
 JIT study was not successful
 /abc/S+I>testsavedregex
@ -36,7 +36,7 @@ No options
 First char = 'a'
 Need char = 'c'
 Subject length lower bound = 3
-No set of starting bytes
+No starting char list
 JIT study was successful
 Compiled pattern written to testsavedregex
 Study data written to testsavedregex
@ -165,7 +165,7 @@ No options
 First char = 'a'
 Need char = 'd'
 Subject length lower bound = 4
-No set of starting bytes
+No starting char list
 JIT study was successful
 /(*NO_START_OPT)a(*:m)b/KS++
--- a/ext/pcre/pcrelib/testdata/testoutput13
+++ b/ext/pcre/pcrelib/testdata/testoutput13
@ -8,7 +8,7 @@ No options
 First char = 'a'
 Need char = 'c'
 Subject length lower bound = 3
-No set of starting bytes
+No starting char list
 JIT support is not available in this version of PCRE
 /a*/SI
--- a/ext/pcre/pcrelib/testdata/testoutput14
+++ b/ext/pcre/pcrelib/testdata/testoutput14
@ -361,7 +361,7 @@ Options: extended
 No first char
 No need char
 Subject length lower bound = 3
-Starting byte set: \x09 \x20 ! " # $ % & ' ( * + - / 0 1 2 3 4 5 6 7 8 
+Starting chars: \x09 \x20 ! " # $ % & ' ( * + - / 0 1 2 3 4 5 6 7 8 
  9 = ? A B C D E F G H I J K L M N O P Q R S T U V W X Y Z ^ _ ` a b c d e 
  f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f 
@ -388,7 +388,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: \x09 \x20 \xa0 
+Starting chars: \x09 \x20 \xa0 
 /\H/SI
 Capturing subpattern count = 0
@ -396,7 +396,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-No set of starting bytes
+No starting char list
 /\v/SI
 Capturing subpattern count = 0
@ -404,7 +404,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: \x0a \x0b \x0c \x0d \x85 
+Starting chars: \x0a \x0b \x0c \x0d \x85 
 /\V/SI
 Capturing subpattern count = 0
@ -412,7 +412,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-No set of starting bytes
+No starting char list
 /\R/SI
 Capturing subpattern count = 0
@ -420,7 +420,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: \x0a \x0b \x0c \x0d \x85 
+Starting chars: \x0a \x0b \x0c \x0d \x85 
 /[\h]/BZ
 ------------------------------------------------------------------
--- a/ext/pcre/pcrelib/testdata/testoutput15
+++ b/ext/pcre/pcrelib/testdata/testoutput15
@ -481,7 +481,7 @@ Options: utf
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a 
+Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a 
  \x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 
  \x1a \x1b \x1c \x1d \x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 
  5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y 
@ -519,7 +519,7 @@ Options: utf
 First char = \x{c4}
 Need char = \x{80}
 Subject length lower bound = 3
-No set of starting bytes
+No starting char list
  \x{100}\x{100}\x{100}\x{100\x{100}
 0: \x{100}\x{100}\x{100}
@ -539,7 +539,7 @@ Options: utf
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: x \xc4 
+Starting chars: x \xc4 
 /(\x{100}*a|x)/8SDZ
 ------------------------------------------------------------------
@ -558,7 +558,7 @@ Options: utf
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: a x \xc4 
+Starting chars: a x \xc4 
 /(\x{100}{0,2}a|x)/8SDZ
 ------------------------------------------------------------------
@ -577,7 +577,7 @@ Options: utf
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: a x \xc4 
+Starting chars: a x \xc4 
 /(\x{100}{1,2}a|x)/8SDZ
 ------------------------------------------------------------------
@ -597,7 +597,7 @@ Options: utf
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: x \xc4 
+Starting chars: x \xc4 
 /\x{100}/8DZ
 ------------------------------------------------------------------
@ -799,7 +799,7 @@ Options: utf
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: \x09 \x20 \xc2 \xe1 \xe2 \xe3 
+Starting chars: \x09 \x20 \xc2 \xe1 \xe2 \xe3 
    ABC\x{09}
 0: \x{09}
    ABC\x{20}
@ -825,7 +825,7 @@ Options: utf
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: \x0a \x0b \x0c \x0d \xc2 \xe2 
+Starting chars: \x0a \x0b \x0c \x0d \xc2 \xe2 
    ABC\x{0a}
 0: \x{0a}
    ABC\x{0b}
@ -845,7 +845,7 @@ Options: utf
 No first char
 Need char = 'A'
 Subject length lower bound = 1
-Starting byte set: \x09 \x20 A \xc2 \xe1 \xe2 \xe3 
+Starting chars: \x09 \x20 A \xc2 \xe1 \xe2 \xe3 
    CDBABC
 0: A
@ -855,7 +855,7 @@ Options: utf
 No first char
 Need char = 'A'
 Subject length lower bound = 2
-Starting byte set: \x0a \x0b \x0c \x0d \xc2 \xe2 
+Starting chars: \x0a \x0b \x0c \x0d \xc2 \xe2 
 /\s?xxx\s/8SI
 Capturing subpattern count = 0
@ -863,7 +863,7 @@ Options: utf
 No first char
 Need char = 'x'
 Subject length lower bound = 4
-Starting byte set: \x09 \x0a \x0b \x0c \x0d \x20 x 
+Starting chars: \x09 \x0a \x0b \x0c \x0d \x20 x 
 /\sxxx\s/I8ST1
 Capturing subpattern count = 0
@ -871,7 +871,7 @@ Options: utf
 No first char
 Need char = 'x'
 Subject length lower bound = 5
-Starting byte set: \x09 \x0a \x0c \x0d \x20 \xc2 
+Starting chars: \x09 \x0a \x0b \x0c \x0d \x20 \xc2 
    AB\x{85}xxx\x{a0}XYZ
 0: \x{85}xxx\x{a0}
    AB\x{a0}xxx\x{85}XYZ
@ -883,15 +883,15 @@ Options: utf
 No first char
 Need char = ' '
 Subject length lower bound = 3
-Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0b \x0e 
+Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0e \x0f 
-  \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d 
+  \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d \x1e 
-  \x1e \x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ 
+  \x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C 
-  A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e 
+  D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h 
-  f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f \xc0 \xc1 \xc2 \xc3 
+  i j k l m n o p q r s t u v w x y z { | } ~ \x7f \xc0 \xc1 \xc2 \xc3 \xc4 
-  \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 \xd1 \xd2 
+  \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 \xd1 \xd2 \xd3 
-  \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf \xe0 \xe1 
+  \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf \xe0 \xe1 \xe2 
-  \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee \xef \xf0 
+  \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee \xef \xf0 \xf1 
-  \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd \xfe \xff 
+  \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd \xfe \xff 
    \x{a2} \x{84} 
 0: \x{a2} \x{84}
    A Z 
@ -917,7 +917,7 @@ Options: caseless utf
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: \xe1 
+Starting chars: \xe1 
 /\x{1234}+?/iS8I
 Capturing subpattern count = 0
@ -925,7 +925,7 @@ Options: caseless utf
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: \xe1 
+Starting chars: \xe1 
 /\x{1234}++/iS8I
 Capturing subpattern count = 0
@ -933,7 +933,7 @@ Options: caseless utf
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: \xe1 
+Starting chars: \xe1 
 /\x{1234}{2}/iS8I
 Capturing subpattern count = 0
@ -941,7 +941,7 @@ Options: caseless utf
 No first char
 No need char
 Subject length lower bound = 2
-Starting byte set: \xe1 
+Starting chars: \xe1 
 /[^\x{c4}]/8DZ
 ------------------------------------------------------------------
@ -974,7 +974,7 @@ Options: utf
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: \x0a \x0b \x0c \x0d \xc2 \xe2 
+Starting chars: \x0a \x0b \x0c \x0d \xc2 \xe2 
 /\777/8DZ
 ------------------------------------------------------------------
--- a/ext/pcre/pcrelib/testdata/testoutput16
+++ b/ext/pcre/pcrelib/testdata/testoutput16
@ -64,7 +64,7 @@ Options: caseless utf
 No first char
 No need char
 Subject length lower bound = 17
-Starting byte set: \xd0 \xd1 
+Starting chars: \xd0 \xd1 
    \x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}
 0: \x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}
    \x{451}\x{440}\x{441}\x{442}\x{443}\x{444}\x{445}\x{446}\x{447}\x{448}\x{449}\x{44a}\x{44b}\x{44c}\x{44d}\x{44e}\x{44f}
@ -92,7 +92,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: \x09 \x20 \xa0 
+Starting chars: \x09 \x20 \xa0 
 /\v/SI
 Capturing subpattern count = 0
@ -100,7 +100,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: \x0a \x0b \x0c \x0d \x85 
+Starting chars: \x0a \x0b \x0c \x0d \x85 
 /\R/SI
 Capturing subpattern count = 0
@ -108,7 +108,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: \x0a \x0b \x0c \x0d \x85 
+Starting chars: \x0a \x0b \x0c \x0d \x85 
 /[[:blank:]]/WBZ
 ------------------------------------------------------------------
@ -118,4 +118,24 @@ Starting byte set: \x0a \x0b \x0c \x0d \x85
        End
 ------------------------------------------------------------------
 /\x{212a}+/i8SI
 Capturing subpattern count = 0
 Options: caseless utf
 No first char
 No need char
 Subject length lower bound = 1
 Starting chars: K k \xe2 
    KKkk\x{212a}
 0: KKkk\x{212a}
 /s+/i8SI
 Capturing subpattern count = 0
 Options: caseless utf
 No first char
 No need char
 Subject length lower bound = 1
 Starting chars: S s \xc5 
    SSss\x{17f}
 0: SSss\x{17f}
 /-- End of testinput16 --/
--- a/ext/pcre/pcrelib/testdata/testoutput17
+++ b/ext/pcre/pcrelib/testdata/testoutput17
@ -228,7 +228,7 @@ Options: extended
 No first char
 No need char
 Subject length lower bound = 3
-Starting byte set: \x09 \x20 ! " # $ % & ' ( * + - / 0 1 2 3 4 5 6 7 8 
+Starting chars: \x09 \x20 ! " # $ % & ' ( * + - / 0 1 2 3 4 5 6 7 8 
  9 = ? A B C D E F G H I J K L M N O P Q R S T U V W X Y Z ^ _ ` a b c d e 
  f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f \xff 
@ -274,7 +274,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: \x09 \x20 \xa0 \xff 
+Starting chars: \x09 \x20 \xa0 \xff 
    \x{1681}\x{200b}\x{1680}\x{2000}\x{202f}\x{3000}
 0: \x{1680}\x{2000}\x{202f}\x{3000}
    \x{3001}\x{2fff}\x{200a}\xa0\x{2000}
@ -292,7 +292,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-No set of starting bytes
+Starting chars: \x09 \x20 \xa0 \xff 
    \x{1681}\x{200b}\x{1680}\x{2000}\x{202f}\x{3000}
 0: \x{1680}\x{2000}\x{202f}\x{3000}
    \x{3001}\x{2fff}\x{200a}\xa0\x{2000}
@ -304,7 +304,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-No set of starting bytes
+No starting char list
    \x{1680}\x{180e}\x{167f}\x{1681}\x{180d}\x{180f}
 0: \x{167f}\x{1681}\x{180d}\x{180f}
    \x{2000}\x{200a}\x{1fff}\x{200b}
@ -330,7 +330,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: \x0a \x0b \x0c \x0d \x85 \xff 
+Starting chars: \x0a \x0b \x0c \x0d \x85 \xff 
    \x{2027}\x{2030}\x{2028}\x{2029}
 0: \x{2028}\x{2029}
    \x09\x0e\x84\x86\x85\x0a\x0b\x0c\x0d
@ -348,7 +348,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-No set of starting bytes
+Starting chars: \x0a \x0b \x0c \x0d \x85 \xff 
    \x{2027}\x{2030}\x{2028}\x{2029}
 0: \x{2028}\x{2029}
    \x09\x0e\x84\x86\x85\x0a\x0b\x0c\x0d
@ -360,7 +360,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-No set of starting bytes
+No starting char list
    \x{2028}\x{2029}\x{2027}\x{2030}
 0: \x{2027}\x{2030}
    \x85\x0a\x0b\x0c\x0d\x09\x0e\x84\x86
@ -378,7 +378,7 @@ Options: bsr_unicode
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: \x0a \x0b \x0c \x0d \x85 \xff 
+Starting chars: \x0a \x0b \x0c \x0d \x85 \xff 
    \x{2027}\x{2030}\x{2028}\x{2029}
 0: \x{2028}\x{2029}
    \x09\x0e\x84\x86\x85\x0a\x0b\x0c\x0d
@ -534,18 +534,18 @@ MK: 0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789AB
 ------------------------------------------------------------------
        Bra
        a*
-        [b-\x{200}]?+
+        [b-\xff\x{100}-\x{200}]?+
        a#
        a*+
-        [b-\x{200}]?
+        [b-\xff\x{100}-\x{200}]?
        b#
-        [a-f]*
+        [a-f]*+
-        [g-\x{200}]*+
+        [g-\xff\x{100}-\x{200}]*+
        #
-        [g-\x{200}]*
+        [g-\xff\x{100}-\x{200}]*+
        [a-c]*+
        #
-        [g-\x{200}]*
+        [g-\xff\x{100}-\x{200}]*
        [a-h]*+
        Ket
        End
--- a/ext/pcre/pcrelib/testdata/testoutput18-16
+++ b/ext/pcre/pcrelib/testdata/testoutput18-16
@ -339,7 +339,7 @@ Options: utf
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a 
+Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a 
  \x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 
  \x1a \x1b \x1c \x1d \x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 
  5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y 
@ -378,7 +378,7 @@ Options: utf
 First char = \x{100}
 Need char = \x{100}
 Subject length lower bound = 3
-No set of starting bytes
+No starting char list
  \x{100}\x{100}\x{100}\x{100\x{100}
 0: \x{100}\x{100}\x{100}
@ -398,7 +398,7 @@ Options: utf
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: x \xff 
+Starting chars: x \xff 
 /(\x{100}*a|x)/8SDZ
 ------------------------------------------------------------------
@ -417,7 +417,7 @@ Options: utf
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: a x \xff 
+Starting chars: a x \xff 
 /(\x{100}{0,2}a|x)/8SDZ
 ------------------------------------------------------------------
@ -436,7 +436,7 @@ Options: utf
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: a x \xff 
+Starting chars: a x \xff 
 /(\x{100}{1,2}a|x)/8SDZ
 ------------------------------------------------------------------
@ -456,7 +456,7 @@ Options: utf
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: x \xff 
+Starting chars: x \xff 
 /\x{100}/8DZ
 ------------------------------------------------------------------
@ -666,7 +666,7 @@ Options: utf
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: \x09 \x20 \xa0 \xff 
+Starting chars: \x09 \x20 \xa0 \xff 
    ABC\x{09}
 0: \x{09}
    ABC\x{20}
@ -692,7 +692,7 @@ Options: utf
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: \x0a \x0b \x0c \x0d \x85 \xff 
+Starting chars: \x0a \x0b \x0c \x0d \x85 \xff 
    ABC\x{0a}
 0: \x{0a}
    ABC\x{0b}
@ -712,19 +712,19 @@ Options: utf
 No first char
 Need char = 'A'
 Subject length lower bound = 1
-Starting byte set: \x09 \x20 A \xa0 \xff 
+Starting chars: \x09 \x20 A \xa0 \xff 
    CDBABC
 0: A
    \x{2000}ABC 
 0: \x{2000}A
-/\R*A/SI8
+/\R*A/SI8<bsr_unicode>
 Capturing subpattern count = 0
-Options: utf
+Options: bsr_unicode utf
 No first char
 Need char = 'A'
 Subject length lower bound = 1
-Starting byte set: \x0a \x0b \x0c \x0d A \x85 \xff 
+Starting chars: \x0a \x0b \x0c \x0d A \x85 \xff 
    CDBABC
 0: A
    \x{2028}A  
@ -736,7 +736,7 @@ Options: utf
 No first char
 Need char = 'A'
 Subject length lower bound = 2
-Starting byte set: \x0a \x0b \x0c \x0d \x85 \xff 
+Starting chars: \x0a \x0b \x0c \x0d \x85 \xff 
 /\s?xxx\s/8SI
 Capturing subpattern count = 0
@ -744,7 +744,7 @@ Options: utf
 No first char
 Need char = 'x'
 Subject length lower bound = 4
-Starting byte set: \x09 \x0a \x0b \x0c \x0d \x20 x 
+Starting chars: \x09 \x0a \x0b \x0c \x0d \x20 x 
 /\sxxx\s/I8ST1
 Capturing subpattern count = 0
@ -752,7 +752,7 @@ Options: utf
 No first char
 Need char = 'x'
 Subject length lower bound = 5
-Starting byte set: \x09 \x0a \x0c \x0d \x20 \x85 \xa0 
+Starting chars: \x09 \x0a \x0b \x0c \x0d \x20 \x85 \xa0 
    AB\x{85}xxx\x{a0}XYZ
 0: \x{85}xxx\x{a0}
    AB\x{a0}xxx\x{85}XYZ
@ -764,20 +764,20 @@ Options: utf
 No first char
 Need char = ' '
 Subject length lower bound = 3
-Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0b \x0e 
+Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0e \x0f 
-  \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d 
+  \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d \x1e 
-  \x1e \x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ 
+  \x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C 
-  A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e 
+  D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h 
-  f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f \x80 \x81 \x82 \x83 
+  i j k l m n o p q r s t u v w x y z { | } ~ \x7f \x80 \x81 \x82 \x83 \x84 
-  \x84 \x86 \x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91 \x92 \x93 
+  \x86 \x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91 \x92 \x93 \x94 
-  \x94 \x95 \x96 \x97 \x98 \x99 \x9a \x9b \x9c \x9d \x9e \x9f \xa1 \xa2 \xa3 
+  \x95 \x96 \x97 \x98 \x99 \x9a \x9b \x9c \x9d \x9e \x9f \xa1 \xa2 \xa3 \xa4 
-  \xa4 \xa5 \xa6 \xa7 \xa8 \xa9 \xaa \xab \xac \xad \xae \xaf \xb0 \xb1 \xb2 
+  \xa5 \xa6 \xa7 \xa8 \xa9 \xaa \xab \xac \xad \xae \xaf \xb0 \xb1 \xb2 \xb3 
-  \xb3 \xb4 \xb5 \xb6 \xb7 \xb8 \xb9 \xba \xbb \xbc \xbd \xbe \xbf \xc0 \xc1 
+  \xb4 \xb5 \xb6 \xb7 \xb8 \xb9 \xba \xbb \xbc \xbd \xbe \xbf \xc0 \xc1 \xc2 
-  \xc2 \xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 
+  \xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 \xd1 
-  \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf 
+  \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf \xe0 
-  \xe0 \xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee 
+  \xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee \xef 
-  \xef \xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd 
+  \xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd \xfe 
-  \xfe \xff 
+  \xff 
    \x{a2} \x{84}
 0: \x{a2} \x{84}
    A Z
@ -803,7 +803,7 @@ Options: caseless utf
 First char = \x{1234}
 No need char
 Subject length lower bound = 1
-No set of starting bytes
+No starting char list
 /\x{1234}+?/iS8I
 Capturing subpattern count = 0
@ -811,7 +811,7 @@ Options: caseless utf
 First char = \x{1234}
 No need char
 Subject length lower bound = 1
-No set of starting bytes
+No starting char list
 /\x{1234}++/iS8I
 Capturing subpattern count = 0
@ -819,7 +819,7 @@ Options: caseless utf
 First char = \x{1234}
 No need char
 Subject length lower bound = 1
-No set of starting bytes
+No starting char list
 /\x{1234}{2}/iS8I
 Capturing subpattern count = 0
@ -827,7 +827,7 @@ Options: caseless utf
 First char = \x{1234}
 Need char = \x{1234}
 Subject length lower bound = 2
-No set of starting bytes
+No starting char list
 /[^\x{c4}]/8DZ
 ------------------------------------------------------------------
@ -860,7 +860,7 @@ Options: utf
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: \x0a \x0b \x0c \x0d \x85 \xff 
+Starting chars: \x0a \x0b \x0c \x0d \x85 \xff 
 /-- Check bad offset --/
--- a/ext/pcre/pcrelib/testdata/testoutput18-32
+++ b/ext/pcre/pcrelib/testdata/testoutput18-32
@ -337,7 +337,7 @@ Options: utf
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a 
+Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a 
  \x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 
  \x1a \x1b \x1c \x1d \x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 
  5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y 
@ -376,7 +376,7 @@ Options: utf
 First char = \x{100}
 Need char = \x{100}
 Subject length lower bound = 3
-No set of starting bytes
+No starting char list
  \x{100}\x{100}\x{100}\x{100\x{100}
 0: \x{100}\x{100}\x{100}
@ -396,7 +396,7 @@ Options: utf
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: x \xff 
+Starting chars: x \xff 
 /(\x{100}*a|x)/8SDZ
 ------------------------------------------------------------------
@ -415,7 +415,7 @@ Options: utf
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: a x \xff 
+Starting chars: a x \xff 
 /(\x{100}{0,2}a|x)/8SDZ
 ------------------------------------------------------------------
@ -434,7 +434,7 @@ Options: utf
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: a x \xff 
+Starting chars: a x \xff 
 /(\x{100}{1,2}a|x)/8SDZ
 ------------------------------------------------------------------
@ -454,7 +454,7 @@ Options: utf
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: x \xff 
+Starting chars: x \xff 
 /\x{100}/8DZ
 ------------------------------------------------------------------
@ -663,7 +663,7 @@ Options: utf
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: \x09 \x20 \xa0 \xff 
+Starting chars: \x09 \x20 \xa0 \xff 
    ABC\x{09}
 0: \x{09}
    ABC\x{20}
@ -689,7 +689,7 @@ Options: utf
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: \x0a \x0b \x0c \x0d \x85 \xff 
+Starting chars: \x0a \x0b \x0c \x0d \x85 \xff 
    ABC\x{0a}
 0: \x{0a}
    ABC\x{0b}
@ -709,19 +709,19 @@ Options: utf
 No first char
 Need char = 'A'
 Subject length lower bound = 1
-Starting byte set: \x09 \x20 A \xa0 \xff 
+Starting chars: \x09 \x20 A \xa0 \xff 
    CDBABC
 0: A
    \x{2000}ABC 
 0: \x{2000}A
-/\R*A/SI8
+/\R*A/SI8<bsr_unicode>
 Capturing subpattern count = 0
-Options: utf
+Options: bsr_unicode utf
 No first char
 Need char = 'A'
 Subject length lower bound = 1
-Starting byte set: \x0a \x0b \x0c \x0d A \x85 \xff 
+Starting chars: \x0a \x0b \x0c \x0d A \x85 \xff 
    CDBABC
 0: A
    \x{2028}A  
@ -733,7 +733,7 @@ Options: utf
 No first char
 Need char = 'A'
 Subject length lower bound = 2
-Starting byte set: \x0a \x0b \x0c \x0d \x85 \xff 
+Starting chars: \x0a \x0b \x0c \x0d \x85 \xff 
 /\s?xxx\s/8SI
 Capturing subpattern count = 0
@ -741,7 +741,7 @@ Options: utf
 No first char
 Need char = 'x'
 Subject length lower bound = 4
-Starting byte set: \x09 \x0a \x0b \x0c \x0d \x20 x 
+Starting chars: \x09 \x0a \x0b \x0c \x0d \x20 x 
 /\sxxx\s/I8ST1
 Capturing subpattern count = 0
@ -749,7 +749,7 @@ Options: utf
 No first char
 Need char = 'x'
 Subject length lower bound = 5
-Starting byte set: \x09 \x0a \x0c \x0d \x20 \x85 \xa0 
+Starting chars: \x09 \x0a \x0b \x0c \x0d \x20 \x85 \xa0 
    AB\x{85}xxx\x{a0}XYZ
 0: \x{85}xxx\x{a0}
    AB\x{a0}xxx\x{85}XYZ
@ -761,20 +761,20 @@ Options: utf
 No first char
 Need char = ' '
 Subject length lower bound = 3
-Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0b \x0e 
+Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0e \x0f 
-  \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d 
+  \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d \x1e 
-  \x1e \x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ 
+  \x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C 
-  A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e 
+  D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h 
-  f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f \x80 \x81 \x82 \x83 
+  i j k l m n o p q r s t u v w x y z { | } ~ \x7f \x80 \x81 \x82 \x83 \x84 
-  \x84 \x86 \x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91 \x92 \x93 
+  \x86 \x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91 \x92 \x93 \x94 
-  \x94 \x95 \x96 \x97 \x98 \x99 \x9a \x9b \x9c \x9d \x9e \x9f \xa1 \xa2 \xa3 
+  \x95 \x96 \x97 \x98 \x99 \x9a \x9b \x9c \x9d \x9e \x9f \xa1 \xa2 \xa3 \xa4 
-  \xa4 \xa5 \xa6 \xa7 \xa8 \xa9 \xaa \xab \xac \xad \xae \xaf \xb0 \xb1 \xb2 
+  \xa5 \xa6 \xa7 \xa8 \xa9 \xaa \xab \xac \xad \xae \xaf \xb0 \xb1 \xb2 \xb3 
-  \xb3 \xb4 \xb5 \xb6 \xb7 \xb8 \xb9 \xba \xbb \xbc \xbd \xbe \xbf \xc0 \xc1 
+  \xb4 \xb5 \xb6 \xb7 \xb8 \xb9 \xba \xbb \xbc \xbd \xbe \xbf \xc0 \xc1 \xc2 
-  \xc2 \xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 
+  \xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 \xd1 
-  \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf 
+  \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf \xe0 
-  \xe0 \xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee 
+  \xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee \xef 
-  \xef \xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd 
+  \xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd \xfe 
-  \xfe \xff 
+  \xff 
    \x{a2} \x{84}
 0: \x{a2} \x{84}
    A Z
@ -800,7 +800,7 @@ Options: caseless utf
 First char = \x{1234}
 No need char
 Subject length lower bound = 1
-No set of starting bytes
+No starting char list
 /\x{1234}+?/iS8I
 Capturing subpattern count = 0
@ -808,7 +808,7 @@ Options: caseless utf
 First char = \x{1234}
 No need char
 Subject length lower bound = 1
-No set of starting bytes
+No starting char list
 /\x{1234}++/iS8I
 Capturing subpattern count = 0
@ -816,7 +816,7 @@ Options: caseless utf
 First char = \x{1234}
 No need char
 Subject length lower bound = 1
-No set of starting bytes
+No starting char list
 /\x{1234}{2}/iS8I
 Capturing subpattern count = 0
@ -824,7 +824,7 @@ Options: caseless utf
 First char = \x{1234}
 Need char = \x{1234}
 Subject length lower bound = 2
-No set of starting bytes
+No starting char list
 /[^\x{c4}]/8DZ
 ------------------------------------------------------------------
@ -857,7 +857,7 @@ Options: utf
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: \x0a \x0b \x0c \x0d \x85 \xff 
+Starting chars: \x0a \x0b \x0c \x0d \x85 \xff 
 /-- Check bad offset --/
--- a/ext/pcre/pcrelib/testdata/testoutput19
+++ b/ext/pcre/pcrelib/testdata/testoutput19
@ -55,7 +55,7 @@ Options: caseless utf
 First char = \x{401} (caseless)
 Need char = \x{42f} (caseless)
 Subject length lower bound = 17
-No set of starting bytes
+No starting char list
    \x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}
 0: \x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}
    \x{451}\x{440}\x{441}\x{442}\x{443}\x{444}\x{445}\x{446}\x{447}\x{448}\x{449}\x{44a}\x{44b}\x{44c}\x{44d}\x{44e}\x{44f}
@ -85,4 +85,24 @@ No set of starting bytes
        End
 ------------------------------------------------------------------
 /\x{212a}+/i8SI
 Capturing subpattern count = 0
 Options: caseless utf
 No first char
 No need char
 Subject length lower bound = 1
 Starting chars: K k \xff 
    KKkk\x{212a}
 0: KKkk\x{212a}
 /s+/i8SI
 Capturing subpattern count = 0
 Options: caseless utf
 No first char
 No need char
 Subject length lower bound = 1
 Starting chars: S s \xff 
    SSss\x{17f}
 0: SSss\x{17f}
 /-- End of testinput19 --/ 
--- a/ext/pcre/pcrelib/testdata/testoutput2
+++ b/ext/pcre/pcrelib/testdata/testoutput2
@ -178,7 +178,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 3
-Starting byte set: c d e 
+Starting chars: c d e 
    this sentence eventually mentions a cat
 0: cat
    this sentences rambles on and on for a while and then reaches elephant
@ -190,7 +190,7 @@ Options: caseless
 No first char
 No need char
 Subject length lower bound = 3
-Starting byte set: C D E c d e 
+Starting chars: C D E c d e 
    this sentence eventually mentions a CAT cat
 0: CAT
    this sentences rambles on and on for a while to elephant ElePhant
@ -202,7 +202,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: a b c d 
+Starting chars: a b c d 
 /(a|[^\dZ])/IS
 Capturing subpattern count = 1
@ -210,7 +210,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a 
+Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a 
  \x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 
  \x1a \x1b \x1c \x1d \x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / : ; < = > 
  ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y [ \ ] ^ _ ` a b c d 
@ -231,7 +231,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: \x09 \x0a \x0b \x0c \x0d \x20 a b 
+Starting chars: \x09 \x0a \x0b \x0c \x0d \x20 a b 
 /(ab\2)/
 Failed: reference to non-existent subpattern at offset 6
@ -512,7 +512,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: a b c d 
+Starting chars: a b c d 
 /(?i)[abcd]/IS
 Capturing subpattern count = 0
@ -520,7 +520,7 @@ Options: caseless
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: A B C D a b c d 
+Starting chars: A B C D a b c d 
 /(?m)[xy]|(b|c)/IS
 Capturing subpattern count = 1
@ -528,7 +528,7 @@ Options: multiline
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: b c x y 
+Starting chars: b c x y 
 /(^a|^b)/Im
 Capturing subpattern count = 1
@ -591,7 +591,7 @@ No options
 First char = 'b' (caseless)
 No need char
 Subject length lower bound = 1
-No set of starting bytes
+No starting char list
 /(a*b|(?i:c*(?-i)d))/IS
 Capturing subpattern count = 1
@ -599,7 +599,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: C a b c d 
+Starting chars: C a b c d 
 /a$/I
 Capturing subpattern count = 0
@ -666,7 +666,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: a b 
+Starting chars: a b 
 /(?<!foo)(alpha|omega)/IS
 Capturing subpattern count = 1
@ -675,7 +675,7 @@ No options
 No first char
 Need char = 'a'
 Subject length lower bound = 5
-Starting byte set: a o 
+Starting chars: a o 
 /(?!alphabet)[ab]/IS
 Capturing subpattern count = 0
@ -683,7 +683,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: a b 
+Starting chars: a b 
 /(?<=foo\n)^bar/Im
 Capturing subpattern count = 0
@ -1642,7 +1642,7 @@ Options: anchored
 No first char
 Need char = 'd'
 Subject length lower bound = 4
-No set of starting bytes
+No starting char list
 /\(             # ( at start
  (?:           # Non-capturing bracket
@ -1875,7 +1875,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 
+Starting chars: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 
  _ a b c d e f g h i j k l m n o p q r s t u v w x y z 
 /^[[:ascii:]]/DZ
@ -1937,7 +1937,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: \x09 \x0a \x0b \x0c \x0d \x20 
+Starting chars: \x09 \x0a \x0b \x0c \x0d \x20 
 /^[[:cntrl:]]/DZ
 ------------------------------------------------------------------
@ -3178,6 +3178,10 @@ Failed: PCRE does not support \L, \l, \N{name}, \U, or \u at offset 1
 /\U/I
 Failed: PCRE does not support \L, \l, \N{name}, \U, or \u at offset 1
 /a{1,3}b/U
    ab
 0: ab
 /[/I
 Failed: missing terminating ] for character class at offset 1
@ -3434,7 +3438,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: a b 
+Starting chars: a b 
 /[^a]/I
 Capturing subpattern count = 0
@ -3454,7 +3458,7 @@ No options
 No first char
 Need char = '6'
 Subject length lower bound = 4
-Starting byte set: 0 1 2 3 4 5 6 7 8 9 
+Starting chars: 0 1 2 3 4 5 6 7 8 9 
 /a^b/I
 Capturing subpattern count = 0
@ -3488,7 +3492,7 @@ Options: caseless
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: A B a b 
+Starting chars: A B a b 
 /[ab](?i)cd/IS
 Capturing subpattern count = 0
@ -3496,7 +3500,7 @@ No options
 No first char
 Need char = 'd' (caseless)
 Subject length lower bound = 3
-Starting byte set: a b 
+Starting chars: a b 
 /abc(?C)def/I
 Capturing subpattern count = 0
@ -3537,7 +3541,7 @@ No options
 No first char
 Need char = 'f'
 Subject length lower bound = 7
-Starting byte set: 0 1 2 3 4 5 6 7 8 9 
+Starting chars: 0 1 2 3 4 5 6 7 8 9 
    1234abcdef
 --->1234abcdef
  1 ^              \d
@ -3856,7 +3860,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: a b 
+Starting chars: a b 
 /(?R)/I
 Failed: recursive call could loop indefinitely at offset 3
@ -4637,7 +4641,7 @@ Options: caseless
 No first char
 Need char = 'g' (caseless)
 Subject length lower bound = 8
-No set of starting bytes
+No starting char list
     Baby Bjorn Active Carrier - With free SHIPPING!!
 0: Baby Bjorn Active Carrier - With free SHIPPING!!
 1: Baby Bjorn Active Carrier - With free SHIPPING!!
@ -4656,7 +4660,7 @@ No options
 No first char
 Need char = 'b'
 Subject length lower bound = 1
-No set of starting bytes
+No starting char list
 /(a|b)*.?c/ISDZ
 ------------------------------------------------------------------
@ -4677,7 +4681,7 @@ No options
 No first char
 Need char = 'c'
 Subject length lower bound = 1
-No set of starting bytes
+No starting char list
 /abc(?C255)de(?C)f/DZ
 ------------------------------------------------------------------
@ -4750,7 +4754,7 @@ Options:
 No first char
 Need char = 'b'
 Subject length lower bound = 1
-Starting byte set: a b 
+Starting chars: a b 
  ab
 --->ab
 +0 ^      a*
@ -4893,7 +4897,7 @@ Options:
 No first char
 Need char = 'x'
 Subject length lower bound = 4
-Starting byte set: a d 
+Starting chars: a d 
  abcx
 --->abcx
 +0 ^        (abc|def)
@ -5127,7 +5131,7 @@ Options:
 No first char
 No need char
 Subject length lower bound = 2
-Starting byte set: a b x 
+Starting chars: a b x 
    Note: that { does NOT introduce a quantifier
 --->Note: that { does NOT introduce a quantifier
 +0         ^                                        ([ab]{,4}c|xy)
@ -5607,7 +5611,7 @@ No options
 First char = 'a'
 Need char = 'c'
 Subject length lower bound = 3
-No set of starting bytes
+No starting char list
 Compiled pattern written to testsavedregex
 Study data written to testsavedregex
 <testsavedregex
@ -5642,7 +5646,7 @@ No options
 First char = 'a'
 Need char = 'c'
 Subject length lower bound = 3
-No set of starting bytes
+No starting char list
 Compiled pattern written to testsavedregex
 Study data written to testsavedregex
 <testsavedregex
@ -5677,7 +5681,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: a b 
+Starting chars: a b 
 Compiled pattern written to testsavedregex
 Study data written to testsavedregex
 <testsavedregex
@ -5716,7 +5720,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: a b 
+Starting chars: a b 
 Compiled pattern written to testsavedregex
 Study data written to testsavedregex
 <testsavedregex
@ -5817,13 +5821,13 @@ No match
 No match
 /a{11111111111111111111}/I
-Failed: number too big in {} quantifier at offset 22
+Failed: number too big in {} quantifier at offset 8
 /(){64294967295}/I
-Failed: number too big in {} quantifier at offset 14
+Failed: number too big in {} quantifier at offset 9
 /(){2,4294967295}/I
-Failed: number too big in {} quantifier at offset 15
+Failed: number too big in {} quantifier at offset 11
 "(?i:a)(?i:b)(?i:c)(?i:d)(?i:e)(?i:f)(?i:g)(?i:h)(?i:i)(?i:j)(k)(?i:l)A\1B"I
 Capturing subpattern count = 1
@ -6431,7 +6435,7 @@ No options
 No first char
 Need char = ','
 Subject length lower bound = 1
-Starting byte set: \x09 \x0a \x0b \x0c \x0d \x20 , 
+Starting chars: \x09 \x0a \x0b \x0c \x0d \x20 , 
    \x0b,\x0b
 0: \x0b,\x0b
    \x0c,\x0d
@ -6738,7 +6742,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: C a b c d 
+Starting chars: C a b c d 
 /()[ab]xyz/IS
 Capturing subpattern count = 1
@ -6746,7 +6750,7 @@ No options
 No first char
 Need char = 'z'
 Subject length lower bound = 4
-Starting byte set: a b 
+Starting chars: a b 
 /(|)[ab]xyz/IS
 Capturing subpattern count = 1
@ -6754,7 +6758,7 @@ No options
 No first char
 Need char = 'z'
 Subject length lower bound = 4
-Starting byte set: a b 
+Starting chars: a b 
 /(|c)[ab]xyz/IS
 Capturing subpattern count = 1
@ -6762,7 +6766,7 @@ No options
 No first char
 Need char = 'z'
 Subject length lower bound = 4
-Starting byte set: a b c 
+Starting chars: a b c 
 /(|c?)[ab]xyz/IS
 Capturing subpattern count = 1
@ -6770,7 +6774,7 @@ No options
 No first char
 Need char = 'z'
 Subject length lower bound = 4
-Starting byte set: a b c 
+Starting chars: a b c 
 /(d?|c?)[ab]xyz/IS
 Capturing subpattern count = 1
@ -6778,7 +6782,7 @@ No options
 No first char
 Need char = 'z'
 Subject length lower bound = 4
-Starting byte set: a b c d 
+Starting chars: a b c d 
 /(d?|c)[ab]xyz/IS
 Capturing subpattern count = 1
@ -6786,7 +6790,7 @@ No options
 No first char
 Need char = 'z'
 Subject length lower bound = 4
-Starting byte set: a b c d 
+Starting chars: a b c d 
 /^a*b\d/DZ
 ------------------------------------------------------------------
@ -6879,7 +6883,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: a b c d 
+Starting chars: a b c d 
 /(a+|b*)[cd]/IS
 Capturing subpattern count = 1
@ -6887,7 +6891,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: a b c d 
+Starting chars: a b c d 
 /(a*|b+)[cd]/IS
 Capturing subpattern count = 1
@ -6895,7 +6899,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: a b c d 
+Starting chars: a b c d 
 /(a+|b+)[cd]/IS
 Capturing subpattern count = 1
@ -6903,7 +6907,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 2
-Starting byte set: a b 
+Starting chars: a b 
 /((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((
 ((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((
@ -9307,7 +9311,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: x y z 
+Starting chars: x y z 
 /(?(?=.*b)b|^)/CI
 Capturing subpattern count = 0
@ -10096,7 +10100,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 2
-Starting byte set: a b 
+Starting chars: a b 
 /(a|bc)\1{2,3}/SI
 Capturing subpattern count = 1
@ -10105,7 +10109,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 3
-Starting byte set: a b 
+Starting chars: a b 
 /(a|bc)(?1)/SI
 Capturing subpattern count = 1
@ -10113,7 +10117,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 2
-Starting byte set: a b 
+Starting chars: a b 
 /(a|b\1)(a|b\1)/SI
 Capturing subpattern count = 2
@ -10122,7 +10126,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 2
-Starting byte set: a b 
+Starting chars: a b 
 /(a|b\1){2}/SI
 Capturing subpattern count = 1
@ -10131,7 +10135,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 2
-Starting byte set: a b 
+Starting chars: a b 
 /(a|bbbb\1)(a|bbbb\1)/SI
 Capturing subpattern count = 2
@ -10140,7 +10144,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 2
-Starting byte set: a b 
+Starting chars: a b 
 /(a|bbbb\1){2}/SI
 Capturing subpattern count = 1
@ -10149,7 +10153,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 2
-Starting byte set: a b 
+Starting chars: a b 
 /^From +([^ ]+) +[a-zA-Z][a-zA-Z][a-zA-Z] +[a-zA-Z][a-zA-Z][a-zA-Z] +[0-9]?[0-9] +[0-9][0-9]:[0-9][0-9]/SI
 Capturing subpattern count = 1
@ -10157,7 +10161,7 @@ Options: anchored
 No first char
 Need char = ':'
 Subject length lower bound = 22
-No set of starting bytes
+No starting char list
 /<tr([\w\W\s\d][^<>]{0,})><TD([\w\W\s\d][^<>]{0,})>([\d]{0,}\.)(.*)((<BR>([\w\W\s\d][^<>]{0,})|[\s]{0,}))<\/a><\/TD><TD([\w\W\s\d][^<>]{0,})>([\w\W\s\d][^<>]{0,})<\/TD><TD([\w\W\s\d][^<>]{0,})>([\w\W\s\d][^<>]{0,})<\/TD><\/TR>/isIS
 Capturing subpattern count = 11
@ -10165,7 +10169,7 @@ Options: caseless dotall
 First char = '<'
 Need char = '>'
 Subject length lower bound = 47
-No set of starting bytes
+No starting char list
 "(?>.*/)foo"SI
 Capturing subpattern count = 0
@ -10173,7 +10177,7 @@ No options
 No first char
 Need char = 'o'
 Subject length lower bound = 4
-No set of starting bytes
+No starting char list
 /(?(?=[^a-z]+[a-z])  \d{2}-[a-z]{3}-\d{2}  |  \d{2}-\d{2}-\d{2} ) /xSI
 Capturing subpattern count = 0
@ -10181,7 +10185,7 @@ Options: extended
 No first char
 Need char = '-'
 Subject length lower bound = 8
-No set of starting bytes
+No starting char list
 /(?:(?:(?:(?:(?:(?:(?:(?:(?:(a|b|c))))))))))/iSI
 Capturing subpattern count = 1
@ -10189,7 +10193,7 @@ Options: caseless
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: A B C a b c 
+Starting chars: A B C a b c 
 /(?:c|d)(?:)(?:aaaaaaaa(?:)(?:bbbbbbbb)(?:bbbbbbbb(?:))(?:bbbbbbbb(?:)(?:bbbbbbbb)))/SI
 Capturing subpattern count = 0
@ -10197,7 +10201,7 @@ No options
 No first char
 Need char = 'b'
 Subject length lower bound = 41
-Starting byte set: c d 
+Starting chars: c d 
 /<a[\s]+href[\s]*=[\s]*          # find <a href=
 ([\"\'])?                       # find single or double quote
@ -10210,7 +10214,7 @@ Options: caseless extended dotall
 First char = '<'
 Need char = '='
 Subject length lower bound = 9
-No set of starting bytes
+No starting char list
 /^(?!:)                       # colon disallowed at start
  (?:                         # start of item
@ -10226,7 +10230,7 @@ Options: anchored caseless extended
 No first char
 Need char = ':'
 Subject length lower bound = 2
-No set of starting bytes
+No starting char list
 /(?|(?<a>A)|(?<a>B))/I
 Capturing subpattern count = 1
@ -10450,7 +10454,7 @@ Options:
 No first char
 Need char = 'a'
 Subject length lower bound = 1
-No set of starting bytes
+No starting char list
    cat
 0: a
 1: 
@ -10464,7 +10468,7 @@ No options
 No first char
 Need char = 'a'
 Subject length lower bound = 3
-No set of starting bytes
+No starting char list
    cat
 No match
@ -10476,7 +10480,7 @@ No options
 First char = 'i'
 No need char
 Subject length lower bound = 1
-No set of starting bytes
+No starting char list
    i
 0: i
@ -10486,7 +10490,7 @@ No options
 No first char
 Need char = 'i'
 Subject length lower bound = 1
-Starting byte set: i 
+Starting chars: i 
    ia
 0: ia
 1: 
@ -11080,7 +11084,7 @@ No options
 First char = 'a'
 Need char = '4'
 Subject length lower bound = 5
-No set of starting bytes
+No starting char list
 /([abc])++1234/SI
 Capturing subpattern count = 1
@ -11088,7 +11092,7 @@ No options
 No first char
 Need char = '4'
 Subject length lower bound = 5
-Starting byte set: a b c 
+Starting chars: a b c 
 /(?<=(abc)+)X/
 Failed: lookbehind assertion is not fixed length at offset 10
@ -11369,7 +11373,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-No set of starting bytes
+No starting char list
 /(a(?2)|b)(b(?1)|a)(?:(?1)|(?2))/SI
 Capturing subpattern count = 2
@ -11377,7 +11381,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 3
-Starting byte set: a b 
+Starting chars: a b 
 /(a(?2)|b)(b(?1)|a)(?1)(?2)/SI
 Capturing subpattern count = 2
@ -11385,7 +11389,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 4
-Starting byte set: a b 
+Starting chars: a b 
 /(abc)(?1)/SI
 Capturing subpattern count = 1
@ -11393,7 +11397,7 @@ No options
 First char = 'a'
 Need char = 'c'
 Subject length lower bound = 6
-No set of starting bytes
+No starting char list
 /^(?>a)++/
    aa\M
@ -11711,7 +11715,7 @@ No options
 First char = 't'
 Need char = 't'
 Subject length lower bound = 18
-No set of starting bytes
+No starting char list
 /\btype\b\W*?\btext\b\W*?\bjavascript\b|\burl\b\W*?\bshell:|<input\b.*?\btype\b\W*?\bimage\b|\bonkeyup\b\W*?\=/IS
 Capturing subpattern count = 0
@ -11720,7 +11724,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 8
-Starting byte set: < o t u 
+Starting chars: < o t u 
 /a(*SKIP)c|b(*ACCEPT)|/+S!I
 Capturing subpattern count = 0
@ -11729,7 +11733,7 @@ No options
 No first char
 No need char
 Subject length lower bound = -1
-No set of starting bytes
+No starting char list
    a
 0: 
 0+ 
@ -11740,7 +11744,7 @@ No options
 No first char
 No need char
 Subject length lower bound = -1
-Starting byte set: a b x 
+Starting chars: a b x 
    ax
 0: x
@ -12436,7 +12440,7 @@ No options
 No first char
 No need char
 Subject length lower bound = -1
-No set of starting bytes
+No starting char list
 /(?:(a)+(?C1)bb|aa(?C2)b)/
    aab\C+
@ -12722,7 +12726,7 @@ No options
 No first char
 Need char = 'z'
 Subject length lower bound = 2
-Starting byte set: a z 
+Starting chars: a z 
    aaaaaaaaaaaaaz
 Error -21 (recursion limit exceeded)
    aaaaaaaaaaaaaz\Q1000
@ -12735,7 +12739,7 @@ No options
 No first char
 Need char = 'z'
 Subject length lower bound = 2
-Starting byte set: a z 
+Starting chars: a z 
    aaaaaaaaaaaaaz
 Error -21 (recursion limit exceeded)
@ -12746,7 +12750,7 @@ No options
 No first char
 Need char = 'z'
 Subject length lower bound = 2
-Starting byte set: a z 
+Starting chars: a z 
    aaaaaaaaaaaaaz
 No match
    aaaaaaaaaaaaaz\Q10
@ -12790,7 +12794,7 @@ Options: dupnames
 First char = 'a'
 Need char = 'z'
 Subject length lower bound = 5
-No set of starting bytes
+No starting char list
 /a*[bcd]/BZ
 ------------------------------------------------------------------
@ -13902,7 +13906,7 @@ No options
 No first char
 Need char = 'd'
 Subject length lower bound = 1
-Starting byte set: a b c d 
+Starting chars: a b c d 
 /[a-c]+d/DZS
 ------------------------------------------------------------------
@ -13917,7 +13921,7 @@ No options
 No first char
 Need char = 'd'
 Subject length lower bound = 2
-Starting byte set: a b c 
+Starting chars: a b c 
 /[a-c]?d/DZS
 ------------------------------------------------------------------
@ -13932,7 +13936,7 @@ No options
 No first char
 Need char = 'd'
 Subject length lower bound = 1
-Starting byte set: a b c d 
+Starting chars: a b c d 
 /[a-c]{4,6}d/DZS
 ------------------------------------------------------------------
@ -13947,7 +13951,7 @@ No options
 No first char
 Need char = 'd'
 Subject length lower bound = 5
-Starting byte set: a b c 
+Starting chars: a b c 
 /[a-c]{0,6}d/DZS
 ------------------------------------------------------------------
@ -13962,7 +13966,7 @@ No options
 No first char
 Need char = 'd'
 Subject length lower bound = 1
-Starting byte set: a b c d 
+Starting chars: a b c d 
 /-- End of special auto-possessive tests --/
@ -14089,6 +14093,30 @@ Failed: malformed number or name after (?( at offset 4
 /(?(R&6yh)abc)/
 Failed: group name must start with a non-digit at offset 5
 /(((a\2)|(a*)\g<-1>))*a?/BZ
 ------------------------------------------------------------------
        Bra
        Brazero
        SCBra 1
        Once
        CBra 2
        CBra 3
        a
        \2
        Ket
        Alt
        CBra 4
        a*
        Ket
        Recurse
        Ket
        Ket
        KetRmax
        a?+
        Ket
        End
 ------------------------------------------------------------------
 /-- Test the ugly "start or end of word" compatibility syntax --/
 /[[:<:]]red[[:>:]]/BZ
@ -14125,4 +14153,57 @@ No match
 /[a[:<:]] should give error/ 
 Failed: unknown POSIX class name at offset 4
 /(?=ab\K)/+
    abcd
 Start of matched string is beyond its end - displaying from end to start.
 0: ab
 0+ abcd
 /abcd/f<lf>
    xx\nxabcd
 No match
 / -- Test stack check external calls --/ 
 /(((((a)))))/Q0
 /(((((a)))))/Q1
 Failed: parentheses are too deeply nested (stack check) at offset 0
 /(((((a)))))/Q
 ** Missing 0 or 1 after /Q
 /^\w+(?>\s*)(?<=\w)/BZ
 ------------------------------------------------------------------
        Bra
        ^
        \w+
        Once_NC
        \s*+
        Ket
        AssertB
        Reverse
        \w
        Ket
        Ket
        End
 ------------------------------------------------------------------
 /\othing/
 Failed: missing opening brace after \o at offset 1
 /\o{}/
 Failed: digits missing in \x{} or \o{} at offset 1
 /\o{whatever}/
 Failed: non-octal character in \o{} (closing brace missing?) at offset 3
 /\xthing/
 /\x{}/
 Failed: digits missing in \x{} or \o{} at offset 3
 /\x{whatever}/
 Failed: non-hex character in \x{} (closing brace missing?) at offset 3
 /-- End of testinput2 --/
--- a/ext/pcre/pcrelib/testdata/testoutput21-16
+++ b/ext/pcre/pcrelib/testdata/testoutput21-16
@ -50,7 +50,7 @@ Options: anchored extended
 No first char
 No need char
 Subject length lower bound = 6
-No set of starting bytes
+No starting char list
 <!testsaved16BE-1
 Compiled pattern loaded from testsaved16BE-1
@ -83,7 +83,7 @@ Options: anchored extended
 No first char
 No need char
 Subject length lower bound = 6
-No set of starting bytes
+No starting char list
 <!testsaved32LE-1
 Compiled pattern loaded from testsaved32LE-1
--- a/ext/pcre/pcrelib/testdata/testoutput21-32
+++ b/ext/pcre/pcrelib/testdata/testoutput21-32
@ -62,7 +62,7 @@ Options: anchored extended
 No first char
 No need char
 Subject length lower bound = 6
-No set of starting bytes
+No starting char list
 <!testsaved32BE-1
 Compiled pattern loaded from testsaved32BE-1
@ -95,6 +95,6 @@ Options: anchored extended
 No first char
 No need char
 Subject length lower bound = 6
-No set of starting bytes
+No starting char list
 /-- End of testinput21 --/
--- a/ext/pcre/pcrelib/testdata/testoutput22-16
+++ b/ext/pcre/pcrelib/testdata/testoutput22-16
@ -37,7 +37,7 @@ Options: extended utf
 No first char
 No need char
 Subject length lower bound = 2
-No set of starting bytes
+No starting char list
 <!testsaved16BE-2
 Compiled pattern loaded from testsaved16BE-2
@ -64,7 +64,7 @@ Options: extended utf
 No first char
 No need char
 Subject length lower bound = 2
-No set of starting bytes
+No starting char list
 <!testsaved32LE-2
 Compiled pattern loaded from testsaved32LE-2
--- a/ext/pcre/pcrelib/testdata/testoutput22-32
+++ b/ext/pcre/pcrelib/testdata/testoutput22-32
@ -49,7 +49,7 @@ Options: extended utf
 No first char
 No need char
 Subject length lower bound = 2
-No set of starting bytes
+No starting char list
 <!testsaved32BE-2
 Compiled pattern loaded from testsaved32BE-2
@ -76,6 +76,6 @@ Options: extended utf
 No first char
 No need char
 Subject length lower bound = 2
-No set of starting bytes
+No starting char list
 /-- End of testinput22 --/
--- a/ext/pcre/pcrelib/testdata/testoutput23
+++ b/ext/pcre/pcrelib/testdata/testoutput23
@ -18,7 +18,7 @@ Failed: character value in \x{} or \o{} is too large at offset 8
 /[\H]/BZSI
 ------------------------------------------------------------------
        Bra
-        [\x00-\x08\x0a-\x1f!-\x9f\x{a1}-\x{167f}\x{1681}-\x{180d}\x{180f}-\x{1fff}\x{200b}-\x{202e}\x{2030}-\x{205e}\x{2060}-\x{2fff}\x{3001}-\x{ffff}]
+        [\x00-\x08\x0a-\x1f!-\x9f\xa1-\xff\x{100}-\x{167f}\x{1681}-\x{180d}\x{180f}-\x{1fff}\x{200b}-\x{202e}\x{2030}-\x{205e}\x{2060}-\x{2fff}\x{3001}-\x{ffff}]
        Ket
        End
 ------------------------------------------------------------------
@ -27,12 +27,25 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-No set of starting bytes
+Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0a \x0b 
  \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a 
  \x1b \x1c \x1d \x1e \x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 
  : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ 
  _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f \x80 
  \x81 \x82 \x83 \x84 \x85 \x86 \x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f 
  \x90 \x91 \x92 \x93 \x94 \x95 \x96 \x97 \x98 \x99 \x9a \x9b \x9c \x9d \x9e 
  \x9f \xa1 \xa2 \xa3 \xa4 \xa5 \xa6 \xa7 \xa8 \xa9 \xaa \xab \xac \xad \xae 
  \xaf \xb0 \xb1 \xb2 \xb3 \xb4 \xb5 \xb6 \xb7 \xb8 \xb9 \xba \xbb \xbc \xbd 
  \xbe \xbf \xc0 \xc1 \xc2 \xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc 
  \xcd \xce \xcf \xd0 \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb 
  \xdc \xdd \xde \xdf \xe0 \xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea 
  \xeb \xec \xed \xee \xef \xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 
  \xfa \xfb \xfc \xfd \xfe \xff 
 /[\V]/BZSI
 ------------------------------------------------------------------
        Bra
-        [\x00-\x09\x0e-\x84\x{86}-\x{2027}\x{202a}-\x{ffff}]
+        [\x00-\x09\x0e-\x84\x86-\xff\x{100}-\x{2027}\x{202a}-\x{ffff}]
        Ket
        End
 ------------------------------------------------------------------
@ -41,6 +54,19 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-No set of starting bytes
+Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0e 
  \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d 
  \x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > 
  ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c 
  d e f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f \x80 \x81 \x82 
  \x83 \x84 \x86 \x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91 \x92 
  \x93 \x94 \x95 \x96 \x97 \x98 \x99 \x9a \x9b \x9c \x9d \x9e \x9f \xa0 \xa1 
  \xa2 \xa3 \xa4 \xa5 \xa6 \xa7 \xa8 \xa9 \xaa \xab \xac \xad \xae \xaf \xb0 
  \xb1 \xb2 \xb3 \xb4 \xb5 \xb6 \xb7 \xb8 \xb9 \xba \xbb \xbc \xbd \xbe \xbf 
  \xc0 \xc1 \xc2 \xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce 
  \xcf \xd0 \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd 
  \xde \xdf \xe0 \xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec 
  \xed \xee \xef \xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb 
  \xfc \xfd \xfe \xff 
 /-- End of testinput23 --/
--- a/ext/pcre/pcrelib/testdata/testoutput25
+++ b/ext/pcre/pcrelib/testdata/testoutput25
@ -1,6 +1,6 @@
 /-- Tests for the 32-bit library only */
-< forbid 8w
+< forbid 8W
 /-- Check maximum character size --/
@ -65,7 +65,7 @@ Need char = \x{800000}
 /[\H]/BZSI
 ------------------------------------------------------------------
        Bra
-        [\x00-\x08\x0a-\x1f!-\x9f\x{a1}-\x{167f}\x{1681}-\x{180d}\x{180f}-\x{1fff}\x{200b}-\x{202e}\x{2030}-\x{205e}\x{2060}-\x{2fff}\x{3001}-\x{ffffffff}]
+        [\x00-\x08\x0a-\x1f!-\x9f\xa1-\xff\x{100}-\x{167f}\x{1681}-\x{180d}\x{180f}-\x{1fff}\x{200b}-\x{202e}\x{2030}-\x{205e}\x{2060}-\x{2fff}\x{3001}-\x{ffffffff}]
        Ket
        End
 ------------------------------------------------------------------
@ -74,12 +74,25 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-No set of starting bytes
+Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0a \x0b 
  \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a 
  \x1b \x1c \x1d \x1e \x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 
  : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ 
  _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f \x80 
  \x81 \x82 \x83 \x84 \x85 \x86 \x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f 
  \x90 \x91 \x92 \x93 \x94 \x95 \x96 \x97 \x98 \x99 \x9a \x9b \x9c \x9d \x9e 
  \x9f \xa1 \xa2 \xa3 \xa4 \xa5 \xa6 \xa7 \xa8 \xa9 \xaa \xab \xac \xad \xae 
  \xaf \xb0 \xb1 \xb2 \xb3 \xb4 \xb5 \xb6 \xb7 \xb8 \xb9 \xba \xbb \xbc \xbd 
  \xbe \xbf \xc0 \xc1 \xc2 \xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc 
  \xcd \xce \xcf \xd0 \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb 
  \xdc \xdd \xde \xdf \xe0 \xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea 
  \xeb \xec \xed \xee \xef \xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 
  \xfa \xfb \xfc \xfd \xfe \xff 
 /[\V]/BZSI
 ------------------------------------------------------------------
        Bra
-        [\x00-\x09\x0e-\x84\x{86}-\x{2027}\x{202a}-\x{ffffffff}]
+        [\x00-\x09\x0e-\x84\x86-\xff\x{100}-\x{2027}\x{202a}-\x{ffffffff}]
        Ket
        End
 ------------------------------------------------------------------
@ -88,6 +101,19 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-No set of starting bytes
+Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0e 
  \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d 
  \x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > 
  ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c 
  d e f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f \x80 \x81 \x82 
  \x83 \x84 \x86 \x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91 \x92 
  \x93 \x94 \x95 \x96 \x97 \x98 \x99 \x9a \x9b \x9c \x9d \x9e \x9f \xa0 \xa1 
  \xa2 \xa3 \xa4 \xa5 \xa6 \xa7 \xa8 \xa9 \xaa \xab \xac \xad \xae \xaf \xb0 
  \xb1 \xb2 \xb3 \xb4 \xb5 \xb6 \xb7 \xb8 \xb9 \xba \xbb \xbc \xbd \xbe \xbf 
  \xc0 \xc1 \xc2 \xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce 
  \xcf \xd0 \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd 
  \xde \xdf \xe0 \xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec 
  \xed \xee \xef \xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb 
  \xfc \xfd \xfe \xff 
 /-- End of testinput25 --/
--- a/ext/pcre/pcrelib/testdata/testoutput3
+++ b/ext/pcre/pcrelib/testdata/testoutput3
@ -1,6 +1,9 @@
-/-- This set of tests checks local-specific features, using the fr_FR locale. 
+/-- This set of tests checks local-specific features, using the "fr_FR" locale. 
-    It is not Perl-compatible. There is different version called wintestinput3
+    It is not Perl-compatible. When run via RunTest, the locale is edited to
-  f  or use on Windows, where the locale is called "french". --/
+    be whichever of "fr_FR", "french", or "fr" is found to exist. There is
    different version of this file called wintestinput3 for use on Windows,
    where the locale is called "french" and the tests are run using
    RunTest.bat. --/
 < forbid 8W 
@ -90,7 +93,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P 
+Starting chars: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P 
  Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z 
 /\w/ISLfr_FR
@ -99,7 +102,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P 
+Starting chars: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P 
  Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z 
  ª µ º À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï Ð Ñ Ò Ó Ô Õ Ö Ø Ù Ú Û Ü Ý Þ ß à á â 
  ã ä å æ ç è é ê ë ì í î ï ð ñ ò ó ô õ ö ø ù ú û ü ý þ ÿ 
--- a/ext/pcre/pcrelib/testdata/testoutput4
+++ b/ext/pcre/pcrelib/testdata/testoutput4
@ -1263,4 +1263,12 @@ No match
    aa
 0: aa
 /^.\B.\B./8
    \x{10123}\x{10124}\x{10125}
 0: \x{10123}\x{10124}\x{10125}
 /^#[^\x{ffff}]#[^\x{ffff}]#[^\x{ffff}]#/8
    #\x{10000}#\x{100}#\x{10ffff}#
 0: #\x{10000}#\x{100}#\x{10ffff}#
 /-- End of testinput4 --/
--- a/ext/pcre/pcrelib/testdata/testoutput5
+++ b/ext/pcre/pcrelib/testdata/testoutput5
@ -270,7 +270,7 @@ No match
 /[z-\x{100}]/8DZ
 ------------------------------------------------------------------
        Bra
-        [z-\x{100}]
+        [z-\xff\x{100}]
        Ket
        End
 ------------------------------------------------------------------
@ -812,7 +812,7 @@ No match
 /[\H]/8BZ
 ------------------------------------------------------------------
        Bra
-        [\x00-\x08\x0a-\x1f!-\x9f\x{a1}-\x{167f}\x{1681}-\x{180d}\x{180f}-\x{1fff}\x{200b}-\x{202e}\x{2030}-\x{205e}\x{2060}-\x{2fff}\x{3001}-\x{10ffff}]
+        [\x00-\x08\x0a-\x1f!-\x9f\xa1-\xff\x{100}-\x{167f}\x{1681}-\x{180d}\x{180f}-\x{1fff}\x{200b}-\x{202e}\x{2030}-\x{205e}\x{2060}-\x{2fff}\x{3001}-\x{10ffff}]
        Ket
        End
 ------------------------------------------------------------------
@ -820,7 +820,7 @@ No match
 /[\V]/8BZ
 ------------------------------------------------------------------
        Bra
-        [\x00-\x09\x0e-\x84\x{86}-\x{2027}\x{202a}-\x{10ffff}]
+        [\x00-\x09\x0e-\x84\x86-\xff\x{100}-\x{2027}\x{202a}-\x{10ffff}]
        Ket
        End
 ------------------------------------------------------------------
@ -1536,7 +1536,7 @@ Options: caseless utf
 No first char
 No need char
 Subject length lower bound = 1
-No set of starting bytes
+No starting char list
 /[^\x{1234}]+?/iS8I   
 Capturing subpattern count = 0
@ -1544,7 +1544,7 @@ Options: caseless utf
 No first char
 No need char
 Subject length lower bound = 1
-No set of starting bytes
+No starting char list
 /[^\x{1234}]++/iS8I   
 Capturing subpattern count = 0
@ -1552,7 +1552,7 @@ Options: caseless utf
 No first char
 No need char
 Subject length lower bound = 1
-No set of starting bytes
+No starting char list
 /[^\x{1234}]{2}/iS8I
 Capturing subpattern count = 0
@ -1560,7 +1560,7 @@ Options: caseless utf
 No first char
 No need char
 Subject length lower bound = 2
-No set of starting bytes
+No starting char list
 //<bsr_anycrlf><bsr_unicode>
 Failed: inconsistent NEWLINE options at offset 0
@ -1620,7 +1620,7 @@ Failed: disallowed Unicode code point (>= 0xd800 && <= 0xdfff) at offset 7
 /[\H\x{d7ff}]+/8BZ
 ------------------------------------------------------------------
        Bra
-        [\x00-\x08\x0a-\x1f!-\x9f\x{a1}-\x{167f}\x{1681}-\x{180d}\x{180f}-\x{1fff}\x{200b}-\x{202e}\x{2030}-\x{205e}\x{2060}-\x{2fff}\x{3001}-\x{10ffff}\x{d7ff}]++
+        [\x00-\x08\x0a-\x1f!-\x9f\xa1-\xff\x{100}-\x{167f}\x{1681}-\x{180d}\x{180f}-\x{1fff}\x{200b}-\x{202e}\x{2030}-\x{205e}\x{2060}-\x{2fff}\x{3001}-\x{10ffff}\x{d7ff}]++
        Ket
        End
 ------------------------------------------------------------------
@ -1660,7 +1660,7 @@ Failed: disallowed Unicode code point (>= 0xd800 && <= 0xdfff) at offset 7
 /[\V\x{d7ff}]+/8BZ
 ------------------------------------------------------------------
        Bra
-        [\x00-\x09\x0e-\x84\x{86}-\x{2027}\x{202a}-\x{10ffff}\x{d7ff}]++
+        [\x00-\x09\x0e-\x84\x86-\xff\x{100}-\x{2027}\x{202a}-\x{10ffff}\x{d7ff}]++
        Ket
        End
 ------------------------------------------------------------------
@ -1882,4 +1882,19 @@ Failed: disallowed Unicode code point (>= 0xd800 && <= 0xdfff) at offset 5
    aa
 0: aa
 /[b-d\x{200}-\x{250}]*[ae-h]?#[\x{200}-\x{250}]{0,8}[\x00-\xff]*#[\x{200}-\x{250}]+[a-z]/8BZ
 ------------------------------------------------------------------
        Bra
        [b-d\x{200}-\x{250}]*+
        [ae-h]?+
        #
        [\x{200}-\x{250}]{0,8}+
        [\x00-\xff]*
        #
        [\x{200}-\x{250}]++
        [a-z]
        Ket
        End
 ------------------------------------------------------------------
 /-- End of testinput5 --/
--- a/ext/pcre/pcrelib/testdata/testoutput6
+++ b/ext/pcre/pcrelib/testdata/testoutput6
@ -719,9 +719,9 @@ No match
 0: \x{6e9}
    \x{060b}
 0: \x{60b}
    \x{061c}
 0: \x{61c}
    ** Failers
 No match
    \x{061c}
 No match
    X\x{06e9}   
 No match
@ -2445,4 +2445,20 @@ No match
    \x{37e} 
 No match
 /[RST]+/8iW
    Ss\x{17f}
 0: Ss\x{17f}
 /[R-T]+/8iW 
    Ss\x{17f}
 0: Ss\x{17f}
 /[q-u]+/8iW 
    Ss\x{17f}
 0: Ss\x{17f}
 /^s?c/mi8
    scat
 0: sc
 /-- End of testinput6 --/
--- a/ext/pcre/pcrelib/testdata/testoutput7
+++ b/ext/pcre/pcrelib/testdata/testoutput7
@ -124,7 +124,7 @@ No match
 /[z-\x{100}]/8iDZ 
 ------------------------------------------------------------------
        Bra
-        [Z\x{39c}\x{3bc}\x{1e9e}\x{178}z-\x{101}]
+        [Zz-\xff\x{39c}\x{3bc}\x{212b}\x{1e9e}\x{212b}\x{178}\x{100}-\x{101}]
        Ket
        End
 ------------------------------------------------------------------
@ -162,7 +162,7 @@ No match
 /[z-\x{100}]/8DZi
 ------------------------------------------------------------------
        Bra
-        [Z\x{39c}\x{3bc}\x{1e9e}\x{178}z-\x{101}]
+        [Zz-\xff\x{39c}\x{3bc}\x{212b}\x{1e9e}\x{212b}\x{178}\x{100}-\x{101}]
        Ket
        End
 ------------------------------------------------------------------
@ -2263,4 +2263,36 @@ No match
        End
 ------------------------------------------------------------------
 /[RST]+/8iWBZ
 ------------------------------------------------------------------
        Bra
        [R-Tr-t\x{17f}]++
        Ket
        End
 ------------------------------------------------------------------
 /[R-T]+/8iWBZ 
 ------------------------------------------------------------------
        Bra
        [R-Tr-t\x{17f}]++
        Ket
        End
 ------------------------------------------------------------------
 /[Q-U]+/8iWBZ 
 ------------------------------------------------------------------
        Bra
        [Q-Uq-u\x{17f}]++
        Ket
        End
 ------------------------------------------------------------------
 /^s?c/mi8I
 Capturing subpattern count = 0
 Options: caseless multiline utf
 First char at start or follows newline
 Need char = 'c' (caseless)
    scat
 0: sc
 /-- End of testinput7 --/
--- a/ext/pcre/pcrelib/testdata/testoutput8
+++ b/ext/pcre/pcrelib/testdata/testoutput8
@ -7232,7 +7232,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 3
-Starting byte set: a d x 
+Starting chars: a d x 
    terhjk;abcdaadsfe
 0: abc
    the quick xyz brown fox 
@ -7777,4 +7777,12 @@ Matched, but offsets vector is too small to show all matches
 1: aaa
 2: aa
 '\A(?:[^\"]++|\"(?:[^\"]*+|\"\")*+\")++'
    NON QUOTED \"QUOT\"\"ED\" AFTER \"NOT MATCHED
 0: NON QUOTED "QUOT""ED" AFTER 
 '\A(?:[^\"]++|\"(?:[^\"]++|\"\")*+\")++'
    NON QUOTED \"QUOT\"\"ED\" AFTER \"NOT MATCHED
 0: NON QUOTED "QUOT""ED" AFTER 
 /-- End of testinput8 --/
--- a/ext/pcre/pcrelib/testdata/wintestoutput3
+++ b/ext/pcre/pcrelib/testdata/wintestoutput3
@ -84,7 +84,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P 
+Starting chars: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P 
  Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z 
 /\w/ISLfrench
@ -93,7 +93,7 @@ No options
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P 
+Starting chars: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P 
  Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z 
  ƒ Š Œ Ž š œ ž Ÿ ª ² ³ µ ¹ º À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï Ð Ñ Ò Ó Ô Õ Ö 
  Ø Ù Ú Û Ü Ý Þ ß à á â ã ä å æ ç è é ê ë ì í î ï ð ñ ò ó ô õ ö ø ù ú û ü ý 
--- a/ext/pcre/pcrelib/ucp.h
+++ b/ext/pcre/pcrelib/ucp.h
@ -192,7 +192,31 @@ enum {
  ucp_Miao,
  ucp_Sharada,
  ucp_Sora_Sompeng,
-  ucp_Takri
+  ucp_Takri,
  /* New for Unicode 7.0.0: */
  ucp_Bassa_Vah,
  ucp_Caucasian_Albanian,
  ucp_Duployan,
  ucp_Elbasan,
  ucp_Grantha,
  ucp_Khojki,
  ucp_Khudawadi,
  ucp_Linear_A,
  ucp_Mahajani,
  ucp_Manichaean,
  ucp_Mende_Kikakui,
  ucp_Modi,
  ucp_Mro,
  ucp_Nabataean,
  ucp_Old_North_Arabian,
  ucp_Old_Permic,
  ucp_Pahawh_Hmong,
  ucp_Palmyrene,
  ucp_Psalter_Pahlavi,
  ucp_Pau_Cin_Hau,
  ucp_Siddham,
  ucp_Tirhuta,
  ucp_Warang_Citi
 };
 #endif
--- a/ext/pcre/upgrade-pcre.php
+++ b/ext/pcre/upgrade-pcre.php
@ -68,11 +68,11 @@ function recurse($path)
 		// always include the config.h file
 		$content    = file_get_contents($newfile);
-		$newcontent = preg_replace('/#\s*ifdef HAVE_CONFIG_H\s*(.+)\s*#\s*endif/', '$1', $content);
+		//$newcontent = preg_replace('/#\s*ifdef HAVE_CONFIG_H\s*(.+)\s*#\s*endif/', '$1', $content);
-		if ($content !== $newcontent) {
+		//if ($content !== $newcontent) {
-			file_put_contents($file, $newcontent);
+		//	file_put_contents($file, $newcontent);
-		}
+		//}
 		echo "OK\n";
 	}