8341923: java.util.Locale class specification improvements

Reviewed-by: liach, naoto
2025-08-26 22:34:27 +02:00 · 2024-12-04 20:57:26 +00:00 · 2024-12-04 20:57:26 +00:00 · ee0f88c901
commit ee0f88c901
parent a72cab8c47
1 changed files with 236 additions and 220 deletions
--- a/src/java.base/share/classes/java/util/Locale.java
+++ b/src/java.base/share/classes/java/util/Locale.java
@ -45,7 +45,7 @@ import java.io.ObjectInputStream;
 import java.io.ObjectOutputStream;
 import java.io.ObjectStreamField;
 import java.io.Serializable;
-import java.text.DateFormat;
+import java.text.NumberFormat;
 import java.text.MessageFormat;
 import java.util.concurrent.ConcurrentHashMap;
 import java.util.function.Function;
@ -70,131 +70,151 @@ import sun.util.locale.provider.LocaleServiceProviderPool;
 import sun.util.locale.provider.TimeZoneNameUtility;

 /**
- * A {@code Locale} object represents a specific geographical, political,
- * or cultural region. An operation that requires a {@code Locale} to perform
- * its task is called <em>locale-sensitive</em> and uses the {@code Locale}
- * to tailor information for the user. For example, displaying a number
- * is a locale-sensitive operation&mdash; the number should be formatted
- * according to the customs and conventions of the user's native country,
- * region, or culture.
+ * A {@code Locale} represents a specific geographical, political,
+ * or cultural region. An API that requires a {@code Locale} to perform
+ * its task is <dfn>{@index "locale-sensitive"}</dfn> and uses the {@code Locale}
+ * to tailor information for the user. These <em>locale-sensitive</em> APIs
+ * are principally in the <i>java.text</i> and <i>java.util</i> packages.
+ * For example, displaying a number is a <em>locale-sensitive</em> operation&mdash;
+ * the number should be formatted according to the customs and conventions of the
+ * user's native country, region, or culture.
 *
 * <p>The {@code Locale} class implements IETF BCP 47 which is composed of
 * <a href="https://tools.ietf.org/html/rfc4647">RFC 4647 "Matching of Language
 * Tags"</a> and <a href="https://tools.ietf.org/html/rfc5646">RFC 5646 "Tags
 * for Identifying Languages"</a> with support for the LDML (UTS#35, "Unicode
 * Locale Data Markup Language") BCP 47-compatible extensions for locale data
- * exchange.
+ * exchange. Each {@code Locale} is associated with locale data which is provided
+ * by the Java runtime environment or any deployed {@link
+ * java.util.spi.LocaleServiceProvider LocaleServiceProvider} implementations.
+ * The locale data provided by the Java runtime environment may vary by release.
 *
- * <p> A {@code Locale} object logically consists of the fields
- * described below.
+ * <h2 id="loc_comp">Locale Composition</h2>
+ * <p> A {@code Locale} is composed of the bolded fields described below; note that a
+ * {@code Locale} need not have all such fields. For example, {@link
+ * Locale#ENGLISH Locale.ENGLISH} is only comprised of the <em>language</em> field.
+ * In contrast, a {@code Locale} such as the one returned by {@code
+ * Locale.forLanguageTag("en-Latn-US-POSIX-u-nu-latn")} would be comprised of all
+ * the fields below. This particular {@code Locale} would represent English in
+ * the United States using the Latin script and numerics for use in POSIX
+ * environments.
+ * <p>
+ * {@code Locale} implements IETF BCP 47 and any deviations should be observed
+ * by the comments prefixed by <em>"BCP 47 deviation:"</em>.
+ * <a href="https://tools.ietf.org/html/rfc5646">RFC 5646</a>
+ * combines subtags from various ISO (639, 3166, 15924) standards which are also
+ * included in the composition of {@code Locale}.
+ * Additionally, the full list of valid codes for each field can be found in the
+ * <a href="https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry">
+ * IANA Language Subtag Registry</a> (e.g. search for "Type: region").
 *
 * <dl>
 *   <dt><a id="def_language"><b>language</b></a></dt>
- *
- *   <dd>ISO 639 alpha-2 or alpha-3 language code, or registered
- *   language subtags up to 8 alpha letters (for future enhancements).
+ *   <dd> ISO 639 alpha-2/alpha-3 language code or a registered
+ *   language subtag up to 8 alpha letters (for future enhancements).
 *   When a language has both an alpha-2 code and an alpha-3 code, the
- *   alpha-2 code must be used.  You can find a full list of valid
- *   language codes in the IANA Language Subtag Registry (search for
- *   "Type: language").  The language field is case insensitive, but
+ *   alpha-2 code must be used.</dd>
+ *
+ *   <dd> <em>Case convention:</em> {@code language} is case insensitive, but
 *   {@code Locale} always canonicalizes to lower case.</dd>
 *
- *   <dd>Well-formed language values have the form
- *   <code>[a-zA-Z]{2,8}</code>.  Note that this is not the full
- *   BCP47 language production, since it excludes extlang.  They are
- *   not needed since modern three-letter language codes replace
- *   them.</dd>
+ *   <dd> <em>Syntax:</em> Well-formed {@code language} values have the form {@code [a-zA-Z]{2,8}}.</dd>
+ *   <dd> <em> BCP 47 deviation:</em> this is not the full BCP 47 language production, since it excludes
+ *   <a href="https://datatracker.ietf.org/doc/html/rfc5646#section-2.2.2">extlang</a>
+ *   (as modern three-letter language codes are preferred).</dd>
 *
- *   <dd>Example: "en" (English), "ja" (Japanese), "kok" (Konkani)</dd>
+ *   <dd> <em>Example:</em> "en" (English), "ja" (Japanese), "kok" (Konkani)</dd>
 *
 *   <dt><a id="def_script"><b>script</b></a></dt>
 *
- *   <dd>ISO 15924 alpha-4 script code.  You can find a full list of
- *   valid script codes in the IANA Language Subtag Registry (search
- *   for "Type: script").  The script field is case insensitive, but
+ *   <dd> ISO 15924 alpha-4 script code.</dd>
+ *
+ *   <dd> <em>Case convention:</em> {@code script} is case insensitive, but
 *   {@code Locale} always canonicalizes to title case (the first
 *   letter is upper case and the rest of the letters are lower
 *   case).</dd>
 *
- *   <dd>Well-formed script values have the form
- *   <code>[a-zA-Z]{4}</code></dd>
+ *   <dd> <em>Syntax:</em> Well-formed {@code script} values have the form {@code
+ *   [a-zA-Z]{4}}</dd>
 *
- *   <dd>Example: "Latn" (Latin), "Cyrl" (Cyrillic)</dd>
+ *   <dd> <em>Example:</em> "Latn" (Latin), "Cyrl" (Cyrillic)</dd>
 *
 *   <dt><a id="def_region"><b>country (region)</b></a></dt>
 *
- *   <dd>ISO 3166 alpha-2 country code or UN M.49 numeric-3 area code.
- *   You can find a full list of valid country and region codes in the
- *   IANA Language Subtag Registry (search for "Type: region").  The
- *   country (region) field is case insensitive, but
+ *   <dd> ISO 3166 alpha-2 country code or UN M.49 numeric-3 area code.</dd>
+ *
+ *   <dd> <em>Case convention:</em> {@code country (region)} is case insensitive, but
 *   {@code Locale} always canonicalizes to upper case.</dd>
 *
- *   <dd>Well-formed country/region values have
- *   the form <code>[a-zA-Z]{2} | [0-9]{3}</code></dd>
+ *   <dd> <em>Syntax:</em> Well-formed {@code country (region)} values have the form {@code
+ *   [a-zA-Z]{2} | [0-9]{3}}</dd>
 *
- *   <dd>Example: "US" (United States), "FR" (France), "029"
+ *   <dd> <em>Example:</em> "US" (United States), "FR" (France), "029"
 *   (Caribbean)</dd>
 *
 *   <dt><a id="def_variant"><b>variant</b></a></dt>
 *
 *   <dd> Any arbitrary value used to indicate a variation of a
- *   {@code Locale}.  Where there are two or more variant values
- *   each indicating its own semantics, these values should be ordered
- *   by importance, with most important first, separated by
- *   underscore('_').  The variant field is case sensitive.</dd>
- *
- *   <dd>Note: IETF BCP 47 places syntactic restrictions on variant
- *   subtags.  Also BCP 47 subtags are strictly used to indicate
+ *   {@code Locale}. When multiple variants exist, they should be separated by
+ *   {@code ('_'|'-')}. Variants of higher importance should precede the others.</dd>
+ *   <dd> <em>BCP 47 deviation:</em> BCP 47 subtags are strictly used to indicate
 *   additional variations that define a language or its dialects that
 *   are not covered by any combinations of language, script and
- *   region subtags.  You can find a full list of valid variant codes
- *   in the IANA Language Subtag Registry (search for "Type: variant").
- *
- *   <p>However, the variant field in {@code Locale} has
+ *   region subtags. However, the variant field in {@code Locale} has
 *   historically been used for any kind of variation, not just
 *   language variations.  For example, some supported variants
 *   available in Java SE Runtime Environments indicate alternative
 *   cultural behaviors such as calendar type or number script.  In
- *   BCP 47 this kind of information, which does not identify the
+ *   BCP 47, this kind of information which does not identify the
 *   language, is supported by extension subtags or private use
 *   subtags.</dd>
 *
- *   <dd>Well-formed variant values have the form <code>SUBTAG
- *   (('_'|'-') SUBTAG)*</code> where <code>SUBTAG =
- *   [0-9][0-9a-zA-Z]{3} | [0-9a-zA-Z]{5,8}</code>. (Note: BCP 47 only
- *   uses hyphen ('-') as a delimiter, this is more lenient).</dd>
+ *   <dd> <em>Case convention:</em> {@code variant} is case sensitive. BCP 47
+ *   deviation: BCP 47 treats the variant field as case insensitive.</dd>
 *
- *   <dd>Example: "polyton" (Polytonic Greek), "POSIX"</dd>
+ *   <dd> <em>Syntax:</em> Well-formed {@code variant} values have the form {@code
+ *   SUBTAG (('_'|'-') SUBTAG)*} where {@code SUBTAG =
+ *   [0-9][0-9a-zA-Z]{3} | [0-9a-zA-Z]{5,8}}.</dd>
+ *   <dd> <em>BCP 47 deviation:</em> BCP 47 only
+ *   uses hyphen ('-') as a delimiter, {@code Locale} is more lenient.</dd>
+ *
+ *   <dd> <em>Example:</em> "polyton" (Polytonic Greek), "POSIX"</dd>
 *
 *   <dt><a id="def_extensions"><b>extensions</b></a></dt>
 *
 *   <dd> A map from single character keys to string values, indicating
- *   extensions apart from language identification.  The extensions in
- *   {@code Locale} implement the semantics and syntax of BCP 47
- *   extension subtags and private use subtags. The extensions are
- *   case insensitive, but {@code Locale} canonicalizes all
- *   extension keys and values to lower case. Note that extensions
- *   cannot have empty values.</dd>
+ *   extensions apart from language identification.</dd>
+ *   <dd> <em> BCP 47 deviation:</em> The {@code
+ *   extensions} in {@code Locale} implement the semantics and syntax of BCP 47
+ *   extension subtags <em>and</em> private use subtags. The {@code extensions}
+ *   field cannot have empty values. </dd>
 *
- *   <dd>Well-formed keys are single characters from the set
+ *   <dd> <em>Case convention:</em> {@code extensions} are
+ *   case insensitive, but {@code Locale} canonicalizes all
+ *   extension keys and values to lower case.</dd>
+ *
+ *   <dd> <em>Syntax:</em> Well-formed keys are single characters from the set
 *   {@code [0-9a-zA-Z]}.  Well-formed values have the form
 *   {@code SUBTAG ('-' SUBTAG)*} where for the key 'x'
- *   <code>SUBTAG = [0-9a-zA-Z]{1,8}</code> and for other keys
- *   <code>SUBTAG = [0-9a-zA-Z]{2,8}</code> (that is, 'x' allows
+ *   {@code SUBTAG = [0-9a-zA-Z]{1,8}} and for other keys
+ *   {@code SUBTAG = [0-9a-zA-Z]{2,8}} (that is, 'x' allows
 *   single-character subtags).</dd>
 *
- *   <dd>Example: key="u"/value="ca-japanese" (Japanese Calendar),
+ *   <dd> <em>Example:</em> key="u"/value="ca-japanese" (Japanese Calendar),
 *   key="x"/value="java-1-7"</dd>
 * </dl>
 *
- * <b>Note:</b> Although BCP 47 requires field values to be registered
+ * <b>BCP 47 deviation:</b> Although BCP 47 requires field values to be registered
 * in the IANA Language Subtag Registry, the {@code Locale} class
- * does not provide any validation features.  The {@code Builder}
+ * does not validate this requirement. For example, the variant code <em>"foobar"</em>
+ * is well-formed since it is composed of 5 to 8 alphanumerics, but is not defined
+ * the IANA Language Subtag Registry. The {@link Builder}
 * only checks if an individual field satisfies the syntactic
 * requirement (is well-formed), but does not validate the value
- * itself.  See {@link Builder} for details.
+ * itself. Conversely, {@link #of(String, String, String) Locale::of} and its
+ * overloads do not make any syntactic checks on the input.
 *
- * <h2><a id="def_locale_extension">Unicode locale/language extension</a></h2>
+ * <h3><a id="def_locale_extension">Unicode BCP 47 U Extension</a></h3>
 *
 * <p>UTS#35, "Unicode Locale Data Markup Language" defines optional
 * attributes and keywords to override or refine the default behavior
@ -213,7 +233,7 @@ import sun.util.locale.provider.TimeZoneNameUtility;
 * String representing this information, for example, "nu-thai".  The
 * {@code Locale} class also provides {@link
 * #getUnicodeLocaleAttributes}, {@link #getUnicodeLocaleKeys}, and
- * {@link #getUnicodeLocaleType} which allow you to access Unicode
+ * {@link #getUnicodeLocaleType(String)} which provides access to the Unicode
 * locale attributes and key/type pairs directly.  When represented as
 * a string, the Unicode Locale Extension lists attributes
 * alphabetically, followed by key/type sequences with keys listed
@ -221,11 +241,11 @@ import sun.util.locale.provider.TimeZoneNameUtility;
 * fixed when the type is defined)
 *
 * <p>A well-formed locale key has the form
- * <code>[0-9a-zA-Z]{2}</code>.  A well-formed locale type has the
- * form <code>"" | [0-9a-zA-Z]{3,8} ('-' [0-9a-zA-Z]{3,8})*</code> (it
+ * {@code [0-9a-zA-Z]{2}}.  A well-formed locale type has the
+ * form {@code "" | [0-9a-zA-Z]{3,8} ('-' [0-9a-zA-Z]{3,8})*} (it
 * can be empty, or a series of subtags 3-8 alphanums in length).  A
 * well-formed locale attribute has the form
- * <code>[0-9a-zA-Z]{3,8}</code> (it is a single subtag with the same
+ * {@code [0-9a-zA-Z]{3,8}} (it is a single subtag with the same
 * form as a locale type subtag).
 *
 * <p>The Unicode locale extension specifies optional behavior in
@ -234,36 +254,11 @@ import sun.util.locale.provider.TimeZoneNameUtility;
 * implementations in a Java Runtime Environment might not support any
 * particular Unicode locale attributes or key/type pairs.
 *
- * <h3><a id="ObtainingLocale">Obtaining a Locale</a></h3>
+ * <h2><a id="default_locale">Default Locale</a></h2>
 *
- * <p>There are several ways to obtain a {@code Locale}
- * object.
- *
- * <h4>Builder</h4>
- *
- * <p>Using {@link Builder} you can construct a {@code Locale} object
- * that conforms to BCP 47 syntax.
- *
- * <h4>Factory Methods</h4>
- *
- * <p>The method {@link #forLanguageTag} obtains a {@code Locale}
- * object for a well-formed BCP 47 language tag. The method
- * {@link #of(String, String, String)} and its overloads obtain a
- * {@code Locale} object from given {@code language}, {@code country},
- * and/or {@code variant} defined above.
- *
- * <h4>Locale Constants</h4>
- *
- * <p>The {@code Locale} class provides a number of convenient constants
- * that you can use to obtain {@code Locale} objects for commonly used
- * locales. For example, {@code Locale.US} is the {@code Locale} object
- * for the United States.
- *
- * <h3><a id="default_locale">Default Locale</a></h3>
- *
- * <p>The default Locale is provided for any locale-sensitive methods if no
+ * <p>The default Locale is provided for any <em>locale-sensitive</em> methods if no
 * {@code Locale} is explicitly specified as an argument, such as
- * {@link DateFormat#getInstance()}. The default Locale is determined at startup
+ * {@link NumberFormat#getInstance()}. The default Locale is determined at startup
 * of the Java runtime and established in the following three phases:
 * <ol>
 * <li>The locale-related system properties listed below are established from the
@ -316,6 +311,7 @@ import sun.util.locale.provider.TimeZoneNameUtility;
 * system properties are not altered. It is not recommended that applications read
 * these system properties and parse or interpret them as their values may be out of date.
 *
+ * <h3>Locale Category</h3>
 * <p>There are finer-grained default Locales specific for each {@link Locale.Category}.
 * These category specific default Locales can be queried by {@link #getDefault(Category)},
 * and set by {@link #setDefault(Category, Locale)}. Construction of these category
@ -327,19 +323,87 @@ import sun.util.locale.provider.TimeZoneNameUtility;
 * category. In the absence of category specific system properties, the "category-less"
 * system properties are used, such as {@code user.language} in the previous example.
 *
- * <h3><a id="LocaleMatching">Locale Matching</a></h3>
+ * <h2><a id="ObtainingLocale">Obtaining a Locale</a></h2>
 *
- * <p>If an application or a system is internationalized and provides localized
+ * <p>There are several ways to obtain a {@code Locale} object.
+ * It is advised against using the deprecated {@code Locale} constructors.
+ *
+ * <dl>
+ *  <dt><b>Locale Constants</b></dt>
+ *  <dd>A number of convenient constants are provided that return {@code Locale}
+ *  objects for commonly used locales. For example, {@link #US Locale.US} is the
+ *  {@code Locale} object for the United States.</dd>
+ *  <dt><b>Factory Methods</b></dt>
+ *  <dd>{@link #of(String, String, String) Locale::of} and its overloads obtain a
+ *  {@code Locale} object from the given {@code language}, {@code country},
+ *  and/or {@code variant}. {@link #forLanguageTag(String)} obtains a {@code Locale}
+ *  object for a well-formed BCP 47 language tag.</dd>
+ *  <dt><b>Builder</b></dt>
+ *  <dd>{@link Builder} is used to construct a {@code Locale} object that conforms
+ *  to BCP 47 syntax. Use a builder to enforce syntactic restrictions on the input.</dd>
+ * </dl>
+ * <p>The following invocations produce Locale objects that are all equivalent:
+ * {@snippet lang=java :
+ *     Locale.US;
+ *     Locale.of("en", "US");
+ *     Locale.forLanguageTag("en-US");
+ *     new Locale.Builder().setLanguage("en").setRegion("US").build();
+ * }
+ *
+ * <h2>Usage Examples</h2>
+ *
+ * <p>Once a {@code Locale} is {@linkplain ##ObtainingLocale obtained},
+ * it can be queried for information about itself. For example, use {@link
+ * #getCountry} to get the country (or region) code and {@link #getLanguage} to
+ * get the language. {@link #getDisplayCountry} can be used to get the
+ * name of the country suitable for displaying to the user. Similarly,
+ * use {@link #getDisplayLanguage()} to get the name of
+ * the language suitable for displaying to the user. The {@code getDisplayXXX}
+ * methods are themselves <em>locale-sensitive</em> and have two variants; one with an explicit
+ * locale parameter, and one without. The latter uses the default {@link
+ * Locale.Category#DISPLAY DISPLAY} locale, so the following are equivalent :
+ * {@snippet lang=java :
+ *     Locale.getDefault().getDisplayCountry();
+ *     Locale.getDefault().getDisplayCountry(Locale.getDefault(Locale.Category.DISPLAY));
+ * }
+ *
+ * <p>The Java Platform provides a number of classes that perform locale-sensitive
+ * operations. For example, the {@code NumberFormat} class formats
+ * numbers, currency, and percentages in a <em>locale-sensitive</em> manner. Classes such
+ * as {@code NumberFormat} have several factory methods for creating a default object
+ * of that type. These methods generally have two variants; one with an explicit
+ * locale parameter, and one without. The latter uses the default {@link
+ * Locale.Category#FORMAT FORMAT} locale, so the following are equivalent :
+ * {@snippet lang=java :
+ *     NumberFormat.getCurrencyInstance();
+ *     NumberFormat.getCurrencyInstance(Locale.getDefault(Locale.Category.FORMAT));
+ * }
+ *
+ * <p>
+ * The following example demonstrates <em>locale-sensitive</em> currency and
+ * date related operations under different locales :
+ * {@snippet lang = java:
+ *     var number = 1000;
+ *     NumberFormat.getCurrencyInstance(Locale.US).format(number); // returns "$1,000.00"
+ *     NumberFormat.getCurrencyInstance(Locale.JAPAN).format(number); // returns "\u00A51,000""
+ *     var date = LocalDate.of(2024, 1, 1);
+ *     DateTimeFormatter.ofLocalizedDate(FormatStyle.LONG).localizedBy(Locale.US).format(date); // returns "January 1, 2024"
+ *     DateTimeFormatter.ofLocalizedDate(FormatStyle.LONG).localizedBy(Locale.JAPAN).format(date); // returns "2024\u5e741\u67081\u65e5"
+ * }
+ *
+ * <h2><a id="LocaleMatching">Locale Matching</a></h2>
+ *
+ * <p>If an application is internationalized and provides localized
 * resources for multiple locales, it sometimes needs to find one or more
 * locales (or language tags) which meet each user's specific preferences. Note
- * that a term "language tag" is used interchangeably with "locale" in this
- * locale matching documentation.
+ * that the term "<dfn>{@index "language tag"}</dfn>" is used interchangeably
+ * with "locale" in the following locale matching documentation.
 *
- * <p>In order to do matching a user's preferred locales to a set of language
+ * <p>In order to match a user's preferred locales to a set of language
 * tags, <a href="https://tools.ietf.org/html/rfc4647">RFC 4647 Matching of
 * Language Tags</a> defines two mechanisms: filtering and lookup.
 * <em>Filtering</em> is used to get all matching locales, whereas
- * <em>lookup</em> is to choose the best matching locale.
+ * <em>lookup</em> is to select the best matching locale.
 * Matching is done case-insensitively. These matching mechanisms are described
 * in the following sections.
 *
@ -348,7 +412,8 @@ import sun.util.locale.provider.TimeZoneNameUtility;
 * language ranges: basic and extended. See
 * {@link Locale.LanguageRange Locale.LanguageRange} for details.
 *
- * <h4>Filtering</h4>
+ *
+ * <h3>Filtering</h3>
 *
 * <p>The filtering operation returns all matching language tags. It is defined
 * in RFC 4647 as follows:
@ -366,7 +431,7 @@ import sun.util.locale.provider.TimeZoneNameUtility;
 * {@link Locale.FilteringMode} is a parameter to specify how filtering should
 * be done.
 *
- * <h4>Lookup</h4>
+ * <h3>Lookup</h3>
 *
 * <p>The lookup operation returns the best matching language tags. It is
 * defined in RFC 4647 as follows:
@ -398,76 +463,50 @@ import sun.util.locale.provider.TimeZoneNameUtility;
 * an {@link Iterator} over a {@link Collection} of language tags is treated as
 * the best matching one.
 *
- * <h3>Use of Locale</h3>
+ * <h3>Serialization</h3>
 *
- * <p>Once you've obtained a {@code Locale} you can query it for information
- * about itself. Use {@code getCountry} to get the country (or region)
- * code and {@code getLanguage} to get the language code.
- * You can use {@code getDisplayCountry} to get the
- * name of the country suitable for displaying to the user. Similarly,
- * you can use {@code getDisplayLanguage} to get the name of
- * the language suitable for displaying to the user. Interestingly,
- * the {@code getDisplayXXX} methods are themselves locale-sensitive
- * and have two versions: one that uses the default
- * {@link Locale.Category#DISPLAY DISPLAY} locale and one
- * that uses the locale specified as an argument.
+ * <p>During serialization, writeObject writes all fields to the output
+ * stream, including extensions.
 *
- * <p>The Java Platform provides a number of classes that perform locale-sensitive
- * operations. For example, the {@code NumberFormat} class formats
- * numbers, currency, and percentages in a locale-sensitive manner. Classes
- * such as {@code NumberFormat} have several convenience methods
- * for creating a default object of that type. For example, the
- * {@code NumberFormat} class provides these three convenience methods
- * for creating a default {@code NumberFormat} object:
- * {@snippet lang=java :
- *     NumberFormat.getInstance();
- *     NumberFormat.getCurrencyInstance();
- *     NumberFormat.getPercentInstance();
- * }
- * Each of these methods has two variants; one with an explicit locale
- * and one without; the latter uses the default
- * {@link Locale.Category#FORMAT FORMAT} locale:
- * {@snippet lang=java :
- *     NumberFormat.getInstance(myLocale);
- *     NumberFormat.getCurrencyInstance(myLocale);
- *     NumberFormat.getPercentInstance(myLocale);
- * }
- * A {@code Locale} is the mechanism for identifying the kind of object
- * ({@code NumberFormat}) that you would like to get. The locale is
- * <STRONG>just</STRONG> a mechanism for identifying objects,
- * <STRONG>not</STRONG> a container for the objects themselves.
+ * <p>During deserialization, readResolve adds extensions as described
+ * in {@linkplain ##special_cases_constructor Special Cases}, only
+ * for the two cases th_TH_TH and ja_JP_JP.
 *
- * <h3>Compatibility</h3>
- *
- * <p>In order to maintain compatibility, Locale's
- * constructors retain their behavior prior to the Java Runtime
- * Environment version 1.7.  The same is largely true for the
- * {@code toString} method. Thus Locale objects can continue to
- * be used as they were. In particular, clients who parse the output
- * of toString into language, country, and variant fields can continue
- * to do so (although this is strongly discouraged), although the
- * variant field will have additional information in it if script or
- * extensions are present.
+ * @implNote
+ * <h2>Compatibility</h2>
+ * <p> The following commentary is provided for apps that want to ensure
+ * interoperability with older releases of {@code Locale} provided by the
+ * reference implementation.
+ * <h3><a id="locale_behavior">Locale Behavior</a></h3>
+ * In order to maintain compatibility, Locale's (deprecated) constructors,
+ * {@link #of(String, String, String)}, and its overloads retain their behavior prior to the Java Runtime
+ * Environment version 1.7. That is, a length constraint is not imposed on any of
+ * the input parameters. Similarly, the same preservation of past behavior is largely true
+ * for the {@link #toString()} method.
+ * Apps that previously parsed the output of {@link #toString()} into language,
+ * country, and variant fields can continue to do so (although this is strongly
+ * discouraged). A caveat is that the variant field will have additional
+ * information in it if script or extensions are present.
 *
 * <p>In addition, BCP 47 imposes syntax restrictions that are not
 * imposed by Locale's constructors. This means that conversions
 * between some Locales and BCP 47 language tags cannot be made without
- * losing information. Thus {@code toLanguageTag} cannot
+ * losing information. Thus {@link #toLanguageTag} cannot
 * represent the state of locales whose language, country, or variant
 * do not conform to BCP 47.
 *
- * <p>Because of these issues, it is recommended that clients migrate
+ * <p>Because of these issues, it is recommended that apps migrate
 * away from constructing non-conforming locales and use the
- * {@code forLanguageTag} and {@code Locale.Builder} APIs instead.
- * Clients desiring a string representation of the complete locale can
- * then always rely on {@code toLanguageTag} for this purpose.
+ * {@link #forLanguageTag(String)} and {@link Locale.Builder} APIs instead.
+ * Apps desiring a string representation of the complete locale can
+ * then always rely on {@link #toLanguageTag} for this purpose.
 *
- * <h4><a id="special_cases_constructor">Special cases</a></h4>
+ * <h3><a id="special_cases_constructor">Special cases</a></h3>
 *
 * <p>For compatibility reasons, two
 * non-conforming locales are treated as special cases.  These are
 * <b>{@code ja_JP_JP}</b> and <b>{@code th_TH_TH}</b>. These are ill-formed
- * in BCP 47 since the variants are too short. To ease migration to BCP 47,
+ * in BCP 47 since the {@linkplain ##def_variant variants} are too short. To ease migration to BCP 47,
 * these are treated specially during construction.  These two cases (and only
 * these) cause a constructor to generate an extension, all other values behave
 * exactly as they did prior to Java 7.
@ -487,25 +526,16 @@ import sun.util.locale.provider.TimeZoneNameUtility;
 * constructor is called with the arguments "th", "TH", "TH", the
 * extension "u-nu-thai" is automatically added.
 *
- * <h4>Serialization</h4>
+ * <h3><a id="legacy_language_codes">Legacy language codes</a></h3>
 *
- * <p>During serialization, writeObject writes all fields to the output
- * stream, including extensions.
- *
- * <p>During deserialization, readResolve adds extensions as described
- * in {@linkplain ##special_cases_constructor Special Cases}, only
- * for the two cases th_TH_TH and ja_JP_JP.
- *
- * <h4><a id="legacy_language_codes">Legacy language codes</a></h4>
- *
- * <p>Locale's constructor has always converted three language codes to
+ * <p>Locale's constructors have always converted three language codes to
 * their earlier, obsoleted forms: {@code he} maps to {@code iw},
 * {@code yi} maps to {@code ji}, and {@code id} maps to
 * {@code in}. Since Java SE 17, this is no longer the case. Each
 * language maps to its new form; {@code iw} maps to {@code he}, {@code ji}
 * maps to {@code yi}, and {@code in} maps to {@code id}.
 *
- * <p>For the backward compatible behavior, the system property
+ * <p>For backwards compatible behavior, the system property
 * {@systemProperty java.locale.useOldISOCodes} reverts the behavior
 * back to that of before Java SE 17. If the system property is set to
 * {@code true}, those three current language codes are mapped to their
@ -524,22 +554,12 @@ import sun.util.locale.provider.TimeZoneNameUtility;
 * lookup mechanism also implements this mapping, so that resources
 * can be named using either convention, see {@link ResourceBundle.Control}.
 *
- * <h4>Three-letter language/country(region) codes</h4>
- *
- * <p>The Locale constructors have always specified that the language
- * and the country param be two characters in length, although in
- * practice they have accepted any length.  The specification has now
- * been relaxed to allow language codes of two to eight characters and
- * country (region) codes of two to three characters, and in
- * particular, three-letter language codes and three-digit region
- * codes as specified in the IANA Language Subtag Registry.  For
- * compatibility, the implementation still does not impose a length
- * constraint.
- *
 * @spec https://www.rfc-editor.org/info/rfc4647
 *      RFC 4647: Matching of Language Tags
 * @spec https://www.rfc-editor.org/info/rfc5646
 *      RFC 5646: Tags for Identifying Languages
+ * @spec https://unicode.org/reports/tr35/
+ *      Unicode Locale Data Markup Language
 * @see Builder
 * @see ResourceBundle
 * @see java.text.Format
@ -1229,28 +1249,28 @@ public final class Locale implements Cloneable, Serializable {
    }

    /**
-     * {@return an array of installed locales}
+     * {@return an array of available locales}
     *
     * The returned array represents the union of locales supported
-     * by the Java runtime environment and by installed
+     * by the Java runtime environment and by deployed
     * {@link java.util.spi.LocaleServiceProvider LocaleServiceProvider}
     * implementations. At a minimum, the returned array must contain a
-     * {@code Locale} instance equal to {@link Locale#ROOT Locale.ROOT} and
-     * a {@code Locale} instance equal to {@link Locale#US Locale.US}.
+     * {@code Locale} instance equal to {@link #ROOT Locale.ROOT} and
+     * a {@code Locale} instance equal to {@link #US Locale.US}.
     */
    public static Locale[] getAvailableLocales() {
        return LocaleServiceProviderPool.getAllAvailableLocales();
    }

    /**
-     * {@return a stream of installed locales}
+     * {@return a stream of available locales}
     *
     * The returned stream represents the union of locales supported
-     * by the Java runtime environment and by installed
+     * by the Java runtime environment and by deployed
     * {@link java.util.spi.LocaleServiceProvider LocaleServiceProvider}
     * implementations. At a minimum, the returned stream must contain a
-     * {@code Locale} instance equal to {@link Locale#ROOT Locale.ROOT} and
-     * a {@code Locale} instance equal to {@link Locale#US Locale.US}.
+     * {@code Locale} instance equal to {@link #ROOT Locale.ROOT} and
+     * a {@code Locale} instance equal to {@link #US Locale.US}.
     *
     * @implNote Unlike {@code getAvailableLocales()}, this method does
     * not create a defensive copy of the Locale array.
@ -1532,8 +1552,8 @@ public final class Locale implements Cloneable, Serializable {
     * Java 6 and prior.
     *
     * <p>If both the language and country fields are missing, this function will return
-     * the empty string, even if the variant, script, or extensions field is present (you
-     * can't have a locale with just a variant, the variant must accompany a well-formed
+     * the empty string, even if the variant, script, or extensions field is present
+     * (a locale with just a variant is not allowed, the variant must accompany a well-formed
     * language or country code).
     *
     * <p>If script or extensions are present and variant is missing, no underscore is
@ -1613,7 +1633,7 @@ public final class Locale implements Cloneable, Serializable {
     * (delimited by '-' or '_') is emitted as a subtag.  Otherwise:
     * <ul>
     *
-     * <li>if all sub-segments match <code>[0-9a-zA-Z]{1,8}</code>
+     * <li>if all sub-segments match {@code [0-9a-zA-Z]{1,8}}
     * (for example "WIN" or "Oracle_JDK_Standard_Edition"), the first
     * ill-formed sub-segment and all following will be appended to
     * the private use subtag.  The first appended subtag will be
@ -1622,7 +1642,7 @@ public final class Locale implements Cloneable, Serializable {
     * "Oracle-x-lvariant-JDK-Standard-Edition".
     *
     * <li>if any sub-segment does not match
-     * <code>[0-9a-zA-Z]{1,8}</code>, the variant will be truncated
+     * {@code [0-9a-zA-Z]{1,8}}, the variant will be truncated
     * and the problematic sub-segment and all following sub-segments
     * will be omitted.  If the remainder is non-empty, it will be
     * emitted as a private use subtag as above (even if the remainder
@ -1774,7 +1794,7 @@ public final class Locale implements Cloneable, Serializable {
     *
     * <p>If the specified language tag contains any ill-formed subtags,
     * the first such subtag and all following subtags are ignored.  Compare
-     * to {@link Locale.Builder#setLanguageTag} which throws an exception
+     * to {@link Locale.Builder#setLanguageTag(String)} which throws an exception
     * in this case.
     *
     * <p>The following <b>conversions</b> are performed:<ul>
@ -1994,7 +2014,6 @@ public final class Locale implements Cloneable, Serializable {
     * getDisplayLanguage() will return "anglais".
     * If the name returned cannot be localized for the default
     * {@link Locale.Category#DISPLAY DISPLAY} locale,
-     * (say, we don't have a Japanese name for Croatian),
     * this function falls back on the English name, and uses the ISO code as a last-resort
     * value.  If the locale doesn't specify a language, this function returns the empty string.
     *
@ -2012,7 +2031,6 @@ public final class Locale implements Cloneable, Serializable {
     * is en_US, getDisplayLanguage() will return "French"; if the locale is en_US and
     * inLocale is fr_FR, getDisplayLanguage() will return "anglais".
     * If the name returned cannot be localized according to inLocale,
-     * (say, we don't have a Japanese name for Croatian),
     * this function falls back on the English name, and finally
     * on the ISO code as a last-resort value.  If the locale doesn't specify a language,
     * this function returns the empty string.
@ -2067,7 +2085,6 @@ public final class Locale implements Cloneable, Serializable {
     * getDisplayCountry() will return "Etats-Unis".
     * If the name returned cannot be localized for the default
     * {@link Locale.Category#DISPLAY DISPLAY} locale,
-     * (say, we don't have a Japanese name for Croatia),
     * this function falls back on the English name, and uses the ISO code as a last-resort
     * value.  If the locale doesn't specify a country, this function returns the empty string.
     *
@ -2084,8 +2101,7 @@ public final class Locale implements Cloneable, Serializable {
     * For example, if the locale is fr_FR and inLocale
     * is en_US, getDisplayCountry() will return "France"; if the locale is en_US and
     * inLocale is fr_FR, getDisplayCountry() will return "Etats-Unis".
-     * If the name returned cannot be localized according to inLocale.
-     * (say, we don't have a Japanese name for Croatia),
+     * If the name returned cannot be localized according to inLocale,
     * this function falls back on the English name, and finally
     * on the ISO code as a last-resort value.  If the locale doesn't specify a country,
     * this function returns the empty string.
@ -2697,7 +2713,7 @@ public final class Locale implements Cloneable, Serializable {
     * alphanumerics.  The method {@code setVariant} throws
     * {@code IllformedLocaleException} for a variant that does not satisfy
     * this restriction. If it is necessary to support such a variant, use
-     * {@link Locale#of(String, String, String)}.  However, keep in mind that a {@code Locale}
+     * {@link #of(String, String, String)}.  However, keep in mind that a {@code Locale}
     * object obtained this way might lose the variant information when
     * transformed to a BCP 47 language tag.
     *
@ -2710,7 +2726,7 @@ public final class Locale implements Cloneable, Serializable {
     * <p>Builders can be reused; {@code clear()} resets all
     * fields to their default values.
     *
-     * @see Locale#forLanguageTag
+     * @see Locale#forLanguageTag(String)
     * @see Locale#of(String, String, String)
     * @since 1.7
     */
@ -2760,7 +2776,7 @@ public final class Locale implements Cloneable, Serializable {
         * language tag.  Discards the existing state.  Null and the
         * empty string cause the builder to be reset, like {@link
         * #clear}.  Legacy tags (see {@link
-         * Locale#forLanguageTag}) are converted to their canonical
+         * Locale#forLanguageTag(String)}) are converted to their canonical
         * form before being processed.  Otherwise, the language tag
         * must be well-formed (see {@link Locale}) or an exception is
         * thrown (unlike {@code Locale.forLanguageTag}, which
@ -2862,7 +2878,7 @@ public final class Locale implements Cloneable, Serializable {
         * the {@code Locale} class does not impose any syntactic
         * restriction on variant, and the variant value in
         * {@code Locale} is case sensitive.  To set such a variant,
-         * use {@link Locale#of(String, String, String)}.
+         * use {@link #of(String, String, String)}.
         *
         * @param variant the variant
         * @return This builder.
@ -2884,12 +2900,12 @@ public final class Locale implements Cloneable, Serializable {
         * must be {@linkplain Locale##def_extensions well-formed} or an exception
         * is thrown.
         *
-         * <p><b>Note:</b> The key {@link Locale#UNICODE_LOCALE_EXTENSION
+         * <p><b>Note:</b> The key {@link #UNICODE_LOCALE_EXTENSION
         * UNICODE_LOCALE_EXTENSION} ('u') is used for the Unicode locale extension.
         * Setting a value for this key replaces any existing Unicode locale key/type
         * pairs with those defined in the extension.
         *
-         * <p><b>Note:</b> The key {@link Locale#PRIVATE_USE_EXTENSION
+         * <p><b>Note:</b> The key {@link #PRIVATE_USE_EXTENSION
         * PRIVATE_USE_EXTENSION} ('x') is used for the private use code. To be
         * well-formed, the value for this key needs only to have subtags of one to
         * eight alphanumeric characters, not two to eight as in the general case.
@ -2918,7 +2934,7 @@ public final class Locale implements Cloneable, Serializable {
         *
         * <p>Keys and types are converted to lower case.
         *
-         * <p><b>Note</b>:Setting the 'u' extension via {@link #setExtension}
+         * <p><b>Note</b>:Setting the 'u' extension via {@link #setExtension(char, String)}
         * replaces all Unicode locale keywords with those defined in the
         * extension.
         *
@ -3010,9 +3026,9 @@ public final class Locale implements Cloneable, Serializable {
         * Returns an instance of {@code Locale} obtained from the fields set
         * on this builder.
         *
-         * <p>This applies the conversions listed in {@link Locale#forLanguageTag}
+         * <p>This applies the conversions listed in {@link #forLanguageTag(String)}
         * when constructing a Locale. (Legacy tags are handled in
-         * {@link #setLanguageTag}.)
+         * {@link #setLanguageTag(String)}.)
         *
         * @return A Locale.
         */
@ -3193,10 +3209,10 @@ public final class Locale implements Cloneable, Serializable {
     *
     * @spec https://www.rfc-editor.org/info/rfc4234 RFC 4234: Augmented BNF for Syntax Specifications: ABNF
     * @spec https://www.rfc-editor.org/info/rfc4647 RFC 4647: Matching of Language Tags
-     * @see #filter
-     * @see #filterTags
-     * @see #lookup
-     * @see #lookupTag
+     * @see #filter(List, Collection, FilteringMode)
+     * @see #filterTags(List, Collection, FilteringMode)
+     * @see #lookup(List, Collection)
+     * @see #lookupTag(List, Collection)
     *
     * @since 1.8
     */
@ -3413,7 +3429,7 @@ public final class Locale implements Cloneable, Serializable {
         *     found in the given {@code ranges} is ill-formed
         * @spec https://www.rfc-editor.org/info/rfc2616 RFC 2616: Hypertext Transfer Protocol -- HTTP/1.1
         * @see #parse(String)
-         * @see #mapEquivalents
+         * @see #mapEquivalents(List, Map)
         */
        public static List<LanguageRange> parse(String ranges,
                                                Map<String, List<String>> map) {