8329222: java.text.NumberFormat (and subclasses) spec updates

Reviewed-by: naoto
This commit is contained in:
Justin Lu 2024-04-23 21:10:46 +00:00
parent 2555166247
commit f60798a30e
3 changed files with 421 additions and 391 deletions

View file

@ -56,45 +56,104 @@ import sun.util.locale.provider.ResourceBundleBasedAdapter;
/**
* {@code DecimalFormat} is a concrete subclass of
* {@code NumberFormat} that formats decimal numbers. It has a variety of
* features designed to make it possible to parse and format numbers in any
* locale, including support for Western, Arabic, and Indic digits. It also
* supports different kinds of numbers, including integers (123), fixed-point
* {@code NumberFormat} that formats decimal numbers in a localized manner.
* It has a variety of features designed to make it possible to parse and format
* numbers in any locale, including support for Western, Arabic, and Indic digits.
* It also supports different kinds of numbers, including integers (123), fixed-point
* numbers (123.4), scientific notation (1.23E4), percentages (12%), and
* currency amounts ($123). All of these can be localized.
* currency amounts ($123).
*
* <p>To obtain a {@code NumberFormat} for a specific locale, including the
* default locale, call one of {@code NumberFormat}'s factory methods, such
* as {@code getInstance()}. In general, do not call the
* {@code DecimalFormat} constructors directly, since the
* {@code NumberFormat} factory methods may return subclasses other than
* {@code DecimalFormat}. If you need to customize the format object, do
* something like this:
* <h2>Getting a DecimalFormat</h2>
*
* <blockquote>{@snippet lang=java :
* NumberFormat numFormat = NumberFormat.getInstance(loc);
* if (numFormat instanceof DecimalFormat decFormat) {
* decFormat.setDecimalSeparatorAlwaysShown(true);
* To obtain a standard decimal format for a specific locale, including the default locale,
* it is recommended to call one of the {@code NumberFormat}
* {@link NumberFormat##factory_methods factory methods}, such as {@link NumberFormat#getInstance()}.
* These factory methods may not always return a {@code DecimalFormat}
* depending on the locale-service provider implementation
* installed. Thus, to use an instance method defined by {@code DecimalFormat},
* the {@code NumberFormat} returned by the factory method should be
* type checked before converted to {@code DecimalFormat}. If the installed locale-sensitive
* service implementation does not support the given {@code Locale}, the parent
* locale chain will be looked up, and a {@code Locale} used that is supported.
*
* <p>If the factory methods are not desired, use one of the constructors such
* as {@link #DecimalFormat(String) DecimalFormat(String pattern)}. See the {@link
* ##patterns Pattern} section for more information on the {@code pattern} parameter.
*
* <h2>Using DecimalFormat</h2>
* The following is an example of formatting and parsing,
* {@snippet lang=java :
* NumberFormat nFmt = NumberFormat.getCurrencyInstance(Locale.US);
* if (nFmt instanceof DecimalFormat dFmt) {
* // pattern match to DecimalFormat to use setPositiveSuffix(String)
* dFmt.setPositiveSuffix(" dollars");
* dFmt.format(100000); // returns "$100,000.00 dollars"
* dFmt.parse("$100,000.00 dollars"); // returns 100000
* }
* }
* }</blockquote>
*
* <p>A {@code DecimalFormat} comprises a <em>pattern</em> and a set of
* <em>symbols</em>. The pattern may be set directly using
* {@code applyPattern()}, or indirectly using the API methods. The
* symbols are stored in a {@code DecimalFormatSymbols} object. When using
* the {@code NumberFormat} factory methods, the pattern and symbols are
* read from localized {@code ResourceBundle}s.
*
* <h2 id="patterns">Patterns</h2>
* <h2 id="formatting">Formatting and Parsing</h2>
* <h3 id="rounding">Rounding</h3>
*
* Note: For any given {@code DecimalFormat} pattern, if the pattern is not
* in scientific notation, the maximum number of integer digits will not be
* derived from the pattern, and instead set to {@link Integer#MAX_VALUE}.
* Otherwise, if the pattern is in scientific notation, the maximum number of
* integer digits will be derived from the pattern. This derivation is detailed
* in the {@link ##scientific_notation Scientific Notation} section. This behavior
* is the typical end-user desire; {@link #setMaximumIntegerDigits(int)} can be
* used to manually adjust the maximum integer digits.
* When formatting, {@code DecimalFormat} can adjust its rounding using {@link
* #setRoundingMode(RoundingMode)}. By default, it uses
* {@link java.math.RoundingMode#HALF_EVEN RoundingMode.HALF_EVEN}.
*
* <h3>Digits</h3>
*
* When formatting, {@code DecimalFormat} uses the ten consecutive
* characters starting with the localized zero digit defined in the
* {@code DecimalFormatSymbols} object as digits.
* <p>When parsing, these digits as well as all Unicode decimal digits, as
* defined by {@link Character#digit Character.digit}, are recognized.
*
* <h3 id="digit_limits"> Integer and Fraction Digit Limits </h3>
* @implSpec
* When formatting a {@code Number} other than {@code BigInteger} and
* {@code BigDecimal}, {@code 309} is used as the upper limit for integer digits,
* and {@code 340} as the upper limit for fraction digits. This occurs, even if
* one of the {@code DecimalFormat} getter methods, for example, {@link #getMinimumFractionDigits()}
* returns a numerically greater value.
*
* <h3>Special Values</h3>
* <ul>
* <li><p><b>Not a Number</b> ({@code NaN}) is formatted as a string,
* which is typically given as "NaN". This string is determined by {@link
* DecimalFormatSymbols#getNaN()}. This is the only value for which the prefixes
* and suffixes are not attached.
*
* <li><p><b>Infinity</b> is formatted as a string, which is typically given as
* "&#8734;" ({@code U+221E}), with the positive or negative prefixes and suffixes
* attached. This string is determined by {@link DecimalFormatSymbols#getInfinity()}.
*
* <li><p><b>Negative zero</b> ({@code "-0"}) parses to
* <ul>
* <li>{@code BigDecimal(0)} if {@code isParseBigDecimal()} is
* true
* <li>{@code Long(0)} if {@code isParseBigDecimal()} is false
* and {@code isParseIntegerOnly()} is true
* <li>{@code Double(-0.0)} if both {@code isParseBigDecimal()}
* and {@code isParseIntegerOnly()} are false
* </ul>
* </ul>
*
* <h2><a id="synchronization">Synchronization</a></h2>
*
* <p>
* Decimal formats are generally not synchronized.
* It is recommended to create separate format instances for each thread.
* If multiple threads access a format concurrently, it must be synchronized
* externally.
*
* <h2 id="patterns">DecimalFormat Pattern</h2>
*
* A {@code DecimalFormat} comprises a <em>pattern</em> and a set of
* <em>symbols</em>. The pattern may be set directly using {@code applyPattern()},
* or indirectly using the various API methods. The symbols are stored in a {@code
* DecimalFormatSymbols} object. When using the {@code NumberFormat} factory
* methods, the pattern and symbols are created from the locale-sensitive service
* implementation installed.
*
* <p> {@code DecimalFormat} patterns have the following syntax:
* <blockquote><pre>
@ -135,11 +194,115 @@ import sun.util.locale.provider.ResourceBundleBasedAdapter;
* 0 <i>MinimumExponent<sub>opt</sub></i>
* </pre></blockquote>
*
* <p>A {@code DecimalFormat} pattern contains a positive and negative
* <h3><a id="special_pattern_character">Special Pattern Characters</a></h3>
*
* <p>The special characters in the table below are interpreted syntactically when
* used in the DecimalFormat pattern.
* They must be quoted, unless noted otherwise, if they are to appear in the
* prefix or suffix as literals.
*
* <p> The characters in the {@code Symbol} column are used in non-localized
* patterns. The corresponding characters in the {@code Localized Symbol} column are used
* in localized patterns, with the characters in {@code Symbol} losing their
* syntactical meaning. Two exceptions are the currency sign ({@code U+00A4}) and
* quote ({@code U+0027}), which are not localized.
* <p>
* Non-localized patterns should be used when calling {@link #applyPattern(String)}.
* Localized patterns should be used when calling {@link #applyLocalizedPattern(String)}.
*
* <blockquote>
* <table class="striped">
* <caption style="display:none">Chart showing symbol, location, localized, and meaning.</caption>
* <thead>
* <tr>
* <th scope="col" style="text-align:left">Symbol
* <th scope="col" style="text-align:left">Localized Symbol
* <th scope="col" style="text-align:left">Location
* <th scope="col" style="text-align:left;width:50%">Meaning
* </thead>
* <tbody>
* <tr>
* <th scope="row">{@code 0}
* <td>{@link DecimalFormatSymbols#getZeroDigit()}
* <td>Number
* <td>Digit
* <tr>
* <th scope="row">{@code #}
* <td>{@link DecimalFormatSymbols#getDigit()}
* <td>Number
* <td>Digit, zero shows as absent
* <tr>
* <th scope="row">{@code .}
* <td>{@link DecimalFormatSymbols#getDecimalSeparator()}
* <td>Number
* <td>Decimal separator or monetary decimal separator
* <tr>
* <th scope="row">{@code - (U+002D)}
* <td>{@link DecimalFormatSymbols#getMinusSign()}
* <td>Number
* <td>Minus sign
* <tr>
* <th scope="row">{@code ,}
* <td>{@link DecimalFormatSymbols#getGroupingSeparator()}
* <td>Number
* <td>Grouping separator or monetary grouping separator
* <tr>
* <th scope="row">{@code E}
* <td>{@link DecimalFormatSymbols#getExponentSeparator()}
* <td>Number
* <td>Separates mantissa and exponent in scientific notation. This value
* is case sensistive. <em>Need not be quoted in prefix or suffix.</em>
* <tr>
* <th scope="row">{@code ;}
* <td>{@link DecimalFormatSymbols#getPatternSeparator()}
* <td>Subpattern boundary
* <td>Separates positive and negative subpatterns
* <tr>
* <th scope="row">{@code %}
* <td>{@link DecimalFormatSymbols#getPercent()}
* <td>Prefix or suffix
* <td>Multiply by 100 and show as percentage
* <tr>
* <th scope="row">&permil; ({@code U+2030})
* <td>{@link DecimalFormatSymbols#getPerMill()}
* <td>Prefix or suffix
* <td>Multiply by 1000 and show as per mille value
* <tr>
* <th scope="row">&#164; ({@code U+00A4})
* <td> n/a (not localized)
* <td>Prefix or suffix
* <td>Currency sign, replaced by currency symbol. If
* doubled, replaced by international currency symbol.
* If present in a pattern, the monetary decimal/grouping separators
* are used instead of the decimal/grouping separators.
* <tr>
* <th scope="row">{@code ' (U+0027)}
* <td> n/a (not localized)
* <td>Prefix or suffix
* <td>Used to quote special characters in a prefix or suffix,
* for example, {@code "'#'#"} formats 123 to
* {@code "#123"}. To create a single quote
* itself, use two in a row: {@code "# o''clock"}.
* </tbody>
* </table>
* </blockquote>
*
* <h3>Maximum Digits Derivation</h3>
* For any given {@code DecimalFormat} pattern, if the pattern is not
* in scientific notation, the maximum number of integer digits will not be
* derived from the pattern, and instead set to {@link Integer#MAX_VALUE}.
* Otherwise, if the pattern is in scientific notation, the maximum number of
* integer digits will be derived from the pattern. This derivation is detailed
* in the {@link ##scientific_notation Scientific Notation} section. {@link
* #setMaximumIntegerDigits(int)} can be used to manually adjust the maximum
* integer digits.
*
* <h3>Negative Subpatterns</h3>
* A {@code DecimalFormat} pattern contains a positive and negative
* subpattern, for example, {@code "#,##0.00;(#,##0.00)"}. Each
* subpattern has a prefix, numeric part, and suffix. The negative subpattern
* is optional; if absent, then the positive subpattern prefixed with the
* minus sign ({@code '-' U+002D HYPHEN-MINUS}) is used as the
* minus sign {@code '-' (U+002D HYPHEN-MINUS)} is used as the
* negative subpattern. That is, {@code "0.00"} alone is equivalent to
* {@code "0.00;-0.00"}. If there is an explicit negative subpattern, it
* serves only to specify the negative prefix and suffix; the number of digits,
@ -158,105 +321,15 @@ import sun.util.locale.provider.ResourceBundleBasedAdapter;
* specified.) Another example is that the decimal separator and grouping
* separator should be distinct characters, or parsing will be impossible.
*
* <h3>Grouping Separator</h3>
* <p>The grouping separator is commonly used for thousands, but in some
* countries it separates ten-thousands. The grouping size is a constant number
* locales it separates ten-thousands. The grouping size is a constant number
* of digits between the grouping characters, such as 3 for 100,000,000 or 4 for
* 1,0000,0000. If you supply a pattern with multiple grouping characters, the
* 1,0000,0000. If you supply a pattern with multiple grouping characters, the
* interval between the last one and the end of the integer is the one that is
* used. So {@code "#,##,###,####"} == {@code "######,####"} ==
* used. For example, {@code "#,##,###,####"} == {@code "######,####"} ==
* {@code "##,####,####"}.
*
* <h3><a id="special_pattern_character">Special Pattern Characters</a></h3>
*
* <p>Many characters in a pattern are taken literally; they are matched during
* parsing and output unchanged during formatting. Special characters, on the
* other hand, stand for other characters, strings, or classes of characters.
* They must be quoted, unless noted otherwise, if they are to appear in the
* prefix or suffix as literals.
*
* <p>The characters listed here are used in non-localized patterns. Localized
* patterns use the corresponding characters taken from this formatter's
* {@code DecimalFormatSymbols} object instead, and these characters lose
* their special status. Two exceptions are the currency sign and quote, which
* are not localized.
*
* <blockquote>
* <table class="striped">
* <caption style="display:none">Chart showing symbol, location, localized, and meaning.</caption>
* <thead>
* <tr>
* <th scope="col" style="text-align:left">Symbol
* <th scope="col" style="text-align:left">Location
* <th scope="col" style="text-align:left">Localized?
* <th scope="col" style="text-align:left">Meaning
* </thead>
* <tbody>
* <tr style="vertical-align:top">
* <th scope="row">{@code 0}
* <td>Number
* <td>Yes
* <td>Digit
* <tr style="vertical-align: top">
* <th scope="row">{@code #}
* <td>Number
* <td>Yes
* <td>Digit, zero shows as absent
* <tr style="vertical-align:top">
* <th scope="row">{@code .}
* <td>Number
* <td>Yes
* <td>Decimal separator or monetary decimal separator
* <tr style="vertical-align: top">
* <th scope="row">{@code -}
* <td>Number
* <td>Yes
* <td>Minus sign
* <tr style="vertical-align:top">
* <th scope="row">{@code ,}
* <td>Number
* <td>Yes
* <td>Grouping separator or monetary grouping separator
* <tr style="vertical-align: top">
* <th scope="row">{@code E}
* <td>Number
* <td>Yes
* <td>Separates mantissa and exponent in scientific notation.
* <em>Need not be quoted in prefix or suffix.</em>
* <tr style="vertical-align:top">
* <th scope="row">{@code ;}
* <td>Subpattern boundary
* <td>Yes
* <td>Separates positive and negative subpatterns
* <tr style="vertical-align: top">
* <th scope="row">{@code %}
* <td>Prefix or suffix
* <td>Yes
* <td>Multiply by 100 and show as percentage
* <tr style="vertical-align:top">
* <th scope="row">{@code U+2030}
* <td>Prefix or suffix
* <td>Yes
* <td>Multiply by 1000 and show as per mille value
* <tr style="vertical-align: top">
* <th scope="row">&#164; ({@code U+00A4})
* <td>Prefix or suffix
* <td>No
* <td>Currency sign, replaced by currency symbol. If
* doubled, replaced by international currency symbol.
* If present in a pattern, the monetary decimal/grouping separators
* are used instead of the decimal/grouping separators.
* <tr style="vertical-align:top">
* <th scope="row">{@code '}
* <td>Prefix or suffix
* <td>No
* <td>Used to quote special characters in a prefix or suffix,
* for example, {@code "'#'#"} formats 123 to
* {@code "#123"}. To create a single quote
* itself, use two in a row: {@code "# o''clock"}.
* </tbody>
* </table>
* </blockquote>
*
* <h3 id="scientific_notation">Scientific Notation</h3>
*
* <p>Numbers in scientific notation are expressed as the product of a mantissa
@ -339,95 +412,13 @@ import sun.util.locale.provider.ResourceBundleBasedAdapter;
* <li>Exponential patterns may not contain grouping separators.
* </ul>
*
* <h3>Rounding</h3>
*
* {@code DecimalFormat} provides rounding modes defined in
* {@link java.math.RoundingMode} for formatting. By default, it uses
* {@link java.math.RoundingMode#HALF_EVEN RoundingMode.HALF_EVEN}.
*
* <h3>Digits</h3>
*
* For formatting, {@code DecimalFormat} uses the ten consecutive
* characters starting with the localized zero digit defined in the
* {@code DecimalFormatSymbols} object as digits. For parsing, these
* digits as well as all Unicode decimal digits, as defined by
* {@link Character#digit Character.digit}, are recognized.
*
* <h3 id="digit_limits"> Integer and Fraction Digit Limits </h3>
*
* @implSpec
* When formatting a {@code Number} other than {@code BigInteger} and
* {@code BigDecimal}, {@code 309} is used as the upper limit for integer digits,
* and {@code 340} as the upper limit for fraction digits. This occurs, even if
* one of the {@code DecimalFormat} getter methods, for example, {@link #getMinimumFractionDigits()}
* returns a numerically greater value.
*
* <h4>Special Values</h4>
*
* <p>Not a Number({@code NaN}) is formatted as a string, which typically has a
* single character {@code U+FFFD}. This string is determined by the
* {@code DecimalFormatSymbols} object. This is the only value for which
* the prefixes and suffixes are not used.
*
* <p>Infinity is formatted as a string, which typically has a single character
* {@code U+221E}, with the positive or negative prefixes and suffixes
* applied. The infinity string is determined by the
* {@code DecimalFormatSymbols} object.
*
* <p>Negative zero ({@code "-0"}) parses to
* <ul>
* <li>{@code BigDecimal(0)} if {@code isParseBigDecimal()} is
* true,
* <li>{@code Long(0)} if {@code isParseBigDecimal()} is false
* and {@code isParseIntegerOnly()} is true,
* <li>{@code Double(-0.0)} if both {@code isParseBigDecimal()}
* and {@code isParseIntegerOnly()} are false.
* </ul>
*
* <h3><a id="synchronization">Synchronization</a></h3>
*
* <p>
* Decimal formats are generally not synchronized.
* It is recommended to create separate format instances for each thread.
* If multiple threads access a format concurrently, it must be synchronized
* externally.
*
* <h3>Example</h3>
*
* <blockquote>{@snippet lang=java :
* // Print out a number using the localized number, integer, currency,
* // and percent format for each locale
* Locale[] locales = NumberFormat.getAvailableLocales();
* double myNumber = -1234.56;
* NumberFormat form;
* for (int j = 0; j < 4; ++j) {
* System.out.println("FORMAT");
* for (Locale locale : locales) {
* if (locale.getCountry().length() == 0) {
* continue; // Skip language-only locales
* }
* System.out.print(locale.getDisplayName());
* form = switch (j) {
* case 0 -> NumberFormat.getInstance(locale);
* case 1 -> NumberFormat.getIntegerInstance(locale);
* case 2 -> NumberFormat.getCurrencyInstance(locale);
* default -> NumberFormat.getPercentInstance(locale);
* };
* if (form instanceof DecimalFormat decForm) {
* System.out.print(": " + decForm.toPattern());
* }
* System.out.print(" -> " + form.format(myNumber));
* try {
* System.out.println(" -> " + form.parse(form.format(myNumber)));
* } catch (ParseException e) {}
* }
* }
* }</blockquote>
*
* @spec https://www.unicode.org/reports/tr35
* Unicode Locale Data Markup Language (LDML)
* @see <a href="http://docs.oracle.com/javase/tutorial/i18n/format/decimalFormat.html">Java Tutorial</a>
* @see NumberFormat
* @see DecimalFormatSymbols
* @see ParsePosition
* @see Locale
* @author Mark Davis
* @author Alan Liu
* @since 1.1