[DOC] Tweaks for String#grapheme_clusters

This commit is contained in:
BurdetteLamar 2025-08-01 09:19:18 -05:00 committed by Peter Zhu
parent a6aaeb9acf
commit b7f65f01ee

View file

@ -1,6 +1,19 @@
Returns an array of the grapheme clusters in +self+ Returns an array of the grapheme clusters in +self+
(see {Unicode Grapheme Cluster Boundaries}[https://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries]): (see {Unicode Grapheme Cluster Boundaries}[https://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries]):
s = "\u0061\u0308-pqr-\u0062\u0308-xyz-\u0063\u0308" # => "ä-pqr-b̈-xyz-c̈" s = "ä-pqr-b̈-xyz-c̈"
s.size # => 16
s.bytesize # => 19
s.grapheme_clusters.size # => 13
s.grapheme_clusters s.grapheme_clusters
# => ["ä", "-", "p", "q", "r", "-", "b̈", "-", "x", "y", "z", "-", "c̈"] # => ["ä", "-", "p", "q", "r", "-", "b̈", "-", "x", "y", "z", "-", "c̈"]
Details:
s = "ä"
s.grapheme_clusters # => ["ä"] # One grapheme cluster.
s.bytes # => [97, 204, 136] # Three bytes.
s.chars # => ["a", "̈"] # Two characters.
s.chars.map {|char| char.ord } # => [97, 776] # Their values.
Related: see {Converting to Non-String}[rdoc-ref:String@Converting+to+Non--5CString].