mirror of
https://github.com/ruby/ruby.git
synced 2025-08-23 13:04:13 +02:00
353 lines
14 KiB
Text
353 lines
14 KiB
Text
== Recipes for Parsing \CSV
|
|
|
|
For other recipes, see {Recipes for CSV}[./recipes_rdoc.html].
|
|
|
|
All code snippets on this page assume that the following has been executed:
|
|
require 'csv'
|
|
|
|
=== Contents
|
|
|
|
- {Source Formats}[#label-Source+Formats]
|
|
- {Parsing from a String}[#label-Parsing+from+a+String]
|
|
- {Recipe: Parse from String with Headers}[#label-Recipe-3A+Parse+from+String+with+Headers]
|
|
- {Recipe: Parse from String Without Headers}[#label-Recipe-3A+Parse+from+String+Without+Headers]
|
|
- {Parsing from a File}[#label-Parsing+from+a+File]
|
|
- {Recipe: Parse from File with Headers}[#label-Recipe-3A+Parse+from+File+with+Headers]
|
|
- {Recipe: Parse from File Without Headers}[#label-Recipe-3A+Parse+from+File+Without+Headers]
|
|
- {Parsing from an IO Stream}[#label-Parsing+from+an+IO+Stream]
|
|
- {Recipe: Parse from IO Stream with Headers}[#label-Recipe-3A+Parse+from+IO+Stream+with+Headers]
|
|
- {Recipe: Parse from IO Stream Without Headers}[#label-Recipe-3A+Parse+from+IO+Stream+Without+Headers]
|
|
- {Converting Fields}[#label-Converting+Fields]
|
|
- {Converting Fields to Objects}[#label-Converting+Fields+to+Objects]
|
|
- {Recipe: Convert Fields to Integers}[#label-Recipe-3A+Convert+Fields+to+Integers]
|
|
- {Recipe: Convert Fields to Floats}[#label-Recipe-3A+Convert+Fields+to+Floats]
|
|
- {Recipe: Convert Fields to Numerics}[#label-Recipe-3A+Convert+Fields+to+Numerics]
|
|
- {Recipe: Convert Fields to Dates}[#label-Recipe-3A+Convert+Fields+to+Dates]
|
|
- {Recipe: Convert Fields to DateTimes}[#label-Recipe-3A+Convert+Fields+to+DateTimes]
|
|
- {Recipe: Convert Assorted Fields to Objects}[#label-Recipe-3A+Convert+Assorted+Fields+to+Objects]
|
|
- {Recipe: Convert Fields to Other Objects}[#label-Recipe-3A+Convert+Fields+to+Other+Objects]
|
|
- {Recipe: Filter Field Strings}[#label-Recipe-3A+Filter+Field+Strings]
|
|
- {Recipe: Register Field Converters}[#label-Recipe-3A+Register+Field+Converters]
|
|
- {Using Multiple Field Converters}[#label-Using+Multiple+Field+Converters]
|
|
- {Recipe: Specify Multiple Field Converters in Option :converters}[#label-Recipe-3A+Specify+Multiple+Field+Converters+in+Option+-3Aconverters]
|
|
- {Recipe: Specify Multiple Field Converters in a Custom Converter List}[#label-Recipe-3A+Specify+Multiple+Field+Converters+in+a+Custom+Converter+List]
|
|
- {Converting Headers}[#label-Converting+Headers]
|
|
- {Recipe: Convert Headers to Lowercase}[#label-Recipe-3A+Convert+Headers+to+Lowercase]
|
|
- {Recipe: Convert Headers to Symbols}[#label-Recipe-3A+Convert+Headers+to+Symbols]
|
|
- {Recipe: Filter Header Strings}[#label-Recipe-3A+Filter+Header+Strings]
|
|
- {Recipe: Register Header Converters}[#label-Recipe-3A+Register+Header+Converters]
|
|
- {Using Multiple Header Converters}[#label-Using+Multiple+Header+Converters]
|
|
- {Recipe: Specify Multiple Header Converters in Option :header_converters}[#label-Recipe-3A+Specify+Multiple+Header+Converters+in+Option+-3Aheader_converters]
|
|
- {Recipe: Specify Multiple Header Converters in a Custom Header Converter List}[#label-Recipe-3A+Specify+Multiple+Header+Converters+in+a+Custom+Header+Converter+List]
|
|
|
|
=== Source Formats
|
|
|
|
You can parse \CSV data from a \String, from a \File (via its path), or from an \IO stream.
|
|
|
|
==== Parsing from a \String
|
|
|
|
You can parse \CSV data from a \String, with or without headers.
|
|
|
|
===== Recipe: Parse from \String with Headers
|
|
|
|
Use class method CSV.parse with option +headers+ to read a source \String all at once
|
|
(may have memory resource implications):
|
|
string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
|
|
CSV.parse(string, headers: true) # => #<CSV::Table mode:col_or_row row_count:4>
|
|
|
|
Use instance method CSV#each with option +headers+ to read a source \String one row at a time:
|
|
CSV.new(string, headers: true).each do |row|
|
|
p row
|
|
end
|
|
Ouput:
|
|
#<CSV::Row "Name":"foo" "Value":"0">
|
|
#<CSV::Row "Name":"bar" "Value":"1">
|
|
#<CSV::Row "Name":"baz" "Value":"2">
|
|
|
|
===== Recipe: Parse from \String Without Headers
|
|
|
|
Use class method CSV.parse without option +headers+ to read a source \String all at once
|
|
(may have memory resource implications):
|
|
string = "foo,0\nbar,1\nbaz,2\n"
|
|
CSV.parse(string) # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
|
|
|
|
Use instance method CSV#each without option +headers+ to read a source \String one row at a time:
|
|
CSV.new(string).each do |row|
|
|
p row
|
|
end
|
|
Output:
|
|
["foo", "0"]
|
|
["bar", "1"]
|
|
["baz", "2"]
|
|
|
|
==== Parsing from a \File
|
|
|
|
You can parse \CSV data from a \File, with or without headers.
|
|
|
|
===== Recipe: Parse from \File with Headers
|
|
|
|
Use instance method CSV#read with option +headers+ to read a file all at once:
|
|
string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
|
|
path = 't.csv'
|
|
File.write(path, string)
|
|
CSV.read(path, headers: true) # => #<CSV::Table mode:col_or_row row_count:4>
|
|
|
|
Use class method CSV.foreach with option +headers+ to read one row at a time:
|
|
CSV.foreach(path, headers: true) do |row|
|
|
p row
|
|
end
|
|
Output:
|
|
#<CSV::Row "Name":"foo" "Value":"0">
|
|
#<CSV::Row "Name":"bar" "Value":"1">
|
|
#<CSV::Row "Name":"baz" "Value":"2">
|
|
|
|
===== Recipe: Parse from \File Without Headers
|
|
|
|
Use class method CSV.read without option +headers+ to read a file all at once:
|
|
string = "foo,0\nbar,1\nbaz,2\n"
|
|
path = 't.csv'
|
|
File.write(path, string)
|
|
CSV.read(path) # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
|
|
|
|
Use class method CSV.foreach without option +headers+ to read one row at a time:
|
|
CSV.foreach(path) do |row|
|
|
p row
|
|
end
|
|
Output:
|
|
["foo", "0"]
|
|
["bar", "1"]
|
|
["baz", "2"]
|
|
|
|
==== Parsing from an \IO Stream
|
|
|
|
You can parse \CSV data from an \IO stream, with or without headers.
|
|
|
|
===== Recipe: Parse from \IO Stream with Headers
|
|
|
|
Use class method CSV.parse with option +headers+ to read an \IO stream all at once:
|
|
string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
|
|
path = 't.csv'
|
|
File.write(path, string)
|
|
File.open(path) do |file|
|
|
CSV.parse(file, headers: true)
|
|
end # => #<CSV::Table mode:col_or_row row_count:4>
|
|
|
|
Use class method CSV.foreach with option +headers+ to read one row at a time:
|
|
File.open(path) do |file|
|
|
CSV.foreach(file, headers: true) do |row|
|
|
p row
|
|
end
|
|
end
|
|
Output:
|
|
#<CSV::Row "Name":"foo" "Value":"0">
|
|
#<CSV::Row "Name":"bar" "Value":"1">
|
|
#<CSV::Row "Name":"baz" "Value":"2">
|
|
|
|
===== Recipe: Parse from \IO Stream Without Headers
|
|
|
|
Use class method CSV.parse without option +headers+ to read an \IO stream all at once:
|
|
string = "foo,0\nbar,1\nbaz,2\n"
|
|
path = 't.csv'
|
|
File.write(path, string)
|
|
File.open(path) do |file|
|
|
CSV.parse(file)
|
|
end # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
|
|
|
|
Use class method CSV.foreach without option +headers+ to read one row at a time:
|
|
File.open(path) do |file|
|
|
CSV.foreach(file) do |row|
|
|
p row
|
|
end
|
|
end
|
|
Output:
|
|
["foo", "0"]
|
|
["bar", "1"]
|
|
["baz", "2"]
|
|
|
|
=== Converting Fields
|
|
|
|
You can use field converters to change parsed \String fields into other objects,
|
|
or to otherwise modify the \String fields.
|
|
|
|
==== Converting Fields to Objects
|
|
|
|
Use field converters to change parsed \String objects into other, more specific, objects.
|
|
|
|
There are built-in field converters for converting to objects of certain classes:
|
|
- \Float
|
|
- \Integer
|
|
- \Date
|
|
- \DateTime
|
|
|
|
Other built-in field converters include:
|
|
- <tt>:numeric</tt>: converts to \Integer and \Float.
|
|
- <tt>:all</tt>: converts to \DateTime, \Integer, \Float.
|
|
|
|
You can also define field converters to convert to objects of other classes.
|
|
|
|
===== Recipe: Convert Fields to Integers
|
|
|
|
Convert fields to \Integer objects using built-in converter <tt>:integer</tt>:
|
|
source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
|
|
parsed = CSV.parse(source, headers: true, converters: :integer)
|
|
parsed.map {|row| row['Value'].class} # => [Integer, Integer, Integer]
|
|
|
|
===== Recipe: Convert Fields to Floats
|
|
|
|
Convert fields to \Float objects using built-in converter <tt>:float</tt>:
|
|
source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
|
|
parsed = CSV.parse(source, headers: true, converters: :float)
|
|
parsed.map {|row| row['Value'].class} # => [Float, Float, Float]
|
|
|
|
===== Recipe: Convert Fields to Numerics
|
|
|
|
Convert fields to \Integer and \Float objects using built-in converter <tt>:numeric</tt>:
|
|
source = "Name,Value\nfoo,0\nbar,1.1\nbaz,2.2\n"
|
|
parsed = CSV.parse(source, headers: true, converters: :numeric)
|
|
parsed.map {|row| row['Value'].class} # => [Integer, Float, Float]
|
|
|
|
===== Recipe: Convert Fields to Dates
|
|
|
|
Convert fields to \Date objects using built-in converter <tt>:date</tt>:
|
|
source = "Name,Date\nfoo,2001-02-03\nbar,2001-02-04\nbaz,2001-02-03\n"
|
|
parsed = CSV.parse(source, headers: true, converters: :date)
|
|
parsed.map {|row| row['Date'].class} # => [Date, Date, Date]
|
|
|
|
===== Recipe: Convert Fields to DateTimes
|
|
|
|
Convert fields to \DateTime objects using built-in converter <tt>:date_time</tt>:
|
|
source = "Name,DateTime\nfoo,2001-02-03\nbar,2001-02-04\nbaz,2020-05-07T14:59:00-05:00\n"
|
|
parsed = CSV.parse(source, headers: true, converters: :date_time)
|
|
parsed.map {|row| row['DateTime'].class} # => [DateTime, DateTime, DateTime]
|
|
|
|
===== Recipe: Convert Assorted Fields to Objects
|
|
|
|
Convert assorted fields to objects using built-in converter <tt>:all</tt>:
|
|
source = "Type,Value\nInteger,0\nFloat,1.0\nDateTime,2001-02-04\n"
|
|
parsed = CSV.parse(source, headers: true, converters: :all)
|
|
parsed.map {|row| row['Value'].class} # => [Integer, Float, DateTime]
|
|
|
|
===== Recipe: Convert Fields to Other Objects
|
|
|
|
Define a custom field converter to convert \String fields into other objects.
|
|
This example defines and uses a custom field converter
|
|
that converts each column-1 value to a \Rational object:
|
|
rational_converter = proc do |field, field_context|
|
|
field_context.index == 1 ? field.to_r : field
|
|
end
|
|
source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
|
|
parsed = CSV.parse(source, headers: true, converters: rational_converter)
|
|
parsed.map {|row| row['Value'].class} # => [Rational, Rational, Rational]
|
|
|
|
==== Recipe: Filter Field Strings
|
|
|
|
Define a custom field converter to modify \String fields.
|
|
This example defines and uses a custom field converter
|
|
that strips whitespace from each field value:
|
|
strip_converter = proc {|field| field.strip }
|
|
source = "Name,Value\n foo , 0 \n bar , 1 \n baz , 2 \n"
|
|
parsed = CSV.parse(source, headers: true, converters: strip_converter)
|
|
parsed['Name'] # => ["foo", "bar", "baz"]
|
|
parsed['Value'] # => ["0", "1", "2"]
|
|
|
|
==== Recipe: Register Field Converters
|
|
|
|
Register a custom field converter, assigning it a name;
|
|
then refer to the converter by its name:
|
|
rational_converter = proc do |field, field_context|
|
|
field_context.index == 1 ? field.to_r : field
|
|
end
|
|
CSV::Converters[:rational] = rational_converter
|
|
source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
|
|
parsed = CSV.parse(source, headers: true, converters: :rational)
|
|
parsed['Value'] # => [(0/1), (1/1), (2/1)]
|
|
|
|
==== Using Multiple Field Converters
|
|
|
|
You can use multiple field converters in either of these ways:
|
|
- Specify converters in option <tt>:converters</tt>.
|
|
- Specify converters in a custom converter list.
|
|
|
|
===== Recipe: Specify Multiple Field Converters in Option <tt>:converters</tt>
|
|
|
|
Apply multiple field converters by specifying them in option <tt>:conveters</tt>:
|
|
source = "Name,Value\nfoo,0\nbar,1.0\nbaz,2.0\n"
|
|
parsed = CSV.parse(source, headers: true, converters: [:integer, :float])
|
|
parsed['Value'] # => [0, 1.0, 2.0]
|
|
|
|
===== Recipe: Specify Multiple Field Converters in a Custom Converter List
|
|
|
|
Apply multiple field converters by defining and registering a custom converter list:
|
|
strip_converter = proc {|field| field.strip }
|
|
CSV::Converters[:strip] = strip_converter
|
|
CSV::Converters[:my_converters] = [:integer, :float, :strip]
|
|
source = "Name,Value\n foo , 0 \n bar , 1.0 \n baz , 2.0 \n"
|
|
parsed = CSV.parse(source, headers: true, converters: :my_converters)
|
|
parsed['Name'] # => ["foo", "bar", "baz"]
|
|
parsed['Value'] # => [0, 1.0, 2.0]
|
|
|
|
=== Converting Headers
|
|
|
|
You can use header converters to modify parsed \String headers.
|
|
|
|
Built-in header converters include:
|
|
- <tt>:symbol</tt>: converts \String header to \Symbol.
|
|
- <tt>:downcase</tt>: converts \String header to lowercase.
|
|
|
|
You can also define header converters to otherwise modify header \Strings.
|
|
|
|
==== Recipe: Convert Headers to Lowercase
|
|
|
|
Convert headers to lowercase using built-in converter <tt>:downcase</tt>:
|
|
source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
|
|
parsed = CSV.parse(source, headers: true, header_converters: :downcase)
|
|
parsed.headers # => ["name", "value"]
|
|
|
|
==== Recipe: Convert Headers to Symbols
|
|
|
|
Convert headers to downcased Symbols using built-in converter <tt>:symbol</tt>:
|
|
source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
|
|
parsed = CSV.parse(source, headers: true, header_converters: :symbol)
|
|
parsed.headers # => [:name, :value]
|
|
parsed.headers.map {|header| header.class} # => [Symbol, Symbol]
|
|
|
|
==== Recipe: Filter Header Strings
|
|
|
|
Define a custom header converter to modify \String fields.
|
|
This example defines and uses a custom header converter
|
|
that capitalizes each header \String:
|
|
capitalize_converter = proc {|header| header.capitalize }
|
|
source = "NAME,VALUE\nfoo,0\nbar,1\nbaz,2\n"
|
|
parsed = CSV.parse(source, headers: true, header_converters: capitalize_converter)
|
|
parsed.headers # => ["Name", "Value"]
|
|
|
|
==== Recipe: Register Header Converters
|
|
|
|
Register a custom header converter, assigning it a name;
|
|
then refer to the converter by its name:
|
|
capitalize_converter = proc {|header| header.capitalize }
|
|
CSV::HeaderConverters[:capitalize] = capitalize_converter
|
|
source = "NAME,VALUE\nfoo,0\nbar,1\nbaz,2\n"
|
|
parsed = CSV.parse(source, headers: true, header_converters: :capitalize)
|
|
parsed.headers # => ["Name", "Value"]
|
|
|
|
==== Using Multiple Header Converters
|
|
|
|
You can use multiple header converters in either of these ways:
|
|
- Specify header converters in option <tt>:header_converters</tt>.
|
|
- Specify header converters in a custom header converter list.
|
|
|
|
===== Recipe: Specify Multiple Header Converters in Option :header_converters
|
|
|
|
Apply multiple header converters by specifying them in option <tt>:header_conveters</tt>:
|
|
source = "Name,Value\nfoo,0\nbar,1.0\nbaz,2.0\n"
|
|
parsed = CSV.parse(source, headers: true, header_converters: [:downcase, :symbol])
|
|
parsed.headers # => [:name, :value]
|
|
|
|
===== Recipe: Specify Multiple Header Converters in a Custom Header Converter List
|
|
|
|
Apply multiple header converters by defining and registering a custom header converter list:
|
|
CSV::HeaderConverters[:my_header_converters] = [:symbol, :downcase]
|
|
source = "NAME,VALUE\nfoo,0\nbar,1.0\nbaz,2.0\n"
|
|
parsed = CSV.parse(source, headers: true, header_converters: :my_header_converters)
|
|
parsed.headers # => [:name, :value]
|