Importing and Exporting (I/O)
Importing data from tabular data files
To read data from a CSV-like file, use the readtable function:
DataTables.readtable — Function.Read data from a tabular-file format (CSV, TSV, ...)
readtable(filename, [keyword options])Arguments
filename::AbstractString: the filename to be read
Keyword Arguments
header::Bool– Use the information from the file's header line to determine column names. Defaults totrue.separator::Char– Assume that fields are split by theseparatorcharacter. If not specified, it will be guessed from the filename:.csvdefaults to',',.tsvdefaults to' ',.wsvdefaults to' '.quotemark::Vector{Char}– Assume that fields contained inside of twoquotemarkcharacters are quoted, which disables processing of separators and linebreaks. Set toChar[]to disable this feature and slightly improve performance. Defaults to['"'].decimal::Char– Assume that the decimal place in numbers is written using thedecimalcharacter. Defaults to'.'.nastrings::Vector{String}– Translate any of the strings into this vector into a NULL value. Defaults to["", "NULL", "NA"].truestrings::Vector{String}– Translate any of the strings into this vector into a Booleantrue. Defaults to["T", "t", "TRUE", "true"].falsestrings::Vector{String}– Translate any of the strings into this vector into a Booleanfalse. Defaults to["F", "f", "FALSE", "false"].makefactors::Bool– Convert string columns intoCategoricalVector's for use as factors. Defaults tofalse.nrows::Int– Read onlynrowsfrom the file. Defaults to-1, which indicates that the entire file should be read.names::Vector{Symbol}– Use the values in this array as the names for all columns instead of or in lieu of the names in the file's header. Defaults to[], which indicates that the header should be used if present or that numeric names should be invented if there is no header.eltypes::Vector– Specify the types of all columns. Defaults to[].allowcomments::Bool– Ignore all text inside comments. Defaults tofalse.commentmark::Char– Specify the character that starts comments. Defaults to'#'.ignorepadding::Bool– Ignore all whitespace on left and right sides of a field. Defaults totrue.skipstart::Int– Specify the number of initial rows to skip. Defaults to0.skiprows::Vector{Int}– Specify the indices of lines in the input to ignore. Defaults to[].skipblanks::Bool– Skip any blank lines in input. Defaults totrue.encoding::Symbol– Specify the file's encoding as either:utf8or:latin1. Defaults to:utf8.normalizenames::Bool– Ensure that column names are valid Julia identifiers. For instance this renames a column named"a b"to"a_b"which can then be accessed with:a_binstead ofSymbol("a b"). Defaults totrue.
Result
::DataTable
Examples
dt = readtable("data.csv")
dt = readtable("data.tsv")
dt = readtable("data.wsv")
dt = readtable("data.txt", separator = ' ')
dt = readtable("data.txt", header = false)readtable requires that you specify the path of the file that you would like to read as a String. To read data from a non-file source, you may also supply an IO object. It supports many additional keyword arguments: these are documented in the section on advanced I/O operations.
Exporting data to a tabular data file
To write data to a CSV file, use the writetable function:
DataTables.writetable — Function.Write data to a tabular-file format (CSV, TSV, ...)
writetable(filename, dt, [keyword options])Arguments
filename::AbstractString: the filename to be createddt::AbstractDataTable: the AbstractDataTable to be written
Keyword Arguments
separator::Char– The separator character that you would like to use. Defaults to the output ofgetseparator(filename), which uses commas for files that end in.csv, tabs for files that end in.tsvand a single space for files that end in.wsv.quotemark::Char– The character used to delimit string fields. Defaults to'"'.header::Bool– Should the file contain a header that specifies the column names fromdt. Defaults totrue.nastring::AbstractString– What to write in place of missing data. Defaults to"NULL".
Result
::DataTable
Examples
dt = DataTable(A = 1:10)
writetable("output.csv", dt)
writetable("output.dat", dt, separator = ',', header = false)
writetable("output.dat", dt, quotemark = '', separator = ',')
writetable("output.dat", dt, header = false)Supplying DataTables inline with non-standard string literals
You can also provide CSV-like tabular data in a non-standard string literal to construct a new DataTable, as in the following:
dt = csv"""
name, age, squidPerWeek
Alice, 36, 3.14
Bob, 24, 0
Carol, 58, 2.71
Eve, 49, 7.77
"""The csv string literal prefix indicates that the data are supplied in standard comma-separated value format. Common alternative formats are also available as string literals. For semicolon-separated values, with comma as a decimal, use csv2:
dt = csv2"""
name; age; squidPerWeek
Alice; 36; 3,14
Bob; 24; 0
Carol; 58; 2,71
Eve; 49; 7,77
"""For whitespace-separated values, use wsv:
dt = wsv"""
name age squidPerWeek
Alice 36 3.14
Bob 24 0
Carol 58 2.71
Eve 49 7.77
"""And for tab-separated values, use tsv:
dt = tsv"""
name age squidPerWeek
Alice 36 3.14
Bob 24 0
Carol 58 2.71
Eve 49 7.77
"""