Strapping.jl

This guide provides documentation around the Strapping.construct and Strapping.deconstruct functions. This package was born from a desire for straightforward, not-too-magical ORM capabilities in Julia, which means being able to transform, for example, 2D SQL query results from a database into a Vector of custom application objects, without having to write your own adapter code. Strapping.jl integrates with the StructTypes.jl package, which allows customizing Julia structs and their fields.

If anything isn't clear or you find bugs, don't hesitate to open a new issue, even just for a question, or come chat with us on the #data slack channel with questions, concerns, or clarifications.

Strapping.jl

Strapping.construct
Strapping.deconstruct

`Strapping.construct`

Strapping.construct — Function.

Strapping.construct(T, tbl)
Strapping.construct(Vector{T}, tbl)

Given a Tables.jl-compatible input table source tbl, construct an instance of T (single object, first method), or Vector{T} (list of objects, 2nd method).

The 1st method will throw an error if the input table is empty, and warn if there are more rows than necessary to construct a single T.

The 2nd method will return an empty list for an empty input source, and construct as many T as are found until the input table is exhausted.

Strapping.construct utilizes the StructTypes.jl package for determining the StructTypes.StructType trait of T and constructing an instance appropriately: * StructTypes.Struct/StructTypes.Mutable: field reflection will be used to retrieve values from the input table row, with field customizations respected, like excluded fields, field-specific keyword args, etc. * StructTypes.DictType: each column name/value of the table row will be used as a key/value pair to be passed to the DictType constructor * StructTypes.ArrayType: column values will be "collected" as an array to be passed to the ArrayType constructor * StructTypes.StringType/StructTypes.NumberType/StructTypes.BoolType/StructTypes.NullType: only the first value of the row will be passed to the scalar type constructor

Note that for StructTypes.DictType and StructTypes.ArrayType, "aggregate" value/eltypes are not allowed, since the entire row is treated as key/value pairs or array elements. That means, for example, I can't have a table with rows like tbl = [(a=1, b=2)] and try to do Strapping.construct(Dict{Symbol, Dict{Int, Int}}, tbl). It first attempts to map column names to the outer Dict keys, (a and b), but then tries to map the values 1 and 2 to Dict{Int, Int} and fails.

For structs with ArrayType fields, the first row values will be used for other scalar fields, and subsequent rows will be iterated for the ArrayType field values. For example, I may wish to construct a type like:

struct TestResult
    id::Int
    values::Vector{Float64}
end
StructTypes.StructType(::Type{TestResult}) = StructTypes.Struct()
StructTypes.idproperty(::Type{TestResult}) = :id

and my input table would look something like, tbl = (id=[1, 1, 1], values=[3.14, 3.15, 3.16]). I can then construct my type like:

julia> Strapping.construct(TestResult, tbl)
TestResult(1, [3.14, 3.15, 3.16])

Note that along with defining the StructTypes.StructType trait for TestResult, I also needed to define StructTypes.idproperty to signal which field of my struct is a "unique key" identifier. This enables Strapping to distinguish which rows belong to a particular instance of TestResult. This allows the slightly more complicated example of returning multiple TestResults from a single table:

julia> tbl = (id=[1, 1, 1, 2, 2, 2], values=[3.14, 3.15, 3.16, 40.1, 0.01, 2.34])
(id = [1, 1, 1, 2, 2, 2], values = [3.14, 3.15, 3.16, 40.1, 0.01, 2.34])

julia> Strapping.construct(Vector{TestResult}, tbl)
2-element Array{TestResult,1}:
 TestResult(1, [3.14, 3.15, 3.16])
 TestResult(2, [40.1, 0.01, 2.34])

Here, we actually have two TestResult objects in our tbl, and Strapping uses the id field to identify object owners for a row. Note that currently the table rows need to be sorted on the idproperty field, i.e. rows belonging to the same object must appear consecutively in the input table rows.

Now let's discuss "aggregate" type fields. Let's say I have a struct like:

struct Experiment
    id::Int
    name::String
    testresults::TestResult
end
StructTypes.StructType(::Type{Experiment}) = StructTypes.Struct()
StructTypes.idproperty(::Type{Experiment}) = :id

So my Experiment type also as an id field, in addition to a name field, and an "aggregate" field of testresults. How should the input table source account for testresults, which is itself a struct made up of its own id and values fields? The key here is "flattening" nested structs into a single set of table column names, and utilizing the StructTypes.fieldprefix function, which allows specifying a Symbol prefix to identify an aggregate field's columns in the table row. So, in the case of our Experiment, we can do:

StructTypes.fieldprefix(::Type{Experiment}, nm::Symbol) = nm == :testresults ? :testresults_ : :_

Note that this is the default definition, so we don't really need to define this, but for illustration purposes, we'll walk through it. We're saying that for the :testresults field name, we should expect its column names in the table row to start with :testresults_. So the table data for an Experiment instance, would look something like:

tbl = (id=[1, 1, 1], name=["exp1", "exp1", "exp1"], testresults_id=[1, 1, 1], testresults_values=[3.14, 3.15, 3.16])

This pattern generalizes to structs with multiple aggregate fields, or aggregate fields that themselves have aggregate fields (nested aggregates); in the nested case, the prefixes are concatenated, like testresults_specifictestresult_id.

source

`Strapping.deconstruct`

Strapping.deconstruct — Function.

Strapping.deconstruct(x::T)
Strapping.deconstruct(x::Vector{T})

The inverse of Strapping.construct, where an object instance x::T or Vector of objects x::Vector{T} is "deconstructed" into a Tables.jl-compatible row iterator. This works following the same patterns outlined in Strapping.construct with regards to ArrayType and aggregate fields. Specifically, ArrayType fields will cause multiple rows to be outputted, one row per collection element, with other scalar fields being repeated in each row. Similarly for aggregate fields, the field prefix will be used (StructTypes.fieldprefix) and nested aggregates will all be flattened into a single list of column names with aggregate prefixes.

In general, this allows outputting any "object" as a 2D table structure that could be stored in any Tables.jl-compatible sink format, e.g. csv file, sqlite table, mysql database table, feather file, etc.

source