Calling `columntable(df)` is not a good idea
Because NamedTuple
s are typed, users should be careful about calling Tables.columntable
on a large data frame.
For the survey data I work with, I usually have approximately 5000 observations and 1000 columns. Because this is so wide, a flexible join will cause a 1-second delay, and this 1 second is re-paid every time I alter the data frame
julia> df = DataFrame(rand(5000, 1000), :auto);
julia> @time Tables.columntable(df);
0.993039 seconds (248.40 k allocations: 11.678 MiB, 99.71% compilation time)
julia> @time Tables.columntable(df);
0.000238 seconds (20 allocations: 95.281 KiB)
julia> df2 = DataFrame(rand(5000, 1001), :auto);
julia> @time Tables.columntable(df2);
0.961908 seconds (248.53 k allocations: 11.682 MiB, 99.70% compilation time)
I'm not sure this is something your package can fix, what's needed is an agreed-upon package of an untyped object which supports only the Tables.jl interface.