Move to use pandas dataframes as default container format

Currently ndarray is being used (for example, when datasets are read). I think we should use pandas and people can then convert to ndarray if needed. One argument for this is that ndarrays with object dtypes are pickled with arrow, but if you use pandas, only the column with object dtype would be pickled.