Commit 6f495cc9 authored by Jozef Hajnala's avatar Jozef Hajnala

Add vroom::vroom with multiple files

parent 69e820bb
......@@ -38,6 +38,12 @@ bash bench/bench.sh rscripts/03_readr.R &> results/out_readr.txt
bash bench/bench.sh rscripts/06_vroom_purrr.R &> results/out_vroom_purrr.txt
```
### For vroom::vroom on multiple files
```
bash bench/bench.sh rscripts/07_vroom.R &> results/out_vroom.txt
```
### For data.table::fread with grep
```
......@@ -58,6 +64,7 @@ bash bench/bench.sh rscripts/05_readr_grep.R &> results/out_readr_grep.txt
| `readr::read_csv` + `purrr::map_dfr` | 27.02 GB | 3.43 m |
| `vroom::vroom` + `purrr::map_dfr` * ** | 25.70 GB | 1.67 m |
| `data.table::fread` + `rbindlist` | 15.25 GB | 1.40 m |
| `vroom::vroom` (multiple files) * ** | 31.18 GB | 1.30 m |
| `data.table::fread` from `grep` | 1.68 GB | 0.34 m |
| `readr::read_csv`+ `pipe()` from `grep`| 1.70 GB | 0.88 m |
......
rscripts/07_vroom.R
Maximum resident set size (kbytes): 32691740
real 13m5.144s
user 64m57.651s
sys 1m20.970s
suppressPackageStartupMessages({
library(vroom)
})
dataDir <- path.expand("~/dataexpo")
dataFiles <- dir(dataDir, pattern = "csv$", full.names = TRUE)
col_types <- vroom::cols(
.default = vroom::col_double(),
UniqueCarrier = vroom::col_character(),
TailNum = vroom::col_character(),
Origin = vroom::col_character(),
Dest = vroom::col_character(),
CancellationCode = vroom::col_character(),
CarrierDelay = vroom::col_double(),
WeatherDelay = vroom::col_double(),
NASDelay = vroom::col_double(),
SecurityDelay = vroom::col_double(),
LateAircraftDelay = vroom::col_double()
)
df <- vroom::vroom(dataFiles, col_types = col_types, progress = FALSE)
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment