Commit 22bd5694 authored by Jon Tavernier's avatar Jon Tavernier

explore gnu parallel

parent f9687510
# GNU Parallel
GNU Parallel is new to me. It looks like a great way to get a script that's written to process one thing to process many things on a larger box (i.e. process one thing per core).
This "Hello World" exercise shows passing multiple arguments. `parallel` looks a lot more powerful based on the options I see. I'm just scratching the surface here.
## Process One City
The `./` script simiulates processing one city. The script has two parameters: `CITY_NAME` and `STATE_ABBREVIATION`.
### Usage
# crunch data for Chicago, IL
./ Chicago IL
# output
Chicago, IL - Processing in PID 57675
Chicago, IL - Finished Processing in PID 57675 in 6 seconds
## Processing All Cities
The `./` script reads `./cities.tsv` and passes the arguments to `parallel`, which executes `./` in parallel.
Chicago IL
Anaheim CA
Buffalo NY
Boulder CO
Atlanta GA
Seattle WA
Ventura CA
Bozeman MT
Hammond IN
Lincoln MO
Orlando FL
#! /usr/bin/env bash
# source data is tab delimited
cat ./cities.tsv | \
parallel \
--bar \
--colsep ' ' \
./ {1} {2}
#! /usr/bin/env bash
set -euo pipefail
# pretend to process data.
# sleep a random amount of seconds.
local sleep_seconds=$[($RANDOM % 10) + 1]
echo "${CITY_NAME}, ${STATE_ABBREVIATION} - Processing in PID $$"
sleep "${sleep_seconds}s"
echo "${CITY_NAME}, ${STATE_ABBREVIATION} - Finished Processing in PID $$ in ${sleep_seconds} seconds"
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment