Make merging with Spark work in containers
(Py)Spark poses some challenges in Docker. There are two ways to accomplish this:
- Run Spark inside the main process like we do locally.
- Run a Spark cluster in Docker containers, and submit our job there.
Either way, once this is done, we can move merging and news processing into Docker.