Explore projects
-
containalytics / containalytics
GNU Affero General Public License v3.0"Cloud container data analytics, statistical modeling, and machine learning on distributed databases". "A free opensource alternative to SPSS, SAS, MATLAB, PowerBI, Tableau and Alteryx". Runs on Linux, Windows, MacOS, and in the cloud via containers.
Updated -
Daniel Silva / Spark in Swarm
GNU General Public License v3.0 or laterSpark on docker. running on composer or docker swarm
Updated -
This web app finds the best configuration of a Spark Application given the hardware of the cluster
Updated -
Stack Exchange releases "data dumps" of all its publicly available content roughly every three months via archive.org.
This project is an example and a framework for building ETL for this data with Apache Spark and Java.
Updated