Explore projects
-
ufscar / hpc / Exemplos / spark
BSD 3-Clause "New" or "Revised" LicenseApache-Spark with Master-Slave setup to work out of the box using OpenHPC and Slurm
Updated -
containalytics / containalytics
GNU Affero General Public License v3.0"Cloud container data analytics, statistical modeling, and machine learning on distributed databases". "A free opensource alternative to SPSS, SAS, MATLAB, PowerBI, Tableau and Alteryx". Runs on Linux, Windows, MacOS, and in the cloud via containers.
Updated -
This web app finds the best configuration of a Spark Application given the hardware of the cluster
Updated -
Stack Exchange releases "data dumps" of all its publicly available content roughly every three months via archive.org.
This project is an example and a framework for building ETL for this data with Apache Spark and Java.
Updated -
Luis Miguel Mejía Suárez / documents-cluster
Apache License 2.0This repo presents performance comparisons between a serial implementation, a MPI based and a Spark based implementation of a document clustering algorithm
Updated -
thomas lörtsch / trice
BSD 3-Clause "New" or "Revised" Licenseanalytics suite for 'Tor Project' metrics data
Updated