Explore projects
-
containalytics / containalytics
GNU Affero General Public License v3.0"Cloud container data analytics, statistical modeling, and machine learning on distributed databases". "A free opensource alternative to SPSS, SAS, MATLAB, PowerBI, Tableau and Alteryx". Runs on Linux, Windows, MacOS, and in the cloud via containers.
Updated -
-
Elypia / Webhooker
Apache License 2.0Webserver with single endpoint for managing webhooks and receiving payloads from clients.
Updated -
Práctica del módulo Big Data Processing (Spark y Scala) del V Bootcamp BD & ML de Keepcoding
Updated -
ufscar / hpc / Exemplos / spark
BSD 3-Clause "New" or "Revised" LicenseApache-Spark with Master-Slave setup to work out of the box using OpenHPC and Slurm
Updated -
Workshop de Big Data a cargo de Jimmy Farfán docente del curso online "Desarrollo de Aplicaciones de Big Data en Hadoop". Si requieren más información o cualquier duda pueden ubicarnos en facebook como Data Hack Formation.
Updated -
Daniel Silva / Spark in Swarm
GNU General Public License v3.0 or laterSpark on docker. running on composer or docker swarm
Updated -
Roffild / RoffildLibrary
Apache License 2.0Library for MQL5 (MetaTrader) with Python, Java, Apache Spark, AWS https://roffild.com/
Updated -
Дипломный проект с составлением датасета и его использованием для машинного обучения с целью кластеризации.
Updated -
DH-TINF15AIBC-DBII-Cassandra / TestApplication
MIT LicenseThis project contains a sample application to test Apache Spark and Cassandra. This collection contains the Scala source code together with the CQL code for the data schema.
Updated -
Learning Spark in java/scala
Updated -
Rob Agnese / Unit Converter Service
MIT LicenseThis was done as a code challenge for a job application. The goal was to create a REST endpoint that would take a combination of measurement units and return the SI equivalent as well as the conversion factor.
Updated -
-
Darko Britvec / Geospatial Distributed Index - Spark Streaming
Apache License 2.0Spatial join of geospatial data from Kafka streams using Apache Spark (Spark Streaming).
Updated -
-
This project is a realtime streaming data collection system based on kafka and spark.
Updated -
LibreHealth / LibreHealth Toolkit / LibreHealth Toolkit FHIR Analytics Module
Mozilla Public License 2.0Mirrored to GitHub at https://github.com/LibreHealthIO/lh-toolkit-fhir-analytics
Updated -
This web app finds the best configuration of a Spark Application given the hardware of the cluster
Updated -
Scala and Spark example code for my machine learning course.
Updated -
Stack Exchange releases "data dumps" of all its publicly available content roughly every three months via archive.org.
This project is an example and a framework for building ETL for this data with Apache Spark and Java.
Updated