Explore projects
-
-
This code will parse spark-defaults.conf file and make changes to Spark driver extraclasspath configuration and Spark executor extraclasspath configuration. This was written for Amazon cluster running spark.
Updated -
-
thomas lörtsch / trice
BSD 3-Clause "New" or "Revised" Licenseanalytics suite for 'Tor Project' metrics data
Updated -
Iván Károly / dokit
MIT LicenseFree and opensource documentation sharing website, based on java spark framework.
Updated -
Luis Miguel Mejía Suárez / documents-cluster
Apache License 2.0This repo presents performance comparisons between a serial implementation, a MPI based and a Spark based implementation of a document clustering algorithm
Updated -
Christoph Görn / tweest
GNU General Public License v3.0 onlyThis is my take on 'get tweeter stream, send it over to Apache Kafka as JSON objects, receive it in Apache Spark and process the stream'
Updated -
DH-TINF15AIBC-DBII-Cassandra / TestApplication
MIT LicenseThis project contains a sample application to test Apache Spark and Cassandra. This collection contains the Scala source code together with the CQL code for the data schema.
Updated -
Rob Agnese / Unit Converter Service
MIT LicenseThis was done as a code challenge for a job application. The goal was to create a REST endpoint that would take a combination of measurement units and return the SI equivalent as well as the conversion factor.
Updated -
-
Daniel Silva / Spark in Swarm
GNU General Public License v3.0 or laterSpark on docker. running on composer or docker swarm
Updated -
-
This project is a realtime streaming data collection system based on kafka and spark.
Updated -
LibreHealth / LibreHealth Toolkit / LibreHealth Toolkit FHIR Analytics Module
Mozilla Public License 2.0Mirrored to GitHub at https://github.com/LibreHealthIO/lh-toolkit-fhir-analytics
Updated -
Stack Exchange releases "data dumps" of all its publicly available content roughly every three months via archive.org.
This project is an example and a framework for building ETL for this data with Apache Spark and Java.
Updated -
Scala and Spark example code for my machine learning course.
Updated -
Learning Spark in java/scala
Updated -
-
Darko Britvec / Geospatial Distributed Index - Spark Streaming
Apache License 2.0Spatial join of geospatial data from Kafka streams using Apache Spark (Spark Streaming).
Updated -
Práctica del módulo Big Data Processing (Spark y Scala) del V Bootcamp BD & ML de Keepcoding
Updated