spark
Projects with this topic
-
-
Learning Spark in java/scala
Updated -
Scala and Spark example code for my machine learning course.
Updated -
Stack Exchange releases "data dumps" of all its publicly available content roughly every three months via archive.org.
This project is an example and a framework for building ETL for this data with Apache Spark and Java.
Updated -
Mirrored to GitHub at https://github.com/LibreHealthIO/lh-toolkit-fhir-analytics
Updated -
This project is a realtime streaming data collection system based on kafka and spark.
Updated -
Python notebooks to demonstrate working with Apache Spark
Updated -
Spark on docker. running on composer or docker swarm
Updated -
-
This was done as a code challenge for a job application. The goal was to create a REST endpoint that would take a combination of measurement units and return the SI equivalent as well as the conversion factor.
Updated -
This project contains a sample application to test Apache Spark and Cassandra. This collection contains the Scala source code together with the CQL code for the data schema.
Updated -
This is my take on 'get tweeter stream, send it over to Apache Kafka as JSON objects, receive it in Apache Spark and process the stream'
Updated -
This repo presents performance comparisons between a serial implementation, a MPI based and a Spark based implementation of a document clustering algorithm
Updated -
Free and opensource documentation sharing website, based on java spark framework.
Updated -
analytics suite for 'Tor Project' metrics data
Updated -
-
This code will parse spark-defaults.conf file and make changes to Spark driver extraclasspath configuration and Spark executor extraclasspath configuration. This was written for Amazon cluster running spark.
Updated -
Apache Spark Streaming code snippets
Updated