Projects with this topic
-
Hybrid Filtering with Spark MLlib
Updated -
Performance benchmarking tool for ClickHouse Key-Value workloads, featuring a synthetic data generator and a multithreaded runner.
Updated -
In real time inference simulator using the MVTec dataset, aproach via DINOv2 and Claude Sonnet 3.5 (Amazon Bedrock's API)
Updated -
End-to-end AWS data lake pipeline for fleet telemetry data using S3, Spark, and Athena. Includes partitioned Parquet ETL, vehicle safety analytics, and SQL queries for overspeed and harsh braking detection.
Updated -
Distributed geospatial data pipeline using Apache Spark and Apache Sedona to analyze NYC taxi demand hotspots from millions of trip records.
Updated -
Introduction to Cassandra, and benchmark of its main mechanisms. Developed for the final exam of the master's course "Architetture Dati".
Updated -
A Project based around a simulated music streaming app (Spotify for example) generating live user events, to practice handling real time streaming of large amounts of data.
Updated -
The SalesStream dashboards is an application for monitoring and analyzing revenue data in real time. By leveraging the power of Apache Spark and Apache Kafka, this system ensures that financial data is processed efficiently and in a timely manner, providing companies with up-to-date insights into their revenue streams.
Updated -
-
This project aims to predict the outcome of future presidential elections using artificial intelligence predictions.
Updated -
Notebooks for Pandas, Spark and Python experiments.
Updated -
-
Implementation of Geo-Temporally Weighted Regression (GTWR) using Apache Spark, Spark ML and Apache Sedona.
Updated -
Implementation of Geographically Weighted Regression (GWR) using Apache Spark, Spark ML and Apache Sedona.
Updated -
The "Stage Metrics" plugin for Apache Spark to creating metrics by stage status
Updated -
-
sentiment analysis using spark ml library. implemented classic ml models: SVM, Logistic Regression, Naive Bayes and Random Forest. implemented embedding: Word2Vec and TF-IDF. also ensemble and hybrid (ml and lexicon based) methods were implemented
Updated -