Projects with this topic
-
this project was done on a text dataset with labels about topics, the goal from this project is to make it as a small reference for myself about clustering and its metrics and reduction techniques, it was not intended for anyone else, but if you find it useful feel free to learn from it and show your friends
deployment link: https://orthodox-clusteringfr-gb3ygqya9aicctlx5qdwft.streamlit.app/
Updated -
pt_kmeans is a high-performance, pure PyTorch K-Means implementation for CPU/GPU, featuring K-Means++ initialization, hierarchical clustering, and cluster splitting, optimized for large-scale datasets.
Updated -
A regular expression generator for arbitrary sets of strings. Returns the patterns with exact or generalised character sets, depending on the choice of the user, and facilitates clustering over patterns to create superpatterns.
Updated -
A program that utilizes cluster computing and parallel programming to simulate trading strategies on the Nordic stock market.
Updated -
My attempts to solve homework from the Moscow Institute of Physics and Technology course
Updated -
Collection of completed data-mining (university course) on python
Updated -
A simple spectral clustering example made with Python.
Updated -
-
"Cloud container data analytics, statistical modeling, and machine learning on distributed databases". "A free opensource alternative to SPSS, SAS, MATLAB, PowerBI, Tableau and Alteryx". Runs on Linux, Windows, MacOS, and in the cloud via containers.
LaTeX statistics sas spss matlab Python R spark cloud gcp Oracle azure Amazon Web S... Kubernetes containers Docker ML machine lear... regression clustering TiDB Yugabyte MySQL MariaDB SQL sparkr pyspark RStudio - KNIME Anal... Apache Spark... PyTorch MXNet Chainer keras gluon Scikit-learn... ONNX MLOps - Anac... NumPy Ipython) StatsModels pytest dask Koalas API -... Tornado - Py... Altair Bokeh Jupyter Voila Plotly/Dash matplotlib Seaborn - C#... SASPy - R: T... ggplot2 shiny dash Sparklyr BlueSky Stat... Jamovi - Int... vs code Vim - Tableau TabPy Tableau Buil... Python) - PL... SQL Developer PostgreSQL MySQL/MariaDB pgAdmin4 dbeaver MySQL Workbench Spark SQL Delta Lake Angular 2+ React .NET Core JavaScript (JS) Typescript (TS) Blazor Razor html5 CSS3 AWS EC2 Servers docker-compose podman Red Hat Ente... Oracle Linux fedora centos Ubuntu (WSL 2) debian Kestrel nginx Apache web s... jira Git Gitlab CI/CD... Code Climate... Ansible helm Terraform Cloudera Dat... nifi blender godot MS OfficeUpdated -
This repository contiains the implementation of DPC-based algorithm as described in Russo, E.T., Laio, A. & Punta, M. Density Peak clustering of protein sequences associated to a Pfam clan reveals clear similarities and interesting differences with respect to manual family annotation. BMC Bioinformatics 22, 121 (2021). https://doi.org/10.1186/s12859-021-04013-x. Note that the implementation has been written with the puropose of analysing, on a traditional workstation (8GB ram, 4-8 cores), query datasets with up to 5000 proteins, as those analysed in the reference paper.
Updated -
Python libraries for Principal Component Analysis-based (PCA) model order reduction, clustering and data analysis.
Updated -
pyAMNESIA: a python pipeline for analysing the Activity and Morphology of NEurons using Skeletonization and other Image Analysis techniques.
Updated -
Clustering documents to get an overview of a corpus
Updated -
Машинное обучение. Анализ алгоритмов кластеризации
Updated -
Testing functions and features of a music production utility app.
Updated -
This repo presents performance comparisons between a serial implementation, a MPI based and a Spark based implementation of a document clustering algorithm
Updated