Projects with this topic
-
-
SIMNL is een Python-tool die federatief consistente synthetische datasets genereert voor de drie belangrijkste Nederlandse overheidsregisters: BRP (personen), BAG (adressen) en HR (ondernemingen). De gegenereerde data is AVG-compliant, demografisch accuraat volgens CBS-cijfers, en bevat correcte kruisverwijzingen tussen registers — ideaal voor het testen van applicaties die met overheidsdata werken zonder echte persoonsgegevens te gebruiken. Met ondersteuning voor reproduceerbare output via seed-parameters en 30+ gevalideerde business rules levert SIMNL realistische testdata voor ontwikkel- en integratieomgevingen.
Updated -
Machine learning-based fraud detection system using Logistic Regression, Random Forest, and a Voting Classifier, achieving ~99% accuracy on a synthetic financial dataset.
Updated -
Bahn-Vorhersage - The best Train Delay Prediction System.
Updated -
Recommend.Games blog: https://blog.recommend.games/
Updated -
Projet académique de Master 2 en Python portant sur l’analyse de la qualité de l’air en France et en Île-de-France. Le projet combine scraping de données, traitement de données environnementales, analyse statistique des concentrations de polluants atmosphériques et visualisation cartographique des indices de qualité de l’air. Les analyses portent notamment sur les polluants NO₂, O₃ et PM10, avec une étude temporelle des données Airparif de 2014 à 2017.
Mots-clés :
python data-science web-scraping data-analysis data-visualization air-quality environmental-data geospatial-analysis time-series pandas
Updated -
A modular Clinical NLP Pipeline built to process and analyze unstructured medical text using both traditional machine learning and transformer-based approaches.
The project combines multiple components including OCR, text preprocessing, feature engineering, classification, named entity recognition, and visualization into a single end-to-end pipeline. It supports extracting clinical insights from raw documents and predicting medical categories using both TF-IDF + SVM and BERT-based models.
The system was designed and implemented as a structured Python project, with each stage separated into independent modules for scalability and maintainability.
Key Highlights
Built an end-to-end NLP pipeline for clinical text processing. Implemented SVM (≈51% accuracy) and BERT (≈77% accuracy) models. Integrated OCR for extracting text from medical documents. Performed Named Entity Recognition (NER) on clinical data. Designed modular architecture (src/) for clean code organization. Exported outputs for visualization and dashboard integration.Updated -
-
-
A LARA python-django app for managing projects and experiments in lab automation systems and scientific laboratories.
Updated -
Documentos referentes al curso de Introducción a la bioestadística y programación:
Hojas de cálculo Bioestadística Programas bioinformáticos Bases de datos biosanitarias Lenguajes de programación Lenguaje de programación RUpdated -
Personal explorations in ML and statistics for quant trading
Updated -
Analysis of Kilter Board data, along with predictive models for V-grades based on holds and angle.
Updated -
Analysis of Tension Board 2 data, along with predictive models for V-grades based on holds and angle.
Updated -
-
Fundamental theory and practice in Data Science (DS).
🧮 data analysis AI ML DL machine lear... deep learning data science data-enginee... artificial i... data-science data preproc... Python C C++ NumPy pandas mathematics Algorithm algorithms Data Enginee... big data scipy scikit-learn xgboost lightgbm catboost TensorFlow keras PyTorch matplotlib seaborn plotly nltk opencv dask linear-algebra calculus probability statistics Discrete Mat... RUpdated -
Découverte de ce monde (Data / ML) via un projet perso. Basés sur des relevés météo de différentes sources, avec des outils comme Pandas, Dask, Spark, Polars, ..., du ML et du DL. Une couche de visualisation via PowerBI.
Updated -
Data Science / Machine Learning Pipeline component for training and deploying ML models using CI
Updated -
-
This repository holds code for the AskAnna Backend. Our backend stack primarily uses Django and the Django REST Framework.
Updated