Projects with this topic
-
Solving 4chan captcha
Updated -
Advanced enterprise Free Open Source DMS (document management system).
Updated -
DocuMind es un sistema de organización automática de documentos para Linux desktop, impulsado por IA local (Ollama/Llama3 o HuggingFace). Procesa PDFs, imágenes, vídeos, audio y código: extrae texto/OCR, transcribe, analiza contenido y clasifica/archiva según ISO 15489 (facturas, legal, trabajo, personal, multimedia). Detecta duplicados, registra auditoría en SQLite y prioriza privacidad offline.
Desarrollada en Python 3.10+ con PyMuPDF, Tesseract, Vosk/Whisper, multiprocessing y optimizaciones (xxHash, caching, GPU), demuestra expertise en integración LLM locales/multimodales, procesamiento paralelo, arquitectura modular escalable y evolución hacia GUI PyQt6 con drag-and-drop, búsqueda full-text y empaquetado RPM/Flatpak. (612 caracteres)
Updated -
Sistema event-driven con Kafka que transforma documentos no estructurados en especificaciones de software completas. Extrae texto con OCR, procesa NER con transformers, clasifica oraciones y generar SRS en múltiples formatos.
Updated -
(Design WIP) Ext. tool adding a transcription (OCR) workflow to the EmuHawk (BizHawk) emulator, allowing retro games to be translated partially- or fully-automatically
Updated -
Jochre OCR training corpus for Yiddish in Alto4 format
Updated -
Traitement d'articles en C++ (via RapidOCROnnx) de journaux italiens dans le cadre d'un mémoire de recherche en histoire. Catégorisation à venir.
UpdatedUpdated -
Jochre3 OCR engine with default implementation for Yiddish - completely new version of https://github.com/urieli/jochre
Updated -
Process UrT gameplay to gather distance stats for Game Life Balance: https://game-life-balance.com
Updated -
A libre smart powered comic book reader for Android.
❗ Note: This is a mirror. Check GitHub repository.UpdatedUpdated -
Plataforma de Administración de Documentos (DMP) para preservar el patrimonio musical de "El Sistema", usando:
Papra DMP: Gestión de metadatos. Audiveris OMR: OMR para partituras.Updated -
This project focuses on developing a prototype application for extracting headlines and content from digitized newspaper images stored in the SIDAK (Sistem Informasi Database Koleksi) system of the Monumen Pers Nasional, utilizing computer vision and deep learning techniques.
The prototype aims to overcome the limitations of standard OCR tools by integrating YOLOv8 object detection to precisely identify and separate newspaper headlines and article content before text extraction.
Updated -
Graphical browser-based Alto4 editor, for the construction of OCR training corpora.
Updated -
Project9: Character Recognition (Input an image, Print single digits one-by-one)
ocr code matlab matlab-class MATLAB Simulink matlabcap matlab utils matlab-gui matlab-inter... Matlab API MATLAB toolbox MATLAB Helper matlabbatch project abandoned-pr... finished-pro... template-pro... project mana... projects university p... School-project example project Student project project temp... Final Project school project study-project project euler project-euler oldproject CSV plate-recogn... character-re... character-ge... ocr-recognition plate-detection hcmiu ocr-text-reader license-plat... license-plat...Updated -
-
Android app to extract text from images using tesseract OCR
Updated -
(Design WIP) A shorthand script/alphabet designed for touchscreen + stylus with "omniscient" OCR. Inspired by Palm OS' Graffiti script and some niche DS titles.
Updated -
MiniAiLive Intelligent ID OCR for Reliable Identity Verification From document verification to data entry, our MiniAiLive OCR solution can help transform your identity verification process.
Updated -
MiniAiLive Intelligent ID OCR for Reliable Identity Verification From document verification to data entry, our MiniAiLive OCR solution can help transform your identity verification process.
Updated -
MiniAiLive's Complete Document Liveness Detection Solution for Digital Onboarding
Updated