feature request: cleanup scanned document
I have tons of badly scanned articles and books in pdf form. Some OCRed a lot not (in different languages too). Is there a way to clean this up like unpaper does commandline?
Sometimes I even have 2 pages of a book scanned in one 'page', so splitting it up would be great too.
(personally I am not interested to keep the original intact, but the content)