feature idea: Use "textract" lib to extract text from documents

Hi, I found this lib: http://textract.readthedocs.org/en/latest/ May be it is useful for mayan.

It is mentioned there: http://pyvideo.org/video/3526/cleaning-confused-collections-of-characters BTW.: it it worth the time to watch this :-)

br Matthias

Edited Feb 08, 2021 by Roberto Rosario