error in OCR in Mandarin language
Environment: Build with Docker Compose. Added in the docker-compose.yml
MAYAN_APT_INSTALLS: "tesseract-ocr-chi-sim tesseract-ocr-chi-tra"
OCR Error found when adding document("Mandarin language" is selected for Language Option) :
Exception calling Tesseract with language option: cmn; RAN: /usr/bin/tesseract - - -l cmn STDOUT: STDERR: Error opening data file /usr/share/tesseract-ocr/4.00/tessdata/cmn.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory. Failed loading language 'cmn' Tesseract couldn't load any languages! Could not initialize tesseract. The requested OCR language "cmn" is not available and needs to be installed.
Have double checked that "chi_sim.traineddata" and "chi_tra.traineddata" have been installed
root@addc443e70b7:/# ls /usr/share/tesseract-ocr/4.00/tessdata/
chi_sim.traineddata configs osd.traineddata tessconfigs
chi_tra.traineddata eng.traineddata pdf.ttf