value too long for type character varying - Metadata not saved at all if one value errors
When uploading a document I noticed the following exception:
postgresql_1 | 2020-01-06 15:56:17.623 UTC [386] ERROR: value too long for type character varying(255)
postgresql_1 | 2020-01-06 15:56:17.623 UTC [386] STATEMENT: INSERT INTO "file_metadata_filemetadataentry" ("document_version_driver_entry_id", "key", "value") VALUES (5, 'ManifestReferenceFilePath', '[''/Users/bfanney/Edits/rh-enterprise-open-source-ebook-f16984bf-201904/resources/RH_BRAND_005964_01_SRC_EnterpriseOpenSourcePDF_tn_Folder/Links/JIM.tif'', ''/Users/bfanney/Edits/rh-enterprise-open-source-ebook-f16984bf-201904/resources/RH_BRAND_005964_01_SRC_EnterpriseOpenSourcePDF_tn_Folder/Links/Jim_Whitehurst_Headshot_high_res.jpg'']') RETURNING "file_metadata_filemetadataentry"."id"
app_1 | [2020-01-06 15:56:17,643: ERROR/ForkPoolWorker-4] Task mayan.apps.file_metadata.tasks.task_process_document_version[77f99f56-5323-4d84-b27f-d8601e04e551] raised unexpected: DataError('value too long for type character varying(255)\n')
Clearly one item form the exif drive is too long to fit into the database. However, no items appear in the file metadata section for the document and manual file metadata re-generation produces the same error as expected. Conversely, it's not apparent anywhere in Mayan that this part of the process failed as there's not a "file metadata processing errors" section like there is for Parsing and OCR.
This could result in a document missing from indexes if metadata is relied upon for an index template.
In an ideal world; Mayan would be able to insert the other items produced by the exif driver into the DB and discard the error-producing one. If that's impossible because the whole task is atomic then notifying the user somehow would help mitigate the issue as the error isn't shown in the events section or anywhere else inside Mayan that I can see.