File attachments are not anonymized
Many file types have detailed metadata which can be used to de-anonymize users. Common formats for this metadata include EXIF data on JPEG files and XMP data on PDF, PNG, ... the list goes on.
EXIF is a simple file header whose format is defined here: https://www.media.mit.edu/pia/Research/deepview/exif.html
XMP is a gross embeddable XML format designed by ADOBE -> https://en.wikipedia.org/wiki/Extensible_Metadata_Platform
Implications for v1.x
- Tokumei can probably leverage Imagemagick to nullify EXIF fields
- XMP data is more difficult to tackle and I haven't assessed any options for dealing with it
Implications for v2.x
- Go does not have any good libraries for manipulating EXIF data
- we may need to write a quick library that allows reading/writing arbitrary EXIF fields
- it might be possible to just read the size of the EXIF header and "zero" it out by writing null bytes
- it might be safe to just entirely remove the EXIF header; I'm not sure if image renderers ever specifically look for this data as a prereq to rendering; this approach would require just reading the size of the header and removing that many bytes from the file; splicing and concatenating is easy
- XMP data is more difficult to tackle and I haven't assessed any options for dealing with it
Edited by Keefer Rourke