Crash when trying to remove metadata from PDF files with pdftk-java
Hi.
I am going through my collection of PDF files with some tools and I found at least one file that is giving pdftk a hard time when I try to manipulate its metadata.
When I run the following command
pdftk foo.pdf dump_data | sed -e 's/\(InfoValue:\)\s.*/\ /g' | pdftk foo.pdf update_info - output foo.clean.pdf
the message that I get is the following:
pdftk Warning: data bookmark record not valid -- skipped; data:
BookmarkBegin
BookmarkTitle: HANDBOOK
BookmarkLevel: -1
BookmarkPageNumber: -1
pdftk Warning: unexpected case 1 in LoadDataFile(); continuing
Unhandled Java Exception in create_output():
java.lang.StringIndexOutOfBoundsException: begin 0, end 8, length 2
at java.base/java.lang.String.checkBoundsBeginEnd(String.java:3319)
at java.base/java.lang.String.substring(String.java:1874)
at com.gitlab.pdftk_java.data_import.LoadDataFile(data_import.java:182)
at com.gitlab.pdftk_java.data_import.UpdateInfo(data_import.java:198)
at com.gitlab.pdftk_java.TK_Session.create_output(TK_Session.java:3126)
at com.gitlab.pdftk_java.pdftk.main(pdftk.java:177)
There was a problem with pdftk-java. Please report it at
https://gitlab.com/pdftk-java/pdftk/issues
including the message above, the version of pdftk-java (3.0.2), and if possible steps to reproduce the error.
Unfortunately, the file in question is copyrighted, but I can send it privately.
Thanks in advance,
Rogério Brito.