Commit f6a2a17b authored by ScottWNesbitt's avatar ScottWNesbitt

Deleted unused post

parent a4e651a9
title: Taking a look at pdftk
layout: post
published: false
<p align="center"><img src="{{ site.baseurl }}img/adobe-27964_320.png" alt="PDF logo" /></p>
I don't know how many ways you can create PDF files in Linux. Most applications let you save documents directly to PDF, and you can convert files to PDF quite easily. But manipulating those PDFs is a bit trickier. various applications let you to fiddle with PDFs in one or two ways. But if you're a command line junkie, an app called [pdftk]( (PDF Toolkit) is practically an all-in-one solution. It's the closest thing to Adobe Acrobat that I've found for Linux.
pdftk's developer describes it as the PDF equivalent of an *electronic staple remover, hole punch, binder, secret decoder ring, and X-ray glasses.* That's pertty close to the truth. Pdftk can:
* Join and split PDFs
* Pull single pages from a file
* Encrypt and decrypt PDF files
* Add, update, and export a PDF's metadata
* Export bookmarks to a text file
* Add or remove attachments
* Fill out PDF forms
You can download [pdftk]( either as source code, or in packages for various flavours of Linux &mdash; for example, Debian, RPM-based distributions, FreeBSD, or Gentoo.
As I mentioned earlier, pdftk is a command line tool. Its options can be complicated, especially for complex operations. You'll be doing quite a bit of typing, but that shouldn't put you off using pdftk. When I started working with pdftk, I found myself using only a few of its functions: joining and splitting PDF files, adding metadata, and password protecting the file
## Combining PDF files
pdftk can combine two or more PDF files. To do that, open a terminal window and change to the directory containing the PDF files that you want to combine. Then, type the following command:
**pdftk file1.pdf file2.pdf cat output newFile.pdf**
**cat** is short for _concatenate_ &mdash; join together, for those of us plain plain &mdash; and **output** tells pdftk to write the combined PDFs to a new file; in this case, newFile.pdf.
Pdftk doesn't retain the bookmarks that might have been in one or all of the files you're combining, but it does keep hyperlinks to both destinations within the PDF and to external files or Web sites. Where some other applications point to the wrong destinations for hyperlinks, the links in PDFs combined using pdftk managed to hit each link target.
## Splitting files
Splitting PDF files with pdftk can be ... interesting. The **burst** option breaks a PDF into multiple files. How many? How about one file for each page. To use it, type:
**pdftk style_guide.pdf burst**
With larger documents you wind up with a lot of files with names corresponding to their page numbers, like pg_0001 and pg_0013. It's not very intuitive or useful, especially if you want only a few pages.
Of course, pdftk remove specific pages from a PDF file. For example, to remove pages 10 to 25 from a PDF file, type the following command:
**pdftk myDocument.pdf cat 1-9 26-end output removedPages.pdf**
The options **1-9** and **26-end** tell pdftk to ignore pages 1 through 9 and page 26 to the last page, and copy the pages between those ranges to the file removedPages.pdf.
I've used this feature quite a bit -- mainly to trim pages from work samples that I have posted on my company's Web site, and to extract articles from back issues of a magazine to which I contribute. The resulting files are small, and the PDFs are clear and easy to read.
## Adding attachments to a PDF
To be honest, I used to miss Adobe Acrobat's ability to attach files to a PDF. When working with PDFs on Windows, I regularly used this feature to include addenda, surveys, or additional information with a published PDF. Until I found pdftk, I was forced to move my PDF documents to a computer running Windows whenever I needed to attach a file.
Why attach a file to a PDF instead of sending an archive? Mainly convenience. If you move a PDF from one computer to another, and don't move the archive along with it, you won't have access to the attachments. And instead of pulling a file from an archive to view it, you just double-click on the attachment's icon to open the file from your PDF viewer.
Using pdftk, you can easily attach binary and text files to a PDF. You can even specify what page of the PDF you want the attachment to appear on. Just type the following command:
**pdftk html_tidy.pdf attach_files command_ref.html to_page 24 output html_tidy_book.pdf**
Obviously, **attach_files_**_ is the option to attach files. **to**_**page 24** tells pdftk to attach the file command_ref.html to page 24 of the resulting PDF.
I've attached Writer documents, tar.gz and zip archives, and text and HTML files to various PDF documents. Apart from a noticeable increase in the size of the PDF file, there were no nasty side effects.
How do you know a PDF contains an attachment? Look for the thumbtack icon in the PDF. This only works in Adobe's Acrobat Reader, though. Attachments don't appear in applications like Xpdf, Evince, KPDF, or gv.
## Adding metadata and passwords to a PDF
Pdftk has a number of options that you might use infrequently, but that are very useful when you need them. Two of them are _update\_info_ and _user\_pw_.
When you create a PDF, it might contain no or incomplete _metadata_, which is information that describes the PDF. Metadata can come in handy when you or your users need to organize or index a set of PDF files. Using pdftk and a text file, you can change or add metadata to the PDF by typing the following command:
**pdftk DocBook_Overview.pdf update_info data.txt output DocBookOverview.pdf**
In this case, the file data.txt contains an InfoKey and InfoValue pair, like this:
InfoKey: Keywords
InfoValue: DocBook,writing,documentation,background
You can change only the following metadata items with pdftk: title, author, subject, producer, and keywords.
If you're working with PDFs that contain sensitive information, you may want to make sure that only certain people can view a PDF by apply a password to it with the _user\_pw_ option:
**pdftk sales_report.pdf output SalesReport.pdf user_pw PROMPT**
You will be prompted for a password of up to 32 characters. When someone tries to open the PDF, they will be asked to enter the password.
## Conclusion
pdftk is one of the most useful tools for manipulating PDF file that I've found for Linux. It's not the easiest software to work with, but you'll get the hang of it after a bit of practice.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment