I would like to vent a bit of my frustration in regards to these wonderful things known as PDF printers. Back whilst my ignorance got the better of me (which is like 15 minutes ago) and I thought that these things were the end-all-be-all solution to creating PDF documents from your Word/Excel/HTML/etc. applications, I have come to an abrupt halt on why these aren’t as awesome as I thought they were.
Back many years ago I ran across two very wonderful and simple tools: CutePDF Writer and BullZip PDF Printer. Both of these, among many others, PrimoPDF, PDFill, etc. are all tools for creating PDFs from various other sources.
Getting back on why these aren’t the best, and why they have a special place, one word needs to be said: Hyperlinks. Without getting into a whole explanation of how printing actually works on a computer, I will come up with a simple flow which honestly I never really thought about until I had to troubleshoot why this person’s hyperlinks weren’t creating properly in the PDF.
Word -> Print -> Spools into Virtual Queue -> Gets sent off the PORT (either an IP address on a network, or your USB cable, or in the case of a PDF printer, the specialized port that was created to monitor for stuff being sent to it). Upon the spooled file being sent into the PDF Printer queue, it will then read the words or images that have been created, and generate a PDF file out of it, allowing you to save anything that can be printed to a printer, into a PDF.
However, here is the flaw. The spooled file sent to the virtual printer is nothing more than binary words and text. It does not contain the proper meta data for the hyperlinks. Granted, it can OCR and realize that bold is bold, and italic is italic, but for hyperlinks, it will read what you have on your screen.
EXAMPLE: Let’s compare two sentences.
Sentence 1: I like to search on Google.
Sentence 2: I like to search on http://www.google.com.
Now as silly as Sentence 2 is, that is the only way for you to make your hyperlinks work properly when printing into a PDF printer. It will read the words, and if it starts with http:// or one of those protocols, it will know to make a hyperlink with all of the following text. So what happens if you have some ridiculously long hyperlink and it word wraps 2 or more times around your document? Answer: You’re screwed. Not only are you screwed for that, but as far as Sentence 1 is concerned to the printer, the word “Google” is just blue and underlined, there is no link. In addition to this, it seems that if anywhere in your hyperlink there is either a left or right parenthesis it will break all words beyond the hyperlink, even if it’s not word wrapped.
*deep breath*
Wow Etch, that was way too long and my eyes bleed. So PDF printers don’t work for hyperlinks…what do I do?!
Well, I’m glad I asked. There are a few different possibilities here. For those of you who currently have Microsoft Office 2007, you’re in luck! You can actually File -> Save As… a PDF! This works perfectly, no worry about breaking hyperlink… Now for the rest of the world that doesn’t have this new and fancy version of Microsoft Office you’re in quite a bind. Well…for the most part. OpenOffice works wonders as this has had built in PDF functionality for at least as long as I can remember. Even though I haven’t used it in a while, but I’m fairly positive this works with hyperlinks natively (although I could be wrong?).
If you are using Office 2003 or something else, you are going to have to invest in a piece of PDF software. However, don’t fall into the trap of just printing to the PDF printer that these editors provide. You’re just going to chase your tail on that one. Primarily I use Adobe Acrobat for this. There is a plug-in that is placed into your Word toolbar that allows you to “Send To” the PDF or at least adds an icon somewhere in your toolbar. What this does is sent the word document into the PDF editor directly, instead of printing it. As was stated above, printing loses some of the data, but by sending the word document directly into the editor, it’s essentially the same as “Creating New” from those editors. Usually that works by starting off with a former file or a blank sheet.
Hopefully this will help anyone who has run into this problem.
And I love how I ended up explaining printing anyway…
Without getting into a whole explanation of how printing actually works on a computer, I will come up with a simple flow which honestly I never really thought about until I had to troubleshoot why this person’s hyperlinks weren’t creating properly in the PDF.
Initial Research Sources:
1. Archived Technote from Adobe: http://kb2.adobe.com/cps/330/330729.html
2. Hyperlinks Preserved in Word 2007: http://askville.amazon.com/Hyperlinks-word-doc-lost-file-converted-pdf-hyperlinks-work/AnswerViewer.do?requestId=10206775