Repair Scanned Documents With gscan2pdf

I had the pleasure last week of tracking down an article locked behind a digital paywall. It arrived through inter-library loan in the form of a book, all issues of the journal that year bound together. I felt a little disappointed as it meant I’d be left with lower quality scans. (You know what I’m talking about if you’ve ever placed a book on a copy machine.) I turned to the internet for a solution and discovered the tool gscan2pdf.

There are two ways to begin using gscan2pdf: 1) scan your document using the application or 2) open an existing document, for instance a multi-page PDF produced by your department’s copy machine. The built in tools allow you to reorder pages, crop, rotate and perform a few other adjustments. The “Clean Up” tool offers a GUI panel for unpaper, a post-processer for fixing bad scans. Running unpaper after basic editing worked very well, correcting subtle alignment and border issues with the scans. I didn’t try them, but gscan2pdf can also incorporate three optical character recognition packages (if installed on your machine):

Gscan2pdf is handy for repairing scanned documents. It is open source with Debian packages available (only an apt-get away if you run Ubuntu). Unfortunately, I couldn’t find any OS X or Windows builds. I did come across an application called Scan Tailor which works on Windows and GNU/Linux and appears to offer similar functionality to gscan2pdf. I am not aware of any OS X apps; if you know of any please offer suggestions in the comments below.

After going through the hassle of requesting the articles, waiting for their retrieval and scanning them… you might as well spend another five minutes and fix up the results using one of these applications. It only takes a few minutes to learn the operation of gscan2pdf and results in improved readability.

This entry was posted in Advice. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

HTML tags are not allowed.