Scanning Those Public Domain Magazines With FreeOCR

I’ve been searching for a good, free OCR software program that has the best features for scanning books and magazines, but still be really easy to use. I believe I’ve found the perfect choice for you. The program is called FreeOCR and can downloaded here: V2.3 Free OCR Software

I installed the program and proceeded to put it through the paces. It doesn’t have a lot of bells and whistles (which is a GOOD thing) so the scanning and OCR process is pretty straight-forward. There are a few tricks you should know about though, which is why I’m writing this. To test the software, I decided to scan and convert a multi-page article from a magazine (from the Public Domain) that I recently received. The article was formatted in a multi-column (3 columns) format, so I was curious to see if the software could keep the columns separate from a single scan. It couldn’t. All the lines ran together across the columns. But honestly, this isn’t a big deal. All I did achieve success was to scan each column individually and it worked perfectly!

The next issue I encountered was exact how to handle the multi-page scans. There wasn’t a prompt to ask for the next scan. What I discovered, however, was that this isn’t a problem either. Here’s what I did: I did a preview scan of the first page of the magazine article and then adjusted the scan area to only include the first column from the article. Then I scanned that column into FreeOCR (use at least 200 dpi…I used 300dpi). When the scan was complete, it appeared in the program’s left panel. I clicked on the OCR button in the program and the text was processed and shown in the right panel.

Next, I clicked the “scan” button again and re-adjusted the scan area to include just the 2nd column. When the scan was complete, the 2nd column appeared in the left panel as before. When I clicked on the OCR button for THIS scan, it processed the text and added it to the end of the previous OCR text. Very cool. I repeated the process over again until I had scanned all the columns from the magazine article. Once the last column of text was processed, I copied all the text from the right panel (there is a button to copy the text to the clipboard) and pasted it into Microsoft Word for final formatting. You can also click on the Microsoft Word button within the program, and if you have Word installed, it will automatically launch the program and open a file with the scanned text. All that remained was to format the text.

What was impressive to me was the accuracy of the OCR. The article I chose wasn’t just a simple, basic article (you know me better than that). I picked one that included French and Spanish in the text in addition to English. The OCR correctly rendered all the words in the entire article (except for three) and even included all the quotation marks as well as all the diacritic marks for the foreign words. Overall, I was pleased with the software and have concluded that it, if you do not currently have scanning software installed (such as OmniPage, Acrobat Pro or Microsoft Document Imaging), FreeOCR is a perfect choice for scanning all those Public Domain magazines and books that you’ve been buying off of eBay!

One Response to “Scanning Those Public Domain Magazines With FreeOCR”

  • Vicki:

    Thanks for the tip! I just purchased a public domain book on and was scratching my head wondering how to go about doing this. I read somewhere that Microsoft Word already has voice to text built in, so I was also hoping I would be able to just read the book aloud and have it create the text document for me in MS Word, but also record my reading at the same time somehow? What works in conjunction with this voice to text feature in Word, so I can create a word doc and an mp3 at the same time? (I really really really don’t want to have to read aloud this 400+ page book *twice* if I don’t have to!). Just color me “newbie”, and thanks for all the great tips on your blog!

