Note: Hit CTRL-Refresh a few times if the images don't display correctly.

May 28, 2004

A Sample Of My Less-Paper Office

As promised, I've uploaded a sample PDF from my scanner. This was scanned in on my (Visioneer) Xerox Documate 510 scanner using the included PaperPort software. The page was scanned in at 300dpi and saved as a PDF.

Here are the documents:

PDF file (requires Acrobat Reader)
text file

I used the OCR functionality that is part of Adobe Acrobat 6 Standard. That allowed me to convert the PDF to text. If you look at the text file you'll see that most of the text was converted over correctly. Most of the text that wasn't converted over is due to the fold in the original document. The OCR isn't perfect but it usually gets the majority of the text. It has more problems distinguishing text when the background color isn't white.

Open the PDF and use the Select Text tool to select some text. See how you can just copy and paste text from your documents?

The coolest thing is the search. Open the PDF and click the Search button (the binoculars). Search for "your" and scroll through the results to see how it works. I have scanned in all my old credit card statements. If I can't remember when I bought my router, I just search for "Best Buy" and I can immediately find all the purchases I made there. Adobe also lets you search a whole directory of documents.

When are YOU going to start having less paper?

Related links:
My review of the Xerox Documate 510 Scanner

Comments:
Post a Comment | Send Me A Private Comment

Anonymous's comment on 3:50 PM

Hi -

How good is the driver for the Documate 510 at remembering settings? For instance, I scan into PaperPort using a Brother Multi-function or an HP multifunction and on each scan I have to explicitly pick 8.5x11 and other scanning parameters - kind of a nuisance, haven't figured out how to save my "favorite" scan configuration. Any thoughts? I appreciated your review of the 510 - email me at efendler@yahoo.com if that's ok, if not I'll check your blog later on. Thanks!

Reece's comment on 9:50 PM

Hi,

The software remembers the last settings that were entered. I don't think there is a way to save a specific configuration (other than the last settings that you set).

-Reece

Anonymous's comment on 8:00 AM

I am thinking of buying this xerox scanner. One of the reviewers on Amazon.com said this scanner pulls to the left a little. Have you run into this problem?

I have an HP scanner. It has problems when scanning and sometimes misses parts of pages. I think it is more of a software problem, have you run into this on this scanner?

Thanks for you help.

Reece's comment on 8:16 AM

I haven't noticed the pulling to the left with 8 1/2 x 11 pages. But you need to be careful with small pieces of paper (such as photos) to make sure they are lined up straight - but this is true for most ADF scanners.

I'm just scanning in bank statements and such so having it 100% straight isn't so important for me. I believe you can fix this in Adobe Acrobat anyway.

Overall it does a decent job of keeping the paper straight.

Anonymous's comment on 6:04 PM

Hi,
I have a similar setup-- is there a way to make Adobe automatically OCR the documents as it scans them, or do you have to manually run the OCR every time?

Thanks

Matt.

Reece's comment on 10:16 PM

I haven't figured out a way to do automatic OCR. I'm sure there must be a command-line option to Acrobat that does this.

This link http://www.pdfforlawyers.com/2004/04/ocr_tutorial_fo.html mentions how to convert multiple files at a time.

jason's comment on 9:14 PM

I have a couple of questions for you. First and most importantly, what software comes with this scanner that allows you to scan multiple page documents that are double sided and then rearanges them for you. I already have a scanner and need to be able to do this.

Second, how do you make a scanned Adobe file searchable without using the OCR. The example you posted is searchable and it is a scanned image. I did several searches. One of the searches I tried doing was for "First." This word is contained in your document at the header, but Adobe did not find it.

Reece's comment on 9:56 PM

The PaperPort 9 software that is bundled with the Documate scanner does the double-sided magic. I don't know which scanners it supports.

To make a searchable PDF, don't use the scanning software's OCR functionality. Have PaperPort (or whatever scanning software you use) save to a PDF file. Then open the PDF in Adobe Acrobat and use Acrobat's OCR functionality. It keeps the original image and stores the text.

Back To Reece's Blog | Send To A Friend