Home > Uncategorized > Index everything….

Index everything….

I’ve just come across a revelation while searching my expenses.

It seems that Copernicus Desktop Search not only indexes Word, PDF, Excel, etc. documents, but will also find the OCR’d metadata in Microsoft Imaging MDI and TIF files.

Now, it’s well-known that you can run a Find within Microsoft Document Imaging to search for OCR’d text; and you can also use Windows search to find the same text in a folder or subfolders.

MDI screenshot

However – the fact that third-party indexing software like Copernicus supports it is pretty powerful. All the documents you scan and save as compressed images (as I do for all documents over 1 year old), you can also search instantly; making finding a particular amount on an old receipt or statement, 5 seconds’ work. No more Sundays spent on the living room floor surrounded by boxes of receipts and folders.

As an aside: a few years ago I ran a personal project to test the most efficient, reliable way to store my old documents. I found that while Adobe Acrobat 6 Full Image PDF provides better lossless compression of document images, Microsoft MDI wasn’t far behind. And while Acrobat 6 Full Version performed better OCR on typewritten text with good structure, Microsoft Imaging OCR was more effective on random snippets of text (eg. scanned receipts).

  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: