Home > Uncategorized > Index everything….

Index everything….

I’ve just come across a revelation while searching my expenses.

It seems that Copernicus Desktop Search not only indexes Word, PDF, Excel, etc. documents, but will also find the OCR’d metadata in Microsoft Imaging MDI and TIF files.

Now, it’s well-known that you can run a Find within Microsoft Document Imaging to search for OCR’d text; and you can also use Windows search to find the same text in a folder or subfolders.

MDI screenshot

However – the fact that third-party indexing software like Copernicus supports it is pretty powerful. All the documents you scan and save as compressed images (as I do for all documents over 1 year old), you can also search instantly; making finding a particular amount on an old receipt or statement, 5 seconds’ work. No more Sundays spent on the living room floor surrounded by boxes of receipts and folders.

As an aside: a few years ago I ran a personal project to test the most efficient, reliable way to store my old documents. I found that while Adobe Acrobat 6 Full Image PDF provides better lossless compression of document images, Microsoft MDI wasn’t far behind. And while Acrobat 6 Full Version performed better OCR on typewritten text with good structure, Microsoft Imaging OCR was more effective on random snippets of text (eg. scanned receipts).

  1. No comments yet.
  1. No trackbacks yet.