
Find and Fix PDF Files That Lack Searchable Text

Archiving paper documents as PDF files is a great way to save shelf space and preserve essential records.

However, more than simply scanning the documents is required. It would be best if you also used Optical Character Recognition (OCR) to process the scans. Once OCR has processed a PDF scan, the file contains an invisible text version in addition to the scanned image of the document. macOS Spotlight can now index the content, and you can use HoudahSpot to search your document archive.

But what if some of your PDF files lack OCR text?

Continue reading Find and Fix PDF Files That Lack Searchable Text

Video Tour of HoudahSpot 6.0

Finding files on your Mac can be a difficult and sometimes a frustrating task because of their sheer number — we all have so many!

HoudahSpot is an application that allows you to do refined searches to get just the files you are looking for and includes several useful presets right out of the box.

Todd Olthoff of ScreenCastsOnline revisits HoudahSpot and shows you all that is new in version 6.0.

Don McAllister of ScreenCastsOnline has covered HoudahSpot in the past so this screencast from Todd focuses on the updates that version 6 brings to the search party.

Below is a short preview. Watch the full show for free with a 7-day trial ScreenCastsOnline subscription. Continue reading Video Tour of HoudahSpot 6.0

Customize Your HoudahSpot Search Setup

In HoudahSpot, you can choose between hundreds of criteria to search for files. HoudahSpot also lets you specify in which folders to search and how to sort results.

There are lots of options to choose from – and settings you don’t want to make over and again.

Search criteria, results display, and sort order are a matter of personal preference and habits. You may, for example, find yourself frequently searching for files by file extension. You may prefer to search your full hard drive rather than just your home folder. You may want search results always to list file size.

Let’s see how you can set up HoudahSpot so that a new search window matches your preferred way of searching.

Continue reading Customize Your HoudahSpot Search Setup

Quick Access to HoudahSpot Search Templates

In HoudahSpot, search templates serve as starting points for searches that you perform repeatedly.

HoudahSpot comes with a set of sample templates. These include, for example, a template for finding photos. This template is set up to search for image files having a resolution of at least 7 MP. Search results show image previews. The Refine pane is pre-configured for searches by camera make and model, ISO speed, and flash settings. To find photos you just need to fill in the blanks and start the search.

Templates can also act as dynamic file lists. The Recent Files template, for example, lists files used or modified within the last 7 days.

You will certainly create your own templates or customize the sample templates to fit your needs. You may, for example, want to update the My Photos template to search only for files matching your camera make and model.

As templates become a central part of your workflow, you will want quick access to your favorite ones.

Continue reading Quick Access to HoudahSpot Search Templates

How-to: Customize the HoudahSpot Alfred Workflow

Alfred by Running with Crayons Ltd. is an award-winning app for macOS which boosts your efficiency with hotkeys, keywords, text expansion and more.

HoudahSpot is a powerful file search tool. It takes the guesswork out of Spotlight searches and helps you find files hidden deep in the “haystack” of files accumulated over the years.

HoudahSpot 5 includes an Alfred workflow that provides an elegant way to start a HoudahSpot search using Alfred. You can install this workflow from HoudahSpot > Preferences > Extensions.

Note: Workflows are part of the extended Alfred feature set. This needs to be unlocked by purchasing the Alfred Powerpack .

Screenshot: Alfred in action
Start a file search from Alfred

The workflow is triggered using the hs keyword in Alfred. You may need to use the arrow keys to select the workflow. Press the space key to activate the workflow.

Continue reading How-to: Customize the HoudahSpot Alfred Workflow

Search DEVONthink 3 Databases using HoudahSpot

This is a follow-up to an older post. It has been updated for the new DEVONthink 3. Users of the older DEVONthink Pro (Office) can refer to the original post.

DEVONthink is a smart document management solution for Mac. It lets you organize and work with all your documents — bookmarks, email messages, text files, images, PDFs — in one place, regardless of where they come from.

Now that you have all your documents stored and organized in DEVONthink, you can rely on both DEVONthink and HoudahSpot to always find the piece of information you need.

Continue reading Search DEVONthink 3 Databases using HoudahSpot

How-to: Customize the HoudahSpot LaunchBar Action

Objective Development’s LaunchBar is an adaptive app launcher, document browser, and much more.

HoudahSpot is a powerful file search tool. It takes the guesswork out of Spotlight searches and helps you find files hidden deep in the “haystack” of files accumulated over the years.

HoudahSpot 5 includes a LaunchBar action that provides an elegant way to start a HoudahSpot search using LaunchBar. You can install this action from HoudahSpot > Preferences > Extensions.

Screenshot: LaunchBar in action
Start a file search from LaunchBar 6

The action is selected by typing the hs abbreviation in LaunchBar.

Continue reading How-to: Customize the HoudahSpot LaunchBar Action

Customize Your HoudahSpot Search Setup

In HoudahSpot, you can choose between hundreds of criteria to search for files. HoudahSpot also lets you specify in which folders to search and how to sort results.

Lots of options to choose from – and settings you don’t want to make over and again.

Search criteria, results display, and sort order are a matter of personal preference and habits. You may, for example, find yourself frequently searching for files by file name extension. You may prefer to search your full hard drive rather than just your home folder. You may want search results to always list file size.

Let’s see how you can set up HoudahSpot so that a fresh search window matches your preferred way of searching.

Continue reading Customize Your HoudahSpot Search Setup

Search DEVONthink Databases using HoudahSpot

This post applies to DEVONthink Pro (Office). Users of the new DEVONthink 3 should refer to the updated version of this post.

DEVONthink is a smart document management solution for Mac. It lets you organize and work with all your documents — bookmarks, email messages, text files, images, PDFs — in one place, regardless where they come from.

Now that you have all your documents stored and organized in DEVONthink, you can rely on both DEVONthink and HoudahSpot to always find the piece of information you need.

Continue reading Search DEVONthink Databases using HoudahSpot

Find PDF Files That Need OCR Processing

There is an updated version of this post.

Scanning paper documents to PDF files lets you archive important (and not so important) documents without filling up cabinets.

Optical Character Recognition (OCR) makes these scanned documents much more useful than their paper originals. Once a scan has been processed by OCR, the PDF file contains both an image of the document and an invisible text version. The text can then be searched using HoudahSpot.

Unfortunately, you will find that not all of your PDF files have text content. You may have forgotten to run them through OCR. Or you may have received the scanned document from someone else.

How can you find these files and rectify this?

With a little trick, HoudahSpot can find PDF files that lack text content. It is safe to assume that any text contains either a space or a period. Thus, we will be looking for any PDF file that contains neither space or period.

This translates to the following search:

Find PDF files that lack OCR text. The first search field contains a “space” character

Continue reading Find PDF Files That Need OCR Processing