Note: This blog post strays away from our usual focus on tips & tricks. It does not provide a solution or workaround for Mail searches on macOS Catalina. It rather discusses technical background and ethical considerations.
Spotlight vs. Core Spotlight
Recent versions of macOS use two indexing technologies to power local searches in the Spotlight window: Spotlight and Core Spotlight.
Spotlight was introduced with OS X 10.4 Tiger. It indexes user files. Whenever a file is modified, the Spotlight engine calls upon the appropriate importer plug-in to read metadata and text content from the file. That data is then indexed for searching.
The fact that Spotlight works only with files can be a problem for some applications. For “shoebox” applications, it is often more natural to store data items in a single file or database rather than use one file per data item. Such data items cannot be indexed by Spotlight. Thus such applications either have to change their data storage to fit Spotlight’s requirements or resort to tricks to get their data into Spotlight.
Core Spotlight is a more recent addition. Core Spotlight does not watch for data or files to appear. Instead, applications actively submit data to Core Spotlight for indexing. This reversal of roles allows Core Spotlight to index any kind of data.
Migration to Core Spotlight
In recent years, Apple has migrated a few system applications to use monolithic storage rather than individual files. A few years back, Safari history items and Apple Notes were saved as individual files. We were sad to see this information moved to databases. With that, a file search tool can at best find the file that holds all notes. This is of much less interest than the individual notes.
Such items still appear in the Spotlight window by way of Core Spotlight.
Some third-party applications have also found it easier to add support for Core Spotlight than to adapt their data structures to work with Spotlight.
HoudahSpot and Tembo
Up until recently, Core Spotlight was of no interest to HoudahSpot and Tembo. Core Spotlight is typically used to index items not available as individual files. HoudahSpot and Tembo, however, are designed as file search and organization tools. These applications expect to work with files that have a name and a path, can be tagged, can be copied, etc.
The other reason HoudahSpot and Tembo have not added support for Core Spotlight, is that Apple has yet to provide public API to allow third-party applications to search Core Spotlight. The documented API allows only for searching data owned by the application that does the searching.
This became a problem with the release of macOS 10.15 Catalina. Even though the Apple Mail application still stores mail messages as individual files, it has moved indexing from Spotlight to Core Spotlight. This puts searching for Apple Mail messages off-limits to third-party applications, scripts, and automation tools.
The removal of Mail messages from the Spotlight index does not only affect file search tools.
Spotlight was the de-facto API for accessing Mail messages. It gave access to messages, their subject, sender and recipient names, as well as a wealth of other well-documented metadata. Spotlight also provided notifications when new mail was downloaded.
This allowed applications and scripts to work with mail without duplicating the effort of connecting to mail servers. Automation tools could set up actions to run upon receiving email messages. E.g. a mail to self to “turn on screen sharing”.
Apple has removed this public API without prior notice of deprecation and without providing a replacement.
The decision not to allow third-party applications access to Mail searches steps out of line with Apple’s current efforts on privacy. On macOS, Apple likes to put the user in the driver’s seat. At every turn, the system asks the user whether to allow or deny access to potentially sensitive data.
macOS Mojave and Catalina have an option to allow applications to access application data like Mail, Messages, etc. The user can thus decide to trust an application with access to his/her Mail messages. Yet Apple all but overrides the user’s choice by still not allowing that application access to the search index for Mail messages.
This may appear to be a cautious approach that favors security and privacy over application interoperability and productivity. In truth, the new situation is likely to undo privacy benefits provided by the “Full Disk Access” protection introduced with macOS Mojave.
Power users and third-party applications are likely to create their own search indexes. These additional copies of the private data contained in mail messages will not benefit from SIP / “Full Disk Access” protection.
It appears that from a technical point of view, Spotlight and Core Spotlight are not all that different. Both share a common index file format. Spotlight typically stores its index at the root of the drive. The Core Spotlight index is found in the user home folder.
Further digging left the impression that Spotlight and Core Spotlight also share the same search API. The very same API that HoudahSpot uses to work with Spotlight seems to be capable of bringing in search results from Core Spotlight. It appears it does so when used by Apple’s Spotlight window application. Yet it refuses to do so when called from a third-party application.
The Apple Spotlight application has two Apple private codesigning entitlements: com.apple.private.corespotlight.internal and com.apple.private.corespotlight.search.internal. It seems reasonable to assume that these entitlements trigger the change in behavior of the Spotlight search API. With these entitlements present, the default Spotlight application gets access to search results that are withheld from third-party applications.
One has to applaud the fact that the existing API did not simply start bringing in Core Spotlight results when that new technology was introduced. An application that expects search results to contain only files would have been caught off guard had the API thrown in data items from shoebox databases.
Unfortunately, Apple chose not to provide an option for third-party applications to opt into Core Spotlight search results. This puts more and more user data (Apple Notes, Safari bookmarks, third-party “shoebox” data, etc.) off-limits to search and automation tools.
What is frustrating about the Catalina Mail situation is that there is no obvious reason for neither the switch to Core Spotlight nor for not allowing third-party applications to search for Mail messages. Even after the switch to Core Spotlight, Apple could have allowed Spotlight API to include Mail messages in search results.
There however is a much bigger problem. Since there is no way for third-party applications to search Core Spotlight, no third-party can offer a full-featured alternative to the Spotlight window. Apple has purposely limited API access so that no third-party can build upon the existing search indexes to provide better tools.
With all its flaws and limitations, the Spotlight window is made to be the only user interface that can dig up both files and data stashed in shoebox applications. Power users who need to work with more files that the anemic Spotlight can handle are forced to juggle Spotlight, Finder, and third-party tools to get a complete picture of their data.
For example, when one tries to find all interactions with one a particular client, it would be most useful to find project files, invoices, mail messages, calendar entries, notes, etc. in one place. This used to be the case. As more and more data moves to shoeboxes and indexing shifts to Core Spotlight, one has to check many in places to again piece together a complete picture. The Spotlight window shows only the top few matches. Finder and HoudahSpot provide the remaining files. The Spotlight window also finds a few notes and Mail messages. Yet one has to visit each application separately to get a full set of search results.
Automation and application integration – formerly a hallmark of the Mac – are pushed out the door as more and more user data is hidden away in iOS-style data silos. It is time for a change of course that once again favors productivity.
The solution is obvious: Apple needs to again allow third-party applications full access to the search engines it builds into macOS. This includes both Spotlight and Core Spotlight.
We encourage readers to submit feedback to Apple.