When I start a search review project I never ask people what information they need to find — I ask what decisions they have to make.
With the decisions I get a clear picture of information requirements within the context of a business process. And the one factor that comes up repeatedly during these interviews is how valuable it is to search by a date or date range.
“I often have to give presentations to prospective clients and I remembered seeing an excellent diagram in an internal presentation I attended earlier this year. But when I searched I couldn’t find it.”
When we repeated the search together (always instructive) it did not appear as a PowerPoint file and the author was clearly not the person who gave the presentation. My interviewee knew the date as it was down in her Outlook calendar, but searching by the date also failed. Only later did we find that it had been saved as a PDF and the Modified By date was a month or more after the presentation. Because it was an internal presentation no "Event" information provided a date stamp.
Defining a Date
Google provides only relative dates on the main search page but Google Scholar offers search by publication date, which for academic content is of considerable value. In a typical enterprise search environment there is a mixture of actual and relative date queries, which give rise to a significant amount of frustration on the part of users.
My least favorite is Modified By, because there is no accepted definition of what "modified" means. Someone could have opened a 2014 document a week ago, corrected the spelling of the name of the author, and saved it. The document would now have a 2016 ‘Modified By’ date.
While in principle finding the most recent version of a document could be valuable, that's not necessarily the same as finding the document that was modified most recently.
Another issue arises when moving a collection of content items to a new server (for example during a migration project). This can result in all of the content items having the same date — the date of the migration — even though they may stretch over several years.
For example the Syngenta website shows the current date for all search results, which is not helpful. A search for [Q1 2014] to find a trading statement has Q1 2009 information as the second most relevant item. To be fair, Syngenta is by no means the only example of date delirium. Dropbox has also been causing some challenges over date management, with broader implications for cloud search.
People search is also rife with date delirium — “There was a chap down in South Africa when I was there in 2013 who was brilliant at SERS but I can’t remember his name!” SERS stands for surface-enhanced Raman spectroscopy and no one will use that as a query term. The point being that it can be helpful to use search as a roll-back to find people.
It Is Not a Search Tech Issue …
Search software can help to a certain extent by normalizing date formats (US against UK), but then the business needs to decide what to present to the user. Ideally it should be clear.
Search software also excels at entity extraction, bringing out date references and indexing them. But again, it's up to the business to decide what ranking to set in place, as the weighting of extracted dates may be lower than for attributed (i.e. tagged in the metadata) dates. I have only seen one search application where a person could specifically request a list ordered by extracted dates.
Advanced search (how I dislike that term!) is usually not much help here. While people can usually specify date ranges, it's unclear how the search software defines the dates falling within the specified range.
A feature that would be helpful — and that I have yet to see — would give users the option of extending a date range in both directions when they click on a document's date. This could help users find associated documents when only one document from a project is highlighted. Just a thought for search vendors.
… It Is a Content Quality Issue
People search for information in date-specific ways. Businesses need to acknowledge this and view date management as part of the broader enterprise content quality standards and guidelines.
Have your search team search by date for content where date management is critical. Identify individual items or groups of items where additional date information could usefully be added, and consider if the weighting of these needs to be changed — this will not apply to all content. But the first step is removing any Modified By filter until you've reached a unique and agreed upon definition of what it signifies.