Creating Secure Search for Your Big Data Goldmine

4 minute read
Kamran Khan avatar

In modern business and government, your data is your goldmine. 

It adds value to your organization, shows your history of expertise, and is used to make insightful decisions on critical matters. Like an actual goldmine, your data must be both secure and accessible. Without the former, anyone can walk away with your wealth of knowledge. Without the latter, your data does nothing more than take up server space.

There are ways to balance the need to keep your data safe while making it searchable and accessible to the correct people within your organization. 

Involve Data Managers

The first step in delivering secure yet accessible data is to conduct a comprehensive assessment of your data landscape. Most organizations have multiple data repositories, controlled by different departmental groups, and it is critical to involve the content or repository owners from each area.

Whoever oversees SharePoint, Documentum, file shares and other secure repositories should be involved in this process. Each will approach security differently, and will have their own set of needs and level of paranoia about access security.

Once you have the repository owners on board, it is time to devise strategy. This includes deciding what documents to make searchable, understanding how often access rights change, and creating processes to ensure that content owners remain comfortable as the data landscape expands and evolves. If they are not comfortable with the plan, they will find ways, be they political or physical, to block access.

While this will take some time upfront, something that busy managers are loath to give, without it the company will suffer from much greater time and productivity lost down the road.

Embed Permissions

The next step is indexing data into the search engine in a way that captures not only document content and metadata, but also document-level security information such as Access Control Lists (ACLs).

This “pre-binding” security approach is the most effective way to ensure the correct person can access the relevant data at the exact moment they need it. It ensures that access via search results abides strictly by the security rules set by the repository owners. By having permissions embedded in the index, when Sarah runs a search query, the search engine knows exactly what data Sarah has permission to view and only presents her with those specific results.

Organizations often err on the side of safety, especially where repository owners are not completely confident that the search system will fully adhere to their security protocols. This restricts the scope of search, prevents the organization from fully mining its content and frustrates search users. Diligence with security builds confidence among repository owners, who are then more likely to allow the indexing of the content they control, leading to a more robust search system for all involved.

Learning Opportunities

Create Rule Sets

Sometimes, fully respecting ACLs is not enough to provide the desired level of access control. This is especially so when dealing with multiple data collections owned by different departments. Additional security can be provided by looking for sensitive words and concepts in document metadata, or even within the content of the document, and further limiting access based on this information.

For example, words like “salary” might cause access to be further limited to just members of the HR department, or mentions of “Project X” can be limited to a specific engineering group.

In a well-implemented search system, administration involves a set of “business rules” which control these exceptions. These can be simple or complex. For example, if only a select group of employees are permitted to access documents pertaining to an impending merger, you can create a rule set that allows others to see documents containing either one of the two company’s names, but not documents where those two companies are mentioned in the same paragraph. This way, employees could still find reports on their “competitor” without knowing they will soon be on the same team.

What is important to remember with these rule sets, or any aspect of maintaining search access security, is that systems need to be designed and maintained by humans. While there is highly sophisticated technology inherent in enterprise search systems, ongoing human oversight, and related processes, are the key to success.

Different organizations maintain different data security protocols and have varying tolerance levels about data sensitivity. The meshing of search and security should be customized to fit each situation. This individualized attention will keep your organization’s data goldmine under lock and yet easily accessible to those with the correct key.

Image courtesy of optimarc (Shutterstock)

Editor's Note: To read more of Kamran's thoughts on search, check out Structuring the Unstructured: Why Big Data is Suddenly Interested in Enterprise Search

About the author

Kamran Khan

As an investor and president and CEO of Pureinsights, Kamran provides the company with clear, pragmatic leadership. With over 25 years in the search industry, Kamran was a key executive at Convera before co-founding Search Technologies in 2005.