Any outage or lost document in a business critical application is going to be costly. The average cost of datacenter downtime is around $300,000 per hour, so having at least a modest data protection plan in place moves from a nice-to-have to a must-have.
Almost every business has had this conversation — the pain comes in when trying to balance the cost of protection with the probability of loss. A too broad strategy results in paying to protect worthless content, and a too narrow plan might cause you to miss something important. But who really knows what’s valuable and who should be responsible for setting the expectations?
Framing the data protection plan in terms of information governance places the focus on the content and access to that content, instead of worrying about where the content lives. This allows you to break down the planning into manageable pieces, creating a more flexible plan that can be quickly adapted if a company moves to cloud hosted infrastructure or a Software-as-a-Service (SaaS) environment. With an understanding of content and its value, you can plan around any scenario.
So how does this apply to information governance? Information governance is about getting the right information to the right people at the right time (or putting the right information in the right place at the right time). For this reason, many governance, risk and compliance conversations focus on the lifecycle management of content, asking questions such as:
- How long do I need to retain this document?
- Where should it be stored?
- Who should or should not see it?
A lot of these conversations impact disaster recovery. For example, if a flood knocks out your primary datacenter, does the failover site provide full access or is it read-only? Do users have access to offline copies of documents?
Data protection strategies come in many forms — it can be hard to decide where to start, as there are many different factors at play. However, viewing disaster recovery through the lens of information governance can ease the planning process.
Approaches to Identification and Classification
Before any plan can be successfully implemented, get a handle on the current environment through a process of identification and classification. In general, there are two approaches to this process — reactive and proactive — which are by no means mutually exclusive.
Reactive: This approach focuses on reporting and can be quite cumbersome due to the level of research required. It depends on someone in the business gathering data on usage statistics and last accessed times. It also means the audit trail must be accurate and available. Once the information has been collected, the business can make decisions about how to apply a consistent policy or industry standard. If the classification is successful, stale content can be archived and low traffic items can be treated as low priority for backup.
Proactive: The key to this approach is putting the correct controls and policies in place so that priority is applied as content is generated. This ties in directly with consistently managing the lifecycle of the content — the hallmark of information governance. This approach allows considerations such as acceptable downtime, backup frequency and scope to be considered every time an object is provisioned.
Either approach should yield similar results, but when time matters most, it is a relief knowing you have a proactive plan in place rather than assessing the damage after the fact. Proactive plans can yield collateral benefits for compliance and risk management by maintaining up-to-date ownership and classification of content.
Once you've completed classification, you can begin backup planning. There are three different backup strategies to consider to ensure information is protected according to specific business needs.
Precision Backup: For high-traffic sites and high-value, rapidly changing content, the backup window needs to be as small as possible to ensure the most up-to-date information is protected. The biggest challenge to this approach is properly classifying content, because it is not scalable for a broad scope of objects due to its reliance on granular or file level backup processes. As the number of items increases, the backup time will increase and your recovery point will start slipping. By focusing on a precision backup strategy, administrators can maintain a high recovery point objective and avoid costly outages. In terms of occurrence, a plan for this type of backup should be executed anywhere from every few minutes to every few hours depending on the size and complexity of the objects and the change rate.
Component Backup Approach: By necessity, SharePoint components and databases demand a longer backup window as there is simply more data to backup than file level backups. Because the change rate is lower, these components do not require the same frequency as the high priority items. A plan for this type of backup should fall in the daily-to-weekly range.
Wide Net Backup: For servers or virtual machines, changes are even less frequent — but in a total loss situation, rebuilding a host from scratch is going to take time. If these components are protected and a failover farm is available, the environment can be restored more quickly, but that requires an investment in hardware that is not active. A plan for this type of backup should fall in the weekly-to-monthly range.
Beyond the wide net approach, business continuity strategies begin to rely on highly available instances or multiple farm replication. The same tradeoffs between cost and time are still at play here, and deciding replication schedules around content priorities can still provide value.
For successful information governance, all of these approaches rely on consistent application of policies. Knowing what to protect is important, but knowing the relative value of the information can help build smarter disaster recovery plans.