The first week of 2021 was so 2020, and the business productivity world was not immune. Hours into the first business day of the New Year on Monday, Jan. 4, Slack went down. The outage lasted from around 9 a.m. to 12:15 p.m. ET, according to Slack’s status report.
Slack’s Monday-morning downtime sent the business world into a frenzy, leaving employees to wonder how to possibly connect with colleagues digitally while revving up the Twitter GIF engine with illustrations of chaos and business paralysis.
Three-hour outages for communications tools likely won't bring down NASDAQ-listed enterprises, but it begs the question: how should businesses respond when these central collaboration and communications tools go down and usually-reliable cloud computing services demonstrate their frailty like an unplugged cable box? Is a backup plan needed for these times, and how extensive should it be?
“A contingency plan means the impact is lessened,” said Sam Marshall, owner and lead consultant at ClearBox Consulting, a digital workplace consultancy. “You don’t want people trying to figure out what to do — or spending time debating different ideas about what to do. For example, sending emails arguing the merits of Mural over Google Docs.”
Single Point of Contact
Of course, it’s 2021 and human beings tend to be resourceful connecting digitally. Employees don't have to be reminded that when Slack goes down they can text, email, use other collaboration tools besides Slack, phone calls and real-time document collaboration to reach colleagues. But should there be a more formal connection compass for businesses when tools like Slack go down?
After all, tools like Slack in business are like text and IM for consumers, and this was true even before the COVID-19 pandemic sent most of us into our living rooms and kitchens to do work. Slack has 12 million daily active users. Microsoft Teams has 115 million daily active users.
Slack going down Monday, Jan. 4 kinda felt like a business blackout, right? What’s the immediate next step?
Keep communication to a single point of contact/source of truth, said Kristina Podnar, digital policy and privacy consultant. “When something goes down how will you know it is down and who will let everyone know — and how — when it is back online?” she asked. “Much like with smoke in the building, you need to have a designated ‘Fire Marshall’ for when systems go down. Someone who can assess how big of a deal the outage is and direct people to a safe solution.”
The point person directs everyone via whatever has been designated in the policy as the correct channel to communicate in an outage This could be a text message even. They communicate reminders on what the backup plan is and what to do while the outage is impacting the organization.
Related Article: Why Salesforce Really Bought Slack
Approved Backup Communications Tools
Having a point of contact — and policy to back things up — also can help avoid costly shadow IT by scrambling employees. Policies for operating without a Slack tool should address a list of approved backup tools and distinct guidelines on how to use those tools, according to Morten Brøgger, CEO of Wire, a collaboration platform.
Without a clear understanding of which tools are appropriate, employees may turn to what is most accessible or comfortable instead of what is the right fit, Brøgger said. If Slack goes down do you feel comfortable with your company’s data and information being discussed on WhatsApp, and therefore shared with Facebook? Probably not. Or would you prefer your employees to use secure enterprise-grade tools?
“To make an informed decision about which tools should make it onto your approved list, it’s important to decide which factors — security, privacy, ease-of-use, integrations and time-to-implementation — are most crucial for your business,” Brøgger added.
Teams often need to share information asynchronously, from innocuous/fun messages to serious messages containing important IP or sensitive customer data. Each of these communication categories carry a different type of significance for your business, and, more importantly, a different threat level if leaked.
“Because of this,” Brøgger said, “it is important to clearly define which collaboration tools are appropriate for which types of communication. This is especially important in light of the pandemic’s shelter-in-place orders, as most employees are now isolated in their homes, and have increased their use of consumer-facing tools that lack proper cybersecurity protocols. Not addressing this could result in a serious shadow IT problem, where employees create unnecessary cyber risks by using systems without explicit company approval.”
The critical point of a collaboration-outage policy would be to clearly state that non-corporate platforms — personal email, consumer chat programs, etc. — are absolutely never allowed to be used under any circumstances in terms of sharing sensitive company material, according to Dan Nadir, chief product officer at Theta Lake, which provides security for collaboration platforms. “This ensures,” he said, “that corporate information, sensitive or not, is not transmitted in an insecure manner.”
Make Sure Collaboration Tools Are Part of Your Disaster Recovery Plan
Most businesses likely have some kind of disaster recovery and business continuity policy documents. And those could cover collaboration-tool outages but may need to be updated.
“With the rapid shift to remote work, a lot of organizations made changes out of pure necessity and did not have time to change these plans and procedures,” said Fernando Castanheira, chief information officer at Aternity, a digital employee experience management provider. “Check to make sure that collaboration tools are even included. Outline alternative methods and company approved tools that can be utilized if the main tool is down.”
Related Article: Hidden Dangers: Productivity Killers That Sap Energy and Time
No Backup Plan Is Perfect
Backup plans do not come without caveats. The problem with a backup plan in these situations is that a properly deployed collaboration tool will be the main conduit for information and workflows at a company, according to Carrie Basham Marshall, founder of Talk Social to Me, which provides digital workplace and employee experience consulting.
Redundancy may be temporarily achieved by duplicate functionality in other tools when, for instance, users revert to an old version of Skype to chat, or hop on a Zoom call for a face-to-face conversation, according to Basham Marshall.
“But aside from email,” she said, “IT leaders don't typically sanction a secondary collaboration platform in the event of an outage. That would result in too many information silos, which is one of the problems a tool like Slack attempts to remedy. We find that companies simply deal with the inconvenience of a major outage, and in many cases, employees enjoy the respite from Slack's constant stream of information and pings for attention.”
What About Freemiums as Backup Tools?
Is it worth considering freemium collaboration tools as a backup? After all, we’ve all become pretty good at “jumping on” something quickly at no cost to our companies. And there’s no shortage of these tools.
However Nadir feels that free consumer products don't have the appropriate level of security, and authentication on such a platform would likely be a problem. Further, he said, paid "backup" products won't likely have a good ROI as outages will likely be very short-lived. “The last Slack outage lasted about three to four hours,” he said. “It would likely take two hours to get people migrated to the backup system, and then even more time to get them to move back.”
Most freemium models have an expiration date, so it’s likely you would need to start your free trial when an outage happens, Brøgger noted. “This opens up a whole other host of issues,” he added, such as getting employees access to that tool during an outage, making sure employees are familiar with that tool and more.
Many companies may use Slack, but Microsoft Teams is available through their Microsoft 365 package, which is an example of where backup collaboration tools are common and functionable, according to Castanheira.
However, he added, the challenge is to get your employees to quickly shift to alternate tools. "The coordination and communication of these shifts is something that should not be overlooked,” Castanheira said.
Adding another collaboration tool in the event of an outage is likely to cause confusion, according to Marshall. For one thing, he said, it will be less familiar. For another, once people get back online with your regular collaboration tool, they will forget about the fragments that happened in the back-up tool. It won’t appear in searches, for example.
“Instead, the policy should direct people to the ‘next best’ tool, and if that’s email and conference calls then in the short term that’s fine,” Marshall said. “For transient collaboration it is less of an issue. For example, replacing a Slack call with a Zoom call leaves no lasting trace.”
Related Article: Is Your Business Data Safe in Slack and Microsoft Teams?
5 Key Components of Collaboration Contingency Planning
Your employees and their specific needs will go a long way determining the tenets of your backup plan, according to Podnar. Consider the need of the people to urgently communicate. If they are in a job that can’t do with waiting, you need to have a backup plan.
“Otherwise,” Podnar added, “it turns into a one-off solution by each employee and pure chaos. Not to mention loss of knowledge. Remember that if you go to a different tool as an interim solution, you want to bring that knowledge back into a central place — like Slack — once it is back up and if it is necessary. So who does that? And what qualifies for being put back into that central collaboration tool when it is back online?”
The makeup of your company will be a determining factor of your policy for times like these, too. Large enterprises are naturally going to want some backup plan/policy for times likes these, Podnar said. But small organizations should too, though the policy does not have to be super formal: bullet points on a Confluence page will do, according to Podnar. “But you do need something that is as formalized as necessary to keep the holes in the boat patched so you don’t take on water when you hit the disruption iceberg,” she added.
Podnar cited five key components of such a policy:
- Tolerable points of failure. If your business relies on an online system and it is a single system, component or service, you do not have high availability. The single point of failure may not necessarily be the application infrastructure itself. You may have multiple web, application and database servers constituted in a redundant, highly available configuration, but if these servers are hosted behind a lone router or firewall, you have a single point of failure for the entire solution.
- The need (if any) for backups channels. Think of backups if you are not going to have redundancy and can you recover from those backups when things are down (e.g., a backup can be an export of the data that can easily be imported or accessed in an nearline/offline manner depending on how critical it is to have access).
- Key points of contact. Who is in charge of determining outages and how they are impacting the business? Who will alert everyone to switch to WhatsApp for communications during an outage or to use, for instance Confluence for doc exchange?
- What is the plan if none of the plans work. Things will happen that nobody can predict. So what is the plan when there is no plan because we didn’t foresee the issue?
- Monitoring. Do a dry run once a quarter or year (choose your timing) and test your plan. See how long it takes everyone to recover from an outage. See how long it takes for an outage to impact your business. What have you missed? What are the metrics/thresholds to change the policy?
Preparing for Legal, Compliance Risks
Besides downtime, miscommunications, delays in getting messages out to your partners and consumers, potential loss of sales and loss of knowledge, companies also need to consider bigger picture around the legal/regulatory arena. Podnar referred to a case where a company, being sued, used two collaboration tools but during an outage used another as an alternate tool for two days. The company’s legal team was not aware of the material in that alternate collaboration tool, did not present the material during discovery and therefore lost the case.
“So it is about collaborating during an outage,” Podnar said, “but it is about records management that impacts business collaboration as well and limits risks.”