Salesforce customers lost four hours worth of CRM data due to a system failure and outage that lasted over two business days.
The outage set off a social media firestorm from customers of the world’s largest Software-as-a-Service (SaaS) customer relationship management provider. The timing of the acknowledgement is worth noting: It occurred during Salesforce Connections, a three-day digital marketing conference in Atlanta, where chatter about the outage was conspicuously absent among the 6,000 attendees.
A Salesforce representative when asked for details on the outage referred CMSWire to the company’s system status page.
How Many Affected?
It is unclear how many customers were affected or how much data was lost. The Salesforce “instance” affected is called, "NA14." Salesforce instance refers to the server that Salesforce organizations live on, according to cloud computing service provider Cloud for Good.
Many Salesforce instances live on one server. NA would mean “North America” for this instance. Salesforce lists 97 instances on its service performance website.
Salesforce would not release the number of customers or organizations affected by the outage.
But if you use Twitter as a metric, it’s significant. Customers took to Twitter under the hashtag #NA14 to lament:
It even prompted the CEO of the 20,000-employee, $6.7 billion yearly-revenue company to apologize to a user.
Drew Rothe (tweet above), a sales account executive, told CMSWire his company had no access to Salesforce for 24 hours and could not access phone numbers, emails and more.
"Also," he added, "we weren't able to prospect into any new accounts and no calls were being recorded. Deals also had to wait because we create our quotes through Salesforce as well. You really don't expect something like this from a $49 billion company."
Salesforce reported an initial performance degradation issue around 8:41 a.m. ET Tuesday followed by a service disruption about 50 minutes later.
Salesforce did not resolve the disruption until 5:30 a.m. Wednesday, about 20-hour later. Salesforce said data written to the NA14 instance between 5:53 a.m. ET and 9:29 a.m. ET on Tuesday can not be restored.
“The NA14 instance continues to operate in a degraded state,” Salesforce officials said on their performance website. “Customers can access the Salesforce service, but we have temporarily suspended some functionality such as weekly exports and sandbox copy functionality.”
While sales agents across the Salesforce ecosystem remained stifled during the outage, Salesforce reported the service disruption was caused by a database failure on the NA14 instance. It introduced, officials said, a file integrity issue. The issue was resolved by restoring NA14 from a prior backup, which was not impacted by the file integrity issues.
“We sincerely regret any inconvenience this disruption has caused you or your organization,” Salesforce officials said. “Ensuring your success is our top priority at Salesforce, and we’re focused on learning from this issue and preventing any recurrences.”
"We believe this makes sense for as, owing its age, the company has a complicated infrastructure with significant legacy components that are difficult to modernize," Piper Jaffray research analysts Alex Zukin and Scott Wilson wrote in a brief they shared with CMSWire. "With the continued strong growth of the company, this infrastructure continues to be put under pressure — as evidenced by this week's much publicized outage."
User Wants Better Response
Darrell DeVeaux, president founder of Atlanta-based HealthDetail, a healthcare SaaS company, said his company uses NA14 and is a Salesforce ISV partner. It uses Salesforce for sales and lead tracking. The 10-year Salesforce user said in the first couple of years these outages happened more frequently and longer but he has seen it "improve greatly since then."
However, the response this time around could have been better, DeVeaux said.
"Once Salesforce knew it would have to go to a backup and it would be a few hours they could do much better letting us know that," he said. "At this point all the Trust site says is that the instance is down. Yeah, we know that. That type of information is fine initially, but they should then be more proactive in the outlook of things vs. retrospective. On Twitter they just told me to check Trust site. So yes, they could be more proactive once they knew it would be a while."
DeVeaux said the events of the Salesforce outage this week reinforces that "nothing is ever safe" in the cloud.
"It makes me question how often these instances are backed up and the need to ensure we are paying more attention to what the vulnerabilities are and mitigating those as best we can," he said. "I took this for granted and, like losing my computer last year to a power surge because the surge protector was just basic, I had to quickly educate myself on surge protectors, something I took for granted in past."
The cloud, DeVeaux added, is computers, and computers break. They always will.
"People that don't understand that are in the wrong industry," he added. "The cloud has far more safeguards and fewer problems than on-premise. We don't hear about on-premise issues since they are localized."