rubber band stretching
PHOTO: NeONBRAND

As the pandemic causes an increase in online purchasing activity, ecommerce platforms are collapsing in droves, unable to keep up with the expanding volume of visitors and transactions.

While this may sound like a good problem to have, it's completely preventable. Yet companies are continuing to release platforms without accounting for an increase in users, which will cause long-term issues in customer experience and service delivery.

An Ecommerce Warning Fable

Here in Colombia, we experienced this issue first-hand when the government attempted to reignite the economy by eliminating commodities tax on all purchases for three days.

On day one, people swarmed shopping malls and department stores trying to buy large items. Not exactly an ideal scenario during a pandemic. To counter this, the government restricted large purchases to online platforms only, which led many of the country’s largest ecommerce platforms to collapse.

The result? Extensive delays, bad customer experiences, and millions in revenue lost — the exact opposite of the government’s initial objective. 

This example is just one instance of a worldwide challenge that companies face as more people move online to shop. In this piece, we’ll share some of our advice for overcoming this issue and preparing ecommerce systems for an increase in demand before it happens.

Related Article: What Do Businesses Need in an Ecommerce Platform Today?

Breaking Up Coupled Components

It’s critical to understand where the most traffic is hitting the system and how the system behaves under different scenarios. But when a system isn't segregated by responsibilities, it’s much harder to monitor traffic and puts unnecessary stress on the whole architecture.

For example, during a surge, one user might be making a purchase while the inventory is being updated or while many customers are attempting to perform the same action, preventing them from being able to process their payment or finalize the purchase. In a high number of cases, these problems are the result of coupled responsibilities. Separating these components should be a priority when strengthening architecture.

One method is an architectural quantum approach, which involves segregating responsibilities into the smallest deployable piece, but this brings its own set of problems and challenges, especially for existing platforms. 

Instead, try to look for a middle ground and separate components based on domain. For example, a surge in people writing product reviews shouldn’t impact the system’s ability to show a catalog or process payments, so keep these responsibilities separate.

While component separation is key, be careful not to end up with so many individual components that your team can’t add new features without touching several codebases or having to maintain too many services.

Related Article: Website Obesity Is Eating Your Business

Stateless and Stateful Categorization

Once your components are uncoupled, they need to be analyzed and broken into two categories: Stateful or Stateless.

Stateful components, like a database, are components that persist information and require a lot of planning and thought when being scaled in order to avoid consistency problems and in many cases, potential downtimes. Stateless components, on the other hand, don’t “remember” anything, treating every request as if it was made for the first time. This makes them easy to scale horizontally, which can be done automatically at runtime.

Stateful components are much harder and more expensive to scale than their counterparts, so aim to minimize their numbers. One of the ways you can do this is by pulling the state out of your components via centralized storage options, which would act as the software equivalent of a network file system.

Moreover, querying a database too often can cause extensive delays, so use caching mechanisms that store the most queried information, and then develop a caching strategy to make sure you’re leveraging them.

Along those same lines, make sure to understand the limits of any payment processing systems. If you’re using a third-party solution, you’ll have to carefully determine its limits and how you’re sending it information to avoid collapsing it. 

Related Article: Starbucks Spits in My Mobile Payment Latte

Monitoring for Success

Monitoring doesn’t just serve to avoid over-provisioning and as a result, lowering your costs during spikes; it can help you lower costs across the board and can prepare you for the worst in the event of an influx in demand. 

Completely elastic response to demand is difficult to achieve, but you’ll only ever get somewhat close if you are monitoring. I can’t stress this enough, especially since your profit is directly tied to the system’s ability to handle load. 

Clarity around when to expect spikes in demand allows you to create viable surge scenarios, giving you clearer insights into building elastic demand. Involve your marketing or business development team when creating those surge scenarios, as they should have a pervasive understanding of the market, the number of users and what surges might be coming down the pipeline.

When monitoring, you’re looking for a balance between optimal performance and spend. First, find the areas of stress, and depending on the results, define your horizontal scalability. Every cloud provider has a simple way to handle metrics that allows you to tie events to these metrics, making it very easy to understand their threshold. If you are ramping up for a user spike, consider which metric value is going to reflect that peak, then place an event at that threshold level in order to scale up what you need to.

Try to avoid simple metrics like CPU load or memory usage on their own, and instead aim for what Google’s Site Reliability Engineers have labeled the “Four Golden Signals.” These metrics are considered the standard for user-facing systems, and include: latency, traffic, errors and saturation. Keep in mind, these are high-level suggestions and the bare minimum. You’ll have to do the work to translate them to your system.  

Once you have monitoring in place, you need to design your system to auto-scale based on the results. Start by creating a scalable decision tree based on the metrics and then implement it. Plenty of automation tools can help with this and allow you to test features against a theoretical load, preparing you to test against production. 

Continue to Provide the Solution 

The last point related to the technical side is continuously strengthening architecture and monitoring processes. The above points are not one-time events. Every time monitoring fails, every time an operation fails, you need to revisit the architecture and revisit your processes.

Companies pay thousands of dollars to do right-sizing exercises, but then never do them again, expecting everything to remain constant. Right-sizing means being able to clearly map out the desired service capacity and the computing capacity required to serve the user base, so it’s an ever-changing necessity. 

It takes a cohesive, multidisciplinary team to accomplish this. From my experience, you should strongly consider bringing UX professionals into the mix to provide a clear picture of how the user is interacting with and reacting to the system. Also, ensure all team members are proficient in distributed systems and architecture so they can effectively execute most of the tasks listed above.

Production-experienced cloud or DevOps engineers are best for supporting the environment, while performance professionals are necessary to execute run times and monitoring. From a non-technical aspect, you’ll need people who are able to think in terms of business needs, plan for market share, be specific about spikes and dips in demand, and more in order to help the team make the appropriate provisions.  

Take Back a Little Control: Strengthen Your Ecommerce Platform

At the end of the day, there’s no magic formula to design experiences that can meet demand without breaking the bank. Don’t be fooled into thinking the solution is easy or quick. Instead, focus on the resulting benefits from a technical and revenue perspective to realize why the initial pain and expenditure is worth it.

The world is pretty unpredictable right now, and experiences are everything, so take back some control by strengthening your ecommerce platform and giving your customers a reliable, consistent place to purchase your goods and services.