Preventing server downtime

Preventing server downtime

Preventing server downtime

Downtime is used to refer to periods when a system is unavailable and fails to provide or perform its primary function. As discussed in our previous blog article, server downtime is not only bad for productivity, but it also causes many other related issues, such as downgraded brand image, financial losses, data loss and much more.

So how do we prevent downtime?

There are several things you as a business owner can do to prevent server downtime, and in turn, avoid the aforementioned problems.

Predict issues before they happen

The biggest problem with server downtime is that it usually happens at the worst time possible – during a highly advertised sale, or a big presentation (because Murphy’s law says so). But in reality, downtime is usually caused by either a failure that could have been predicted, or by a surge in pageviews.

So instead of waiting for an issue to happen, you should plan ahead and make sure:

  • Your system has enough resources available (in case of a spike in visitors) or;
  • Your system has the ability to scale easily if needed
  • There are no heavy background processes happening causing additional load on the system
  • You’re using only lightweight applications instead of their heavy duty versions
  • Your system has some sort of a redundancy available, so it can continue working even if the primary system is down
  • You’re using a load balancer to guide your traffic where you want it to be
  • You’re properly utilizing browser and web application cache
  • You’re monitoring your application errors/logs to detect and resolve any application-related issues as soon as possible

Aside from good planning, you should also make sure to have a good monitoring system in place. Monitoring systems will ensure you’re able to see issues before your end-users see them, and react accordingly. If your monitoring system is good enough, you’ll also be able to gather some data from it, and see patterns. By analyzing how your system behaves before there’s an issue, you’ll be able to predict problems before they even happen.

server downtime graphs

Pay attention to security

An outdated system with possible security loopholes can bring you many problems. Whether your system is prone to attacks (DDoS for example) or there’s a possibility for hackers to install unwanted software (cryptolockers and similar), any type of a security leak can cause server downtime.

To minimize the possibility of downtime caused by security issues:

  • Make sure your system is up to date. Most software providers will release patches for any security issues as soon as they arise, in turn making your system much safer. Remember the Meltdown? We do.
  • Use a restrictive firewall ruleset with implicit deny policy, explicitly allowing only necessary ports from/to certain locations
  • Don’t expose unnecessary information that can help attackers identify the software you use (e.g. web server or programming language name and version)
  • Regularly perform infrastructure and application penetration testing
  • Regularly review audit logs and set alerts for critical events
  • Restrict access to critical parts of your infrastructure (e.g. database servers or backend applications) with strict firewall policy, VPN access
  • Wherever possible, use multi-factor authentication (e.g. with software authenticator or hardware authentication  device)

Don’t forget backups

Even if you take all of the necessary precautions and your system is top-notch, there is always a small chance something will go wrong (e.g. accidental data loss). With that in mind, it’s always a good idea to have a backup available which can bring your system up and running as soon as possible. This goes both for your own machine you’re using when you’re working and for any other IT system, be it big or small.

Read more about the importance of backup in one of our previous posts here.

Invest in a good infrastructure

This goes without saying. While we were working with shared infrastructure we encountered many customers who were more than happy to host their data in a shared environment. And that’s ok, if your business doesn’t depend on the availability of your website or your infrastructure in general. However, when problems would happen (usually due to other customers overusing their resources), we’d hear sentences like “I’ve lost millions because of this downtime!”. We understand downtime is a nuisance for anyone involved, however if your business loses millions while your website is down, you shouldn’t host your data on a low-end infrastructure. You wouldn’t keep your valuables in a cardboard box, so why would you keep your valuable data on a shared hosting package?

If you value your business and your online presence is important for you, make sure you’re using the best possible infrastructure for your needs. It can be on-premises (albeit a bit more expensive), or in the cloud, we believe both solutions can be equally good, however make sure you invest only in high-quality providers.

Infrastructure Design

Conclusion

As we discussed in our previous post, downtime can cause many problems for anyone involved.  So to help you ensure your system is up and running at all times, we’ve listed several tips we like to use when we set up our customers’ infrastructure. However, if you need additional help on this, or you’d like us to take over and improve your system’s stability, contact us and we’ll be more than happy to help you. 

Share this post