The Art of Disaster Recovery – Saving Your Business

Imagine a disaster scenario, what’s the first thing that comes to mind? Natural disasters like tornadoes or floods, right? The term “disaster” is defined differently in the technology world. There are two types of disasters: natural and artificial. Artificial or man-made disasters include fires, burglary, power loss, hardware failure, software malfunction, malware, ransomware etc. Basically, any event that results in downtime or disruption of data flow or complete discontinuation of processes is classified as a ‘disaster’ in the technology world.

A number of surveys and reports indicate that disasters are a costly experience for all kinds of businesses; regardless of the scale (SMBs or Large Enterprises). According to a report by WCA Technologies, the direct cost of downtime for an SMB (Small-to-Medium Sized Business) ranges from $8000 to $74000 per hour.

Suffice to say, disasters tend to be very costly for businesses; which is why Disaster Recovery as a Service (DRaaS) and disaster recovery planning are so important.

Before we dive deeper into disaster recovery planning, let’s define what Disaster Recovery as a Service (DRaaS) is.

What is Disaster Recovery as a Service (DRaaS)?

DRaaS comprises of two processes: failover and failback. In the event of a disaster, let’s say hardware failure, you’re no longer able to access your data or resume your operations. The longer you take to resume your operations, the greater the cost and damage to your reputation. With DRaaS, you have a replicated VM to which you can failover to. The moment your primary system fails, you can failover to a secondary system that has the same environment setup with it using replication services. This enables you to continue your operations with minimum downtime.

Once your primary system recovers, you can failback to this system and resume your usual business. Depending on the DRaaS acquired, you can reduce your downtime to less than 15 minutes. This greatly impacts the rigidity of your business and its ability to withstand disasters.

 

Disaster Recovery Planning – Effectively Leveraging Disaster Recovery as a Service (DRaaS)

Disaster recovery planning relies on analysis: analysis of workload, analysis of processes and identification of mission-critical and non-essential data. In order to understand disaster recovery planning, let’s divide it into different steps.

Step 1: Setting Objectives and Defining Elements

Initially, the business needs to identify the types of data running through their processes. Data is divided into three major types:

• Hot or Mission critical data.
• Cold or Infrequently Accessed data.
• Archive data.

Hot or mission critical data is the data that’s necessary for operating purposes. In other words, without this data business operations come to a standstill.

Cold or Infrequently Accessed data is the data that’s not necessary for operating purposes and the business operations can run without them. However, businesses cannot afford to lose this data either. Examples of this data include backup repositories, records, images, videos etc.

Archive data is data retained for long, or sometimes indefinite, periods of time due to compliance reasons or for future reference purposes.

Disaster recovery (DR) services target hot or mission critical data because their purpose is to resume operations with minimum RTOs (Recovery Time Objectives) and RPOs (Recovery Point Objectives). This emphasizes more on the proper identification of the data running through the business’ IT infrastructure.

After the identification of the necessary elements, it’s also important to set certain objectives like how fast you want certain applications recovered or how much data can you afford to lose? The answers to these questions lead to the determination of RTOs and RPOs.

RTOs and RPOs, or RTPOs, determine the shape of your disaster recovery solution. If you’re wondering what RTOs and RPOs are, here’s a brief explanation:

Recovery Time Objective: The time taken to recover an object or data set.
Recovery Point Objective: The point in time from which you wish to recover that object. Ideally, this should be zero but that makes the disaster recovery solution very costly.

As part of the disaster recovery plan, businesses need to assign RTOs and RPOs according to the different tiers of data that they’ve set.

For instance, based on importance, a business has divided their data into 4 tiers ranging from tier 0 to tier 3; with 0 being mission critical and 3 being archival data. Then the assigned RTOs and RPOs of these tiers would look something like this:

With the analytics done and the RTPOs determined and assigned, the next step in DR planning can be initiated.

Step 2: Analyzing Disaster Recovery Solutions and Technology

In order to determine what kind of technology would best suit your requirements, you need to identify your data requirements properly; and you’ve done that in the previous step.

There are several disaster recovery solutions in the market. DR solutions can be distributed in two major types: on-premises DR solutions and cloud disaster recovery (DR) solutions.

On-premises DR solutions are available in the form of disaster recovery appliances while some vendors deliver a single DR site in a box. Businesses can opt to setup these disaster recovery appliances on-site or offsite depending on their preferences. Businesses that intend to emphasize on reduced RTPOs, tend to setup both on-site DR and off-site DR; making them well equipped to deal with all kinds of disasters. This sort of business is well prepared to deal with natural disasters and localized disasters.

Cloud disaster recovery solutions, as the name suggests, rely on cloud based services. The capability of this setup relies on the network capabilities. With variables like bandwidth limitations involved, it’s important to test the disaster recovery solutions a number of times after it has been setup. This is to make sure that the disaster recovery process isn’t interrupted, when it’s needed, due to bandwidth or file size limitations.

It’s important to pay attention to the available data services associated with the DR solution. Data services like replication and snapshot ensure better data redundancy and deliver an enhanced DR experience.

Step 3: Execution and Testing of DR plan

Once the elements are identified, objectives are set and the market is analyzed for appropriate disaster recovery solutions; it’s time to execute the plan, acquire the solution and setup the DR solution.

After the setup has been concluded, the first thing to do is test the disaster recovery plan. By simulating a disaster situation, businesses should promptly and thoroughly test their disaster recovery plan. An untested disaster recovery plan and solution are as good as nothing until they’ve been properly tested.

The tests should analyze the time it takes to resume operations after the failure of primary systems. How long does the secondary system take to respond and when can the IT infrastructure recover after a total failure? The tests should be conducted extensively to provide detailed answers to these questions.

Step 4: Training and Assigning Appropriate Roles

All of the analytics, planning and technology amounts to nothing if you don’t have the proper training and a pre-defined role to fulfill in the event of a disaster. Businesses sometimes hold a casual point of view when it comes to resource training. In terms of disaster recovery planning, this can backfire astronomically.

It’s very important to define roles and assign them accordingly. In the event of a disaster, everyone should know exactly what to do. Let’s say the primary system fails due to a power failure. If the relevant resources aren’t appropriately trained, then the restoration process will consume a lot of unnecessary time. Comparatively, if each resource is well trained about the scenario and the roles have already been determined and assigned; and the process has been tested a few times, the end result will be very different. Each IT staff will know what to do, which station to man and how to react to the disaster. The business will be up and running with less than 5% variation from the determined time span in the disaster recovery plan.

Summary

In the technology world, disasters mean discontinuation of operations; this can be due to a natural disaster or an artificial or man-made disaster like ransomware, malware, fire etc. Disasters tend to be very costly for businesses; this includes SMBs and large enterprises. That’s why it’s very important to devise an efficient disaster recovery plan and setup a reliable disaster recovery solution for it.

Disaster Recovery as a Service (DRaaS) comprises of two techniques: failover and failback. Failover is when your primary system fails and you move your processes and applications to a replicated secondary system. After the primary system is restored, the process of moving your applications back to the primary system from the secondary system is called failback.

An efficient disaster recovery plan comprises of 4 steps:
• Setting objectives and defining elements.
• Analyzing Disaster Recovery Solutions and Technology.
• Execution and Testing.
• Training and Assigning appropriate roles.

Businesses initially need to identify the data types and then assign RTPOs based on their significance. After this, businesses need to analyze DR solutions and technologies to select the most suitable solution for them. In this selection, it’s recommended to keep an eye out for enterprise level data services like replication, snapshots etc.

Once the preferred technology has been selected, it’s time to execute the DR plan and acquire and setup the solution. After setting up the DR solution, it’s important to test it out by simulating disaster scenarios.

Another thing to pay attention to is the training of resources and the assignment of appropriate roles in the event of a disaster situation. This drastically improves the business’ ability to react in the event of an actual disaster.

Conclusively, disaster recovery is an art form which like any other kind of art requires careful planning and a setup. It simply cannot be rushed. Only a business well prepared and equipped for disasters can create a masterpiece.

Bio: This article is provided by StoneFly Inc, a leading service provider that facilitates backup in the cloud, enterprise cloud storage, on-premises data storage and efficient disaster recovery both on-premises and in the cloud.