What does Disaster Recovery mean and is Business Continuity the same?
The short answer to that question would be, “No, it isn’t.”
As someone who has been in the industry for almost 30 years, I’ve come across this more than a few times. I hope that by reading this, it will help explain the difference between disaster recovery and business continuity and how you can best prepare.
Business Continuity is the overall return to business of a company. That encompasses People, Product, Processes and everything related to them. Disaster Recovery (DR) is a part of that. DR is related to IT and the business critical systems needed to get your Data Centre up and running again. This includes the people and processes of IT to achieve this goal.
A common misconception is that DR means, “I have data backups and I’m good.”
If only that were true!
So let’s talk about Disaster Recovery.
Most companies don’t have an effective and rehearsed disaster recovery plan. Data backups are a fantastic start, but those alone won’t save the company in the event of a disaster. And a lot companies that suffer a disaster don’t recover and go out of business because they took too long to rebuild their systems or perhaps lost all their valuable data.
Disaster Recovery must include the systems and people who can build or rebuild them, as well as the data that needs to reside on these systems.
So what does this mean?
Let’s take a look at just one possible system and what might be involved. I’ll use the example of a database that is used for billing customers for their orders of your product.
So, we have a server that runs a database that will need to pull information about the customer, their order, order history, pricing, and latest shipment information.
So what do we need for Disaster Recovery?
Well let’s consider everything that allows that database to work properly for you.
|Virus Scanner Software|
|Security Settings & Hardening Settings|
The above is just the physical environment you’ll need, whether you are virtualized or not.
So what else is there?
Well, then we’ll need people:
- System administrator
- Network administrator
- Security administrator
- SAN and storage administrator
- Database administrator
- Operations staff
OK…..so now we have the people too. Well, not really. Who is empowered to make the tough decisions that these folks will need made? Managers! You’ll also need an overall co-ordinator for all these administrators, so that everything can be choreographed properly.
On top of everything you have above, you are going to need input from other systems. The database contains information that has to be pulled into it from somewhere else, for example, the shipping systems, CRM systems and inventory management systems. So we now have three more systems to recover, before we can bill for that latest shipment to the customer.
How do these additional systems communicate with the billing database? Let’s assume there is some sort of messaging system to do that. Wow! We need to recover five systems just to generate an invoice.
OK…we have a list of hardware, infrastructure, software, and people.
What about a plan?
A plan is the well-choreographed, rehearsed procedure that will help you identify all the interactions these systems have and when they need to interact. For example, if you bring the database up before the CRM and other systems, you’re going to get a lot of errors when it goes looking for the input data.
So this is the time to look at what needs to come first….which systems, software, storage, etc?
Now that you have the order of operations, you can start to fill in the blanks.
Each system should have a “mini” plan…a step by step guide that will allow you to use another experienced IT resource and perform the recovery, if the primary administrators are offline and unable to reach the alternate site. By step by step, I mean “write everything down”…every command, every checkpoint, every “wait till this comes up” point in the recovery.
Do this for each system you are recovering. Then put it all in order for the overall recovery. Some parts of the recovery of a system may have to wait on others. For example, you can bring up the operating systems, but you’ll have to wait for the network cut-over before continuing with the next steps.
Ok…now I can recover!
YES! But now you need to practice.
The key to a successful recovery is practice. Every time you run a mock DR, you’ll discover gaps. Fix those gaps as a top priority. It would be a shame to do all the work above and never rehearse it, only to then find out that you forgot a critical bit and can’t recover everything you need, or you would have to scramble to rebuild that critical bit.
When should you practice?
Well, to be really good at it, you need to run through a disaster recovery every time you make changes to your environment that could affect the recovery. If you upgrade operating systems, hardware, or apply significant patches in production, you’ll need to mirror those with your DR systems and run through the plan. This is really the only way to ensure that everything at the time of DR, works the same as it does in production.
Now you can recover from a disaster and ensure that the IT infrastructure can function to allow the business continuity plan to be fully implemented.
Did I miss something? Please comment.
P.S. Kanatek can help you with this, interested in learning more about our services? Click here to book a free 15-minute overview call with one of our specialists.