Disaster Recovery – Testing your business’ resilience

Article by Nitin Mishra



You know the value of uptime for your business and keeping this in mind you have taken the pains to devise what you believe is an effective business continuity disaster recovery plan. Yet, I can tell you that you will be surprized how many loopholes DR plans can have even after thorough risk assessment and planning. The key to developing a sure fire DR plan is testing it. Creating different disaster scenarios and testing how effectively your DR plan will get you through will reveal all the potential weak spots your plan may have.

The RTO and RPO Factor

To begin with, you need to understand the precise requirements of your business. Here is a question that I suggest you ask yourself: What doesthe business need, and is it capable of addressing this need with regard to both capability and cost? You will need to determine the RTO (Recovery Time Objective) that is essentially the maximum amount of time your business can afford to be down as well as RPO (Recovery Point Objective) which is the acceptable level of data loss. Once you understand these metrics, you will need to invest in and build on capabilities to support them, and eventually bridge the disaster recovery gap between business and IT.Bridging this gap requires IT to meet with business and application owners to understand recovery needs so that the financial impact of outages can be quantified and then weighed against the cost of providing the necessary service level. This may require some negotiation, but without this conversation, DR success is impossible.

Testing the Basics and Intricacies

Your disaster recovery plan should incorporate actions that need to be performed before, during and after a disaster is declared. Basic elements include defining the criteria under which a disaster is declared, who can declare it, and how individuals are notified. A good plan should include contingencies; you can’t assume your email will work, or even that cell phone service will be available.Go by the rule that disaster recovery needs to be integrated into the standard change management process so that whenever systems are modified, the software is automaticallyupdatedand/or additional storage is assigned. Ensure that every time reorganizations occur, the disaster recovery plan isrevisited to check for its resilience.

To test a DR plan for its resilience, consider these fundamentals while testing:

  1. Go beyond data recovery – test application recovery
  2. Get non-primary personnel toconduct the recovery to validate procedures and documentation
  3. Get your staff to role play in imaginary disaster situations
  4. Eradicate the negativity towards disaster recovery. DR is good for the business, good for personnel
  5. Create metrics to chart and measure development

And stop worrying about the Testing Costs

Whenever I have asked managers, the most common reason given for not doing more extensive testing is cost. I think a great idea to effectively address this issue and justify thecost is by closely linking the testing process to RTO/RPO service-level objectives. Themessage should be that comprehensive testing is an essential requirement to ensuring that those metrics can actually be met and is an integral part of the disaster recovery process.