How to perform a disaster recovery business impact analysis

The next step in the planning process is to perform a business impact analysis (BIA). The BIA becomes the foundation of the plan you will build for your recovery. This is the process that will determine what needs to be recovered and how quickly. It is one of the most difficult tasks to perform and one of the most critical to get right. The more time you have to bring a business function back in service following a disaster, the more your recovery options increase. The BIA is invaluable for identifying what is at stake following a disaster and for justifying spending on protection and recovery capability. Nobody but you will mind your own business.

Why Business Impact Is About Time Sensitivity, Not Criticality

I dislike the use of the terms “critical” or “essential” in defining the processes or people involved in this phase of the planning. I prefer to use the term “time-sensitive.” Generally speaking, organizations do not hire staff to perform non-essential tasks. Every function has a purpose, but some are more time-sensitive than others when there is limited time or resources available to perform them. A bank that has suffered a building fire could easily stop its marketing campaign but would not be able to stop processing deposits and checks written by their customers. The bank’s marketing campaign is essential to its growth in the long term, but in the middle of a disaster it will take a backseat, not because it is not critical but because it is not time-sensitive.

The organization needs to look at every function in this same light. How long can the company not perform this function without causing significant financial losses, significant customer unhappiness, or significant penalties or fines from regulators or from lawsuits?

How To Do This and Get It Right

It is all about impact. It is all about what keeps the business running and what can wait till later. When I was doing mid-range and client-server DR for a company, I had to speak to the business unit that managed the general ledger for the company. The general ledger is concerned with accounts payable and receivable. It is just like your checkbook. It is where a business keeps track of the monies coming in for payment of goods or services and those going out to pay for expenses such as payroll. In this company, the general ledger ran on an AS400, and my job was to figure out how long I had before I needed to bring back the system. When I met with the business unit, the first response was that it had to be back by day one after a disaster.

My response was that I was willing to build whatever recovery strategy the business needed and was willing to pay for, but before I priced this strategy, I wanted the team to think about something. This is a financial-services firm. If we did not run the general-ledger system for 30 days, it would be ugly. There is no question that we would have to cut manual checks to keep critical services going and have to maintain a manual general ledger until the system was brought back. I would not want to be the accountant who had to reconcile all the manual-ledger entries into the application once it was restored, but the firm would survive as a business if it did not run the general ledger for a month. How long do you think we would survive as a business if we did not answer our phones? Price our mutual funds? Process our customers’ transactions?

It is not about being important. When business is normal, the general ledger is very important. It is about what keeps us in business. It is about surviving. Disasters are not about business as usual. Management metric reporting is very important when business is normal. My CEO expects his management reports on his desk at 7:00 a.m. every business day. But if the home office burnt to the ground, I know he would be willing to forgo seeing them for a few days!

All business functions and the technology that supports them need to be classified based on their recovery priority. Recovery time frames for business operations are driven by the consequences of not performing the functions. The consequences may be the result of business lost during the down period: contractual commitments not met and resulting in fines or lawsuits, lost goodwill with customers, etc. Impacts generally fall into one or more of these categories: financial, regulatory, or customer retention. Remember, these were the same categories we talked about in Chapter 2.

What steps can you give your planning team to conduct a business impact analysis? It starts with simply identifying the processes or functions performed in their area. Working with the management team, list everything that is done by that group. Once the business processes are understood, each one must be analyzed against three areas: financial risk of not performing that function, regulatory risk of not performing that function, and customer or reputational risk of not performing that function.

Financial risks may include loss of revenue, loss of interest on bank balances, the cost of borrowing to meet cash flow, loss of revenue from sales, interest value on deferred billings, penalties from not meeting contractual commitments or service levels, opportunity lost during the downtime, and losses from processing transactions at market risk as of the date received.

Regulatory risk may include penalties for not filing financial reports or tax returns on time, fines or penalties for noncompliance with regulatory requirements in place for your business, or the need to pull products off shelves because of lost product-testing information.

Customer or reputational risk includes loss of customer confidence and market share, liability claims, customer dissatisfaction with service, media coverage of customer complaints, loss of goodwill, and loss of competitive advantage.

It is all about impact. What happens to the company if we do not do this?

Once your planning team has a list of functions and what happens if they are not performed, the next question to be answered is, how soon do we start to see the impact? Is it as soon as we stop doing something? A customer call center that has been evacuated due to a fi re stops performing its function immediately. Unless there is another call center someplace else that is fully equipped and staffed to take calls, the impact to your customers is immediate. How significant this impact is depends entirely on your business—how many calls you get and what the calls are about.

Let’s say your call center receives an average of 1200 calls per hour and on average, 72 percent of those calls result in a sale with an average value of $57. Do the math: 1200 x 0.72 x $57 = $49,248, the potential loss per hour that the call center is not operational.

If your customers or potential customers find your product or service and place their orders on your website and it goes down, you have an immediate impact. Again, how significant the impact is depends on your business—how many orders you take, how much each order is for, and whether the customers will wait and order from you later or take their business elsewhere.

After your planning team has a list of functions, an idea of what happens when they stop, and how quickly you start to see the impact, the next question to be answered is, how much impact? You can use quantitative measures such as actual dollars per minute, hour, or day of downtime or qualitative measures, which predict certain outcomes based on the knowledge or experience of the individual.

Once all that information is pulled together, you have a view of everything the company does, what impact it would have if the function could not be performed, how quickly that impact would be felt, and how significant the impact will be. This information is the start of what we need to develop the appropriate recovery strategies for each site we do business in.