Archive for February, 2012
Having two data centers, especially when they are separated by a significant distance, brings so many advantages, it’s difficult to name them all, but here are just a few:
- The ability to increase the frequency and quality of disaster recovery testing
- The ability to perform site maintenance and upgrades, while maintaining application availability
- The ability to rapidly restore applications and continue operations in the event of a regional disaster
Some organizations have eliminated tape and migrated to disk-based backup methods, leveraging various techniques for creating application-consistent snapshots. This approach can dramatically improve recovery times, but again, requires that the 3rd-party recovery location have all the necessary equipment and software in order to run the applications, once the applications and data are restored. And, again, the location must be unoccupied.
The reason organizations use 3rd-party disaster recovery service providers is, in part, because they don’t want to absorb the full cost of having a second location sitting idle, just in case a disaster happens. It is cost prohibitive for most organizations. But forward-thinking companies have recognized that application development and test environments can be re-purposed for production applications, when a disaster occurs. In this way, no infrastructure is wasted, and no systems are sitting idle. A two-data center architecture, with development, test, and disaster recovery in one location, and production in the other, provides the ideal approach for both resource efficiency and resiliency.
The biggest challenge for organizations may be to determine the best way to get all of the current application data from the primary production location to the development, test, and disaster recovery location. Asynchronous replication is clearly the approach of choice, in terms of cost and flexibility for locating the secondary site, but it ensures that some data will be lost. Many of you saw our recent announcement about Animal Health International becoming an Axxana customer. The approach that Animal Health took, combining asynchronous replication with disaster-proof protection of the synchronous lag, is precisely the approach that organizations should take. The combination gives organizations a complete solution that is both affordable and flexible.
On 1 May 2011, French investigators recovered the flight data recorder from Air France Flight 447, twenty-three (23) months after the Airbus 330 plunged into the Atlantic Ocean on a flight between Rio de Janeiro and Paris. At the time of the tragic crash, and for months following, there was a great deal of speculation regarding what caused the crash. Was it mechanical failure, pilot error, or some combination of both? With the release of the information from the flight data recorder, including the full transcripts of the cockpit voice recordings, investigators now have a clear picture of what occurred. And from analysis of the retrieved data, they can make recommendations to airplane manufacturers and to pilot training programs on how reduce or eliminate these kinds of tragedies.
In the case of Flight 447, investigators did have some data from the automatic transmissions, but the data was incomplete. Over the past several decades, and as storage media has advanced from magnetic tape to solid state disk, the airline industry has been able to increase the amount of data that flight data recorders store and protect. And now, with the information from Flight 447′s flight data recorder, which was retrieved from the ocean floor, two miles below the surface, the picture is now complete.
Eye-witness accounts are notoriously unreliable, as this Stanford Journal of Legal Studies article and this APA Monitor article attest. And the stress that comes during and after disasters strike only serves to increase the unreliable nature of eye-witness testimony. When disasters strike a business and data is lost, it is sometimes possible to reconstruct data from source documents, but source documents are sometimes lost. Data can also be reconstructed from memory, but, as research shows, memories can be flawed.
For the airline industry, the capture and protection of data in flight data recorders before and during disasters, and the analysis of data after disasters, have been critical to ensuring that airlines are the safest mode of travel. Still the industry is looking to constantly improve. Imagine if, rather than 23 months, the data in the flight data recorder had been recoverable immediately. Imagine that rather than having to be found, the data could have been extracted automatically. Then the analysis of the cause and the development and modification of procedures that might prevent future tragedies could have begun almost immediately.
I just read an article in the Disaster Recovery Journal entitled Business Continuity’s Role in Supply Chain Resilience. The article reminded me that, even when entire business processes are outsourced, the business continuity planners at your company are still responsible for oversight of the business continuity plan. The recent floods in Thailand, and the resulting supply-chain problems in the disk storage industry, are another great reminder.
Leading up to the year 2000, most forward thinking companies required their suppliers to certify Y2K compliance. As a result, the Y2K bug impacted companies’ operations very little. The risk was real, but managed. Now that companies are outsourcing more and more of their supply chain to contract manufacturers and more of their business processes to BPO specialists, they risk losing both visibility and control over business continuity.
Whether your supplier provides components, such as disk drives, or services, such as payroll and accounting, it’s critical to understand your suppliers’ capabilities to withstand and recover from disasters. So here are six questions to ask your suppliers:
- Who is responsible for disaster recovery planning?
- What disasters are you prepared to withstand from this location?
- When will the supply chain or service be restored, after a disaster strikes?
- Where is your disaster recovery site located?
- Why did you choose your particular approach to disaster recovery?
- How frequently do you test your disaster recovery plan?
This is in no way a complete list, but it is at least a start in helping to focus not only on the quality and price of outsourced goods and services, but also the reliability of the supply chain and how your suppliers think about and plan for business continuity.