Posts Tagged ‘hurricanes’
According to Nate Silver, author of The Signal and The Noise, weather forecasting has dramatically improved over the past 30 years. So why is it that the U.S. National Ocean and Atmospheric Administration (NOAA) could be so wrong in forecasting the 2012 hurricane season? As I wrote in my post, Dodging Bullets in Disaster Recovery, NOAA in May of this year was stating that “Conditions in the atmosphere and the ocean favor a near-normal hurricane season in the Atlantic Basin this season.” According to NOAA, that translates to “12 named storms with six hurricanes, including three major hurricanes.” Instead, assuming no more tropical storms and hurricanes, the U.S. will end the season with 19 named storms and 10 hurricanes, tying with 1887, 1995, 2010, and 2011 as the ‘third most active Atlantic hurricane season in recorded history” according to a Wikipedia post. Whether this is a long-term trend remains to be seen, but we seem, at least for now, to be in a dangerous weather trend.
Trends are not the same as forecasts, and weather forecasts have improved. But, not surprisingly, Mr. Silver reports, “the further out in time these models go, the less accurate they turn out to be.” Forecasts for a season may not be very reliable, but organizations and individuals should closely attend to near-term forecasts for a specific event. As an example, forecasts for when and where a specific hurricane will make landfall have dramatically improved. Mr. Silver wrote, “Just twenty-five years ago, when the National Hurricane Center tried to forecast where a hurricane would hit three days in advance of landfall, it missed by an average of 350 miles.” ”Today, however, the average miss is only about one hundred miles.”
Even better, predictions can be highly accurate when making a probabilistic estimate over longer periods of time, such as:
- What’s the probability of a magnitude 6.0 earthquake in the eastern United States in the next 100 years?
- What’s the probability that the Lincoln Tunnel will flood again in the next 50 years?
These probabilistic predictions are even further improved, if you look at conditional probabilities such as:
- What’s the probability of a magnitude 6.0 aftershock, within 2 days of a magnitude 7.0 earthquake?
- What’s the probability of the Lincoln Tunnel flooding, if ocean temperatures increase 2 degrees?
These conditional probabilities enable us to evaluate scenarios and to plan and prepare. As organizations, we need to spend more time evaluating scenarios and looking at approaches that will mitigate the impact of dangerous events. I’ll write more on this in a later post, but, in the meantime, I’ll say it once again, “The greater the distance between your primary and disaster recovery data centers, the greater the probability that your organization can survive a catastrophic event.”
Too often, business continuity planning and disaster recovery planning are treated as the same functions. Unfortunately, they are not. Business continuity planning helps organizations insure that applications and processes continue through the myriad of day-to-day disruptions that might occur. These include IT component failures, such as disk-drive failures, a server failure, a dropped network link, or an application bug. Disaster recovery planning helps organizations recover operations after less frequent, but far more devastating events, such as fires, floods, hurricanes, earthquakes, and a variety of man-made disasters. While the data center strategy is only one component of business continuity and disaster recovery planning, it is a key component. And while business continuity and disaster recovery planning are different functions, they must often be considered together, because of budget limitations.
There are plenty of advantages to having a business continuity data center in region, a very short distance from the production data center. If the data centers are very close, there will be little impact on transaction latency for the always-important two-phase database commit. Failover times from the production data center to the business continuity data center can be very short. Staff that normally work at the primary data center can easily show up for work at the in-region business continuity data center. WAN charges between the primary and business continuity data centers will be relatively low.
The problem with an in-region business continuity data center is that it can’t replace an out-of-region disaster recovery data center. The two are simply too close for comfort. And few organizations can afford three data centers. Following are a few of the types of disasters that can prevent an in-region business continuity data center from acting as a disaster recovery data center:
- Electrical-grid failure
- Telecommunications failure
- Transportation systems failure
- Chemical spills
- Radiation leaks
- War, terrorism, and civil unrest
For these types disasters, it is much more likely that both in-region data centers will be affected and much more challenging to recover applications and data. One of the trade-offs organizations must make is between how quickly they recover and how certain they are that they can recover from the range of disasters that could strike them. We believe that a slight increase in recovery time is well worth the additional assurance that you can actually recover applications after a disaster. Using an in-region business continuity data center as a disaster recovery data center is a little like doing a tandem sky dive. It’s fine, as long as nothing goes wrong.
Tim is the VP of IT. His company’s data center is more than a thousand miles from the nearest ocean, so it’s not going to be impacted by a tsunami or a hurricane. It’s in an area that has very little seismic activity, so it’s not likely to be affected by an earthquake. There are no active volcanoes nearby. It’s not near any rivers or near a flood plain. There are no other major buildings nearby, and, even though his area has experienced a major drought over the past year, the risk from fire, at least from somewhere outside the data center, is very low.
Tim’s disaster recovery plan calls for a full backup of the application and data files of the critical applications once a week and an incremental backup nightly. The backups usually complete without error, but not always. Some applications are considered more critical than others, so some applications are backed up less frequently. Tim does a disaster recovery test twice a year, to make sure that everything in the DR plan is working. Usually it is. There are other risks to his data center. He could lose power or network communication. He could have a fire that starts inside the data center. His area does occasionally have tornadoes, but not very often. There could be a chemical spill that would require the area to be evacuated, but none of these are very likely.
Like every IT director, Tim has a limited budget, and he is constantly under pressure to keep IT costs low. Tim has made a series of guesses, bets really, in developing his disaster recovery plan. He’s bet that he’s covered for most of the risk associated with natural disasters, he’s bet that the applications that he deemed critical are the right priorities, that he’s got all of the program files and data together in the proper groups, and that nothing has changed since he last revisited the plan. He’s betting that the backup process is working, that the tapes are readable and the applications recoverable.
Those are only some of the bets that Tim has made. Each bet has a consequence. Sometimes he’ll win. Sometimes he’ll lose. But what happens, if Tim guesses wrong?
I’m a fan of the movie, “The Princess Bride.” If you haven’t seen the movie, click on the link below to see a short clip of what can happen when you guess wrong.
The Princess Bride: The Man in Black in a battle of wits with Vizzini.
Actually, like the movie, the story I just told you is fantasy. Tim is real, but I made up the rest. In reality, Tim made a very different bet. He bet on Axxana. With one very good bet, he avoided making hundreds of bad ones.
This year, 2011, has been a year of tremendous natural disasters. It began with heavy rainfall in January in Queensland, Australia, and Rio de Janeiro Brazil, causing flooding, landslides, and crop losses. An earthquake in New Zealand followed in February, causing building collapses and an estimated $12 Billion in damages. Japan’s earthquake and tsunami in March resulted in the loss of an estimated 20,000 lives, massive destruction of buildings, loss of power and disruptions to transportation systems, a hurricane in the Eastern United States left 7 million people without power for days and many without power for weeks. Floods in Thailand that began in the summer and continued into December, flooded the capital, killed more than 500 residents and disrupted the lives of millions. In August, an earthquake rocked Virginia and shook buildings as far away as Massachusetts. And a rare October snow storm hit the North Eastern United States, leaving millions without power for days.
In all of this tragedy, there are some important observations:
- Disasters will strike where they are expected, such as the earthquake in Japan, and where they are not, such as the earthquake in Virginia.
- Disasters will strike when they are expected, such as hurricanes in the late summer, and when they are not, such as massive snow storms in the fall.
- Localized disasters, such as the floods in Thailand, can have far-reaching effects, such as the global disruption of the supply chain for disk drives.
The science that enables the prediction of the location, the size and the effect of natural disasters is improving, but it is far from perfect. The local impact of natural disasters is increasing, because people and businesses are migrating into a massive urban areas. The global impact of natural disasters is increasing, because the supply chain is highly specialized into centers of expertise, but at the same time is globally interconnected and interdependent. Because of this specialization, a flood in a relatively small country can impact the global availability and price of products for which the country provides a single, but critical component.
Perhaps the most valuable lesson in all of this tragedy is that a highly efficient global operation that concentrates capabilities into unique centers of expertise, leaves itself exposed to massive disruption from localized disasters and their impact on infrastructure and the workforce. One of our customers has reduced this risk by creating dual centers of expertise, separated not by hundreds of miles, but by half the globe. With the help of Axxana, these dual centers will operate not only highly efficiently, but 100% in synch. Perhaps it is time to re-think your strategy as well.