Archive for April, 2012
Last week I posted a blog: Protecting Consistency Groups Against Human Error. I decided to see what other people were saying, so I did a little browsing around message boards, user groups, and forums. Back in 2010, W. Curtis Preston of Backup Central got into a lively debate with Scott Waterhouse of EMC, with Curtis stating emphatically, “Crash Consistent Backups Aren’t Good Enough,” and with Scott responding that they work, but ” wouldn’t it be ideal if you could do better.”
There’s plenty of concern about the ability to reliably recover applications from crash-consistent copies of data. Of course, it’s different for every application and every environment. Here’s some advice from an EMC message board on how to ensure recovery using RecoverPoint with Oracle:
Many customers and field personnel use RecoverPoint to constantly and successfully access and bring up both application and crash-consistent copies of Oracle every day.
If the Oracle database is setup correctly in RecoverPoint in terms of consistency groups and ensuring that the target volumes are only accessible to ONE mount host and the other host-level best practices are followed, then I would expect you to have no issues.
Click here for the full discussion.
There’s also a helpful discussion by Mike Rothouse on recovering Oracle data from NetApp storage using NetApp’s crash-consistent SnapShot.
If a database has all of its files (control files, data files, online redo logs, and archived logs) contained within a single NetApp volume, then the task is straightforward. A Snapshot copy of that single volume will provide a crash-consistent copy.
Click here for the full discussion.
My key observations are:
1. If you depend on crash-consistent copies of data, then it may work some of the time for some of your applications, but it won’t work all of the time for all of your applications.
2. Best practices for recovering from crash-consistent SnapShots restrict your options for data placement and volume management.
3. Applications, systems, and IT processes are in constant flux, so if you need to set up consistency groups “correctly” in order to ensure recovery, you are creating an inherent human-factor risk.
I don’t know about you, but with the pace of change, the cost of training, and the strain on staffing in today’s IT shops, I would opt for solutions that work the same way across all applications, automatically adapt to changes in the environment, and reduce the risk of human error.
Datacentre Solutions Awards will hold their annual ceremony on 23 May 2012 at the Millenium Gloucester Hotel and Conference Center in London. We’re very excited that Axxana was nominated in not one, but two categories: Datacentre Storage Hardware Product of the Year and Datacentre Storage Software Product of the Year.
All of the nominees have worked hard to bring innovative solutions to the market. We hope that you will take the time to read through the descriptions and vote for your favorite in each category.
Before voting, I want to take you back to a time about 25 years ago, when the only way to reduce the risk of data loss from disk drive failures was to use higher and higher quality disk drives, and then back up the data to tape each night. The risk of data loss from a drive failure during the production day was still very real, even with the massive investment in high-quality disk drives. And still, only the largest companies could afford the very expensive disk drives. Everyone else had to settle for a much worse and much riskier storage device. RAID technology, which was first defined and described in 1987, changed that by making it possible to protect data on lower quality drives. Today even the smallest companies can achieve very high levels of protection against data loss from disk drive failures by using RAID-based storage systems. RAID transformed computing.
Until recently, data center disaster recovery approaches have been very similar to the pre-RAID data centers of 30 years ago. Even today, only the largest, wealthiest organizations can afford the very expensive three-site, synchronous/asynchronous disaster recovery approach that is used by most large, multi-national banks. But with Axxana’s Phoenix System RP, every medium-sized or larger organization can afford to protect all of their data through a disaster. Axxana is for disaster recovery what RAID was for protection against drive failures. We are transforming disaster recovery.
Please click on the link below and vote today.
DCS Awards: Vote here!
What’s worse than losing your data?
Losing your data and having no backup.
What’s worse than having no backup?
Having a backup that restores inconsistent data.
That’s precisely the concern that Josh Kirsher raised on the April 10 Wikibon Peer Incite. A lot of people are buying insurance, in the form of snapshots of application data, and they leverage consistency groups, thinking this will insure that the data is application-consistent. It’s the application-consistent snapshot that companies use as source-volumes for off-site backups and asynchronous replication, and as on-premise application recovery points. And it’s consistency groups that enable applications to be restored in minutes rather than hours or days. Unfortunately consistency groups only work when procedures are perfectly designed, when they are perfectly followed, when they are constantly maintained, and when no one makes an error.
In today’s dynamic environment, where the servers on which applications run are virtualized, where applications are frequently moved from one physical server to another, where LUNs are quickly created, and volumes are added to and removed from LUNs on a daily basis, the probability of developing a perfect consistency-group process that is precisely followed and continuously maintained, without introducing any human error, is very low. That means that, when you need to call upon your insurance, which is the snapshot or the backup that you assume is application consistent, the probability is very high that the data will in fact be inconsistent and the time to restore consistent application data from paper source documents will be measured in days, not minutes or hours. And, for companies that primarily transact business electronically, they may not be able to reconstruct the data at all. This is the scenario that Tim Hays, of Animal Health International, avoided when he made the decision to protect everything. After all, if he could affordably protect everything, he didn’t have to worry about what he might miss.
Market research firm, IDC, recently released revenue estimates for the disk storage systems market. In 2011, the market grew over 8% and exceeded $31B. Given that the price we pay for a terabyte of storage continues to decline, that means that the growth rate of data is much higher. For many organizations, the growth rate is greater than 40% per year. The flood in Thailand that disrupted disk drive manufacturers’ supply chains notwithstanding, thanks to manufacturing innovation, the suppliers have been able to keep up with companies’ almost insatiable demand for more storage resources.
Time is a very different resource. There’s no factory that makes time, so we can’t make more. We can only decide how we use time. In the world of business, we are increasingly deciding to use our time to be open and available for our customers. That means that there’s less time to protect the data that we use to process orders, run our factories, and communicate with our suppliers.
Faced with a challenge like this, we have had to transform our thinking. In computing, over the past couple of decades we have gone from doing one thing at a time to doing everything at once. We used to turn on our accounting and order-entry systems in the morning, bring up terminal services, process orders, shut down systems at the end of the work day, bring in an evening shift to run analysis, print reports, and then back up the data. Now we process orders all day and all night, seven days a week. Terminals have been at least partially replaced by PCs, mobile device applications, and browser-based applications, but they need to be available throughout the day as well. The reports still need to be run and the data still needs to be backed up, and though reports and data backups may still occur at night, they are no longer being done in “off hours,” because there are no off hours.
This challenge is just one of the challenges that Tim Hays, VP of IT at Animal Health International, solved when he installed the Axxana Phoenix System RP and EMC RecoverPoint. Because RecoverPoint provides application-consistent snapshots of data, which can then be used for processing reports and as sources for backups, restores, and data replication, there’s no need to worry about the limited supply of time. Thanks to RecoverPoint, production systems can continue to operate, with only a brief pause, while the snapshot is taken. And thanks to Axxana, the data that is changed or created between the application-consistent snapshots is maintained and protected. This is critically important, because as Tim Hays said in his recent presentation on Wikibon’s Peer Incite, in a world where most transactions are electronic, if you lose your data, there’s no way to reconstruct the transactions.
Let’s face facts. Data backup, data protection, and disaster recovery are difficult. There are more data and more applications to protect and less time to do it. And there are a growing number of risks against which you have to protect your data and applications. Thanks to application and data growth and the integrated nature of applications that support today’s business processes, old data protection and disaster recovery methods simply won’t work. There’s too much complexity and too little time.
Thankfully there are new approaches and new technologies to solve data protection and disaster recovery challenges. Innovative suppliers, like Axxana, are eager to win your trust and win your business. That’s the good news. The bad news is that you don’t have the time to evaluate all of the new technologies and all of the new suppliers to figure out what works and what doesn’t and who to trust and who not to trust.
Faced with a challenge like that, what do you do? If you are like most people, you talk to the people you trust the most. Your larger, long-time suppliers are a logical choice. They have a lot to lose, if they guide you down the wrong path. That’s one of the reasons we chose to partner with the leading information infrastructure supplier, EMC. EMC stands behind the rigorous tests conducted in their ELab, so you don’t have to wonder if our Phoenix System works.
Another logical choice is your peers. What makes peer groups so helpful is that your peers have no financial interest in your decision. They’re not the incumbent supplier, and they’re not the new kid on the block. Their interest is in preserving and enhancing their reputation, which will only be damaged, if they steer you down the wrong path.
We are very pleased to tell you that one of your peers, Tim Hays, VP of IT at Animal Health International, will be talking on a Wikibon Peer Incite, on Tuesday, April 10 at noon Eastern Daylight Time. His topic is how he implemented an affordable zero-data-loss disaster recovery solution; one that eliminates the need to classify data, but instead protects all of the company’s production data. Not only will he be presenting, but he’ll be available to answer your questions. If you are thinking about re-architecting disaster recovery, building a second data center, are concerned about the cost of ensuring data protection, or simply can’t figure out how to affordably protect all of your production data, I encourage you to attend.
Dial-in instructions are below. No registration is required.
Date: Tuesday, Apr 10, 2012
Time: 12:00pm – 1:00pm ET (9:00am – 10:00am PT)