This post isn’t being written to be critical of Google. They have a tremendous platform. I know many people who use Google, not only for advertising and searching, but for blogging, for collaboration applications, and for email. But I’ve been watching the continuing problems with Google’s Gmail service. On Sunday, February 27th, a software bug caused some Gmail user data to be deleted. As reported by Google, only .02% of users were affected by the data loss, down from earlier estimates that were .08%. Turns out, though, that .02% of the Gmail user base is still a big number. By some estimates, it’s about 35,000 people. It’s now five days later. The latest update from Google, which is from yesterday, reports that:
We have restored the majority of the affected accounts, and will continue to restore the remaining accounts as quickly as possible. Accounts with more mail are taking more time.
Why would it take Google so long to restore data? Because, Google has to restore the data from tape. Google has an interesting perspective on tape:
To protect your information from these unusual bugs, we also back it up to tape. Since the tapes are offline, they’re protected from such software bugs. But restoring data from them also takes longer than transferring your requests to another data center, which is why it’s taken us hours to get the email back instead of milliseconds.
Hours instead of milliseconds? Actually, for some users, it’s days instead of milliseconds.
Google is getting a pretty good thrashing in the press, because of the outage, not just in trade press, like ComputerWorld’s IT Blog Watch, but also in business press blogs, like the Wall Street Journal. Google is a very strong and profitable company, and Gmail, which is largely a free service, has little impact on the company’s profitability, I suspect, but the outage will cause some potential customers to wonder about the advisability of putting any applications, even email, in the Google cloud.
I always find it interesting to see what choices companies make and how those choices create problems and how those problems impact public perception of quality and reliability. And I wonder how much companies concern themselves with the “reputational risk” that goes with their technology choices. There are ways to protect data from software bugs that don’t involve tape. Application-consistent point-in-time snapshots are one way. That’s what EMC’s RecoverPoint provides. With RecoverPoint, data that has been accidentally deleted could actually be restored in minutes, if not milliseconds.
When choosing data protection approaches, some companies don’t think enough about restore time. And when bad things happen, as they always will, the company suffers damage to their reputation. It’s the same way with the last bit of data. Some companies don’t think enough about protecting all of the data, down to the last transaction. Eventually, that decision will catch up with them. Not right away, maybe, but eventually. With EMC RecoverPoint and Axxana’s Phoenix System RP, you won’t have to apologize, when data gets accidentally deleted by a software bug. We will get it back for you quickly, down to the last byte. And your reputation will remain intact.