As part of my role as VP of Sales at Axxana I get to visit many of our customers and prospects, from different segments of the Enterprise market and from around the world.
I recently visited several large organizations in Europe, who traditionally use storage-based replication between their Production and DR sites in order to protect their data and critical applications.
All of these enterprises have stringent data protection requirements. Their requirements often lead to the implementation of synchronous storage based replication between their data centers.
Synchronous replication is used to achieve RPO=0 (zero data loss) and this is the main advantage of using sync replication. On the other hand, synchronous replication affects application performance and therefore the DR site needs to be in close proximity to the production site, thus increasing the risk of losing both sites in a regional disaster. This topology was the only option to achieve zero data loss so far. Now, with changes in technology and increasing awareness and business needs, there are new solutions that should be considered.
New business requirements call for:
• Better performance of OLTP applications without affecting the tight SLA the organization was used to.
• Shorter RTO. Fast recovery for critical applications in case of a disaster or hardware failure.
The questions that many infrastructure architects are facing and debating revolve around the following topics:
– How to achieve zero data loss without impacting application performance?
– What is more important: RPO or RTO and how do we decide which of them to compromise on?
– Is active-active topology the new way to go?
I would like to share some of my insights regarding these challenges;
More and more organizations today utilize a design topology that entails two local sites at the same data center in an active-active configuration for High Availability purposes and a 3rd copy in a distant, DR data center. The data is being replicated asynchronously to the DR site. This topology can provide an answer to the first challenge of improving the OLTP applications performance. Nevertheless, when disaster strikes, the organization will suffer from data loss and applications’ inconsistency, leading to a long recovery and reconstruction process. This replication method, therefore, provides only a partial solution to the failover challenge.
Another common practice is to move from storage based replication to a host based, similar to Data Guard in the case of Oracle databases. Moving to host based replication will improve the RTO dramatically, since the hosts and databases are active at the DR site, as opposed to storage based replication, where both the hosts and databases are usually dormant. This topology improves RTO, but affects RPO. In most cases I encountered Data Guard replication that is not configured in Maximum Protection but rather in Maximum Performance, and that creates a tradeoff between RTO and RPO.
The real questions are
• Why implement a costly solution with three data copies, at three different sites , that in a real life scenario will not protect the business from data loss, regardless of improving the OLTP’s performance, and;
• Why compromise with an RPO>0 (guaranteed data loss) for a seemingly shorter RTO (which at the end of the day, due to the data loss, may actually be longer…) when implementing Asynchronous replication?
I invite you to watch our latest video, explaining the value of our solution and why it can overcome the challenges organizations are facing today and read our latest post about The Is and the Ought… What a swift and complete recovery of mission critical applications really looks like!.