The Halloween Scare of Replication Environments – Lag Size “Surprises”

The Halloween Scare of Replication Environments – Lag Size “Surprises”









Before installing the Axxana Phoenix System, we need to analyze the lag size in our customer’s asynchronous replication environment. We use this parameter to determine the size of the Solid State Disk (SSD) that stores and protects the lag inside our Black Box.

We developed special purpose software tools to analyze the lag.

When we report the lag size to our customers, we often find them very surprised, to say the least. The lag is typically much larger than they expected.

Why is this the case?  Why aren’t IT managers aware of the actual data lag they have between their two data centers and across their replication environments?

Examining and discussing this with many IT managers over the years, I gathered the following:

In the very beginning, when the disaster recovery team designs the replication system, management determines how much data their company is willing to lose in a disaster. This lag can be defined in units of size (MBs) or in units of time (seconds or minutes). Taking this into consideration, the team analyzes the replication solution and calculates parameters, so that it fulfills the maximum lag requirement. These parameters include the replication communication line, the processing capabilities of the local and remote storage sub-systems, and the size of the storage cache.

The question is, once the team designs the system, can management guarantee it won’t exceed the maximum agreed upon lag in the future?

Not really…

And why is that?

Often, the theoretical sizing exercise doesn’t accurately represent the behavior of the system once it is implemented. This means that even in the very first day after Installation, the replication system may exceed the maximum allowable lag size.

Even if the system meets expectations in the beginning, over time things change. Data increases. Transaction volumes increase. Network bandwidth becomes insufficient and overloaded. When these things happen, and they will, the data lag between the primary and DR systems increases. Surprisingly, customers are often unaware that this is happening. So when the Axxana team performs its own lag sizing exercise, let’s just say we might create an unpleasant surprise… and in the spirit of Halloween… somewhat of a “scare.”

There is only one way to avoid surprises, and that is the Zero Data Loss way. With Axxana, we don’t care if your lag size grows. We have enough SSD space in our box to guarantee you will lose no data in a disaster.

We don’t scare easily. The line quality can fluctuate, your applications may spike, and your data may grow. Our system absorbs those data peaks and still provides you with zero loss, full consistency and minimum downtime at failover.

Talk to us… before it’s too late!