Azure Storage Replication Explained

In the previous post about Storage Accounts, we talked about various types of accounts and their associated data services. Since that post, the Premium File storage option that was in preview has now gone into General Availability. (https://azure.microsoft.com/en-us/blog/announcing-the-general-availability-of-azure-premium-files/)

Alongside the data access tiers (Hot, Cool, and Archive) and data services (Blob, File, Queue, and Table), Azure also has options around the data redundancy. In this post, we are going to look at the four types of data replication Azure provides.

LRS (Locally redundant storage)

The most ubiquitously available option is Locally Redundant Storage (LRS); this is the default and only replication type available for all storage account types.

LRS ensure your data is replicated three times within a single data centre. These datastores are updated using synchronous writes to guarantee all three copies are kept up to date.

LRS does have downsides, predominately due to a single data centre in a single azure region containing the data replication. This issue exposes the data to a single point of failure if the Data Centre is entirely offline. Microsoft does commit to a 99.9% SLA for read and writes operations for data stored in LRS datastores. The SLA is not to be confused with the 11 9’s (99.999999999%) guarantee they offer for data durability which is just a commitment to ensure a level of data integrity against data loss or corruption.

ZRS (Zone redundant storage)

Introduced for General Purpose v2 storage accounts Zone Redundant Storage (ZRS) brings enhancements to the replication seen in LRS.

As with LRS, ZRS has synchronous writes between 3 copies of your data to ensure data integrity. Although still within the same Azure region, additional resilience is introduced due to the use of availability zones within the region. Two or three availability zones contain copies of the data. The increased resilience removes the issues of a single data centre outage causing data access issues. Although the data is spread across multiple availability zones, these zones are all within a single region, so data available is still susceptible to region-wide outages.

ZRS has the same SLA levels for read and write operations as LRS, but they increase the integrity of the durability of the data objects by an additional 9 to 99.9999999999% (12 9’s).

GRS (Geo-redundant storage)

Geo-redundant storage (GRS) brings additional redundancy to the data storage over both LRS or ZRS. Along with the three copies of your data stored within a single region, a further three copies are stored in the twinned Azure region. So using GRS means you get all the features of the LRS storage within your primary zone, but you also get a second LRS data storage in a neighbouring Azure region. This data is updated asynchronously, so there is a small lag better the 2 data sets, but for most cases this is acceptable.

Although using GRS means you are using two different datacentres in conjunction, there is a drawback of GRS, which is the secondary data storage is not accessible to read unless the storage account fails over. Due to all read and write operations still being managed by via a single data centre, Microsoft offers the same read and write SLA’s as ZRS and LRS datastores.

RA-GRS (Read-Access Geo-redundant)

The final replication type available is Read Access Geo-redundant storage, it has all the benefits and redundancy of the standard GRS replication but also allows for the secondary copy stored in the paired sister Azure region to be readable. This means you have multiple endpoints that are readable that your applications can use if they are configured to handle this configuration.

The additional read endpoint also means that for RA-GRS the read operation SLA is increased to 99.99% availability for Hot datastores. Write operations are left at the 99.9% SLA due to the single region still being in control of the write and update operations.

Both types of GRS replication do have a slight delay in the replication due to its asynchronous behaviour. This is where checks of the LastSyncTime come in useful to ensure you are reading the most up to date copy of the data. This asynchronous replication needs to be checked against your Recovery Point Objectives if you are planning to use GRS / RA-GRS as part of your DR planning.

Currently, Microsoft controls the failover of storage accounts from one Azure region to another region, so the scope for using the failover on your terms is limited. Microsoft has recently, however, brought into preview an option for a customer-initiated failover themselves. This is currently only available in the US West 2 / US West Central region pair and while in preview not suitable for production environments.

Summary

So to summarise the permutation of options available to you when creating new storage accounts are as follows:

StorageTable.PNG

We are now going to look at how the GRS replication works and what happens when you failover to the secondary data store using the new preview option.

The first thing to note about the customer initiated failover is the potential requirement to enable it.

C:\>Connect-AzAccount -SubscriptionId <subscription-id> 
C:\>Register-AzProviderFeature -FeatureName CustomerControlledFailover -ProviderNamespace Microsoft.Storage

The component registration can take a couple of days to be completed. While the request is in a pending state, you are unable to initiate a fail-over of the storage account yourself. You can check to make sure the feature is enabled using the following command.

C:\>Get-AzProviderFeature -FeatureName CustomerControlledFailover -ProviderNamespace Microsoft.Storage

When the feature is ready, the RegistrationState changes from Pending to Registered

RegistrationState.png

Now that the new Azure feature is enabled, we can start to look at how the failover works. We begin with creating a new storage account with a single blob container. This can be done either in the UI or via PowerShell. 

First, we create a new resource group, so all the test resources are grouped together.

C:\>$resourceGroup = "StorageTest" 
C:\>$location = "westus2” 
C:\>New-AzResourceGroup -Name $resourceGroup -Location $location
resourcegroup.png

Now we can create the Storage Account, for the preview we need to use either US West 2 or US West Central and be using either GRS or RA-GRS. In my example, I have created my account in US West 2 as a General Purpose V2 account using RA-GRS.

C:\>New-AzStorageAccount -ResourceGroupName $resourceGroup -Name skylinesteststorageacc -Location $location -SkuName Standard_RAGRS -Kind StorageV2
storage.png

Now to create the Blob container.

C:\>$storageAccount = Get-AzStorageAccount -ResourceGroupName $resourceGroup -Name skylinesteststorageacc 
C:\>$storageAccountContext = $storageAccount.Context 
C:\>$containerName = "testblob" 
C:\>New-AzStorageContainer -Name $containerName -Context $storageAccountContext -Permission blob
storageacct.png

Now to initiate the failover, this can be done of the Geo-replication pane of the storage account or again via a PowerShell account.

getazstorage.png
portal2.png
C:\>Invoke-AzStorageAccountFailover -ResourceGroupName $resourceGroup -Name skylinesteststorageacc
invoke.png

After the failover has been completed, the storage account metadata is modified to show the correct data location and replication type. My storage account is now showing LRS replication and is now located within the secondary datacentre.

getazstorage2.png

If you use unmanaged disks with GRS / RA-GRS replication, you may want to repeat this task with a datastore which contains the VM’s virtual disk. Depending on the types of data being stored, you may find issues failing over when files require exclusive access.

-Matt Boyd