Linux Azure VM Scale Sets with shared storage using Lustre

Update from November 21, 2018: Information below is dated. Please review a more recent article from Azure CAT team describing Parallel File Systems for HPC Storage on Azure and download PDF of the whitepaper.

Azure Virtual Machine Scale Set is a compute resource you can use to deploy and manage an elastic collection of identical and usually stateless VMs. Mark Russinovich announced and demoed the public preview of the Azure VM Scale Sets (VMSS) in his November 11, 2015 blog post.

Currently, as of December 2015, Azure VM Scale Sets do not yet support attached data disks. This means that today virtual machine instances created by the scale set only provide two drives: the OS disk (C: on Windows and /dev/sda on Linux) and the temporary/ephemeral local disk (D: and /dev/sdb often mounted as /mnt/resource).

But what are your options if the elastic and stateless application that you are planning to deploy using VM Scale Set needs shared storage as part of its data layer? For data that can be stored in a relational or NoSQL database, you can use Azure SQL Database, Azure DocumentDB, or Azure Tables. For storage of blobs (e.g. image files, PDFs, etc.), you can use Azure Blobs. For shared configuration data or other similar limited-scale files that require a shared filesystem, you can mount an Azure Files share.

However, there are some Linux workloads that require shared storage with higher performance than Azure Files currently provides (i.e. current limits of 5TB per share, 1TB per file, and throughput of up to 60MB/s per share). For example, many classic tools (e.g. genomic processing, video encoding, etc.) require a shared POSIX compliant filesystem with high throughput, large file size support, and ability to access the same large file from multiple client nodes simultaneously, randomly, or sequentially. For some of these Linux-based (specifically CentOS 6.6 or 7.0) applications, we can combine the elasticity of the Azure VM Scale Sets with the higher performance and scalability of the Lustre parallel filesystem deployed on Azure IaaS VMs with multiple attached data disks preferably using Premium Storage.

The high level steps to create this deployment are:

  1. First, deploy Intel Cloud Edition for Luster* from Azure Marketplace, which will create a virtual network with two subnets and will deploy multiple servers into the server subnet and leave the client subnet empty.
  2. Next, deploy Azure VM Scale Set consisting of Lustre clients using a sample Azure Resource Manager Template, into a new resource group but using the empty client subnet in the existing virtual network.

Deploy Lustre Servers

The same article also links to a sample template in Azure Quickstart Templates GitHub repo that shows how to deploy Lustre clients (i.e. VMs that will use the shared storage) using Azure gallery CentOS 6.6 or 7.0 image using a “copy” loop and CustomScript extension. The bash script executed via CustomScript extension is used to download and compile proper Lustre client kernel modules and to mount the pre-deployed Lustre filesystem from each of the client VMs.

Instead of using the “copy” loop to create separate VMs, we can apply the same approach to a Azure VM Scale Set and create 1–100 VM instances running CentOS 6.6 or 7.0 with a mount point (e.g. /mnt/scratch) to the pre-deployed shared Lustre parallel filesystem. As mentioned above, the actual stateful Lustre servers need to be deployed first outside of the VM Scale Set into a Virtual Network that also has a subnet into which VM Scale Set nodes are deployed later.

After deploying the Lustre filesystem, we can use the following quickstart template to deploy the Azure VM Scale Set that will mount the shared filesystem.

On the GitHub page of the template, click the “Deploy to Azure” button to be redirected to the Azure Portal “Microsoft Template” deployment screen and fill in the required parameters.

We will need to know the private IP address of the Lustre MGS node which we can obtain from the MGS SSH session via “ifconfig” or “sudo lctl list_nids”. We will also need the name of the Lustre servers resource group and the virtual network.

Once the deployment is complete, we can SSH into any of the VM Scale Set nodes via the Load Balancer’s public IP and Inbound NAT rule that maps public ports 50000–50099 to private port 22 on each of the VM instances. You can view the Outputs section to find the domain name of the Public IP that we can use to SSH.

On one of the instances (e.g. SSH into, use “df -h” command to view the currently mounted file systems and confirm that Lustre is mounted at /mnt/scratch

In addition, we can see the contents of the shared Lustre directory to confirm that two VM instances were all able to write their 200MB test file.

Scale the VM Scale Set Up and Down

The source of the scale.json template is here and the “Deploy to Azure” button is on the bottom of Click the “Deploy to Azure” button and fill in the required parameters.

For example, after deploying the “scale.json” template with newClientCount of 10, the /mnt/scratch folder now contains 10 test files one from each of the instances.


I’m looking forward to your feedback and questions via Twitter

Originally published at on December 30, 2015.

Principal Engineer / Architect, FastTrack for Azure at Microsoft

Principal Engineer / Architect, FastTrack for Azure at Microsoft