Windows Azure - Disk Performance Surprises Explained
8A few months ago, we blogged about some surprising metrics coming from Azure Disks. Notably, we found instances where Standard Storage was faster than Premium Storage and instances where Premium Storage was dreadfully slow (as in 2MB/s). A great guy from the Azure storage team reached out and helped me to understand better what was going on.
Here’s my take on what’s going on. Caveat: I’m not a storage expert, just a guy trying to make sense of this stuff.
First off - when measuring the disk throughput, it is going to be governed by the “transfer request size” and the amount of concurrency (”number of threads” in CrystalDiskMark or “Outstanding I/Os” in Iometer). Azure disks (and likely all cloud disks) are more heavily dependent on request size & concurrency to achieve high throughput than would be a local drive. We’ll see how that plays out below.8
Before we update the benchmarks, let’s update/explain some of our findings from the original post.
In the initial post, CrystalDiskMark showed a dreadful 2.207MB/s write speed for the Premium Disk using a 4k transfer request size and 1 thread (this is the bottom right cell of the test) whereas the Standard Disk showed 61.82. The reason for this is that the Standard Disk was also the o/s disk and this disk has read/write caching enabled - so it was the write cache that “sped up” this test. As of yet, you cannot enable the write cache on data disks and using the write cache does expose some risk of data lost. So, the conclusion that the Standard Disk is faster in some cases than the Premium Disk is flawed - it’s the “Standard Disk with read/write caching” that may be faster.
Regardless of #1 above, we’re still left with the question: “Why does Premium Storage only deliver 2.207 MB/s write speed especially when local SSD is 40x faster (82.75MB/s) for the same test?” The answer, turns out to be pretty straightforward. The cloud drive is limited to 500 IOPS per thread - so with a 4K transfer size and 1 thread, you get about 2MB/s (4k * 500/s = 2MB/s). If you change this to 2 threads, you get 4MB/s. This is the most simple example of how concurrency affects cloud drives. You could also get 4MB/s by doubling the transfer size from 4k to 8k.
Now, let’s update some benchmarks using Iometer (the Azure storage expert showed me this tool which provides more explicit control over transfer size and concurrency.
I’m going to use a 4k transfer size, as this is my best guess as to what is likely for ElasticSearch (I’m still very unclear on how transfer size is affected in “real life” between the application stack and the o/s). This corresponds to the built in “4 KiB; 0% Read; 0% random” ‘Access Specification’ in Iometer - with 0% Read, I believe this means it’s a write-only test. When I use the term “threads” below, it corresponds to the “# of Outstanding I/Os” in Iometer.
I’ll just provide the numbers for the Premium Disk (P30 disk on DS12 Azure VM - 4 cores, 28GB RAM) compared to local SSD (4x 800GB SSD in Raid 10 - 6 cores, 64GB RAM). I realize the machine sizes are different and may skew the results (but hey, you’ve got to work with what you’ve got!).
The maximum IOPS for the DS12 (12,800) and the P30 disk (5,000) are taken from this article as are the maximum throughput for the DS12 (128MB/s) and the P30 (200MB/s).
Premium Storage - 4k, 1 Thread
Notes: the highlighted portions show the key metrics - IOPS (474) and Throughput (1.94MB/s). This is exactly what we expect - a single thread is limited to 500 IOPS and with a transfer size of 4k, we get the expected throughput (4k * 500 IOPS = 2MB/s).
Premium Storage - 4k, 8 Threads
Notes: again, we get the expected result. IOPS goes to ~4,000 (8 threads * 500 / thread = 4,000) and throughput scales accordingly.
Premium Storage - 4k, 16 Threads
Notes: here we get something interesting as we see that the IOPS is not 16 * 500 = 8,000 but rather 5,000 which is the limit for a P30 disk. We can conclude here that: The maximum throughput on a single P30 disk with a 4k transfer size is 20MB/s. This is not a limit of the VM, but rather the disk. In theory, you could stripe 3 P30 disks on a DS12 and increase the max IOPS to 12,800 which is the limit for the VM.
Premium Storage - 32k, 16 Threads
For kicks, I ran this test which has a very high transfer size (IMO) and a very high degree of concurrency (16 threads). But, this does provide the maximum throughput as advertised on the DS12 (128MB/s).
Notes: here you can see that while we haven’t reached the maximum IOPS for the P30 disk (5,000), we have reached the maximum throughput for the DS12 (128MB/s).
Local SSD - 4K, 8 Threads
Local SSD - 4K, 16 Threads
Notes - Looks like we’re getting close to the max IOPS for this disk array. Even when I move to 64 threads, it only drives IOPS to 26,000 (minor increase from the 24,685 in the 16 thread test).
Local SSD - 32k, 16 Threads
Notes - What? 1GB/s? Now that’s absolutely blazing fast!
---------------------------------------------------------------
This is a pretty complex scenario, dependent on #cores, Windows, ElasticSearch, Lucene, Java, document size etc. I feel pretty good that they capture a “typical” ElasticSearch scenario, but obviously there are a lot of variables involved. Regardless, for this test setup, bastardized though it may be, I feel OK in making this conclusion - and it matches my ‘empirical’ evidence in our applications having run on both DS12|Premium-Storage and physical machines with Local SSD.
SSD is significantly faster (5x to 30x) than Premium Storage, regardless of transfer size or concurrency. For “typical” ElasticSearch scenarios (see below) SSD is 3x to 5x faster than Premium Storage
“Typical write” ElasticSearch scenario (4k transfer, 8 threads), SSD is 5x faster than Premium Storage (92MB/s vs 15MB/s). By default, ElasticSearch uses #processers as the number of bulk operation threads - this means only 4 threads on the DS12; a good optimization if running ElasticSearch on a DS12 may be to increase many or of all the threadpools.
“Typical read” ElasticSearch scenario (4k transfer, 8 threads), SSD is 3x faster than Premium Storage (281 MB/s vs 67MB/s). [These tests are not shown above.]
Of course, everything is based on particular scenarios and workloads - there can be no steadfast “rules” about what environment is optimal for every scenario. Disclaimer aside, I hope this helps expand the understanding of disk speed, both on and off Azure.