Briefing: cloud storage performance metrics
About 50% of business data is now stored in the cloud, and the volume stored using cloud technologies is even higher when private and hybrid clouds are taken into account.
Cloud storage is flexible and potentially profitable. Organizations can choose from hyperscalers – Amazon Web Services, Google’s GCP, and Microsoft Azure – as well as on-premises or more specialized cloud providers.
But how do you measure the performance of cloud storage services? When storage is on-premises, there are many well-established metrics for us to track storage performance. In the cloud, things may be less clear.
In part, that’s because when it comes to cloud storage, choice brings complexity. Cloud storage is available in a range of formats, capacities and performance including file, block and object storage, hard drive based systems, VM storage, NVMe, SSD drives and even tapes, as well as technology that runs on a “cloud” based on-premises.
This can make comparing and monitoring cloud storage instances more difficult than for on-premises storage. As well as conventional storage performance metrics, such as IOPS and throughput, IT professionals specifying cloud systems must consider criteria such as cost, service availability, and even security.
Conventional storage metrics
Conventional storage metrics also apply in the cloud. But they can be a bit more difficult to undo.
Enterprise storage systems have two main measures of “speed”: debit and IOPS. Throughput is the rate of data transfer to and from storage media, measured in bytes per second; IOPS measures the number of reads and writes (input / output (I / O) operations) per second.
In these measurements, hardware manufacturers distinguish between read speeds and write speeds, with read speeds generally being faster.
Manufacturers of hard drives, SSDs, and arrays also distinguish between sequential and random reads or writes.
These metrics are affected by things like movement of read / write heads on disk platters and the need to erase existing data on flash storage. Random read and write performance is usually the best guide to actual performance.
Hard drive manufacturers quote revolutions per minute (rpm) for spinning drives, typically 7200 rpm for consumer storage, and sometimes 12,000 rpm for quality enterprise systems higher and 5,400 rpm for less efficient equipment. However, these measures are not applicable to solid state storage.
Thus, the higher the IOPS, the more efficient the system. Spinning hard drives typically reach the range of 50 IOPS to 200 IOPS.
Semiconductor systems are significantly faster. On paper, a high performance USB drive can achieve 25,000 IOPS or even more. However, the actual performance differences will be less once the storage controller, network, and other overloads such as RAID and cache usage are taken into account.
Latency is the third key performance metric to consider. Latency is how quickly each I / O request is executed. For a hard drive based system, this will be 10ms to 20ms. For SSDs, it’s a few milliseconds. Latency is often the most important metric in determining whether storage can support an application.
But translating conventional cloud storage metrics is rarely straightforward.
Usually, buyers of cloud storage won’t know exactly how their capacity is provisioned. The exact mix of flash, spinning drives, and even tape or optical media depends on the cloud provider and their service levels.
Most large-scale cloud providers use a mix of storage hardware, caching, and load balancing technologies, making raw hardware performance data less useful. Cloud providers also offer different storage formats – mostly blocks, files, and objects – which makes performance metrics even harder to compare.
The metrics will also vary based on the types of storage an organization purchases, as hyperscalers now offer multiple tiers of storage, depending on performance and price.
Then there are service-oriented offerings, such as backup and restore, and archiving, which have their own metrics, such as recovery time target (RTO) or recovery times.
The easiest area for comparisons, at least between the big cloud providers, is block storage.
Google’s cloud platform, for example, lists maximum Sustained IOPS and Maximum Sustained Throughput (in MBps) for its bulk storage. This breaks down into read and write IOPS, and throughput per GB of data and per instance. But as Google puts it: “Persistent disk IOPS and throughput performance is dependent on disk size, number of vCPUs in the instance, and I / O block size, among other factors. “
Google also lists a useful comparison of how its infrastructure performs against a 7200 RPM physical drive.
AWS has similar advice around its Elastic Block Store (EBS) offer. Again, this can guide buyers through the different tiers of storage, from high performance SSDs to cold disk storage.
Cost, availability of the service … and other useful measures
Because cloud storage is a pay-as-you-go service, cost is always a key metric.
Again, all of the major cloud providers have tiers based on cost and performance. AWS, for example, offers general purpose gp2 and gp3 SSD volumes, io1 and io2 performance optimized volumes, and throughput-driven st1 hard drive volumes for “large sequential workloads.” Buyers will want to compile their own cost and performance analysis in order to make similar comparisons.
But cloud storage metrics aren’t just about cost and performance. The cost per GB or per instance should be factored in alongside other charges including data entry charges and in particular data egress or recovery costs. Some very cheap long term storage offerings can get really expensive when it comes to data recovery.
Another metric is usable capacity: How much of the purchased storage is actually available to the client application, and when will usage begin to impact actual performance? Again, this may differ from the numbers for on-premise technology.
CIOs will also want to review the availability of services. The reliability of storage components and subsystems has traditionally been measured by mean time between failures (MTBF), or for SSDs, the most recent terabytes written over time (TBW).
But for large-scale cloud delivery, availability is a more common and useful metric. Cloud providers are increasingly using data center or telecom-type availability or availability metrics, with “five nine” often the best and most expensive SLA.
Even then, these metrics aren’t the only factors to consider. Buyers of cloud storage will also need to consider the geographic location, redundancy, data protection and compliance, security, and even financial strength of the cloud provider.
While these are not performance metrics in the conventional sense, if a provider fails it could be a barrier to using their service.