Skip to main content

Storage Efficiency, Data Reduction, and Performance with Compression

  • 18 May 2017
  • 2 replies
  • 3481 views

Nutanix Tech Marketing Engineer Andy Daniel wrote up a short paragraph recently as an overview to the Storage efficiency features including compression. Share what you are doing for your different workloads in your environments (Nutanix or otherwise).



--

To optimize storage capacity and accelerate application performance, the Acropolis Distributed Storage Fabric uses data efficiency techniques such as deduplication, compression, and erasure coding. They are intelligent & adaptive, and in most cases, require little or no fine-tuning. In fact, two levels of post-process compression are enabled in conjunction with cold data classification by default on new shipping clusters. Because they’re entirely software driven, it also means existing customers can take advantage of new capabilities and enhancements by upgrading AOS.



DSF provides both inline and post-process compression to maximize capacity. Many times, customers incorrectly associate compression with reduced performance, but this isn’t the case with Nutanix. In fact, as of AOS 5.0, all random writes are compressed inline before being written to OpLog (write cache), no matter the chosen configuration. Increased Oplog space utilization as the result of compression improves burst handling for sustained random writes and allows absorption of sustained random writes for a longer duration.



Large and sequential reads and writes also see a performance benefit from compression, so there are very few workloads where inline compression (compression delay = 0) isn’t appropriate. It’s even recommended for Tier-1 workloads such as Oracle, Microsoft SQL Server and Exchange. Inline compression additionally improves performance within the capacity tier (Extent Store) while maximizing total available storage capacity.



With potentially dramatic performance improvements and the ability to significantly increase your cluster’s effective storage capacity, there’s no reason you shouldn’t enable compression on your containers today!
I am confused by this statement - -



"Large and sequential reads and writes also see a performance benefit from compression, so there are very few workloads where inline compression (compression delay = 0) isn’t appropriate. It’s even recommended for Tier-1 workloads such as Oracle, Microsoft SQL Server and Exchange. Inline compression additionally improves performance within the capacity tier (Extent Store) while maximizing total available storage capacity."



This statement seems to be a direct contradiction to - -



"Compression Best Practices" - - https://portal.nutanix.com/#/page/docs/details?targetId=Web_Console_Guide-Prism_v4_7:sto_compression_c.html - -



"Compressing data is computationally expensive, while decompressing data is less so. From this fact it follows that workloads where data is written once and read frequently, such as user data storage, are most suitable for compression. Examples of such workloads are file servers, archiving, and backup.

Because compressed data cannot be compressed further, attempting to do so consumes system resources for no benefit. Neither user data that comprises mainly natively compressed data (such as JPEG or MPEG) nor data in a system that has native compression (such as SQL Server or Oracle databases) should be stored on a compressed container."



Although it's ambiguous because you may or may not be utilizing compression on SQL, but it seems there is one take away - compressing data is computationally expensive. I have been researching this topic lately because we have a SQL server that is running on a single host, which is a monster VM taking up an entire node (896GB RAM/20 vCPU), and we're noticing contention on this host with the CVM. This is mostly an application problem that we're throwing hardware at, but since we've moved it off of out HPE environment which utilized SSD SAN, we're starting to notice higher and higher latency on the disks since it's been moved to Nutanix. Nutanix recommended we enable inline compression, which was done about 6 months ago, but now we have a CVM who is spiking CPU and often running at 100%. We plan to give the CVM even more vCPUs, but I'm wondering if we were to disable inline compression we'd see less contention of the CVM on this host? Thoughts?
Does anyone have an opinion on which compression is preferred when running a large SQL data warehouse, NUTANIX or Native SQL 2016? Personally, I like having it enabled at the VM / SQL2016 level because if you only have one Data Pool in your NUTANIX environment as we do, you will have to enable it for all systems that access that storage?