Use Cases of Compression, Deduplication and Erasure coding | Nutanix Community
Skip to main content

Hey Guys,

Please share the Use cases for above mentioned features.

 

 

Hello @Shield07 

The table describes which optimizations are applicable to workloads at a high-level:

Data Transform Best suited Application(s) Comments
Erasure Coding (EC-X) Most, Ideal for Nutanix Files/Objects Provides higher availability with reduced overheads than traditional RF. No impact to normal write or read I/O performance. Does have some read overhead in the case of a disk / node / block failure where data must be decoded.
Inline
Compression
All No impact to random I/O, helps increase storage tier utilization. Benefits large or sequential I/O performance by reducing data to replicate and read from disk.
Offline
Compression
None Given inline compression will compress only large or sequential writes inline and do random or small I/Os post-process, that should be used instead.
Perf Tier
Dedup
P2V/V2V,Hyper-V (ODX),Cross-container clones Greater cache efficiency for data which wasn't cloned or created using efficient AOS clones
Capacity Tier
Dedup
Same as perf tier dedup Benefits of above with reduced overhead on disk

Generally, Inline compression works very well for almost any workloads. It can be pretty much always enabled.

Deduplication (both cache and capacity) should only be enabled for VDI with persistent desktops. Another use case can be when there are a lot of similar application servers without a lot of data processing. Everything else does not benefit well from deduplication and it can actually make performance worse, because it creates fingerprints for all deduplicated data and stores them in the metadata. That creates some overhead. I recommend to keep it turned off unless it is VDI and persistent desktops are used. Never use deduplication for SQL, Oracle or any other databases as well as any high I/O performance applications.

EC-X is good for some backup storage, archival solutions, Files/Objects. Also, don’t use it for databases and high I/O performance apps.