Hi all! Is anyone out there running Cloudera in a Nutanix environment? Not just Hadoop, but specifically the Enterprise Cloudera Hadoop implementation. We're in the process of migrating our Production Cloudera cluster to Nutanix VMs and having a host of issues related to disk latency. I'm aware of the Nutanix Hadoop reference documents, but our configuration doesn't quite fit because it started on physical devices, and the existing production configuration would be challenging to change at this point. We're not running YARN yet, still on MapReduce V1, for example, though we're in the process of moving that direction in our dev environment.
We have a storage container devoted to our HDFS directories, with virtual disks carved out of it in VMWare, but we're seeing vdisk latencies *averaging* 300ms with frequent spikes to several seconds. Our I/O is very write heavy; Nutanix advised us to disable EC-X, so we did, but no help. We're running RF2 on that container, no deduplication, inline compression. vdisks are partitioned, with ext4 filesystems on the partitions, mounted "noatime."
If anyone's successfully spun-up a Cloudera environment on Nutanix, what gotchas have you found (and solved)? It's entirely possible I've configured something foolishly, but I can't for the life of me figure out what it might be. Thanks!
Page 1 / 1
How many vdisks are you attaching to the vm? Not Cloudera but I just worked through a SAS-Grid POC and we saw improvements going from 4 to 8.
We've got 4 2TB vDisks on each VM. I'm not sure I could go to 8, I don't have enough space in the container; I might have to remove some and add them back in. Do you have any guesses why that change fixed your issue? Thanks!!!
Enter your E-mail address. We'll send you an e-mail with instructions to reset your password.