Question

2 home labs, experience and list of issues/questions

  • 8 June 2020
  • 4 replies
  • 2465 views

Hi all,

Look like 2020 is a quiet year for Nutanix. I’m rather new to Nutanix. Just created 2 clusters, here are my experiences and seek other for knowledge & advice.

Setup 1 ; 3 x SFF PCs

2 x Lenovo M93p, i7-4770, 32GB

One with Evo 850 250GB & Evo 850 500GB ssd

Another with 2 x Evo 860 500GB

 

1 x HP Z230, i7-4770, 32GB, Transcend 512GB and Evo 860 500GB SSD.

 

2 x Samsung USB3 Bar 32GB  (thumb drive)

1 x Sandisk 32GB SD card in a USB reader  (thumb drive)

 

PCs connected to a single TP-link 1gbits switch.

 

Setup 2; a thin HP 14s laptop with Win10pro, VMware workstation 15.5.2, i7-10510U (4/8), 64GB, 512GB Nvme, 1TB SSD

3 x VMs with 4vCPU, 16GB, 200gb and 500gb vmdk.

 

Steps;

  1. Download the CE image and the ce.vmdk, into the laptop
  2. Create a VM and test booting the image, then add “<pmu state='off'/>” into the file  “/home/install/phx_iso/phoenix/svm_template/kvm/default.xml”. Change the “/var/cache/libvirt/qemu/capabilities/3c76bc41d59c0c73xxxxxx...xml” file by deleting “pc-i440fx-rhel7.2.0…”  and changing one line “pc-i440fx-rhel7.3.0” to “pc-i440fx-rhel7.2.0”
  3. Create a Centos 7 OS VM, mount the tested image file into it, then using fdisk and resize2fs to increase the size of the image to 15GB. (reason; found the default too small and during test it got up to 80%+ usage. Select 15GB to fit into 16GB USB drive) Test the image by creating a single cluster VM.
  4. Using Rufus 3.5p (3.10 fail sometime) to create bootable USB drives.
  5. Set laptop and SFF PCs to UTC time (reason; the default installation does not ask for timezone & got timing issue if I use local time)
  6. Install all SFF with CE via the USB, and “cluster -s IPaddresses create”.
  7. Install another cluster with the laptop & VMware Workstation.

 

Issues;

  1. Even after I set the hardware to UTC time, there are reported of NTP configuration issue. Check via “ntpq -pn” , show it did get the time from external default NTP. Got this in Single Cluster setup too. I’m at timezone +8. So, setting to UTC should resolve this issue, but it did not. After reboot, the alarm did not appear, so I leave it. Which file should I set in the pre-installation image to correct timezone?
  2. Ubuntu 18.4 VM failed to boot after installation. Need to try again..
  3. Windows 2016 need to install NGT for the network to work. Does this issue appear in current version of Nutanix? Took me some time to figure NGT need to be installed (with nic driver), never occur to me as other worked “out of the box”.
  4. Once, I need to change the file twice “/var/cache/libvirt/qemu/capabilities/xxxxxx...xml”. It creates another file with the old setting after booting up and VM can’t be created without changing this file. This is during single cluster test.
  5. I can’t login to the NEXT account to complete the Cluster creation via Prism during the initial setup, if the Browser is on a difference timezone (time) with the cluster. And there is no error or message telling this is the issue.
  6. If I connect to Prism via one of the CVM IP, I can’t connect to the Prism via cluster IP. Believe this is due to the IP redirection.
  7. Running off the laptop, the CPU utilization is very high, even after run overnight in hope to stabilize it. The Prism loading is slow. The password sync still waiting for response from the other nodes after 10hours+. On the same laptop, I installed nested ESXi servers, vcenter, window DNS/NTP server and FreeNAS (for iSCSI), the CPU load is much lower. I don’t run both setup at the same time. It is not practical to run Nutanix cluster in my laptop, maybe 8 cores 128GB RAM can handle it.
  8. Some time can’t boot the SFF PC using USB3, USB 2 no issue.
  9. 3 times noticed the SFF PC booted to the Nutanix splash screen, not able to login via the console, but it actually booted AHV, able to ssh into and start the cluster.
  10. What is svm? old name of CVM?
  11. With CE version, should we shutdown using “cvm_shutdown -P” or “shutdown -P” for CVM? Used both, if the first failed “can’t get token error”, I use the other command.
  12. Got “Latency between CVMs is higher than 15ms” alarm while trying to install 2 VMs; Centos and Windows 2016. Centos installation very slow, still at 242/308 packages after 2hours.
  13. Got this error on the SFF PC cluster upon first reboot of the setup; “Possible degraded node 192.168.0.## with Controller VM ID 6 reported by component zookeeper-monitor.” Ask to contact Nutanix support.
  14. Not able to get into the VM BIOS or able access the boot screen, let say you want to interrupt the boot to get into WinPE, etc.. for troubleshoot.

 

Wouldn’t be using the cluster in the Laptop, it just too slow.

 

Is there way to get Prism center eval? Other than via the demo website?

Is way setup a nutanix lab with ESXi?

 

Any advice? My hardware setup too old, slow?

 

Thanks!


4 replies

Forgot this, think I need to adjust the CVM to use less memory in the VM Workstation setup. But, I doubt it would help much, the current SFF PCs setup is rather slow even with total 12 cores.

Think of getting a 10th gen 8/16 cores CPU with 128GB with Esxi6.7 to host 3nodes cluster. But, wonder that is enough. My little home can’t house any real servers.. B(   and too costly.

Hi @YewHang1,

 

buying a server/pc with 128GB and min 500GB SSD you can easily create multiple AHV nodes and will be able to create some clusters.

You will be able to learn about all the features of Nutanix.

 

kind regards

Thanks Bauke. I was pondering over it as most of ESXi lab are server with many cores. I’m thinking of Ryzen with 12 cores, but worry about compatibility or too much time wasted to tweak it. At the moment, did not find any Intel 10th Gen CPU desktop with vPro (at my location), which is my poor man iLO. Currently I used AMT to remote into those PCs. 

Update everyone. Look like part of the issue is one of the PC NIC suddenly changed to 100mbits. Maybe the cable. Changed the cable and reboot the setup. From the latency it triggered me to check the connectivity. This cause the VM to suddenly load much slower. The health check did check the change of speed.. or I’m missing something here?

The node with the speed change wasn’t mark as degraded, instead another node was, so it ended with a very slow VM installation time. No idea why did it not degrade the one with this issue.. any advice?

I built a home lab with HP Z440 workstations I picked up off Amazon.  They have 64GB of memory in them so don’t have to do anything unnatural for memory configuration.  They are expandable to 128GB.  Prism Central can be used with CE, don't need an eval

Reply