2012 VMs intermittently locking up resolved

  • 30 January 2016
  • 5 replies
  • 1494 views

Userlevel 2
Badge +13
I've had an intermittant problem with some 2012 VMs I had migrated that have been locking up after a random time, usually going to 100% CPU and all IO stops. VM Console is unresponsive and network not reachable, reset is only solution to bring back online. It would then run again for a number of minutes, hours, even a day or more only to do same thing again at a random time through day or night, so was no logical patten to it.

I endup tracking it down to a timezone config issue, when rebooting the VMs in question the timezone would set their clock to the hypervisior host time, which was in a different timezone (default TZ nodes were shipped with) to the the host VM TZ. I had a similar problem in the past, although not with the host lockups, where a number of hosts would lose time sync and be out by the TZ difference between host and VM on our Citrix XenServer Cluster hypervisor host. Because our VM are on an AD domain they will be out of NTP adjustment range becasue time difference was too big.

As part of my cluster setup I had changed the cluster timezone to be same as mine (Australia/Perth). but failed to do the AHV hosts TZ.

After changing AHV hosts to correct TZ and Rebooting VMs, all appears to be fine now and no more host lockups, also on the correct time after reboots each and every time 😉

This topic has been closed for comments

5 replies

Userlevel 2
Badge +13
hmmm spoke too soon....sigh

Lockups have gone since TZ changes, but now possibly a coincidence,

Time is still out on VM reboot by -8 hours which just so happens to be the time difference for my timezone (Australia/Perth) GMT+8.

Will investigate further.
Userlevel 6
Badge +29
Bring this up on your case with Support (looks like its assigned to Chetan), I've got a sneaking suspicion this is VirtIO driver related.
Badge +1
Hi There, When have just had a couple of 2008 R2 hosts with the same issues you look to have had. Are you still having this issue or was it resolved? Was it the timezone issue or something else instead? 100% CPU and totally locked out of the VM's had to reboot them, 2 different servers on different NTX blades.
Userlevel 6
Badge +29
- is support engaged? If not, please file a support ticket right away, so that we can link up with you to debug.

Jon
Badge +1
Thanks, I have just logged a support case. Case number 00079167