Solved

CVM root partitions usage high after clusters upgraded to AOS 5.15

  • 20 April 2020
  • 8 replies
  • 3136 views

Badge

Two weeks ago I’ve upgraded our two clusters to AOS 5.15 . After that, both clusters began to receive constant warnings for root partiton space usage high(exceeded 80%) on almost all CVMs. Below is the result of ‘df -h’ on one of the CVMs: 

Filesystem      Size  Used Avail Use% Mounted on
devtmpfs         18G     0   18G   0% /dev
tmpfs           512M     0  512M   0% /dev/shm
tmpfs            18G  1.2M   18G   1% /run
tmpfs            18G     0   18G   0% /sys/fs/cgroup
/dev/md1        9.8G  7.4G  1.9G  80% /
/dev/loop0      240M  2.3M  221M   2% /tmp
/dev/md2         40G   23G   17G  57% /home
tmpfs           3.6G     0  3.6G   0% /run/user/1000
/dev/sdg1       5.5T  1.4T  4.0T  26% /home/nutanix/data/stargate-storage/disks/                                                                                                                                                                                                                                             17Q0A01WFB9D
/dev/sdh1       5.5T  1.4T  4.0T  26% /home/nutanix/data/stargate-storage/disks/                                                                                                                                                                                                                                             17Q0A01TFB9D
/dev/sdf1       5.5T  1.4T  4.0T  26% /home/nutanix/data/stargate-storage/disks/                                                                                                                                                                                                                                             17Q0A01YFB9D
/dev/sde1       5.5T  1.4T  4.0T  26% /home/nutanix/data/stargate-storage/disks/                                                                                                                                                                                                                                             17Q0A001FB9D
/dev/sdd1       5.5T  1.4T  4.0T  26% /home/nutanix/data/stargate-storage/disks/                                                                                                                                                                                                                                             17Q0A024FB9D
/dev/sdc1       5.5T  1.4T  4.0T  26% /home/nutanix/data/stargate-storage/disks/                                                                                                                                                                                                                                             17Q0A006FB9D
/dev/sda4       1.7T  1.1T  649G  62% /home/nutanix/data/stargate-storage/disks/                                                                                                                                                                                                                                             PHYG928601A81P9DGN
/dev/sdb4       1.7T  1.1T  650G  62% /home/nutanix/data/stargate-storage/disks/                                                                                                                                                                                                                                             PHYG9286018H1P9DGN
 

NCC versions of both clusters is 3.9.4.1

I’ve checked KB 1523 but it only suggests me to contact nutanix support.

Does anyone has any idea? Is this a bug of AOS 5.15?

icon

Best answer by Askar_sre 20 April 2020, 09:12

View original

This topic has been closed for comments

8 replies

Userlevel 1

Hi Guys,

Execute the following command on one of the CVM servers to resolve the issue:


1. Remove excess journal logs:

nutanix@cvm$ allssh 'sudo journalctl --vacuum-size=512M'

2. Make this change persistent:

nutanix@cvm$ allssh 'sudo sed -i 's/1024M/512M/' /etc/systemd/journald.conf'

nutanix@cvm$ allssh 'sudo systemctl restart systemd-journald'

 

A Support article will be published soon on this.

Userlevel 1

Awesome.

We have just now published the following article on this:

https://portal.nutanix.com/kb/7604

Badge

Hi Welsper,

 

I also have the same problem after upgrading to 5.15 but as I could not find anyone else with same problem or any blogs/support articles was thinking it was just my instance.

 

Haven’t found anything yet that could be causing this and was going to log a ticket with Nutanix Support later this week.

 

You could be on the money for it being a bug!

 

Regards,

Billy

Badge +1

Hi,

 

Hope you are keeping well in these ‘interesting times’!

 

Just echoing the above and flagging an open case raised last week for the same issue and an effect of the upgrade to AOS5.15. Since the upgrade, several (but not all) CVMs in a 16 node metro cluster are reporting:

 

Disk space usage for root on Controller VM x.x.x.x has exceeded 80%.

 

Possible Cause

Increased CVM system root partition usage due to excessive logging or incomplete maintenance operation.

 

I understand how this alert works but it’s a bit frustrating that after a standard upgrade, we have to engage with support for something that wasn’t an issue or flagged before. This isn’t the first time we have been in this situation. 

The suggestion above is certainly one to be tried!

 

Best regards,

 

Chris

Badge +1

Hi Askar_sre and all,

Worked for me!:grinning:

Many thanks

Chris

Badge

Thanks Askar_sre.

 

This worked for me as well.

Badge

Hi Askar_sre,

It worked for me, thank for your quick answer!

Userlevel 1

Thank you All.

Glad to have helped you.