oplog_episode_count_check for a disk not connected to a vm

Been having oplog errors for nearly 5 months now. I have read through the article: NCC Health Check: oplog_episode_count_check and I have been able to get the vdisk_id. The problem is that it not connected to any VM. I have looked through all the disks connected to all my VMs and none of them are this disk.

Running vdisk ls does give me this information though:

Name : 00054ce8-d482-d9dc-135d-1866da8ea766::NFS:2:0:457
Container ID : 00054ce8-d482-d9dc-135d-1866da8ea766::1062
Container Uuid : 75138a04-e025-4510-a9d1-05e319573732
Max Capacity : 4 TiB (4,398,046,511,104 bytes)
Reserved Capacity : -
Read-only : false
NFS File Name : counters-4
NFS Parent File Name (... :
Fingerprint On Write : none
On-Disk Dedup : none

Which doesn’t match any of the disks for my VMs. The NFS File Name makes me think this is some kind of system disk or something. Can anyone help me understand where I need to go to figure out what is up with this?

Page 1 / 1

Hi @BrentNorrisKY

What’s the NCC version and the AOS+ hypervisor version?

When you query vdisk_config_printer with the disk ID you get no output?

nutanix@cvm$  vdisk_config_printer | grep -A 12 " vdisk_id: ABCDEFG "

NCC 3.9.5

AOS 5.18.1.2

AHV el7.nutanix.20190916.360

I haven’t updated in a couple months because updating hasn’t seemed to correct this at all in the past.

To answer your other question that is correct. There is no vdisk with that ID listed when you run a vdisk_config_printer.That leads me to think that is some hidden/special disk that is failing, but I don’t know how to find it. The “counter-4” also makes me feel that way as it isn’t anything I would name a file.”

Just to be clear, does the NCC message say Error or Fail? Could you include the actual NCC message?

You are right, it is an internal special purpose counter disk.

Running : health_checks stargate_checks oplog_episode_count_check
[==================================================] 100%
/health_checks/stargate_checks/oplog_episode_count_check � FAIL ]
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Detailed information for oplog_episode_count_check:
Node 10.76.17.53:
FAIL: Oplog episode count exceeds threshold (1200) for the following vdisks:
Id 38425718, episode count 5146
Refer to KB 1541 (http://portal.nutanix.com/kb/1541) for details on oplog_episode_count_check or Recheck with: ncc health_checks stargate_checks oplog_episode_count_check

So I went ahead and went into LCM and upgraded everything. One of the things was the maint package.

After all that currently the check passes:

Running : health_checks stargate_checks oplog_episode_count_check
==================================================] 100%
/health_checks/stargate_checks/oplog_episode_count_check PASS ]
-------------------------------------------------------------------------------+
+-----------------------+
| State | Count |
+-----------------------+
| Pass | 1 |
| Total Plugins | 1 |
+-----------------------+
Plugin output written to /home/nutanix/data/logs/ncc-output-latest.log

Excellent news! I’d run it a few times in a row as the KB suggests to be sure still.

Sadly the error has returned.So I am updated to the very latest everything.

Are you able to open a support case with us? This does not seem generic.

Sign up

Login to the community

Scanning file for viruses.

This file cannot be downloaded