How to remove DIMM from node | Nutanix Community
Skip to main content

Hi Community,

Version and model - NX-1065-G6

I have a faulty DIMM in one node. 

 Memory | Uncorrectable ECC (@DIMMC1(CPU2)) | Asserted

I would like to remove it from the node and restart it. i.e. Not to replace it with a new memory module.

  • Is this possible?
  • Risks?
  • How should I configure the node with the missing DIMM, or is it something that Nutanix takes care of automatically?
  • Any other advice? Recommendations?

I found this document. Is this the correct one to follow?

Cheers!

Hi Sammy777,


This is the right guide (assuming you run ESXi hypervisor).

Follow the guide and you will be OK for memory replacement.

Take the note of the DIMM serial (old and new) and compare the serial in IPMI after the replacement.

Run NCC checks, look at IMPI SEL logs and hardware status after replacement.

Only once everything looks good take the node out of maintenance mode.

 

To remove the DIMM you would have to remove DIMMs from both channels so that channels capacity remains symmetrical.

Also, check supported memory configurations in the same guide.

 


Hi @Alona 

Thanks for the reply.

I do not want to replace the DIMM, but just take it out of the node.

I run AHV VERSION NUTANIX 20170830.171

What do you mean by:

To remove the DIMM you would have to remove DIMMs from both channels so that channels capacity remains symmetrical.

Can you please elaborate?

Cheers!


Maybe this article will help you: https://systemx.lenovofiles.com/help/index.jsp?topic=%2Fcom.lenovo.conv.8695.doc%2FReplacingAMemoryDIMM.html


Hey @Sammy777

Yea, you can do this. The “symmetrical” thing is for performance reasons; you want to make sure that all channels are “balanced” and have the same number of DIMMs. It’ll work if it’s not balanced, but will likely be slower. This doc shows the supported balanced configurations of memory on that model:
https://portal.nutanix.com/page/documents/details?targetId=System-Specs-G6-Multinode:har-dimm-config-overview-g6-c.html

As far as getting Nutanix to recognize that the DIMM is removed permanently and stop sending alerts, there’s a way to do it, but it’s not public and I don’t remember what it is. Hit up Nutanix Support for that.


@JacksonWrath1607 

Thanks, that is helpful.

 


In cause you solved this issue I would kindly ask you to mark @JacksonWrath1607 ‘s reply as answer so people with similar issues can find solution faster.


Updates

It did not work.

After removing the faulty DIMM, 5 DIMMs left in the host.

I could not start the host. It was throwing errors and got into a boot loop.

I removed the parallel DIMM to create a symmetrical configuration, but that did not help either.

Looks like nutanix does not support a-symmetric configuration of DIMM.

Thanks!