Solved

basic capacity calculate and RF question

  • 11 June 2016
  • 4 replies
  • 5365 views

Userlevel 1
Badge +14
Hello everyone,
my environment:
6 * 3060-G4 NODE
per node : 2*ssd 480G, 4* HDD(how to find out if it`s sas or sata drive ?)1TB
RF=2
---------------------------------------------------------------------------------------------------------
how to calculate?
I see prism report 22.18TiB MaxCapacity(Physical)。but I think physical should be 1T*4*6+480G*2*6=24T+5760G

if RF2 need two complete copy(like RAID1) ?, the usable capacity is just half of 24T+5760G, and if after one node failure, for N+1 , I should also reserve another extra node capacity(4T+480G*2) ?

also I have a basic question and guess, if I have three NODE in all(assume not 6 node ), and one node failed, lost data will be replicate to another any node of remaining two, because of RF.
so I have two copy again. but if the second node also fails later(now I have only one node), will the cluster be down ? data lost ? I think data shouldn`t be lost because one of the two copy exists(3 node ,2copy-> 2 node 2copy-> 1node 1copy) . what a basic and silly guest !

the similar quesiton is can nutanix cluster be installed or work with 2 node ? certainly not, but I just can understand why ?

the customer challenge me for that too, it seems wastes too much space.

any response would be appriciated!
icon

Best answer by Jon 13 June 2016, 03:29

View original

This topic has been closed for comments

4 replies

Userlevel 6
Badge +29
-


First off, if you haven't done so already, I'd recommend you read this webpage start to finish: http://nutanixbible.com

There is also a localized versions listed at the top of the page, so they have all sorts of different languages.

Note: Localized versions might be a little behind the main page, as localized versions are updated at different times. So, check the main english page first, and then reference the localized versions as needed.

The Nutanix Bible will really help you understand some of the constructs within the Nutanix platform, and I think will answer many questions.


Let's address physical capacity now
Read this particular sub-section of the nutanix bible:
http://nutanixbible.com/#anchor-drive-breakdown-52 ... go read this particular section now.


Now that you've read that section, you'll note that the SSD's within a Nutanix system serve mutliple purposes.

At a high level, there is various system data (specifically ... Metadata, Caching, Nutanix OS home files) and user data, which is where you'd store virtual machine data.

When we report "Physical Capacity" in Prism, we are SPECIFICALLY talking about User Data, and have already subtracted the system data.

Why would we do that? Because if we reported the actual physical capacity of the disks within the system, that would be misleading to users, as the system takes up some amount for the Metadata, Caching, and nutanix OS files, and that space is not available for user data.

you can do a quick calcuation of physical space by inputting your specifications into this tool, which was drawn up by one of our employees: http://designbrews.com

Note, we use the exact same calcuations from Design Brews in https://services.nutanix.com/#/, which is the "Nutanix Sizer" tool.

This is the single tool that Nutanix Sales, Professional Services, and our Partners use to size and design Nutanix systems. I only posted design brews because it doesn't require a log in to anything, and gives you a very quick view of capacity in Nutanix.


TB vs TiB
Note: As you may know, there is a difference in between TB and TiB (terabyte vs tebibyte, respectively). The difference here is material, as it is Base10 math vs Base2 math.

Keep that in mind when looking at any sort of capacity numbers, as you will need to know the unit.



Capacity for a 6 node 3060-g4 cluster
if you input the numbers into designbrews.com, you will find that the effective capacity (for User Data) using RF2 should be as follows

Effective Capacity: 11.62TB (10.57TiB)

NOTE: This is before any data reduction technologies, like in-line compression (which we recommend in most cases), deduplication, and Erasure Coding.

Using those technologies may allow you to store even more data than ~11.62 TB, but this is all dependent on the type of data being stored, as not all data is compressible, not all data is dedupable, and not all data is erasure code-able


RE Failure Behaior in a 3 node cluster
Read this entire section in the nutanix bible: http://nutanixbible.com/#anchor-distributed-storage-fabric-53

Specifically around Data protection, metadata, and data path resiliency



Now that you have read that, you are right, if you have a 3 node cluster, and you lose two nodes, the cluster will stop, as it has no ability to protect data with only one node (i.e RF2 will have no where to write that other data copy).

Assuming you enable alerts and call home (called Pulse), you will rarely if ever run into this issue, as when the first node starts having problem, both you and Nutanix support will be notified.


RE 2 Node
No, Nutanix does not currently support 2 node deployments.

Besides cost, we have yet to find an actual technical requirement for just a 2 node cluster, that a 3 node cluster wont already solve.

As I mentioned in another post of yours, we just released several SMB and ROBO bundles, as well as the SX series, which should resolve product cost issues in the vast majority of situations.

RE RF2 "wasting space"
Keep in mind, that with Hyperconverged infrastructure, and scale out technologies in general, a lot of the "existing" principles and designs do not apply.

Both Nutanix and other HCI's use a similar type of "no RAID" model, where capacity is stored like RF2 or perhaps RF3 in some customer cases.


Expanding on what I said before about data reduction technologies like compression, in a scale out architecture like Nutanix, these technologies actually work, since you are not bottlenecking compression and deduplication and erasure coding to one or two controllers, like you do in traditional storage arrays.

Instead, all controllers participate in this effort so these technologies are quite efficient which greatly reduces the impact of RF2.
Userlevel 1
Badge +14
Hi Jon,
thanks you for repy, relly make sense for understanding.
As your suggestion,I`ll learn Bible first and go back

RE Failure Behaior in a 3 node cluster
after two NODE fails, cvm cluster is down. I just wonder if data will lost and how to recovery it with only RF2 scenario. in my opinion, there`s one copy of data in the remainning node ,and Data should survive, despite RF2 can`t continue working. all I should do is repance two new NODEs.

of course,I`ll read bible first tring to understand it
Userlevel 6
Badge +29
This cascading failure in a 3 node cluster is rare, given my previous comments.

If that were to happen, you would have support on the line, and they would guide you on recovery.
Userlevel 1
Badge +14
Hi Jon,
Roger That !
Cascading Failure really is rare. But you know strange things happen sometimes indeed.
Thanks again.