Designing Backup and DR for NUTANIX "Part1" - General Aspects


Userlevel 1
Badge +3

Introduction


Because Backup and Disaster Recovery are still 2 very important aspects to think about when you implement IT Infrastructure for businesses I want to write a blog about it and share some conceptual and technical knowledge i acquired during my last years in IT Industry.
This blog will focus mainly on hyper converged environments based on NUTANIX along with corresponding Hypervisors VMware, Hyper-V and AHV and on concepts to consider when planning backup and disaster recovery for those infrastructures.

The blog is split into 4 parts
Part 1 - General aspects
Part 2- Technical aspects (NUTANIX specifics)
Part3- Choosing NUTANIX features and additional software products
part4- conclusion

I will go through general concepts and then to the specific requirements for NUTANIX based environments.
Later on i will concentrate on specific software products (mainly in Part 3). And how they can align with the specific requirements. In the end there will be a short conclusion and wrap up.




The “next-gen” backup world

The last 5 years things in the IT industry have changed in a way and speed than never before. This also applies to datacenter infrastructure. As virtualized workloads continue to grow and business requirements demand more and more agility and flexibility many of the classical multi-tier architectures come to their limits. Converged, Hyper converged and Cloud architectures are more and more common.
In that ever changing virtualized IT world the need for different approaches to backup and disaster recovery is required.
Many businesses defined their backup and DR processes for the “old-world”. That is for example:
  • Backup-to disk-to tape concepts
  • Agent based backups
  • physical “silo-based” infrastructures
These processes and concepts often cannot keep up with a highly virtualized and flexible environment based on hyper converged or cloud infrastructures. Moreover, the need to define “near-zero” RPO/RTO times.
Modern backup tools already focus on these new demands and their feature set targets those highly virtualized infrastructures.
So when a business implements hyper converged (hopefully NUTANIX), a change in doing backup and DR is always a key aspect to think about.


General aspects for backup and disaster recovery in HCI environments


When you have the task of designing a backup and/or disaster recovery concept for a hyper converged infrastructure many aspects are different than compared to classical multi-tier infrastructures.
Some of them (but not limited to) are:

  • Compute, software-defined-storage (and sometimes) software-defined-networking all on one box.
  • Object file system based technology instead of native block-storage or NAS Storage backend
  • Very high consolidation ratio of applications.
  • Many different workloads to consider
  • Sometimes different Hypervisors in one consolidated environment
  • Container technologies alongside with classical hypervisors.
  • Fewer API and programming interfaces
  • Sometimes highly virtualized network environments with strict security policies
I could count numerous more aspects of hyper converged infrastructures but this will be enough right now.
Why is this important for a valid and well planned design ?
What i wanted to point out here is that you should always keep in mind „the big picture“ . You have to align the backup and DR processes along with the business requirements. Not solely, focus on technical requirements. And in addition, you should focus on how hyper converged infrastructure changes or influences those processes.
This could be a daunting task in the first stage. However, the better you point out the business requirements and align them with the technical requirements the better you will be able to design a high quality solution.


I will list some of the business requirements and areas to focus on:

Which are the stakeholders who will use the new HCI Environment?
  • Try to identify them all. 😉. If not all then as much as you can.
  • Stakeholder’s knowledge of backup and DR systems/concepts?
  • Stakeholder’s knowledge of HCI Systems?
Can all application holders define their backup and DR requirements?
  • Who defines backup and DR requirements?
  • Responsibilities of application holders, IT Staff a.o.
RPO/RTO Definition
  • Are RPO/RTO already defined?
  • How they are influenced bei HCI?
Backup and DR documents available?
  • Who is responsible for documentation?
  • Do you have access to the documents?
Strategic Business drivers for Hyper converged Infrastructure.
  • Backup and DR Concept should support those business drivers
  • Are there strategic constrains or assumptions you have to consider about?

RPO/RTO and SLA

RPO/RTO as well as SLA definitions are always a key aspect to consider when planning for backup and DR in your environment.
I want to highlight the meaning of those three common “buzzwords” in IT:

RTO:
The recovery time objective (RTO) is the time that passes after a crash of a computer or an application or after a network interruption, and until the system can continue normal operation and provide access to the data.
The RTO value is specified in a unit of time, in seconds, minutes, hours, or days

RPO:
The Recovery Point Objective (RPO) is the recovery time point after a failure of an IT system or the IT infrastructure. It stays in direct dependency with the recovery time objective (RTO).

SLA:
A service level agreement (SLA) is a contract between a service provider (either internal or external) and the end user that defines the level of service expected from the service provider. SLAs are output-based in that their purpose is specifically to define what the customer will receive.
A business application which is critical for a company to do their daily business should always be considered with minimal RTO/RPO!.
Your backup and DR processes must support all defined RTO/RPO and as well as SLA’s.
Modern backup solutions are always SLA oriented. In addition, they give you the possibility to define backup and/or replication policies based on a specific application SLA.
If all or most of the above aspects can be addressed you can then focus on technical requirements and technical design.


To be continued in "Part 2" - Technical Aspects NUTANIX

Do you find this useful ?

0 replies

Be the first to reply!

Reply