Because Backup and Disaster Recovery are still 2 very important aspects to think about when you implement IT Infrastructure for businesses I want to write a blog about it and share some conceptual and technical knowledge i acquired during my last years in IT Industry.
This blog will focus mainly on hyper converged environments based on NUTANIX along with corresponding Hypervisors VMware, Hyper-V and AHV and on concepts to consider when planning backup and disaster recovery for those infrastructures.
The blog is split into 4 parts
Part 1 - General aspects
Part 2- Technical aspects (NUTANIX specifics)
Part3- Choosing NUTANIX features and additional software products
I will go through general concepts and then to the specific requirements for NUTANIX based environments.
Later on i will concentrate on specific software products (mainly in Part 3). And how they can align with the specific requirements. In the end there will be a short conclusion and wrap up.
The “next-gen” backup worldThe last 5 years things in the IT industry have changed in a way and speed than never before. This also applies to datacenter infrastructure. As virtualized workloads continue to grow and business requirements demand more and more agility and flexibility many of the classical multi-tier architectures come to their limits. Converged, Hyper converged and Cloud architectures are more and more common.
In that ever changing virtualized IT world the need for different approaches to backup and disaster recovery is required.
Many businesses defined their backup and DR processes for the “old-world”. That is for example:
- Backup-to disk-to tape concepts
- Agent based backups
- physical “silo-based” infrastructures
Modern backup tools already focus on these new demands and their feature set targets those highly virtualized infrastructures.
So when a business implements hyper converged (hopefully NUTANIX), a change in doing backup and DR is always a key aspect to think about.
General aspects for backup and disaster recovery in HCI environments
When you have the task of designing a backup and/or disaster recovery concept for a hyper converged infrastructure many aspects are different than compared to classical multi-tier infrastructures.
Some of them (but not limited to) are:
- Compute, software-defined-storage (and sometimes) software-defined-networking all on one box.
- Object file system based technology instead of native block-storage or NAS Storage backend
- Very high consolidation ratio of applications.
- Many different workloads to consider
- Sometimes different Hypervisors in one consolidated environment
- Container technologies alongside with classical hypervisors.
- Fewer API and programming interfaces
- Sometimes highly virtualized network environments with strict security policies
Why is this important for a valid and well planned design ?
What i wanted to point out here is that you should always keep in mind „the big picture“ . You have to align the backup and DR processes along with the business requirements. Not solely, focus on technical requirements. And in addition, you should focus on how hyper converged infrastructure changes or influences those processes.
This could be a daunting task in the first stage. However, the better you point out the business requirements and align them with the technical requirements the better you will be able to design a high quality solution.
I will list some of the business requirements and areas to focus on:
Which are the stakeholders who will use the new HCI Environment?
- Try to identify them all. . If not all then as much as you can.
- Stakeholder’s knowledge of backup and DR systems/concepts?
- Stakeholder’s knowledge of HCI Systems?
- Who defines backup and DR requirements?
- Responsibilities of application holders, IT Staff a.o.
- Are RPO/RTO already defined?
- How they are influenced bei HCI?
- Who is responsible for documentation?
- Do you have access to the documents?
- Backup and DR Concept should support those business drivers
- Are there strategic constrains or assumptions you have to consider about?
RPO/RTO and SLARPO/RTO as well as SLA definitions are always a key aspect to consider when planning for backup and DR in your environment.
I want to highlight the meaning of those three common “buzzwords” in IT:
The recovery time objective (RTO) is the time that passes after a crash of a computer or an application or after a network interruption, and until the system can continue normal operation and provide access to the data.
The RTO value is specified in a unit of time, in seconds, minutes, hours, or days
The Recovery Point Objective (RPO) is the recovery time point after a failure of an IT system or the IT infrastructure. It stays in direct dependency with the recovery time objective (RTO).
A service level agreement (SLA) is a contract between a service provider (either internal or external) and the end user that defines the level of service expected from the service provider. SLAs are output-based in that their purpose is specifically to define what the customer will receive.
A business application which is critical for a company to do their daily business should always be considered with minimal RTO/RPO!.
Your backup and DR processes must support all defined RTO/RPO and as well as SLA’s.
Modern backup solutions are always SLA oriented. In addition, they give you the possibility to define backup and/or replication policies based on a specific application SLA.
If all or most of the above aspects can be addressed you can then focus on technical requirements and technical design.
To be continued in "Part 2" - Technical Aspects NUTANIX