Technology landscape is changing rapidly, and IT teams in organizations should embrace the newer technology at the earliest as it would be refreshed in no time. Meanwhile, there are some technologies which are making their presence felt and getting enough attention that every other vendor is willing to launch or have a me-too product irrespective of its Reliability, Availability, Serviceability, and Scalability features. The technology is Hyper-Converged Infrastructure aka HCI. The organization coined and pioneered this is Nutanix Enterprise and a product initially good for running Virtual Desktop Infrastructure (VDI) Solutions has become de-facto standard for all kinds of virtualization workloads be it VDI, Remote Office Branch Office (ROBO), Business Critical Applications (BCA), or similar virtualized workloads. Nutanix is about to complete 10 Years in industry, lot of things have changed from its inception as a Hardware Appliance-based Solution to True Hybrid Cloud based solution.
Nutanix started as a thought to make Infrastructure Management simpler compared to an old-fashioned 3-Tier Virtualization Architecture which involves multiple components and vendors each with its own management console and development life cycle. Typical Day-2 Operations for such architecture involves repetitive activities and lot of planning even for the simplest of the task.
We need to first understand about ongoing challenges of 3-Tier Infrastructure that lead to inception of Nutanix to eradicate them once for all. Outcome of multiple internal or external surveys that Nutanix conducted pointed out to below day-to-day challenges organizations deal with:
- Data Center Space
- Hardware Upgrades
- Software Updates
- Data Availability
- Data Backup
- Lack of options or flexibility to choose
- Point Product
Following sections, I have discussed about each challenge in detail.
1. Management – The biggest of the challenge when it comes to IT Operations, organization operation teams learn multiple management console just to see them on daily basis to make sure everything works fine and try to douse the fire reactively. Below are the typical management consoles operation team learn to manage:
a. Server Management
b. SAN (Storage Area Network) Management
c. Storage Management
d. Virtualization Management
e. Ethernet Management
f. Backup Management
Most of the cases, every organization has multiple specialized resources to manage above management consoles. Change in vendor leads to repeating the process of learning and managing by operations team.
2. Data Center Space – When it comes to hosting the 3-Tier Infrastructure, Datacenter footprint requirement ranges from couple of racks to multiple racks to run the IT and associated business applications / databases. To run these multiple racks, organization needs to build datacenter along with appropriate power, cooling, and power backups. This adds to the overall Operational Expenditure (OpEx) of organizations and impacts Total Cost of Ownership (TCO). Organizations moved from all physical infrastructure to partial physical and partial virtual, but this doesn’t solve the datacenter footprint and OpEx conundrum.
3. Hardware Upgrades – Upgrading infrastructure almost all the time require lot of manual activities viz. excel sheet capacity planning, buying compatible hardware/software, expanding existing infrastructure for new business requirement, weeks of planning to add the new system to existing infrastructure, arranging downtime, doing such activities during off peak hours or weekends and prepare for rollback plans in case something goes wrong during expansion.
4. Software Updates – One of the biggest day 2 operation for team managing the infrastructure is software updates. For a typical 3-Tier Infrastructure below updates are a must to keep it in healthy state:
a. Server Firmware Updates (Hard Disk, NIC, HBA, BIOS, BMC etc.). Yes, these all require updates
b. Virtualization Software Version Updates
c. Ethernet Switch OS / Firmware Updates
d. SAN Switches OS / Firmware Updates
e. Storage OS & Disk Firmware Updates
f. Backup Software Version Updates
g. Backup Target OS / Firmware Updates etc.
This list is just for starters, biggest challenge with so many software updates is checking compatibility with other component for that to function without any issues.
Almost all the time, every organization has multiple vendors supplying different components and each vendor has its own software development and release cycle. Even if it is a single vendor supplying multiple components, they do not have a common development and release cycle. Operations team need to keep on doing the similar update activities every now and then to support business and have requisite uptime. In other words, organization needs to have lot of Planned Downtime to avoid any Unplanned Downtime due to older software versions.
5. Performance – This parameter often considered as the most critical one as it is directly linked to business, end-users, and productivity. Organizations moved from physical infrastructure to virtualization on a 3-tier architecture often see performance drop compared to older physical infrastructure. Primary reason for this difference in performance is because of latency introduced by HBA Cards, SAN Switch Ports, Storage Ports and Storage drives for every read or write operations performed by users while accessing the application or databases. In Physical environment, application/database directly access drives over server mainboard. This is the reason why all organizations do not trust virtualization for their business-critical workloads.
6. Scalability – Whenever an organization chose a product, it considers a scalable solution looking for business growth and user demands. Most of the times business growth is exponential and a less scalable infrastructure can be a hindrance for agility in business. Legacy 3-Tier Infrastructure almost all the time faces scalability issues for one component or the other. Eyeing the complexity around capacity planning for 3-Tier Infrastructure, organization often end up buying infrastructure for 3 to 5 years scalability at one go and can never able to take advantage of Buy-As-You-Grow.
7. Data Availability – Organizations often struggles with desired Data Availability requirements and options to meet them. Data Availability is a metric owned by Data Storages and mainly ensured with RAID (Redundant Array of Independent Disk) Groups created with combination of multiple data storage disks. To make it more complex various RAID Group combinations are present in data storages to be configured based on workload and its business criticality as per best practices. Almost every organization failed to keep these best practices intact eyeing business pressure and host their critical workloads where it can lead to business disruption and revenue losses.
8. Data Backup – One of the most complex cogs in wheel yet very critical as often called “Necessary Evil” / “Last Resort”. Data backup comes with its own complexity of choosing the right backup software, backup server, backup window, online/offline/file level backup, backup schedules, data retention timelines, full/differential/incremental/synthetic-full and so on. Every backup software vendor has different terminology for a result of having data backed up to revert to in case of an unforeseen disaster. Below should be factored and covered under Data Backup:
a. Operational Recovery – Fastest way to resume business operations
b. DC – DR or BCP – Multi-site Data Replication for Business Continuity
c. Archival or Long-Term Retention (LTR) to available Public Cloud
d. External Backup to Tape Library or Virtual Tape Libraries (Disk-to-Disk or Disk-to-Disk-to-Tape)
All above-mentioned points have different RPO (Recovery Point Objective) and RTO (Recovery Time Objective). Organizations chose them basis business demand and criticality.
9. Lack of options or flexibility to choose – This point is somewhat linked to point 6 (Scalability) mentioned above. Organization who buys 3-Tier Infrastructure knows its limitations in terms of flexibility. Organizations need to buy a 3-Tier infrastructure considering 3-years or 5-years growth and invest heavily on day 1 leading to higher Capital Expenditure (CapEx). 3-Tier Infrastructure often lacks in flexibility to adopt latest technology that leads to worry about better utilizing invested resources instead of investing in newer faster technology.
10. Support – Biggest worry for organizations after buying a technology product / solution is Support and how Support reacts to ongoing or future failures / issues. 3-Tier infrastructure has a multi-vendor structure and in case of smallest of the issues, 3 to 4 different teams get involved along with associated Support team to come to a Root Cause Analysis (RCA) and it take months to remediate the identified issues. Almost all organizations running on 3-Tier infrastructure faced issues with support sometime or the other.
11. Point Product – Organizations often fall in trap of buying a point product without understanding its limitations in terms of compatibility, ecosystem, can additional services be run without changing the core system. Point product mostly have lower Total Cost of Acquisition (TCA), but organizations need to invest in additional tools / hardware to accomplish certain tasks that might require in future and sometimes lead to rip and replace of entire point product. Organizations end up paying higher to protect the investment made.
Above 11 points are very crucial for organizations during or after evaluations of a technology. In my next blog I will talk about how Nutanix has solved all these issues and help organizations focus on their Core Business more by putting IT in Self Driving Mode.