The requirement for N+1 resiliency means the solution must tolerate the failure of one component (in this case, one ESXi host) without disrupting workloads. In VMware Cloud Foundation (VCF), this is typically achieved through vSphere High Availability (HA) settings and sufficient host capacity. The scenario provides key constraints: a 4-node cluster can handle all workloads at 90% utilization, and no growth is expected. Let’s evaluate each option:
Option A: Set the DRS Automation level to Partially AutomatedDRS (Dynamic Resource Scheduling) balances workloads across hosts, but the automation level (Partially Automated vs. Fully Automated) doesn’t directly impact N+1 resiliency. Partially Automated requires manual approval for migrations, which doesn’t enhance or detract from HA-based resiliency. While DRS is useful, this specific setting isn’t critical to the N+1 requirement, per theVMware Cloud Foundation 5.2 Architectural Guide.
Option B: Deploy a workload cluster consisting of five VMware vSphere hostsA 5-node cluster provides N+1 resiliency when paired with HA configured to tolerate one host failure. If one host fails, the remaining four can handle the workload, assuming capacity planning accounts for this. The Aria Operations report indicates a 4-node cluster is sufficient at 90% utilization, but adding a fifth host ensures capacity remains after a failure (reducing utilization to ~72% across four hosts: 90% / 1.25). This aligns with VCF’s standard architecture recommendations for resiliency (VMware Cloud Foundation 5.2 Architectural Guide).
Option C: Set the Host failures cluster tolerates for the workload cluster to 1This HA setting ensures the cluster reserves capacity (e.g., CPU and memory) to failover VMs from onefailed host. In VCF, setting “Host failures cluster tolerates” to 1 is a direct implementation of N+1 resiliency, making it a required design decision (vSphere Availability GuideandVCF 5.2 Administration Guide).
Option D: Deploy a workload cluster consisting of four VMware vSphere hostsA 4-node cluster meets capacity needs at 90% utilization but lacks N+1 resiliency without additional capacity. If one host fails, the remaining three would be overcommitted (120% utilization: 90% / 0.75), risking performance or availability. Thus, this doesn’t satisfy the requirement alone.
Option E: Configure vSphere High Availability (HA) for the workload clusterHA is foundational to N+1 resiliency in vSphere and VCF, enabling VM restarts on surviving hosts after a failure. Without HA, N+1 cannot be achieved, making this a mandatory choice (VMware Cloud Foundation 5.2 Administration Guide).
Option F: Configure vSphere Dynamic Resource Scheduling (DRS) for the workload clusterDRS enhances performance by balancing workloads but isn’t strictly required for N+1 resiliency, which focuses on availability, not optimization. It’s a best practice in VCF but not one of the three critical decisions for this requirement.
Conclusion:
B: A 5-node cluster provides the extra host for N+1.
C: HA set to tolerate 1 host failure implements N+1 policy.
E: HA configuration enables failover, a core N+1 component.Options B, C, and E together ensure the cluster can lose one host without service disruption, meeting the customer’s requirement.References:
VMware Cloud Foundation 5.2 Architectural Guide(docs.vmware.com): Section on Workload Domain Design and HA/DRS Configuration.
vSphere Availability Guide(docs.vmware.com): Chapter on Configuring High Availability.
VMware Cloud Foundation 5.2 Administration Guide(docs.vmware.com): HA and Cluster Sizing Guidelines.