- 2 hosts with 3Go RAM
- Shared storage
- pingeable gateway
- Redundant network
- multiple shared datastores
- TCP/UDP 8182 IN/OUT
Agent HA = FDM (Fault Domain Manager)
FDM Manage :
- communicate host resource information
- virtual machine states
- HA properties to the others host in the cluster
- heartbeat mecanisms
- virtual machine placement
- virtual machine restarts
HA Agent (FDM) = single process agent => Watchdog => watchdog restart ha agent when the agent fail
Work with IP address only
fdm.log = all operations log messages => by default /var/log/fdm.log
fdm talks directly to hostd and vCenter => not dependant of the vpxa agent
When hostd is unavailable or not yet running after a restart, the hot will not participate in any FDM related processes. FDM relies on HOSTD.
FDM is dependant on HOSTD and if HOSTD is not operational, FDM halts all functions and wait for hostd to become operational
The HA master is also the that initiates the restart of virtual machines when a host has failed.
The HA master election takes approximately 15s by using UDP
The host that is participating in the election with the greatest number of connected DS will be elected master.
If two or more hosts have the same number of DS connected, the one with the highest Managed Object ID will be choosen.
After the election of the master, each slave communicate with the master with SSL connection. Slave do not communicate each other unless a re-election of the master needs to take place.
When a master is electe it will try to acquire ownership of all the datastores it can directly access or access by proxying requests to one of the slaves connected to it using the management network.
It does this by locking file called “protectedlist”. The master uses this file to store the inventory. It keep track of which VM are protected by HA. This file include CPU reservation and memory overhead.
IF the master is isolated or fails, the lock will expire and the new master will relock the file if the datastore is accessible to it.
HA does not use multicast => Always point to point communication
host-<number>-poweron : list VM power on on the host and always begin by 0 (=not isolated) or 1 (=isolated)
=============Host Failures Cluster Tolerates=============
A slot = Virtual Machine => A logical representation of the memory and CPU resources that satisfy the reservation requirements for any power-on virtual machine in the cluster.
- A slot is the worst case CPU and memory reservation
Ha use the highest CPU/memory reservation of any given power-on VM. If no reservation slot size = 32Mhz + 0Mo RAM + oberhead
- das.isolationaddressX [1-10]
- das.usedefaultisolationaddress [true-false]
- das.heartbeatdsperhost [2(default)-5]