PXS Heartbeat Monitoring is designed to insure that at most one PXS is in heartbeat monitoring mode and, if possible, at least one PXS is capturing data.
If the application running on the primary PXS detects failure of the synchronous communication link, it can cease operation and instruct the secondary to become active. After the secondary fails over to become active, the primary will fail over to monitor mode effectively reversing the roles of the 2 PXSs. If the secondary encounters a problem, it can fail back to the primary.
If the primary PXS fails due a hardware problem, the secondary will detect the absence of heartbeat messages. After a prescribed number of missed heartbeats, the secondary will fail over and start the application to become the active PXS. After the primary PXS has been replaced, it should be set to monitor the secondary. To restore the original situation with the primary active, the secondary can simply be instructed to do a controlled failover to the primary.
If the secondary PXS fails for any reason, the primary will continue to function normally. If it encounters a problem and attempts a controlled fail over to the secondary, it will wait for the secondary's first heartbeat to fail over to monitor mode. If that heartbeat does not come within a prescribed amount of time, the primary will restart itself in active mode and try to resume operation.
If both the primary and secondary fail due to loss of power, both will reboot
to the heartbeat monitor program, pxsekg
, when power is restored.
This assures that more than one does not become active. Both will detect the absence of
heartbeats, but only the primary will fail over to become active. The secondary
cannot fail over until it has received at least one heartbeat from the primary which
the primary, running pxsekg
, will never send.
The primary will wait twice as long to fail over, but it does not require a first
heartbeat from the secondary in order to fail over. Once the primary becomes active,
it will begin heartbeating to the secondary, and the secondary will enter standby mode.