In previous articles we discussed High Availability and Disaster Recovery support in HANA.
In this article we will explain Host Auto-Failover - A Fault Recovery Solution provided by SAP HANA.
What is Host Auto-Failover?
Host auto-failover is a local fault recovery solution that can be used in addition or as an alternative measure to system replication. One (or more) standby hosts are added to an SAP HANA system, and configured to work in standby mode.As long as they are in standby mode the databases on these hosts do not contain any data and do not accept requests or queries.
This means they cannot be used for other purposes such as quality or test systems.
How Host Auto-Failover works?
When a primary (worker) host fails, a standby host automatically takes its place.
Since the standby host may take over operation from any of the primary hosts, it needs shared access to all the database volumes. This can be accomplished by a shared, networked storage server, by using a distributed file system.
This scenario is illustrated in the graphic below:
Once repaired, the failed host can rejoin the system as the new standby host to reestablish the failure recovery capability:
In order to recover connections from SAP HANA clients that were configured to reach the original host, and need to be ""diverted"" to the standby host after host auto-failover, there are a couple of approaches available:
- http load balancer (HLB)
- Network base approach (IP or DNS)
- SQL/MDX database clients
Heartbeat and Fencing
To ensure data consistency, SAP introduced two capabilities:
Heartbeat is a regular TCP communication to check if the primary host is active as master before attempting to take over master role or perform a failover. It can happen from nameserver to nameserver between hosts or nameserver to hdbdeamon with SAP HANA internal communication protocol.
The following types of heartbeats are used to check if another host is active as
master before starting the current host as master or performing a failover.
TCP communication-based heartbeats:
- Ping from name server to name server with SAP HANA internal communication protocol
- Ping from name server to hdbdaemon with SAP HANA internal communication protocol
I/O Fencing is the process of isolating a failed node and protecting shared data pool to ensure that the (failed) primary host no longer has access the data or log volumes.
The SAP HANA storage connector API allows usage of different types of storage and network architecture to ensure proper I/O fencing:
- SAN storage: SAP HANA Fiber Channel storage connector using SCSI-3 persistent reservations (SCSI-3 PGR)
- NFSv3: used without file locking, but with a storage connector provided by certified storage vendors
- NFSv4 or cluster file systems like GPFS: using file locks
Challenge yourself! Can you pass this quiz on HANA Availability and Scalability?HANA High Availability and Disaster Recover Quiz