Sunday, October 28, 2012

Cloud Computing - High Availability (HA) and Disaster Recovery

The goal of the traditional High Availability (HA) architecture is to mitigate or prevent application downtime or outages due to failures caused by application errors or any infrastructure failure. Disaster recovery primarily deals with falling back on the secondary site in case of a failure at the primary site.


HA focuses on overcoming technology failures such as network and storage failures. DR focuses on overcoming the physical data center or infrastructure disasters. Both HA and DR focus on ensuring that applications are available 24x7 with zero or minimal downtime caused by planned or unplanned outages


Recovery Time Objective (RTO):  how quickly an application must be back in operation following a failure


Recovery Point Objective (RPO):


A simple way to achieve HA/DR is by using backup and restore operations or by using cluster architecture  and finally using fault tolerant hardware. Needless to say not all organizations use fault tolerant hardware but backup/restore becomes a very slow process and could take hours or days.


==
 Data replication and HA clustering can be combined to build what is called a "shared nothing cluster". The clustering technology ensures that applications and servers are operational and can perform a failover from one server to another in the event that any problems are detected. The data replication software handles mirroring data needed by the application between the servers so that no matter which server the application is running on, the data is available to it. The data replication can occur across either a LAN or across a WAN, depending on where the servers in the cluster are located.


==