Overview of the LogRhythm HA Solution
A LogRhythm HA deployment is a cluster made up of two independent appliances or nodes. Both nodes in the cluster share a common configuration. Transient data such as binaries, log files, and operating system files are placed on non-replicated partitions. Persistent data such as databases, archives, transaction logs, state, and configuration information for each of the LogRhythm services are placed on replicated partitions. The replication engine allows both nodes in the cluster to access the information on the shared volume. For example, this is important for log collection. By transferring the state and position through data replication, we can ensure that log collection picks up where it left off regardless of which node it is running on.
Since each node in the cluster has its own operating system and program files, you must apply OS, SQL, and LogRhythm updates to both nodes. In general, if there is a program change on one node in the cluster, you must perform those same actions on the other node. For example, when you install an OS service pack on one node in the cluster you must also apply the same service pack to the other node. For more information, see Perform Platform Updates for HA Deployments.
Components of SIOS Protection Suite
SIOS Core Products:
LifeKeeper
LifeKeeper provides continuous monitoring of critical resources. This toolset provides the foundation of the LogRhythm HA platform.
DataKeeper
Provides volume-level, block-level replication for disk drives and delivers a shared-nothing solution when used in conjunction with LifeKeeper.
Additional SIOS Software Kits
LifeKeeper SQL Recovery Kit
Provides enhanced functionality to deliver database and process level monitoring along with integrated capabilities within the LifeKeeper GUI. The SQL Recovery Kit is only required for EM, LM, and XM appliances.
Reference Architecture
In a typical deployment using XM appliances, each system is configured with four logical drives.
Drive Letter | Contains | Replicated? |
---|---|---|
C: | System | No |
D: | Data | Yes |
L: | SQL Logs | Yes |
T: | TempDB | No |
Each host requires a static IP Address accessible on the Public Network. Additionally, each pair of nodes requires a Shared IP Address accessible on the Public Network. This Shared IP Address will be the IP Address that is used by the protected elements on the system.
In the reference diagram that follows, the IP Address that all LogRhythm services will be configured to use is the Shared IP Address. SQL and the Windows name also use this same shared IP Address. Together, the Shared IP Address, Shared Name, and Shared Data Volumes form the shared infrastructure on which the LogRhythm Application Stack operates. These resources combine to form a logical virtual server on the network.
Dual-Site Deployment Requirements
Network Connections
Dual-Site deployments provide greater protection from disasters. However, they also have a number of additional requirements to deliver an appropriate level of performance across a distance.
The private and public network connections between cluster nodes must appear as a single, nonrouted LAN that uses technologies such as virtual LANs (VLANs). In these cases, the network must be able to provide a guaranteed maximum round-trip latency between nodes of no more than 15 milliseconds. The NICs carrying Public data and Private data must each appear as a standard LAN to support a shared IP Address between cluster nodes.
Network Bandwidth
Network bandwidth is a key consideration.
An estimate of the network sizing can be calculated by adding the Total Write Bytes for the volumes and adding additional overhead for TCP. Then, divide the sum by the optimal compression factor.
((Write Bytes / Sec for D:) + (Write Bytes / Sec for L:) + (Overhead for TCP ) / (Compression Factor) = Minimum required Network Bandwidth
To summarize, site-to-site configurations require:
- Low latency connections – less than 15 ms ping times between nodes of the cluster.
- The network connecting both the Public and Private adapters must appear as a single subnet.
- Network bandwidth equal to half the total write bytes plus TCP overhead of the replicated volumes.