This guide is for LogRhythm Professional Services to prepare, install, and configure LogRhythm's combined HA + DR solution.
All requirements and prerequisites for both HA and DR must be met before deploying HA + DR.
For each HA pair, three static IP Addresses are needed on the Public Network. One IP Address is needed for each of the nodes of the cluster, and one IP Address is Shared. This Shared IP address can only be active on one node of the cluster at any time.
HA Host Records
In the Entities tab of the LogRhythm Client Console's Deployment Manager, a shared host record must be created for HA which includes identifiers for the SHARED IP and the SHARED HOSTNAME.
The following ports are required for the LogRhythm HA solution.
|Windows File and Print||135/TCP, 137/UDP, 138/UDP, 139/TCP, 445/TCP|
|LifeKeeper||81/TCP, 82/TCP, 1500/TCP, 1510/TCP, 3278/TCP, 3279/TCP|
|DataKeeper||9999/TCP, 10003/TCP, 10011/TCP|
Additional ports required for the LogRhythm installation are not included in the above list.
Prior to starting the HA installation, back up critical data.
For new installations, outages are usually not an issue. However, the DataKeeper component installs a driver that requires a reboot. For this reason, you will not be able to create mirrored volumes until the system has been rebooted following the SPS installation.
Constant changes on the source volume will delay the completion of the replica to the target volume. The recommended approach is to minimize or eliminate any changes to the D: and L: volumes until the source and target volumes are synchronized and in a Mirroring State.
LogRhythm recommends that all LogRhythm systems be connected to an uninterruptible power supply. A power cut may cause an Elasticsearch failure that leads to a loss of indices.
LogRhythm Software Version
LogRhythm version 7.8+ and HA software version 10.1 are required.
HA is intended to provide redundancy for hardware failure, which is not applicable to a cloud (shared infrastructure) environment. In a cloud environment, the virtual IP created by the SIOS SteelEye software cannot be appropriately moved between hosts in the event of failover. If HA functionality is required in a cloud environment, consider using Disaster Recovery (DR) or database backups. For more information, see the Disaster Recovery Installation Guide.
Nodes within an HA pair must be identical in all aspects of their components/specs. This includes storage quantity, storage type, RAID controller, memory, CPUs, Network Interfaces, and type (physical/virtual). With LogRhythm appliances this means the node pairs must be an identical model/build and cannot cross generations. For example two PM7500s can be configured as an HA pair however a PM7500 cannot be configured in an HA pair with a PM7600 as they are not identical matches. HA cannot be implemented where 1 node is Physical and another is Virtual. For these scenarios consider using Disaster Recovery (DR).
The LogRhythm XM8600 appliance does not support HA at this time, consider using Disaster Recovery (DR) or database backups. For more information, see the Disaster Recovery Installation Guide.
LogRhythm Infrastructure Installer Considerations for HA
When installing LogRhythm software, the LogRhythm Infrastructure Installer should be run as a single host deployment for XM deployments. In this scenario, no special configuration is needed.
For HA installations of a PM or any distributed LogRhythm deployment, such as an XM + separate Web Console, the LogRhythm Infrastructure Installer will need to be run twice. The initial installation can be run as a single-host deployment in order to continue the LogRhythm Software installation. After completing the HA installation, run the Infrastructure Installer again on the Primary HA node as a multihost deployment, and create a single entry for the HA systems by using the shared IP address created during HA Install. Next, perform a failover to the Secondary HA node and run the deployment package that was generated on the Primary node.
Dual Site Additional Requirements
Dual Site deployments provide greater protection from disasters. However, they also have a number of additional requirements to deliver an appropriate level of performance across a distance.
The private and public network connections between cluster nodes must appear as a single, nonrouted LAN that uses technologies such as stretch virtual LANs (VLANs). In these cases, the network must be able to provide a guaranteed, maximum round-trip latency between nodes of 15 milliseconds. The NICs carrying Public data and Private data must each appear as a standard LAN to support a shared IP Address between cluster nodes.
Network bandwidth is also a key consideration.
In order to characterize the network traffic, a Search Optimized configuration is presented.
- 20% of logs are stored in the Online database are Events and are stored in LogMart.
- 100% of logs are archived.
- The total log rate is 250 logs per second.
The following table provides examples of the results when varying compression levels.
|Active Public NIC Bytes In/sec||Active Private NIC Bytes Out/sec||Active Private NIC Compression Level (0-9)||Active Logical Disk Write Bytes/sec||Average CPU Utilization|
|60,000||3,500,000||0 (no compression)||3,300,000||26%|
The results indicate that the most benefit can be gained from using a compression level of 1 for all replicated volumes without significant negative impact on the CPU. Additional compression increases the load on the CPU without offering notable gain in terms of bandwidth saved.
Measuring the sum of the Total Write Bytes on the D: and L: volumes divided by the optimal compression factor provides a guide for network sizing.
((Write Bytes / Sec for D:) + (Write Bytes / Sec for L:)) / (Compression Factor) = Minimum required Network Bandwidth
To summarize, site-to-site configurations require:
- Low latency connections – less than 15 ms ping times between nodes of the cluster
- The network connecting both the Public and Private adapters must appear as a single subnet
- Network bandwidth equal to half the total write bytes of the replicated volumes
Microsoft automatically creates a DNS record for each HA node that is added to a domain. The DNS record for the Shared Machine Name and Shared Public IP address are not created automatically and should be added manually. Use the DNS snap-in of the Microsoft Management Console to create these. Include a pointer record (PTR) for each by selecting the Create associated pointer (PTR) record check box. For more information on managing DNS records, see the Microsoft Developer Network library.
The LogRhythm SIEM must be deployed on both the Primary and Secondary sites using the same LogRhythm software version.
SQL Server, SQL Server Agent, and LogRhythm Service Registry configuration
Configure the SQL Server, SQL Server Agent, and LogRhythm Service Registry services to run under the same account on both the Primary and Secondary sites. This should be a named, privileged account that is not the sa account. The account can be either:
Configure the network so that:
Ensure that the SQL Server port (1433) and the ports used for replication between the two sites (default is 5022) are open (not blocked by a firewall) at both sites. The DR setup automatically opens ports secured by Windows Firewall, but not by other types of firewalls.
Domain Name Server (DNS) requirements
A common DNS A record needs to be provisioned within the DNS zone the Disaster Recovery systems are deployed to. This operation is not performed automatically by DR Setup and requires manual intervention by a network administrator.
Configure DNS so that:
Disk space requirements on Platform Managers
During the DR setup, you must back up the Primary Platform Manager’s databases and copy them to the Secondary system. The DR installation program will check your database sizes and give you an estimate for the disk space requirements. You can also use a network drive for the backup, provided that the SQL Agent service account has write access to the share.
The database backup may take hours to complete, depending on the data size and the write-speed of the backup media.