Skip to main content
Skip table of contents

Install a LogRhythm HA + DR Combined Solution

This guide is for LogRhythm Professional Services to prepare, install, and configure LogRhythm's combined HA + DR solution.

Prerequisites

All requirements and prerequisites for both HA and DR must be met before deploying HA + DR.

High Availability

IP Addresses

For each HA pair, three static IP Addresses are needed on the Public Network. One IP Address is needed for each of the nodes of the cluster, and one IP Address is Shared. This Shared IP address can only be active on one node of the cluster at any time.

HA Host Records

In the Entities tab of the LogRhythm Client Console's Deployment Manager, a shared host record must be created for HA which includes identifiers for the SHARED IP and the SHARED HOSTNAME.

Ports

The following ports are required for the LogRhythm HA solution.

ComponentPorts
Windows File and Print135/TCP, 137/UDP, 138/UDP, 139/TCP, 445/TCP
LifeKeeper81/TCP, 82/TCP, 1500/TCP, 1510/TCP, 3278/TCP, 3279/TCP
DataKeeper9999/TCP, 10003/TCP, 10011/TCP

Additional ports required for the LogRhythm installation are not included in the above list.

Backup

Prior to starting the HA installation, back up critical data.

Outages

For new installations, outages are usually not an issue. However, the DataKeeper component installs a driver that requires a reboot. For this reason, you will not be able to create mirrored volumes until the system has been rebooted following the SPS installation.

Constant changes on the source volume will delay the completion of the replica to the target volume. The recommended approach is to minimize or eliminate any changes to the D: and L: volumes until the source and target volumes are synchronized and in a Mirroring State.

Power Supply

LogRhythm recommends that all LogRhythm systems be connected to an uninterruptible power supply. A power cut may cause an Elasticsearch failure that leads to a loss of indices.

LogRhythm Software Version

LogRhythm version 7.8+ and the current HA software version (10.x) are required. 

Disable Automatic Page File Setting

The virtual memory setting "Automatically manage paging file size for all drives" must be disabled on the HA nodes. If left at the default (enabled) setting, this may result in page files forming on mirror drives (this will stop installation and break existing mirrors). SIOS software rejects any mirrored volume containing a page file. Move any necessary page files to a non-mirrored drive.

Installation Environment

Cloud infrastructure is not supported in High Availability (HA) environments.

HA is intended to provide redundancy for hardware failure, which is not applicable to a cloud (shared infrastructure) environment. In a cloud environment, the virtual IP created by the SIOS SteelEye software cannot be appropriately moved between hosts in the event of failover. If HA functionality is required in a cloud environment, consider using Disaster Recovery (DR) or database backups. For more information, see the Disaster Recovery Installation Guide.

Nodes within an HA pair must be identical in all aspects of their components/specs. This includes storage quantity, storage type, RAID controller, memory, CPUs, Network Interfaces, and type (physical/virtual). With LogRhythm appliances this means the node pairs must be an identical model/build and cannot cross generations. For example two PM7500s can be configured as an HA pair however a PM7500 cannot be configured in an HA pair with a PM7600 as they are not identical matches. HA cannot be implemented where 1 node is Physical and another is Virtual. For these scenarios consider using Disaster Recovery (DR).

The LogRhythm XM8600 appliance does not support HA at this time, consider using Disaster Recovery (DR) or database backups. For more information, see the Disaster Recovery Installation Guide.

LogRhythm Infrastructure Installer Considerations for HA

When installing LogRhythm software, the LogRhythm Infrastructure Installer should be run as a single host deployment for XM deployments. In this scenario, no special configuration is needed.

For HA installations of a PM or any distributed LogRhythm deployment, such as an XM + separate Web Console, the LogRhythm Infrastructure Installer will need to be run twice. The initial installation can be run as a single-host deployment in order to continue the LogRhythm Software installation. After completing the HA installation, run the Infrastructure Installer again on the Primary HA node as a multihost deployment, and create a single entry for the HA systems by using the shared IP address created during HA Install. Next, perform a failover to the Secondary HA node and run the deployment package that was generated on the Primary node.

Dual Site Additional Requirements

Dual Site deployments provide greater protection from disasters. However, they also have a number of additional requirements to deliver an appropriate level of performance across a distance.

The private and public network connections between cluster nodes must appear as a single, nonrouted LAN that uses technologies such as stretch virtual LANs (VLANs). In these cases, the network must be able to provide a guaranteed, maximum round-trip latency between nodes of 15 milliseconds. The NICs carrying Public data and Private data must each appear as a standard LAN to support a shared IP Address between cluster nodes.

Network bandwidth is also a key consideration.

In order to characterize the network traffic, a Search Optimized configuration is presented.

  • 20% of logs are stored in the Online database are Events and are stored in LogMart.
  • 100% of logs are archived.
  • The total log rate is 250 logs per second.

The following table provides examples of the results when varying compression levels.

Active Public NIC Bytes In/secActive Private NIC Bytes Out/secActive Private NIC Compression Level (0-9)Active Logical Disk Write Bytes/secAverage CPU Utilization
60,0003,500,0000 (no compression)3,300,00026%
60,0001,800,00013,300,00031%
60,0001,600,00023,300,00033%
60,0001,550,00033,300,00036%
60,0001,350,00093,300,00039%

The results indicate that the most benefit can be gained from using a compression level of 1 for all replicated volumes without significant negative impact on the CPU. Additional compression increases the load on the CPU without offering notable gain in terms of bandwidth saved.

Measuring the sum of the Total Write Bytes on the D: and L: volumes divided by the optimal compression factor provides a guide for network sizing.

((Write Bytes / Sec for D:) + (Write Bytes / Sec for L:)) / (Compression Factor) = Minimum required Network Bandwidth

To summarize, site-to-site configurations require:

  • Low latency connections – less than 15 ms ping times between nodes of the cluster
  • The network connecting both the Public and Private adapters must appear as a single subnet
  • Network bandwidth equal to half the total write bytes of the replicated volumes

DNS Records

Microsoft automatically creates a DNS record for each HA node that is added to a domain. The DNS record for the Shared Machine Name and Shared Public IP address are not created automatically and should be added manually. Use the DNS snap-in of the Microsoft Management Console to create these. Include a pointer record (PTR) for each by selecting the Create associated pointer (PTR) record check box. For more information on managing DNS records, see the Microsoft Developer Network library.

Disaster Recovery

LogRhythm SIEM

The LogRhythm SIEM must be deployed on both the Primary and Secondary sites using the same LogRhythm software version.

SQL Server, SQL Server Agent, and LogRhythm Service Registry configuration

Configure the SQL Server, SQL Server Agent, and LogRhythm Service Registry services to run under the same account on both the Primary and Secondary sites. This should be a named, privileged account that is not the sa account. The account can be either:

  • A domain account
  • Identical local user accounts

Network recommendations

Configure the network so that:

  • A dedicated network is used for data replication from the Primary to Secondary sites, so that replication traffic is isolated from other network traffic — recommended.
  • The IP addresses are configured on the dedicated interfaces, but they do not need to be on the same subnet.
  • The network supports:o Bandwidth: 10 Mb/secondo Latency: 150 milliseconds (maximum)
Ports/Firewall

Ensure that the SQL Server port (1433) and the ports used for replication between the two sites (default is 5022) are open (not blocked by a firewall) at both sites. The DR setup automatically opens ports secured by Windows Firewall, but not by other types of firewalls.

Domain Name Server (DNS) requirements

A common DNS A record needs to be provisioned within the DNS zone the Disaster Recovery systems are deployed to. This operation is not performed automatically by DR Setup and requires manual intervention by a network administrator.

Configure DNS so that:

  • It can point to either the IP address of the Primary Platform Manager or the IP address of the Secondary Platform Manager.
  • The Data Indexers and AI Engines point to the Platform Manager using a DNS name rather than an IP address. The Data Indexers and AI Engines can optionally have a shared name, but it is not necessary.
  • DNS Zones should span the Primary and Secondary sites.
  • DNS Address records should be configured with a TTL (Time to Live) of two minutes so that failover occurs relatively quickly.

Disk space requirements on Platform Managers

During the DR setup, you must back up the Primary Platform Manager’s databases and copy them to the Secondary system. The DR installation program will check your database sizes and give you an estimate for the disk space requirements. You can also use a network drive for the backup, provided that the SQL Agent service account has write access to the share.

The database backup may take hours to complete, depending on the data size and the write-speed of the backup media.



JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.