Perform Platform Updates for HA Deployments
The LogRhythm HA Solution provides the capability to shift active resources quickly from one node in the cluster to the other to minimize downtime for security patches, operating system enhancements, SQL database updates, and driver updates.
Before performing any updates or switchover of Resource Hierarchies, ensure that replicated volumes D: and L: on the Standby node have a status of Mirroring.
Operating System Updates
OS updates should be applied to the Standby node. The OS should be rebooted to ensure that it is still operational.
In this example, System1 has all Resource Hierarchies in an Active state and is in a cluster with System2, which has all Resource Hierarchies in a Standby state.
- Apply OS patches to System2.
- Review the Event Logs for warnings and errors to confirm System2 is operating normally.
- Use the LifeKeeper GUI to bring the Resource Hierarchy In Service on System2.
- Review the Event Logs and LogRhythm logs for warning and errors to confirm System2 is operating normally.
- Apply OS patches to System1.
- Use the LifeKeeper GUI to bring the Resource Hierarchy In Service on System1.
- Review the Event Logs and LogRhythm logs for warning and errors to confirm that System1 is operating normally.
LogRhythm Binary and Database Updates
LogRhythm binary (executable) updates should be applied to the Standby node. Since the LogRhythm services are not active on the Standby node, typically no reboot is required.
In the following example, System1 has all Resource Hierarchies in an Active state and is in a cluster with System2 which has all Resource Hierarchies in a Standby state.
- Apply LogRhythm binary updates to System2 (standby).
- Review the Event Logs for warnings and errors to confirm System2 (standby) is operating normally.
- From System1 (active) node of the HA pair, open the LifeKeeper GUI (Admin Only).
- From System1 (active), for each individual LogRhythm service ResTag, right-click on the top-level resource (i.e. scmedsvr_ResTag) on the active side and select Out of Service. This stops the LogRhythm Service but keep its HA dependencies active.
- Repeat step 4 for each protected LogRhythm service.
- Use a tool other than LifeKeeper to verify that all relevant LogRhythm services have been stopped (services.msc).
- After you have ensured that the services are stopped on the primary node of the HA pair only, perform database upgrades as you would on any standalone system.
- After all database upgrades are complete and the system is at a point where the LogRhythm services can be started again, open the LifeKeeper GUI (Admin Only) and right-click the top-level container (on the standby side, where the binaries have already been updated) for all of the protected resources (i.e. XM_ResTag) and select “In Service”. This causes all of the containers’ dependencies (all protected services) to come online on the Standby system.
- Confirm System2 is operating normally by reviewing the Event Logs and LogRhythm logs for warnings and errors.
- Apply LogRhythm binary updates to System1.
- Use the LifeKeeper GUI to bring the Resource Hierarchy In Service on System1.
- Review the Event Logs and LogRhythm logs for warnings and errors to confirm that System1 is operating normally.
LogRhythm Infrastructure Installer Considerations for HA
When installing LogRhythm software, the LogRhythm Infrastructure Installer should be run as a singlehost deployment for XM deployments. In this scenario, no special configuration is needed.
For HA installations of a PM or any distributed LogRhythm deployment, such as an XM + separate Web Console, the LogRhythm Infrastructure Installer will need to be run twice. The initial installation can be run as a single-host deployment in order to continue the LogRhythm software installation. After completing the HA installation, run the Infrastructure Installer again on the Primary HA node as a multihost deployment, and create a single entry for the HA systems by using the shared IP address created during HA Install. Next, perform a failover to the Secondary HA node and run the deployment package that was generated on the Primary node.
For additional information about the LogRhythm Infrastructure Installer, see the LogRhythm Software Installation Guide.
SIOS Updates
You may upgrade from previous versions of LifeKeeper for Windows and SIOS DataKeeper for Windows while preserving your resource hierarchies by using the procedure below.
Upgrade Procedure
The following scenario illustrates the upgrade process when upgrading SIOS LifeKeeper, DataKeeper, and LifeKeeper for SQL. You should first upgrade LifeKeeper and LifeKeeper for SQL before upgrading SIOS DataKeeper. The LifeKeeper Services and SIOS DataKeeper Service are stopped during the upgrade process. A system reboot is required after upgrading all three components.
Given two systems (Sys1 and Sys2), with Sys1 being the primary (active) server, perform the following steps to upgrade to LifeKeeper and SIOS DataKeeper:
Upgrade the Backup Server
- Exit the LifeKeeper GUI and SIOS DataKeeper GUI on backup server Sys2.
Open a command window and enter C:\LK\bin\lkstop to stop all the LifeKeeper services. Wait until you see “LIFEKEEPER NOW STOPPED” before continuing.
If the lkstop command fails to stop the SIOS services properly, you may need to add Debug permissions to the Administrator account. To do this:- Click Start, Administrative Tools, and Local Security Policy.
- Expand Local Policies.
- Select User Rights Assignment.
- Double-click Debug Programs and add the local administrator account.
- To upgrade LifeKeeper on the backup server Sys2:, run the setup program to upgrade LifeKeeper.
- To continue upgrading LifeKeeper, click Yes.
The existing LifeKeeper files are overwritten by the LifeKeeper installation. - If necessary, install the new LifeKeeper license using the License Manager utility. Do not reboot the backup server until all of the SIOS software is upgraded.
- To upgrade LifeKeeper for SQL on the backup server Sys2:, run the setup program to upgrade LifeKeeper for SQL.
- To upgrade SIOS DataKeeper on the backup server Sys2:, run the SIOS DataKeeper for setup program. To continue upgrading SIOS DataKeeper, click Yes.
- Open the Windows Control Panel and click Add/Remove Programs, and then check that the SIOS software reports the upgraded version.
- Reboot the backup server Sys2.
Upgrade the Primary Server
- Exit the LifeKeeper GUI and SIOS DataKeeper GUI on primary server Sys1.
Open a command window and enter C:\LK\bin\lkstop to stop all the LifeKeeper services. Wait until you see “LIFEKEEPER NOW STOPPED” before continuing.
If the lkstop command fails to stop the SIOS services properly, you may need to add Debug permissions to the Administrator account. To do this:- Click Start, Administrative Tools, and Local Security Policy.
- Expand Local Policies.
- Select User Rights Assignment.
- Double-click Debug Programs and add the local administrator account.
- To upgrade LifeKeeper on the primary server Sys1:, run the setup program to upgrade LifeKeeper.
- The existing LifeKeeper files are overwritten by the LifeKeeper installation. If necessary, install your new LifeKeeper license using the License Manager utility. Do not reboot the primary server until all of the SIOS software is upgraded.
- To upgrade LifeKeeper for SQL on the primary server Sys1:, run the setup program to upgrade LifeKeeper for SQL.
- To upgrade SIOS Data Replication on the primary server Sys1:, run the SIOS DataKeeper setup program.
- Open the Windows Control Panel, click Add/Remove Programs, and then check that the SIOS software reports the upgraded version.
- Reboot the primary server Sys1.
- To start the LifeKeeper GUI on Sys1, click Start, and then Programs, LifeKeeper, LifeKeeper GUI, and then login to Sys1.
SQL Server Updates
SQL Server updates differ from OS and LogRhythm updates due to the requirement to have access to the database and log volumes. Because SQL Server binaries are updated on an unlocked drive, but the SQL Server update also applies changes to databases on a mirrored drive, mirroring must stop in HA to perform the upgrade and resume after the process is complete.
Upgrade the Primary Server
- Open Lifekeeper as an administrator.
- Right-click the top-level resource and select Out of Service. Wait for resources to stop.
- Open DataKeeper as an administrator.
- Right-click the Vol.D and Vol.L res tags. Select "Pause and Unlock All Mirrors". Before proceeding, verify these drives are unlocked on the Secondary server and return to the Primary.
Install Microsoft SQL software updates on the Primary server.
Microsoft SQL services may be stopped during the installation of some Microsoft SQL software updates.- If required, reboot the Primary server.
- After the Microsoft SQL Server update, verify SQL is working correctly before applying the updates to the backup server.
Upgrade the Secondary Server
Through the Services MMC snap-in, start the SQL Server service on the Secondary server.
After Microsoft SQL is running on the Secondary server, Install Microsoft SQL software updates.
Microsoft SQL services may be stopped during the installation of some Microsoft SQL software updates.- If required, reboot the Secondary server.
- After the Microsoft SQL Server resource is active on System2, verify SQL is working correctly before returning to System1.
- Return to the Primary server.
- Open the DataKeeper GUI as an administrator, right-click Continue and Lock All Mirrors for the Vol.D and Vol.L restags.
A partial resync occurs. When the replicated volumes are in the Mirroring state, open LifeKeeper as an administrator. Right-click the top-level resource and bring the hierarchy back into service.
The mirrors must be in the Mirroring state prior to performing a manual in-service.When the platform update is complete, verify that all LogRhythm services are in-service on the Primary server.
System Monitor Agent Updates
The following steps detail the process to update/patch a High Availability/System Monitor Agent pairing.
- Remove the Top Level HA ResTag to prevent failover.
- Access the Secondary Node and update/patch as required.
- Restore the Top Level HA ResTag.
- Failover from Node 1 to Node 2.
Ensure the Secondary Node (Node 2) has come back online as expected and is heartbeating and processing within the SIEM.
At this step it is important to ensure that the Agent is operational on the Secondary Node (Node 2) prior to altering Node 1.
If there are any issues with Node 2, contact LogRhythm Support and fail back to Node 1 to restore operations in the meantime. If the upgrade/patching was successful, proceed to step 6.
- Remove the Top Level HA ResTag to prevent failover.
Access the Primary Node and update/patch as required, ensuring the changes are identical to the changes made to Node 2 in step 2.
- Restore the Top Level HA ResTag.
- Failover from Node 2 to Node 1.
Ensure the Primary Node (Node 1) has come back online as expected and is heartbeating and processing within the SIEM.
Similar to step 5, if there are any issues with Node 1, contact LogRhythm Support before continuing.
- Perform a test failover to verify that patching/upgrading is completed.