Skip to main content
Skip table of contents

High Availability Patching Procedure

Use these procedures to patch the Windows OS and SQL.

The LogRhythm High Availability (HA) Solution facilitates the seamless transfer of active resources between nodes within the cluster, thereby minimizing service interruptions during the application of security patches, operating system upgrades, SQL database maintenance, and driver updates.

Lightweight Patching (Non-Mirrored Drives Impacted)

Lightweight patching refers to updates applied exclusively to the operating system drive (typically the C:\ drive), such as Windows security patches (Not SQL patches). In this scenario:

  • The inactive node (passive) is always the patch target.

  • This approach provides a direct rollback path:

    1. Patch the secondary node (e.g., HA2).

    2. Fail over to HA2.

    3. If issues arise, fail back to the primary node (e.g., HA1).

    4. If HA2 operates as expected, proceed to patch HA1.

  • Failover is required as part of the validation and rollback process.

This method minimizes risk and downtime by ensuring that one node remains operational and unmodified until the patch is verified.

Heavy Patching (Mirrored Drives Impacted)

Heavy patching involves updates to all drives, including potentially mirrored drives (commonly the D:\, S:\, and L:\ drives), which therefore requires both sides to be patched in alignment with each other. This activity is typically associated with binary or SQL patches (e.g. Service Pack Updates). In these cases:

  • Both nodes must be patched in sequence, and downtime of the active side is required to complete the process.

  • The secondary node is patched first, primarily for compatibility testing (e.g., verifying SQL services start correctly after patching).

  • The primary node must then be taken fully offline to apply the same updates.

  • Failover is not required during the patching process since both nodes are brought to parity, but it is recommended post-patching to validate full system functionality.

This approach provides the most comprehensive coverage but typically results in a longer period of downtime. It allows all patches to be applied in a single, coordinated activity. While this method can also be used for operating system patching, we generally treat OS updates separately due to their higher frequency and lower overall impact.

Health Check

Before making any updates, a health check must be performed to ensure the deployment is functioning properly. Skipping this step can lead to system issues, extended downtime, and potential data loss. Proceeding without verification significantly increases the risk of service disruption.

  • Conduct a Comprehensive Health Check
    Ensure all system components are online and operating as expected before initiating any updates.

  • Verify Database Integrity via Client Console
    Log in to the Client Console to confirm the integrity and operational status of both the EMDB and LogMart databases.

  • Confirm Component Responsiveness in Deployment Monitor
    Access the Deployment Monitor to verify that all core components are actively heartbeating and responding within the expected one-minute interval.

  • Perform Functional Validation Tests
    Execute additional functionality tests as needed, including validation of AIE Alarms, Drilldowns, and Search capabilities.

  • Review SQL Maintenance Job Status
    Open SQL Server Management Studio and confirm that all scheduled SQL maintenance jobs have completed successfully.

Prerequisites and Pre-Patching Requirements

Before beginning the patching process, please ensure the following prerequisites are met:

  1. System Database Backups: Verify that current backups exist for all system databases (Master, MSDB, and Model). These backups serve as your rollback plan in case remediation is required.

  2. Antivirus and Endpoint Detection & Response (AV/EDR): Ensure that all AV/EDR services are disabled before patching begins.

  3. Node Isolation: Remove Service Level protection declaring "out of service" using LifeKeeper (LK) - this will allow services to start/stop independently.

  4. Mirror Unlock: Unlock all mirrors via DataKeeper (DK) - this will unlock the drives on both the target/source sides.

Patching Steps

Phase 1: HA2 Passive Node Patching (both Lightweight and Heavy Patching)

The passive node (HA2) is always patched first to validate system stability with the applied patches, regardless of patch activity.

  1. Start the SQL Server instance on HA2, ensuring both the SQL Server Agent and SQL Server services are running.

  2. Apply required patch updates to the HA2 node.

  3. Reboot the HA2 node if required by the update process.

  4. Verify that all volumes remain unlocked following the patching and reboot cycle.

  5. If volumes are locked after patching, unlock them using the GUI first. If the GUI is unresponsive, use the command line to unlock them manually:

    CODE
    cd %extmirrbase%
       EMCMD . UNLOCKVOLUME D
       EMCMD . UNLOCKVOLUME L
       EMCMD . UNLOCKVOLUME S
  6. Ensure all required patches have been successfully installed.

  7. Start the SQL Server instance on HA2, confirming that both the SQL Server Agent and SQL Server services are operational.

  8. Review the SQL Server error log to confirm that the start up process completed without errors.

At this stage, the patching process diverges depending on the required patching level. Refer to the Lightweight vs. Heavy patching procedures for guidance.

Phase 2 for Lightweight Patching: Passive Production Node Patching

This section covers phase 2 for Lightweight patching. For Heavy patching, skip to the “Phase 2 for Heavy Patching” section.

OS updates should already be applied to the Standby node. The node should be rebooted as required to make sure the site is compatibly functional.

At this stage, the LifeKeeper GUI top-level dependency (typically XM_ResTag) should be “Out of Service” and all DataKeeper mirrors should be “Paused & Unlocked” on both sides. HA1 should still be operationally functional at this point.

Phase 2A for Lightweight Patching - Validate HA2

  1. Ensure all patches have been applied and have cleared the Windows “Check for Updates” feature on HA2 node.

  2. Post HA2 reboots, review the Microsoft Event Logs (Update/System/Application/Security) for warnings and errors to confirm HA2 is operating normally.

  3. (Important!) Restore the DataKeeper mirrors by selecting “Continue & Locking” for each one, ensuring that all mirror statuses return to “Mirroring.”

Phase 2B for Lightweight Patching - Failover and Patch HA1

  1. Once mirrors are synchronized again, conduct a failover onto the HA2 site using the LifeKeeper GUI to bring HA2 top level dependency “In Service.”

  2. (Important!) Once failover is complete, refer to the Health Check section above to verify that the system is functioning normally on the HA2 (now active) site. If any issues are detected, do not proceed with patching HA1. Instead, fail back to HA1 and open a support case for further investigation.
    At this stage, you should have a fully patched functional SIEM on the now-active HA2 site.

  3. It is advisable to “Pause & Unlock” Mirrors on the now-passive HA1 site.

  4. Repeat steps 1-3 on the HA1 passive, bringing the site up to date with HA2.

  5. Use the LifeKeeper GUI to bring the Resource Hierarchy “In Service” on HA1 (failing back over) to validate HA1 is also functionally compatible.

Proceed to the Phase 3 section to continue with Lightweight patching.

Phase 2 for Heavy Patching: Combined Production Node Patching

This section covers phase 2 for Heavy patching. For Lightweight patching, refer back to the “Phase 2 for Lightweight Patching” section.

OS updates should already be applied to the Standby node. The node should be rebooted as required to make sure the site is compatibly functional.

Phase 2A - Validate HA2

  1. Ensure all patches have been applied and have cleared the Windows “Check for Updates” feature on the HA2 node.

  2. Following HA2 reboots, review the Microsoft Event Logs (Update/System/Application/Security) for warnings and errors to confirm HA2 is operating normally.

  3. Confirm that LifeKeeper is out of service and that DataKeeper mirrors remain unlocked.

Phase 2B - Patching HA1 (No Failover)

This is the stage of patching where live operations will be directly impacted.

  1. Take the system offline by stopping all services connected to the database. Stop all core LogRhythm-specific services (mainly DP, AIE, ARM, and Job Manager services - observed via “services.msc”) using the following PowerShell command:

    CODE
     gsv -displayname 'LogRhythm*' | stop-service
  2. Verify in the SQL Job Activity Monitor that there are no new connections, freeing the Databases.

  3. Repeat Phase 1 steps 1 through 8 on the HA1 node to apply patches and verify system stability.

  4. After all reboots are complete:

    • Restart LifeKeeper and DataKeeper services.

    • Confirm that all services on the HA1 server remain stopped.

    • Execute the following command:

      CODE
      gsv -displayname 'LogRhythm*' | stop-service  
  5. Validate that SQL Server starts successfully. Review the SQL Server error log to confirm that no component error messages are present.

  6. Restart all services, including distributed components:

    CODE
    gsv -displayname 'LogRhythm*' | start-service
  7. Lock mirrors in DataKeeper and verify that the DataKeeper synchronization process has begun.

  8. Confirm SIEM is fully operational, having patched both HA1+HA2.

  9. (Optional.) Conduct failover functional validation.

Phase 3: Post-Patching Validation

Comprehensive validation ensures that all components are functioning correctly and that the system is ready for production use.

  1. Execute a full health check to confirm that all components have come online properly.

  2. Log in to the Client Console to verify the integrity of the EMDB and LogMart databases.

  3. Access Deployment Monitor to confirm that all core components are heartbeating and responding within the expected one-minute interval.

  4. Perform additional functionality testing as required, including verification of AIE Alarms, Drilldowns, and Searches.

  5. (Important!) Once DataKeeper mirror synchronization is confirmed complete, restore LifeKeeper to "In Service" status.

  6. (Optional.) Conduct a High Availability failover test to validate functionality on both nodes.
    Because system stability has been verified throughout the patching process, this step is optional for final validation.

SIOS Updates

Before applying any patches or upgrades, check the Official SIOS Documentation Site for compatibility details.

You may upgrade from previous versions of LifeKeeper for Windows and SIOS DataKeeper for Windows while preserving your resource hierarchies by using the procedure below.

Upgrade Procedure

The following procedure outlines the upgrade process for SIOS LifeKeeper, DataKeeper, and LifeKeeper for SQL. You should first upgrade LifeKeeper and LifeKeeper for SQL before upgrading SIOS DataKeeper. The LifeKeeper Services and SIOS DataKeeper Service are stopped during the upgrade process. A system reboot is required after upgrading all three components.

Given two systems (HA1 and HA2), with HA1 being the primary (active) server, perform the following steps to upgrade to LifeKeeper and SIOS DataKeeper:

Upgrade the Backup Server

  1. Exit the LifeKeeper GUI and SIOS DataKeeper GUI on backup server HA2.

  2. Open a command window and enter “C:\LK\bin\lkstop” to stop all the LifeKeeper services. Wait until you see “LIFEKEEPER NOW STOPPED” before continuing.

If the lkstop command fails to stop the SIOS services properly, you may need to add Debug permissions to the Administrator account. To do this:

  1. Click Start, Administrative Tools, and Local Security Policy.

  2. Expand Local Policies.

  3. Select User Rights Assignment.

  4. Double-click Debug Programs and add the local administrator account.

  1. To upgrade LifeKeeper on the backup server HA2, run the setup program to upgrade LifeKeeper.

  2. To continue upgrading LifeKeeper, click Yes.
    The existing LifeKeeper files are overwritten by the LifeKeeper installation. 

  3. If necessary, install the new LifeKeeper license using the License Manager utility.
    Do not reboot the backup server until all of the SIOS software is upgraded.

  4. To upgrade LifeKeeper for SQL on the backup server HA2, run the setup program to upgrade LifeKeeper for SQL.

  5. To upgrade SIOS DataKeeper on the backup server HA2, run the SIOS DataKeeper setup program.
    To continue upgrading SIOS DataKeeper, click Yes.

  6. Open the Windows Control Panel and click Add/Remove Programs, and then check that the SIOS software reports the upgraded version.

  7. Reboot the backup server HA2.

Upgrade the Primary Server

  1. Exit the LifeKeeper GUI and SIOS DataKeeper GUI on primary server HA1.

  2. Open a command window and enter C:\LK\bin\lkstop to stop all the LifeKeeper services. Wait until you see “LIFEKEEPER NOW STOPPED” before continuing.

If the lkstop command fails to stop the SIOS services properly, you may need to add Debug permissions to the Administrator account. To do this:

  1. Click Start, Administrative Tools, and Local Security Policy.

  2. Expand Local Policies.

  3. Select User Rights Assignment.

  4. Double-click Debug Programs and add the local administrator account.

  1. To upgrade LifeKeeper on the primary server HA1, run the setup program to upgrade LifeKeeper.

  2. The existing LifeKeeper files are overwritten by the LifeKeeper installation.

  3. If necessary, install your new LifeKeeper license using the License Manager utility.
    Do not reboot the primary server until all of the SIOS software is upgraded.

  4. To upgrade LifeKeeper for SQL on the primary server HA1, run the setup program to upgrade LifeKeeper for SQL.

  5. To upgrade SIOS Data Replication on the primary server HA1, run the SIOS DataKeeper setup program.

  6. Open the Windows Control Panel, click Add/Remove Programs, and then check that the SIOS software reports the upgraded version.

  7. Reboot the primary server HA1.

  8. To start the LifeKeeper GUI on HA1, click Start, and then ProgramsLifeKeeperLifeKeeper GUI, and then login to HA1.

System Monitor Agent Updates

For a dedicated HA system monitors that are not present on the host PM/XM, the same patch process should be applied as outlined in the steps above. Lightweight patching is the most commonly applied framework due to typically containing only one mirrored drive and not directly hosting Database information.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.