Skip to main content
Skip table of contents

Failover and Failback

If you need to designate the Secondary site as the Active site, read this section for an overview of the failover process and for detailed instructions on how to switch between sites.

Overview of the Failover Process

The failover process depends on whether you purposely shut down the Primary site (Planned Failover) or the Primary site went down unexpectedly (Unplanned Failover). In either case, you must manually perform a failover, as outlined below:

  1. Manually initiate the failover, using the appropriate process:
    • Planned Failover. Go to the Primary (active) site and use the DR Control (DR_Monitoring.ps1) script to initiate the failover to the standby site. For more information, see Perform a Planned Failover below.
    • Unplanned Failover. Go to the Secondary (standby) site and use the DR Control (DR_Monitoring.ps1) script to initiate the failover to the standby site. For more information, see Perform an Unplanned Failover below.
  2. Update the shared DNS record so the Primary site components point to the IP address of the Secondary Platform Manager. Once the time to live (TTL) limit is reached, all systems in the Primary site reconnect to the newly activated Platform Manager.
  3. If a Data Processor is unavailable on the Primary site, reconnect Agents to a new Data Processor by changing the DNS records.

Perform a Planned Failover

  1. Access the Primary (active) Platform Manager.
  2. Click StartAll ProgramsLogRhythm, and Disaster Recovery.
  3. Right-click DR Control and click Run as administrator. Enter your local system administrator credentials.
  4. To display the DR Control Options, press D.
  5. To initiate the failover process, type F.
  6. At the Would you like to failover… prompt, type Y.
    The DR solution automatically performs the following tasks:
    1. Stops the Platform Manager services on the Primary (active) site.
    2. Makes sure all databases are in sync between the Primary and Secondary sites.
    3. Designates the Secondary Platform Manager as the Active site. In DR Controls, the Role column displays Standby.
  7. Update the DNS record so that all components point to the IP address of the Secondary Platform Manager.
    After the time to live (TTL) limit is reached, all systems reconnect to the newly activated Platform Manager.
  8. Go to the Secondary Platform Manager server and confirm that the Platform Manager services, which include the Alarming and Response Manager (ARM) service and the Job Manager service, have started. If necessary, also start the services for the Data Processors, Data Indexers, and the AI Engines.
  9. If necessary, reconnect remote systems to the Secondary Platform Manager by changing the DNS records. If a Data Processor is unavailable, reconnect Agents to a new Data Processor by changing the DNS records or by using the Deployment Manager in the SIEM Console to redirect them.

Perform an Unplanned Failover (Disaster Recovery Only)

This procedure applies for deployments that have only Disaster Recovery configured. If you have an HA + DR system, see Perform an Unplanned Failover (HA + DR).

  1. Go to the Secondary (standby) Platform Manager.
  2. Click StartAll ProgramsLogRhythm, and Disaster Recovery.
  3. Right-click DR Control and click Run as administrator. Enter your local system administrator credentials.
    The DR solution displays a warning, indicating that executing a failover from the standby system may result in data loss. Because the Primary site may have gone offline before the databases were fully synchronized, some data may be lost.
  4. To continue, type Y.
    The DR solution automatically performs the following tasks:
    1. Switches the Secondary Platform Manager to the Active state.
    2. Starts the Platform Manager services on the Secondary site.
    3. Loads the replicated databases. In the DR Controls, the Role column displays “Active,” and the State column is still “Disconnected” to indicate the databases are not currently replicating.
  5. When the failover is complete, press Enter to exit.

 Perform an Unplanned Failover (HA + DR)

This procedure applies for deployments that have both High Availability and Disaster Recovery configured. If you have a DR only system, see Perform an Unplanned Failover (Disaster Recovery Only).

  1. Go to the Secondary (standby) Platform Manager.
  2. Click StartAll ProgramsLogRhythm, and Disaster Recovery.
  3. Right-click DR Control and click Run as administrator. Enter your local system administrator credentials.
  4. To display the DR Control Options, type D.
  5. To initiate the failover, type F.
    The DR solution displays a warning, indicating that executing a failover from the standby system may result in data loss. Because the Primary site may have gone offline before the databases were fully synchronized, some data may be lost.
  6. To continue, type Y.
    The DR solution automatically performs the following tasks:
    1. Switches the Secondary Platform Manager to the Active state.
    2. Starts the Platform Manager services on the Secondary site.
    3. Loads the replicated databases.
    In the DR Controls, the Role column displays Active, and the State column is still Disconnected to indicate the databases are not currently replicating.
  7. Update the DNS record so that all components point to the IP address of the Secondary Platform Manager.
    After the Time to Live (TTL) limit is reached, all systems within the Primary site reconnect to the newly activated Platform Manager.
  8. If necessary, reconnect remote systems to the Secondary Platform Manager by changing the DNS records. If a Data Processor is unavailable on the Primary site, reconnect Agents to a new Data Processor by changing the DNS records or by using the Deployment Manager in the SIEM Console.

Resume Operations on the Primary Platform Manager

When the Primary Platform Manager is operational again, you can perform a failback to the Primary site as follows:

  1. Go to the Primary Platform Manager.
  2. Click StartAll ProgramsLogRhythm, and Disaster Recovery.
  3. Right-click DR Control and click Run as administrator. Enter your local system administrator credentials.
    If the DR Solution detects that the Primary Platform Manager is operational, the State column displays Suspended. This means that the Primary site is ready and waiting for data replication to resume.
  4. To display the DR Control Options, type D.
  5. To resume data replication, type R.
    The State column displays Synchronizing during this process.
  6. Wait for all databases to show Synchronized state, and then access the Primary Platform Manager.
  7. From the Primary Platform Manager, open DR Control.
  8. Type D to display the DR Control Options, and then type F to fail over to this site. At the Would you like to failover… prompt, type Y.
    The DR solution automatically performs the following tasks:
    1. Switches the Primary Platform Manager to the Active state.
    2. Starts the Platform Manager services on the Primary site.
  9. Update the DNS record so that all components point to the IP address of the Primary Platform Manager. 
    After the time to live (TTL) limit is reached, all systems reconnect to the newly activated Platform Manager.
JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.