Skip to main content
Skip table of contents

Disaster Recovery Administration Checklist

This checklist can be used to record your progress throughout the process of administering a LogRhythm Disaster Recovery deployment.

Regular Monitoring Tasks

Replication Status Monitoring

  • [ ] Check replication status using LogRhythm DR Control:

    • [ ] Run DR Control (Start > All Programs > LogRhythm > Disaster Recovery > DR Control) as administrator

    • [ ] Verify databases show "Synchronized" or "Synchronizing" status

    • [ ] Review metrics (SendQueue, SendRate, RedoQueue, RedoRate, EstimatedRecoveryTime, SyncPerformance)

    • [ ] Exit panel with 'Q'

  • [ ] Alternatively, use AlwaysOn Availability Group Dashboard:

    • [ ] Start SQL Server Management Studio and log in as administrator

    • [ ] Expand AlwaysOn High Availability folder and Availability Groups folder

    • [ ] Right-click Availability Group and select "Show Dashboard"

Replication Mode Management

  • [ ] Review current replication mode (Asynchronous or Synchronous)

  • [ ] Determine if mode changes are needed based on:

    • [ ] Current network performance

    • [ ] Recovery Point Objective (RPO) requirements

    • [ ] Performance requirements

    • [ ] Distance between Primary and Secondary sites

Planned Failover Procedure

Pre-Failover Steps

  • [ ] Schedule maintenance window for failover

  • [ ] Notify all relevant stakeholders of planned failover

  • [ ] Verify all databases are synchronized between Primary and Secondary sites

  • [ ] Verify Secondary site components are ready to become active

Execute Planned Failover

  • [ ] Access Primary (active) Platform Manager

  • [ ] Run DR Control (Start > All Programs > LogRhythm > Disaster Recovery > DR Control) as administrator

  • [ ] Press 'D' to display DR Control Options

  • [ ] Type 'F' to initiate failover process

  • [ ] Confirm with 'Y' when prompted

  • [ ] Wait for automatic tasks to complete:

    • [ ] Platform Manager services stopping on Primary site

    • [ ] Database synchronization verification

    • [ ] Secondary Platform Manager designation as Active site

Post-Failover Steps

  • [ ] Verify DNS record updates (automatic or manual) to point to Secondary Platform Manager

  • [ ] Wait for TTL limit to be reached

  • [ ] Confirm Platform Manager services have started on Secondary site:

    • [ ] Alarming and Response Manager (ARM) service

    • [ ] Job Manager service

  • [ ] Start services for Data Processors, Data Indexers, and AI Engines if necessary

  • [ ] Verify remote systems reconnection to Secondary Platform Manager

  • [ ] Test system functionality on Secondary site

  • [ ] Document failover completion

Unplanned Failover Procedure (DR Only)

Execute Unplanned Failover

  • [ ] Go to Secondary (standby) Platform Manager

  • [ ] Run DR Control (Start > All Programs > LogRhythm > Disaster Recovery > DR Control) as administrator

  • [ ] Acknowledge potential data loss warning by typing 'Y'

  • [ ] Wait for automatic tasks to complete:

    • [ ] Secondary Platform Manager switching to Active state

    • [ ] Platform Manager services starting on Secondary site

    • [ ] Replicated databases loading

  • [ ] Press Enter to exit when failover is complete

Post-Failover Steps

  • [ ] Update DNS record to point to Secondary Platform Manager

  • [ ] Wait for TTL limit to be reached

  • [ ] Reconnect remote systems to Secondary Platform Manager

  • [ ] Redirect Agents to new Data Processor if necessary

  • [ ] Test system functionality on Secondary site

  • [ ] Document failover completion and any data loss

Unplanned Failover Procedure (HA + DR)

Execute Unplanned Failover

  • [ ] Go to Secondary (standby) Platform Manager

  • [ ] Run DR Control as administrator

  • [ ] Type 'D' to display DR Control Options

  • [ ] Type 'F' to initiate failover

  • [ ] Acknowledge potential data loss warning by typing 'Y'

  • [ ] Wait for automatic tasks to complete

  • [ ] Exit when failover is complete

Post-Failover Steps

  • [ ] Update DNS record to point to Secondary Platform Manager

  • [ ] Wait for TTL limit to be reached

  • [ ] Reconnect remote systems to Secondary Platform Manager

  • [ ] Redirect Agents if Data Processor is unavailable

  • [ ] Test system functionality on Secondary site

  • [ ] Document failover completion and any data loss

Failback Procedure (Resume Operations on Primary)

Pre-Failback Verification

  • [ ] Verify Primary Platform Manager is operational

  • [ ] Run DR Control on Primary Platform Manager as administrator

  • [ ] Verify State column displays "Suspended" (ready for data replication)

Execute Failback

  • [ ] Type 'D' to display DR Control Options

  • [ ] Type 'R' to resume data replication

  • [ ] Wait for all databases to show "Synchronized" state

  • [ ] Open DR Control on Primary Platform Manager

  • [ ] Type 'D' to display DR Control Options

  • [ ] Type 'F' to fail over to Primary site

  • [ ] Confirm with 'Y' when prompted

  • [ ] Wait for automatic tasks to complete

Post-Failback Steps

  • [ ] Update DNS record to point to Primary Platform Manager

  • [ ] Wait for TTL limit to be reached

  • [ ] Verify all systems reconnect to Primary Platform Manager

  • [ ] Test system functionality on Primary site

  • [ ] Document failback completion

IP Address Changes (Re-IP Procedure)

Preparation

  • [ ] Document current IP configuration:

    • [ ] Management IPs

    • [ ] Failover IPs

    • [ ] Replication IPs

    • [ ] Cluster (DNS) name

    • [ ] Replication ports

  • [ ] Plan new IP configuration

  • [ ] Schedule maintenance window for changes

Execute Re-IP

  • [ ] From DR Install folder, run DR Re-IP Uninstall.exe as Administrator

  • [ ] Click the "Re-IP" tab

  • [ ] Enter new IP addresses for the deployment

  • [ ] Validate IPs and DNS name

  • [ ] Click "Re-IP" to run the script

  • [ ] Review script output to verify success

Post Re-IP Verification

  • [ ] Test replication status using DR Control

  • [ ] Verify DNS resolution with new IP addresses

  • [ ] Test failover functionality with new configuration

  • [ ] Document IP changes

DR Uninstallation Procedure

Preparation

  • [ ] Document current configuration

  • [ ] Back up any critical data

  • [ ] Schedule maintenance window

  • [ ] Notify all relevant stakeholders

Execute Uninstallation

  • [ ] From DR Install folder on primary server, run DR Re-IP Uninstall.exe as Administrator

  • [ ] Click the "Uninstall" tab

  • [ ] Review description of uninstall process

  • [ ] Click "Uninstall" and follow confirmation prompts

  • [ ] Enter sysadmin-level SQL credentials when prompted

  • [ ] Review script output and address any errors

  • [ ] Repeat steps on secondary server

  • [ ] For secondary server, correctly identify deployment type when prompted

Post-Uninstallation Verification

  • [ ] Verify no databases are in Synchronizing, Not Synchronizing, Restoring, or Suspect state

  • [ ] Confirm LogRhythm folder in Windows Task Scheduler has been removed

  • [ ] Verify CONSUL_CLIENT environment variable does not exist on XM/PM

  • [ ] Confirm all LogRhythm PM services are running

  • [ ] Verify SQL job "LogRhythm DR Job Management" is gone and all remaining SQL Server agent jobs are enabled

  • [ ] Update components to use management IP instead of shared DNS or failover IPs

  • [ ] Re-run LRII and remove host record for Secondary server

  • [ ] In Deployment Properties, change "Does your deployment include Disaster Recovery (DR)?" to No

  • [ ] Run "Get-Cluster" from elevated PowerShell to verify cluster service is not running

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.