The Data Indexer (Indexer) provides next-generation persistence and search capabilities, as well as high-performance, distributed, and highly scalable indexing of machine and forensic data. Indexers can be clustered in a replicated configuration to enable high-availability, improved search performance, and support for a greater number of simultaneous users. Indexers store both the original and structured copy of data to enable search-based analytics. The Indexer is supported on Windows Server 2008 R2, Windows Server 2012 R2, and CentOS Linux 7.x Minimal, as follows:

  • Windows. You can install the Indexer on an XM Appliance, an upgraded Data Processor Appliance, your own server, or a virtual machine. This configuration is called a DPX, and the Indexer is "pinned" to the Data Processor.
  • Linux. You can install a single Indexer or a cluster of three to 10 Indexers on a Linux Indexer Appliance, your own server, or virtual machine. This configuration is called a DX, and the Indexer is installed alone.

For more information about installing or upgrading the Indexer, see the LogRhythm Software Installation Guide on the LogRhythm Community.

Indexer Services

The Indexer is a highly scalable, open-source, full-text search and analytics engine based on Elasticsearch. The full functionality of the Indexer is provided by the following micro services:

ServiceDescription
BulldozerRegisters the Elasticsearch cluster name and nodes in the EMDB. Writes cluster statistics to the EMDB for use in the Deployment Monitor.
CarpenterSynchronizes LogRhythm KB and deployment data to Data Indexer indexes.
ColumboExecutes query requests from LogRhythm components.
Elasticsearch ServiceLog persistence and indexing data store.
GoMaintainMaintains Data Indexer indices for disk space and time to live (TTL).
TransporterFacilitates interfacing to the Data Indexer through HTTP/REST.
WatchTowerReceives analytic data from CloudAI. If CloudAI is not in use in your deployment, this service remains idle, even though it is enabled.

Data Indexer File Locations

WindowsLinux
Data Indexer File Binaries
C:\Program Files\LogRhythm\Data Indexer/usr/local/logrhythm
Data Indexer Log Files

C:\Program Files\LogRhythm\Data Indexer\logs

C:\Program Files\LogRhythm\Data Indexer\Elasticsearch\logs

/var/log/elasticsearch

/var/log/persistent

Data Indexer logs- Repository (Default Path)

${DXDATAPATH}\elasticsearch\data

${DXDATAPATH} = D:\LRIndexer

/usr/local/logrhythm/db/elasticsearch/data
Data Indexer Service Start/Stop Scripts

C:\Program Files\LogRhythm\Data Indexer\tools\start-allservices.bat

C:\Program Files\LogRhythm\Data Indexer\tools\stop-allservices.bat

/usr/local/logrhythm/tools/start-all-serviceslinux.sh

/usr/local/logrhythm/tools/stop-all-serviceslinux.sh

Information About Automatic Maintenance

Automatic maintenance is governed by several Data Indexer settings in the Configuration Manager.

GoMaintain IndexManage Disk HWM (%disktuil)

The disk utilization limit indicates the percentage of disk utilization that triggers maintenance. The default is 80, which means that maintenance starts when the Elasticsearch data disk is 80% full. The value for Disk Util Limit should not be set higher than 80. This value can have an impact on the ability of Elasticsearch to store replica shards for the purpose of failover.

GoMaintain IndexManage Elasticsearch Head (%esheap)

The heap utilization limit is the maximum Elasticsearch heap usage above which GoMaintain performs index TTL management. The default is 85, which means that management begins when the heap pressure exceeds that amount.

GoMaintain TTL Logs (#indices)

The DX monitors Elasticsearch memory and DX storage capacity. GoMaintain tracks heap pressure on the nodes. If the pressure constantly crosses the threshold, GoMaintain decreases the number of days of indices by closing the index. Closing the index removes the resource needs of managing that data and relieves the heap pressure on Elasticsearch. GoMaintain continues to close days until the memory is under the warning threshold and continues to delete days based on the disk utilization setting of 80% by default.

The default config is -1. This value monitors the systems resources and automanages the time-to-live (TTL). You can configure a lower TTL by changing this number. If this number is no longer achievable, the DX sends a diagnostic warning and starts closing the indices.

Indices that have been closed by GoMaintain are not active searchable in 8.0.0 but are maintained for reference purposes. To see which indices are closed, you can run a curl command such as the following:

curl -s -XGET 'http://localhost:9200/_cat/indices?h=status,index' | awk '$1 == "close" {print $2}'

You can also open a browser to http://localhost:9200/_cat/indices?v to show both open and closed indices.

Indices can be reopened with the following query as long as you have enough heap memory and disk space to support this index. If you do not, it immediately closes again.

curl -XPOST 'localhost:9200/<index>/_open?pretty'

After you open the index in this way, you can investigate the data in either the Web Console or Client Console.

GoMaintain Force Merge

Force Merge settings are not preserved during an upgrade. They must be re-enabled in the Configuration Manager after performing an upgrade.

Do not modify any of the configuration options under Force Merge Config without the assistance of LogRhythm Support or Professional Services.
ParameterDescriptionDefault
GoMaintain ForceMergeThe Force Merge configuration combines index segments to improve search performance. In larger deployments, search performance could degrade over time due to a large number of segments. Force merge can alleviate this issue by optimizing older indices and reducing heap usage.Disabled
GoMaintain ForceMerge Hour (UTC hour of day)The hour of the day, in UTC, when the merge operation should begin. If Only Merge Periodically is set to false, GoMaintain merges segments continuously, and this setting is not used.1
GoMaintain ForceMerge Days to Exclude (#days)The number of days into the past of the index to merge.10

Logging of configuration and results for Force Merge can be found in C:\Program Files\LogRhythm\DataIndexer\logs\GoMaintain.log on Windows machines. On Linux, use the following command: /var/log/persistent/gomaintain.log.

If the Data Indexer is a multi-node cluster, there will only be a log for one of the nodes with a GoMaintain lock. To find out which node has the lock, use the following command: sudo /usr/local/logrhythm/tools/