Skip to main content
Skip table of contents

Data Indexer

The Data Indexer (Indexer) provides persistence and search capabilities, as well as high-performance, distributed, and highly scalable indexing of machine and forensic data. Indexers can be clustered in a replicated configuration to enable high-availability, improved search performance, and support for a greater number of simultaneous users. Indexers store both the original and structured copy of data to enable search-based analytics. The Indexer is supported on Windows and Linux, as follows:

  • Windows. You can install the Indexer on an XM Appliance, an upgraded Data Processor Appliance, your own server, or a virtual machine. This configuration is called a DPX, and the Indexer is "pinned" to the Data Processor.

  • Linux. You can install one or 3-10 physical hot nodes, and 1-10 warm nodes (optional) on a Linux Indexer Appliance(s), your own server(s), or virtual machine(s). This configuration is called a DX or DX cluster, and the Indexer is installed alone.

For more information about installing or upgrading the Indexer, see the LogRhythm Software Installation Guide on the LogRhythm Community.

Indexer Services

The Indexer is a highly scalable, open-source, full-text search and analytics engine based on Elasticsearch. The full functionality of the Indexer is provided by the following micro services:

Service

Description

Bulldozer

Registers the Elasticsearch cluster name and nodes in the EMDB. Writes cluster statistics to the EMDB for use in the Deployment Monitor.

Carpenter

Synchronizes LogRhythm KB and deployment data to Data Indexer EMDB_* indexes.

Columbo

Executes query requests from LogRhythm components.

Elasticsearch Service

Log persistence and indexing data store.

GoMaintain

Maintains Data Indexer indices for disk space and time to live (TTL).

Transporter

Facilitates interfacing to the Data Indexer through HTTP/REST for indexing data.

WatchTower

Receives analytic data from CloudAI. If CloudAI is not in use in your deployment, this service remains idle, even though it is enabled.

Data Indexer File Locations

Windows

Linux

Data Indexer File Binaries

C:\Program Files\LogRhythm\Data Indexer

/usr/local/logrhythm

Data Indexer Log Files

C:\Program Files\LogRhythm\Data Indexer\logs

C:\Program Files\LogRhythm\Data Indexer\Elasticsearch\logs

/var/log/elasticsearch

/var/log/persistent

Data Indexer logs- Repository (Default Path)

${DXDATAPATH}\elasticsearch\data

${DXDATAPATH} = D:\LRIndexer

/usr/local/logrhythm/db/elasticsearch/data

Data Indexer Service Start/Stop Scripts

C:\Program Files\LogRhythm\Data Indexer\tools\start-allservices.bat

C:\Program Files\LogRhythm\Data Indexer\tools\stop-allservices.bat

/usr/local/logrhythm/tools/start-all-serviceslinux.sh

/usr/local/logrhythm/tools/stop-all-serviceslinux.sh

Information About Automatic Maintenance

Automatic maintenance is governed by several Data Indexer settings in the Configuration Manager.

GoMaintain IndexManage Disk HWM (%disktuil)

The disk utilization limit indicates the percentage of disk utilization that triggers maintenance. The recommended value depends on the type of disks used in your Hot DX nodes - 90% for SSD and 80% for HDD. This value triggers when maintenance starts based on disk consumption of the smallest Hot node in the cluster. Maintenance for GoMaintain will either delete the oldest index or move it from Hot to Warm tier if Warm tier is present in the cluster. The value for Disk Util Limit should not be set higher than 90. This value can have an impact on the ability of Elasticsearch to store replica shards for the purpose of failover.

GoMaintain IndexManage Elasticsearch Head (%esheap)

The heap utilization limit is the maximum Elasticsearch heap usage above which GoMaintain performs index TTL management. The default is 85, which means that management begins when the heap pressure exceeds that amount.

GoMaintain TTL Logs (#indices)

The DX monitors Elasticsearch memory and DX storage capacity. GoMaintain tracks heap pressure on the nodes. If the pressure constantly crosses the threshold, GoMaintain decreases the number of days of indices by closing the index. Closing the index removes the resource needs of managing that data and relieves the heap pressure on Elasticsearch. GoMaintain continues to close days until the memory is under the warning threshold and continues to delete days based on the disk utilization setting of 80% by default.

The default config is -1. This value monitors the systems resources and automanages the time-to-live (TTL). You can configure a lower TTL by changing this number. If this number is no longer achievable due to heap consumption, the DX sends a diagnostic warning and starts closing the indices.

Indices that have been closed by GoMaintain due to Heap Consumption are not active searchable but are maintained for reference purposes. To see which indices are closed, you can run a curl command such as the following:

curl -s -XGET 'http://localhost:9200/_cat/indices?h=status,index' | awk '$1 == "close" {print $2}'

You can also open a browser to http://localhost:9200/_cat/indices?v to show both open and closed indices.

Indices can be reopened with the following query as long as you have enough heap memory and disk space to support this index. If you do not, it immediately closes again.

curl -XPOST 'localhost:9200/<index>/_open?pretty'

After you open the index in this way, you can investigate the data in either the Web Console or Client Console.

GoMaintain TTL and Disk Settings for Restored Indices 

Users can now enable/disable the maintenance settings in the Configuration Manager for indices created by SecondLook. This allows the user to configure GoMaintain’s TTL and Disk settings for restored indices. The following changes have been made in the Configuration Manager:

Setting

Field Type

Description

Default 

GoMaintain Logsar - Maintenance

Toggle: Enabled/Disabled

Enable or disable automatic maintenance of archive indices created by SecondLook.

Disabled

GoMaintain TTL Logsar - (#indices)

Text Box

Range: -1 to 100000000

Maximum number of logsar indices to store. Default setting (-1) automatically manages number of indices based on available resources.

-1

GoMaintain Max. Archive Index Disk Size 

Text Box

Range: -1 to 100000000

Maximum disk size in GB, above which GoMaintain performs index TTL management.

100

GoMaintain Force Merge

Force Merge settings allow for the merging of indexes into fewer segments, in older Elasticsearch versions this will reduce Heap consumption with the trade-off of significantly increasing disk activity. In newer versions of LogRhythm (7.18+) this feature is no longer used and recommended to be disabled.

Do not modify any of the configuration options under Force Merge Config without the assistance of LogRhythm Support or Professional Services.

Parameter

Description

Default

GoMaintain ForceMerge

The Force Merge configuration combines index segments to improve search performance. In larger deployments, search performance could degrade over time due to a large number of segments. Force merge can alleviate this issue by optimizing older indices and reducing heap usage.

Disabled

GoMaintain ForceMerge Hour (UTC hour of day)

The hour of the day, in UTC, when the merge operation should begin. If Only Merge Periodically is set to false, GoMaintain merges segments continuously, and this setting is not used.

1

GoMaintain ForceMerge Days to Exclude (#days)

The number of days into the past of the index to merge.

10

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.