Customers can now tell when their services are down, memory is too high, and CPU is spiking for each system in their deployment or for their deployment as a whole. Using Kapacitor, an alerting engine that works with InfluxDB, customers can set up alerts to let them know when their LogRhythm system is not performing well.
This feature is enabled by default as part of the Common Components installation. The Metrics Collection service are installed on all components you wish to collect metrics from. The Metrics Database and Metrics Web UI services are installed on only one server in the deployment, the Platform Manager.
This feature was created to give us an out-of-band solution to display health of the LogRhythm deployment. It will speed up troubleshooting on issues because we can look at the deployment as a whole rather than individual pieces of it.
There are three services associated with Metrics Collection. The Metrics Collection service is pushed out as part of the Common Components installation. The Metrics Database and Metrics Web UI services are installed on the Platform Manager. Metrics Collection is the service that gathers all the metrics from the server it is installed on and sends them to the Metrics Database to store. The Metrics Database stores the data for a default length of seven days and the Metrics Web UI retrieves the data when a dashboard is configured to view the metrics.
The Metrics Collection service utilizes Telegraf to collect performance metrics from the system it runs on. It listens for StatsD formatted metrics on the local host over UDP port 8125 for Windows servers only. It collects system information such as Disk, RAM, CPU, and Port metrics on all systems. It then forwards these metrics to the Metrics Database over TCP port 8076.
The Metrics Database service listens externally on port 8076 for the traffic coming from the Metrics Collection service. It utilizes InfluxDB as a persistence layer for time-series metrics and stores them for a default of seven days in 8.0.0. This storage amount may be increased in a future release.
Metrics Web UI
The Metrics Web UI shows all the gathered data. It utilizes Grafana to display and explore the data.
The LogRhythm Metrics Services provides dashboards for the following components and services:
- Case API
- Case API Endpoints
- Hardware Usage
- Deployment View
- Single Host View
- Metrics Health Service
- Open Collector
- Log Distribution Services
- Platform Manager
- AIE Auto Cache Drill Down
- Notification Service
- Web Indexer
The logs for all Common Components are handled by procman-beta so old log files are cleaned up. The most you should have on a system per service is 50MB.
- Logs for Metrics Collection are located in C:\Program Files\LogRhythm\LogRhythm Common\logs. The LogRhythm Metrics Collection.log is available on all LogRhythm appliances.
- Logs for LogRhythm Metrics services are located at C:\Program Files\LogRhythm\LogRhythm Metrics\logs. The LogRhythm Metrics Database.log and LogRhythm Metrics Web UI.log are only available on one server in your deployment, usually the Platform Manager.