Automatic Host Contextualization
Automatic Host Contextualization has been disabled by default for almost all log sources as of Knowledge Base 7.1.507 and Knowledge Base 6.1.507. If you have Knowledge Base synced to one of those versions, you can re-enable automatic host contextualization on a log source by creating a custom policy copy and re-enabling this feature on the rules that require it.
There are several fields that store information about hosts that are referenced within a log message:
- Host (Impacted). The host or device impacted by log activity.
- Host (Origin). The host or device from which log activity originated.
- Interface (Impacted). The impacted interface number of a device or the physical port number of a switch.
- Interface (Origin). The origin interface number of a device or the physical port number of a switch.
- MAC Address (Impacted). The MAC address of the impacted host or device.
- MAC Address (Origin). The MAC address of the origin host or device.
- NAT IP Address (Impacted). The IP address from which the impacted IP was translated via NAT device logs.
- NAT IP Address (Origin). The IP address from which the origin IP was translated via NAT device logs.
The stored value for these fields is often derived from values parsed from IP address and host name tags. For example, the value parsed for the Source IP tag is stored in Origin Host, and the value parsed for the Destination IP tag is stored in Impacted Host.
However, system logs do not consistently store the origin/source or impacted/destination values in the same respective parsing field as is often the case for network devices reporting on network flow data. These devices do not contextualize client vs. server traffic and report the server as Source IP (SIP) in one log and server as Destination IP (DIP) in another, even when it is part of the same network flow.
Based on the port values, LogRhythm can infer the relationship of the two hosts if a log contains parsed values for the following fields:
- Source Port (SPort)
- Destination Port (DPort)
Port values can reliably determine which system is running a server process (impacted host/DIP/DName) vs. the system connecting to that server process (origin host/SIP/SName). This is performed by referring to IANA port allocation rules combined with internal known port lookup tables.
Automatic Host Contextualization Algorithm
The process LogRhythm uses to evaluate port numbers to determine whether it is an origin host or impacted host is to check to see if values parsed for SPort and DPort are a mapping port:
- If both SPort and DPort are mapping ports, infer via IANA logic.
- If one value is a mapping port, host associated with mapped port = impacted host.
IANA Inference Logic
- If value parsed for DPort <= 1023 AND value parsed for SPort > 1023Impacted host or port = parsed DPort host or port values
- If value parsed for SPort <= 1023 AND value parsed for DPort > 1023Impacted host or port = parsed SPort host or port values
- If value parsed for DPort (>= 1024 AND <= 49151) AND value parsed for SPort > 49151Impacted host or port = parsed DPort host or port values
- If value parsed for SPort (>= 1024 AND <= 49151) AND value parsed for DPort > 49151Impacted host or port = parsed SPort host or port values
- Else, Impacted host or port = parsed DPort host or port values
Automatic Host Contextualization Processing
Automatic Host Contextualization is performed only if all the following are true:
- The log has parsed values for Origin Host, Impacted Host, Origin Port, and Impacted Port.
- The host context is set Tags Normal or Tags Reversed.
- The service context is set to Tags Normal or Tags Reversed.
One of the key reasons to contextualize a host or service automatically is to improve the aggregation of log data for unique IP and port combinations. Currently, network data can be aggregated on the following fields:
- Origin Host
- Impacted Host
- Origin Port (disabled by default LogMartMode)
- Impacted Port
If the Impacted Port is the origin port and, as a result, is random in nature, aggregation is much less effective and LogMart utilization is reduced.
As a solution, whenever Automatic Host Contextualization is performed, LogMart is updated according to the following rules:
- Origin Port is set to Null regardless LogMartMode setting.
- The parsed value for Impacted Port is saved if any of the following conditions are true:
- Impacted port is determined via Port Mapping.
- Impacted port is determined via IANA algorithm.
- If none of the above are true:
- Impacted Port is set to NULL.
- ServiceId is set to one of the following three values:
- Unknown UDP
- Unknown TCP
- This value is also set for the associated log and event.
Considerations for Bytes In/Out and Items In/Out
Because bytes in/out and items in/out always pertain to the value stored for impacted host, the values parsed must be set accordingly. Therefore, if host context is determined to be reversed (what was parsed for SIP/SName is stored as DIP/DName), the values parsed for bytes in/out and items in/out are also reversed.