Query Exported Axon Data with Athena
Using AWS Athena, you can query and search through your exported Axon data to review and hunt for information.
Configure AWS Glue
AWS Glue allows you to integrate data from across multiple sources, and configuring this service is necessary to view exported Axon data in AWS Athena.
To configure AWS Glue for the Axon Schema, from the AWS console home screen:
In the left-hand panel, click Analytics, and then click AWS Glue.
In the left-hand panel, expand Data Catalog and then click Databases.
Click Add database.
Create a new database called “Axon”.
Enter an optional description and location.
Click Create database.
In the left-hand panel, expand Data Catalog again and select Tables.
Click Add table.
In the “Set table properties” section, enter the following information:
Name the table cdo_schema.
From the Database dropdown, select the newly-created Axon database.
Under Data store, click Browse S3 and select the bucket created in the Axon Data Export topic.
Under Data format, click JSON.
Leave all of the unmentioned fields at their default settings.
Click Next.
In the section “Choose or define schema,” click Edit schema as JSON.
Copy and paste the JSON from the following file into the given space: Axon Schema for AWS Glue
Click Save.
Click Next.
Click Create.
The Axon schema fields successfully appear in the created table.
Query Data Using Athena
Once the Axon schema has been configured using AWS Glue, you can begin using AWS Athena to query your exported log data.
For a more in-depth explanation of AWS Athena, refer to the Athena documentation:
The following fields are currently available to be queried using AWS Athena. Additional fields will be added for more in-depth searches in the future. For more information on these fields, refer to Axon Data Schema Fields.
Field Name | Canonical ID |
---|---|
datetime | general_information.standard_message_time |
Message ID | general_information.message_id |
Raw | general_information.raw_message |
Common Event | general_information.common_event_name |
Log Source | general_information.log_source.type_name |
Log Source Type | general_information.log_source.type_id |
Origin Host IP | origin.host.ip_address.value |
Target Host IP | target.host.ip_address.value |
Origin Host Name | origin.host.name |
Target Host Name | target.host.name |
Origin Account Name | origin.account.name |
Target Account Name | target.account.name |
To query your data within Athena, from the AWS Console homepage:
In the left-hand panel, click Analytics, and then click Athena.
Open the Data source drop-list and select the bucket created in the Axon Data Export topic.
Open the Database drop-list and select the database created above.
In the Query panel on the right, enter your SQL-like query.
Click Run.
The search results appear at the bottom of the page in the Query results section.
Athena charges by the amount of data scanned. Refer to your Athena account information to learn more on the pricing.
Basic Athena Query Examples
The following examples can help you get started with finding common information needed using Athena queries. Because many queries involve searching through Axon’s common events, you can refer to the Axon Common Events List for more information on those fields.
Because Athena charges by the amount of data scanned, we recommend including a LIMIT in all of your queries to ensure that a search does not become too large.
How can I limit the data returned?
SELECT * from day_08
LIMIT 10
What log source types are available?
select "general_information.log_source.type_name"
FROM day_08
GROUP BY DISTINCT "general_information.log_source.type_name"
LIMIT 10000
What common events are available?
select "general_information.common_event_name"
FROM day_08
GROUP BY DISTINCT "general_information.common_event_name"
LIMIT 10000
How can I show which accounts were created?
select "origin.account.name", "target.account.name", "general_information.standard_message_time"
FROM day_05 where json_array_contains("general_information.common_event_name", 'Account Created')
LIMIT 1
# general_information.standard_message_time target.account.name origin.account.name
1 2015-08-12 Workstation1$ admin
How can I search between two dates?
select * FROM day_05
where "general_information.log_source.type_name" = 'Palo Alto NGFW' and ("origin.host.ip_address.value" = '10.1.1.2' OR "target.host.ip_address.value"= '10.1.1.2') and
"general_information.standard_message_time" between TIMESTAMP '2024-05-05' and TIMESTAMP '2024-05-06'
LIMIT 100
How can I search a raw log?
select * FROM day_08
where "general_information.log_source.type_name" like '%Windows%'
and "general_information.raw_message" like '%<EventID>4779</EventID>%'
LIMIT 100
How can I extract the EventIDs out of logs?
select regexp_extract("general_information.raw_message", '<EventID>\d+</EventID>')
FROM year_2024
where "general_information.raw_message" like '%<EventID>%' LIMIT 10
Search for Indicators of Compromise
The following query examples can help you find logs that may indicate an attack or compromise.
How do I find a hash value from a process?
select * FROM year_2024
where "general_information.raw_message" like '%DDC5D161C1F486514CAB7EEE882BAFD329EDED19E69BD16C9D271733950BD90B%'
LIMIT 100
How can I see network traffic involving an IP address?
select * FROM day_05
where "general_information.log_source.type_name" = 'Palo Alto NGFW' and ("origin.host.ip_address.value" = '10.1.1.2' OR "target.host.ip_address.value"= '10.1.1.2')
LIMIT 100
Compliance
The following query example can help you find logs that may be necessary for compliance purposes.
How can I show an account was terminated?
select "origin.account.name", "target.account.name", "general_information.standard_message_time"
FROM year_2024 where json_array_contains("general_information.common_event_name", 'Disabled Account')
LIMIT 1