Firewall to EDR Correlation Strategy by Hunters' Research Team

Firewalls and Next-Generation Firewalls can serve as a powerful data source for uncovering threats that evade detection on the host.

Hunters' researchers Dvir Sayag, Matthias Becache and Yaniv Assor share their insights on how to make the most out of your Firewall data when correlating it with EDR telemetry.


According to a Business Wire report “The global network security Firewall market is expected to grow at a CAGR of 13.1% from 2019 to 2025 to reach $15.8 billion by 2025”, which means that Firewalls are not going away any time soon.

However, there are issues with network security products. Firewall-based detection often has high chances of false positives, as they lack the context of the host executing the malicious activity. For instance, network vulnerability scanners tend to be noisy and raise many alerts by Firewall logs.

Hunters' approach to this problem is to correlate Firewall data with data from the EDR. This way, we can whitelist the process and the IP address of the vulnerability scanner and make it much easier to deal with these kind of alerts.

Following that, automatically getting host context to those network alerts can help uncover sophisticated threats that were only visible at the network level, without having to go through thousands of low-fidelity network alerts. We call it the “Firewall to EDR Correlation Strategy”.

This blog post will explain the technical details behind the Correlation Strategy and provide you with the tools to implement it and have it work automatically regardless of the size of the security operation and the security stack.

Implementation Process

Preparation

Before implementing the Correlation Strategy it's important to map several things and make sure the ground is ready.

First, make sure to extract the below attributes for each Firewall alert:

  1. Event Time (GENERATED_TIME)
  2. EDR Source IP
  3. NAT Source IP
  4. Destination IP
  5. NAT Destination IP

Next, we recommend creating a table that compares EDR agent_ids to the agent's local_ip addresses. This will enable to query the right table later on to shorten the process. 

Note that in Firewall logs we have source-destination IPs whereas in EDR we have Local and Remote IPs. 

Local will be the EDR device side, which can be the destination IP in Firewall alerts. That means we always need to check source and destination sides both as Local in EDR.

 

First Correlation Step: Firewall to EDR agent_id

To start correlating the data, we should be able to identify the EDR agent_id behind the suspicious action that the Firewall alert caught. That way we can further investigate if the actions on the machine are suspicious or not, which is impossible without the EDR agent.

Following that, for every Firewall alert, we want to find the relevant EDR agent_ids that were involved - the source EDR and the destination EDR. 

We do that by extracting the source and destination IP addresses from the Firewall lead and then returning the corresponding source agent_IDs list and a destination agent_IDs list. See the implementation below.

Using the table we created before (endpoint_ip_mapping); in the 30 minutes before or after the Firewall event, find the Source agent_ids that had the same EDR local_ip and fw.source_ip.

 

Same for destination agent_id:

 

 

Second Correlation Step: Finding EDR Process IDs

Next stage after finding the malicious activity source agent_id, we want to search for network events and processes that are responsible for the malicious connections from this agent to the known remote IP.

If we did not find an agent for the specific Firewall alert, this correlation will not run at all.  

To do that, we extract the process_ids that occurred in the closest time chunk from the Firewall alert. If we find the same process many times, we use the closest one to the event. 

  Source side:

  1. Find network events on one or more of the source agents_ids we found before
  2. Look for where the local_ip is one of the Firewall source_ips (internal or nat)
  3. And which remote_ip is one of the Firewall destination_ips (internal or nat)
  4. Time range of an hour after the Firewall alert to 24 hours before

   Destination side:

  1. Find network events on one or more of the destination agents_ids we found before
  2. Look for where local_ip is one of the Firewall destination_ips (internal or nat)
  3. And which remote_ip is one of the Firewall source_ips (internal or nat)
  4. Time range of an hour after the Firewall Alert to 24 hours before

The result for each side is a nested json, looking as the following: {<agent_id>: {<EDR specific_source_type>: [(<process_id>,<edr_process_time>)]} , where the list of (<process_id>, <edr_process_time>) can contain many tuples of this kind. 

 

Results

For every Firewall alert, this correlation strategy provides enrichments like process tree and process names, command lines, related users, hash prevalence in the organization, and more. It also provides context to the alert, which typically lacks when simply looking at the Firewall data.

Getting these additional insights allows the SOC operator to further inspect the Firewall alert and make security decisions with higher confidence. 


Implementation Difficulties you Might Experience


Time Stamps:

One of the challenges that may arise during implementation are timestamps.

Timestamps can be a problem problem because to be most accurate while correlating, we need to query the data in the shortest time from the moment the alert has popped. 

Network events and Firewall processes might occur in different timestamps. This can happen for several reasons, so we  firstly recommend making sure that your products are synchronized in the same time zone. 

Imagine this scenario: your PAN is using the USA time zone while your EDR is using the Asia time zone. Querying 30 minutes before or after the PAN alert to look for agent_ID won’t give any results because the EDR is using a different time. 

Another scenario specifically for PAN users is that PAN output logs without a time zone. This can create a real problem because when not following the UTC time zone, it means that it is a different time than the EDR we correlate to. 

Specifically for PAN→CrowdStrike:

We found out that many of the PAN alerts were not close to the CrowdStrike network event, so we searched for the EDR processes many hours back. We've already seen an interesting case where an alert was raised in PAN at 11:00 AM, but the network connection event in the EDR was at 7:00 AM.

This was probably caused by the fact that the connection was initiated at 7:00 AM but the malicious traffic occurred only at 11:00 AM. In this case, the process was w3wp.exe.


Network Structure:

This correlation strategy depends mostly on the network structure in the organization.

Different structures might create different situations that will make it difficult to find the specific agent_ID and later on the malicious processes.

The difficulty may arise in situations where the IP address has changed before it got to the Firewall event, either source and destination IP addresses, so the correlation would be much harder. 


Summary 

We have explained how to implement a Firewall to EDR correlation strategy within your security environment. Our recommendation is to automate it so that ultimately handling Firewall alerts can become an easier and more cost-effective task.

Correlation is one of the hardest topics to solve as an XDR. Hunters’ open XDR delivers it automatically as an out-of-the-box capability. Another correlation is Proxy to EDR which is also provided within the Hunters platform.

Follow our Twitter account for more technical content.