Exploring – Azure Firewall Analytics

Azure Firewall is ever growing in popularity as a choice when it comes to perimeter protection for Azure networking. The introduction of additional SKUs (Premium and Basic) since its launch have made it both more functional while also increasing its appeal to a broader environment footprint.

For anyone who has used Azure Firewall since the beginning, troubleshooting and analysis of your logs has always had a steep-ish learning curve. On one hand, the logs are stored in Log Analytics and you can query them using Kusto, so there is familiarity. However, without context, their formatting can be challenging. The good news is, this is being improved with the introduction of a new format.

Previously logs we stored using the Azure Diagnostics mode, with this update, we will now see the use of Resource-Specific mode. This is something that will become more common across many Azure resources, and you should see it appear for several in the Portal already.

Destination table in the Portal

What difference will this make for Azure Firewall? This will mean individual tables in the selected workspace are created for each category selected in the diagnostic setting. This offers the following improvements:

  • Makes it much easier to work with the data in log queries
  • Makes it easier to discover schemas and their structure
  • Improves performance across both ingestion latency and query times
  • Allows you to grant Azure RBAC rights on a specific table

For Azure Firewall, the new resource specific tables are below:

  • Network rule log – Contains all Network Rule log data. Each match between data plane and network rule creates a log entry with the data plane packet and the matched rule’s attributes.
  • NAT rule log – Contains all DNAT (Destination Network Address Translation) events log data. Each match between data plane and DNAT rule creates a log entry with the data plane packet and the matched rule’s attributes.
  • Application rule log – Contains all Application rule log data. Each match between data plane and Application rule creates a log entry with the data plane packet and the matched rule’s attributes.
  • Threat Intelligence log – Contains all Threat Intelligence events.
  • IDPS log – Contains all data plane packets that were matched with one or more IDPS signatures.
  • DNS proxy log – Contains all DNS Proxy events log data.
  • Internal FQDN resolve failure log – Contains all internal Firewall FQDN resolution requests that resulted in failure.
  • Application rule aggregation log – Contains aggregated Application rule log data for Policy Analytics.
  • Network rule aggregation log – Contains aggregated Network rule log data for Policy Analytics.
  • NAT rule aggregation log – Contains aggregated NAT rule log data for Policy Analytics.

So, let’s start with getting logs enabled on your Azure Firewall. You can’t query your logs if there are none! And Azure Firewall does not enable this by default. I’d generally recommend enabling logs as part of your build process and I have an example of that using Bicep over on Github, (note this is Diagnostics mode, I will update it for Resource mode soon!) However, if already built, let’s look at simply doing this via the Portal.

So on our Azure Firewall blade, head to the Monitoring section and choose “Diagnostic settings”

Azure Firewall blade

We’re then going to choose all our new resource specific log options

New resource specific log categories in Portal

Next, we choose to send to a workspace, and make sure to switch to Resource specific.

Workspace option with Resource option chosen in Portal

Finally, give your settings a name, I generally use my resource convention here, and click Save.

It takes a couple of minutes for logs to stream through, so while that happens, let’s look at what is available for analysis on Azure Firewall out-of-the-box – Metrics.

While there are not many entries available, what is there can be quite useful to see what sort of strain your Firewall is under.

Metrics dropdown for Azure Firewall

Hit counts are straight forward, they can give you an insight into how busy the service is. Data Processed and Throughput are also somewhat interesting from an analytics perspective. However, it is Health State and SNAT that are most useful in my opinion. These are metrics you should enable alerts against.

For example, an alert rule for SNAT utilisation reaching an average of NN% can be very useful to ensure scale is working and within limits for your service and configuration of IPs.

Ok, back to our newly enabled Resource logs. When you open the logs tab on your Firewall, if you haven’t disabled it, you should see a queries screen pop-up as below:

Sample log queries

You can see there are now two sections, one specifically for Resource Specific tables. If I simply run the following query:

AZFWNetworkRule

I get a structured and clear output:

NerworkRule query output

To get a comparative output using Diagnostics Table, I need to run a query similar to the below:

// Network rule log data 
// Parses the network rule log data. 
AzureDiagnostics
| where Category == "AzureFirewallNetworkRule"
| where OperationName == "AzureFirewallNatRuleLog" or OperationName == "AzureFirewallNetworkRuleLog"
//case 1: for records that look like this:
//PROTO request from IP:PORT to IP:PORT.
| parse msg_s with Protocol " request from " SourceIP ":" SourcePortInt:int " to " TargetIP ":" TargetPortInt:int *
//case 1a: for regular network rules
| parse kind=regex flags=U msg_s with * ". Action\\: " Action1a "\\."
//case 1b: for NAT rules
//TCP request from IP:PORT to IP:PORT was DNAT'ed to IP:PORT
| parse msg_s with * " was " Action1b:string " to " TranslatedDestination:string ":" TranslatedPort:int *
//Parse rule data if present
| parse msg_s with * ". Policy: " Policy ". Rule Collection Group: " RuleCollectionGroup "." *
| parse msg_s with * " Rule Collection: "  RuleCollection ". Rule: " Rule 
//case 2: for ICMP records
//ICMP request from 10.0.2.4 to 10.0.3.4. Action: Allow
| parse msg_s with Protocol2 " request from " SourceIP2 " to " TargetIP2 ". Action: " Action2
| extend
SourcePort = tostring(SourcePortInt),
TargetPort = tostring(TargetPortInt)
| extend 
    Action = case(Action1a == "", case(Action1b == "",Action2,Action1b), split(Action1a,".")[0]),
    Protocol = case(Protocol == "", Protocol2, Protocol),
    SourceIP = case(SourceIP == "", SourceIP2, SourceIP),
    TargetIP = case(TargetIP == "", TargetIP2, TargetIP),
    //ICMP records don't have port information
    SourcePort = case(SourcePort == "", "N/A", SourcePort),
    TargetPort = case(TargetPort == "", "N/A", TargetPort),
    //Regular network rules don't have a DNAT destination
    TranslatedDestination = case(TranslatedDestination == "", "N/A", TranslatedDestination), 
    TranslatedPort = case(isnull(TranslatedPort), "N/A", tostring(TranslatedPort)),
    //Rule information
    Policy = case(Policy == "", "N/A", Policy),
    RuleCollectionGroup = case(RuleCollectionGroup == "", "N/A", RuleCollectionGroup ),
    RuleCollection = case(RuleCollection == "", "N/A", RuleCollection ),
    Rule = case(Rule == "", "N/A", Rule)
| project TimeGenerated, msg_s, Protocol, SourceIP,SourcePort,TargetIP,TargetPort,Action, TranslatedDestination, TranslatedPort, Policy, RuleCollectionGroup, RuleCollection, Rule

Obviously, there is a large visual difference in complexity! But there are also all of the benefits as described earlier for Resource Specific. I really like the simplicity of the queries. I also like the more structured approach. For example, take a look at the set columns that are supplied on the Application Rule table. You can now predict, understand, and manipulate queries with more detail than ever before. You can check out all the new tables by searching “AZFW” on this page.

Finally, a nice sample query to get you started. One that I use quite often when checking on new services added, or if there are reports of access issues. The below gives you a quick glance into web traffic being blocked and can allow you to spot immediate issues.

AZFWApplicationRule
| where Action == "Deny"
| distinct Fqdn
| sort by Fqdn asc

As usual, if there are any questions, get in touch!

Global Azure Virtual 2020

As a result of the current global health crisis, the Global Azure events have moved entirely online. Spread over three days (23rd-25th April) it promises to be an amazing mix of content from contributors from all corners of the world.

In the UK & Ireland, we have a local event. Organised by a great team, all of the details are here – https://azureglobalbootcamp2020.azurewebsites.net/

There is a mix of live, and pre-recorded sessions with a wide range of topics.

I’ve contributed a session myself, pre-recorded, on Azure Foundations. If anyone you know is looking to make a start on Azure, have them check it out! #GlobalAzureVirtual

How to – Use Azure Firewall IP Groups

If you’re familiar with Azure Firewall you would know that the introduction of an IP Group resource is most welcome. IP Groups are still in preview at the moment, so as usual be cautious on production environments as there is no SLA. However, it’s always nice to try out a service to see if it can work for you, or make your life easier.

IP Groups themselves are a relatively simple resource. They can contain a single IP address, multiple IP addresses, or one or more IP address ranges. They can then be used for DNAT, Network, or Application rules in Azure Firewall.

They currently have some interesting limitations that are a little bit confusing at first. From Docs:

For 50 IP Groups or less, you can have a maximum of 5000 individual IP addresses each per firewall instance. For 51 to 100 IP Groups, you can have 500 individual IP address each per firewall instance.

What this means is that while your rules should already be scoped accurately, you may need to use a couple of extra IP groups if you’re working with large address ranges. A simple example is a /16 will simply not work in an IP Group, /20 is basically your limit per IP Group.

I actually tried this on my own sub and it appears to actually work for now. Expect that to change as preview progresses.

If you’ve worked with Azure Firewall, I’m sure you’ve already thought of several places these rules can really help. For me, it was within Network Rule Collections.

However, as the service is in preview, there are a few aspects to be ironed out. Unfortunately, one of those is the ability to add an IP Group as a destination within a network rule when using the Portal. See below

UPDATE: As expected, this is now resolved! However, read on to see how to do this at scale.

At this point, I am going to flag extreme caution if your Azure Firewall is in production and you are trying this. It is very easy to overwrite all of your collections, take your time and export them before making any changes!

I’m a Windows guy, so I’m going to explain how to do this with Powershell, but it also works for CLI. Similarly, I’m showing a Network Rule, same process works for Application Rules.

First up, you need to all of the details for your Azure Firewall as we will work with it’s config as a variable and finally update it.

#Get the AFW I want to edit
$afw = Get-AzFirewall -Name wda-afw-test -ResourceGroupName rg-wda-afw

#Save current Network Rule Collection to a variable for reference
$oldcol = $afw.NetworkRuleCollections

#Get the IP Group I want to use
$ipg = Get-AzIpGroup -Name wda-group1 -ResourceGroupName rg-wda-afw

#Create my new network rule
$newrule = New-AzFirewallNetworkRule -Name "rule2" -Protocol TCP -SourceAddress * -DestinationIpGroup $ipg.Id -DestinationPort 445

Now this is where it can get a bit tricky. Collections are stored as nested arrays. My AFW has two collections, I want to add my new rule to the second one which means I need to reference index 1. See the collections below, the one we’ll be editing is “collection2” which currently only has “rule2”

#view all collections
$afw.NetworkRuleCollections

#view the specific collection rules using place in array
$afw.NetworkRuleCollections[1].Rules | ft

#add my new rule to my collection
$afw.NetworkRuleCollections[1].AddRule($newrule)

#if you like, check it has updated as desired
$afw.NetworkRuleCollections[1].Rules | ft

#If as expected, update AFW
Set-AzFirewall -AzureFirewall $afw

The last command can take a minute or two to complete. Once it has, you can see the rule is now added to my collection2. The Portal will display it correctly, but you cannot edit correctly with the glitch.

And that’s it! You’ve successfully added an IP Group as a destination to your Azure Firewall. Again, please be careful, the above is only a guide and I cannot be responsible for your Azure Firewall ūüôā

How to – Troubleshoot Azure Firewall

Networking in Azure is one of my favourite topics. As my work has me focus primarily on Azure Virtual Datacenter builds, networking is key. When Microsoft introduced Azure Firewall (AFW), I was excited to see a platform based option as a hopeful alternative to the traditional NVAs. Feature wise in preview, AFW lacked some key functionality. Once it went GA a lot of the asks from the community were rectified however there are still some outstanding issues like cost, but all of that is for a blog post for another day!

AFW is used in a lot of environments. It’s simple to deploy, resilient and relatively straight forward to configure. However, once active in the environment, I noticed that finding out what is going wrong can be tricky. Hopefully this post helps with that and can save you some valuable time!

I don’t know about you, but the first thing I always check when trying to solve a problem is the most simple solution. For AFW that check is to make sure it’s not stopped. Yes that’s right you can “stop” AFW. It’s quick and easy to do via shell:

# Stop an existing firewall

$azfw = Get-AzFirewall -Name "FW Name" -ResourceGroupName "RG Name"
$azfw.Deallocate()
Set-AzFirewall -AzureFirewall $azfw

But how do you check if it has been stopped? Very simply, via the Azure Portal. On the overview blade for AFW it shows provisioning state. If this is anything but “Succeeded” you most likely have an issue.

So, how do you enable it again should you find your AFW deallocated? Again, quite simply via shell, however, it must be allocated to the original resource group and subscription. Also, while it deallocates almost instantly, it takes roughly the same amount of time to allocate AFW as it does to create one from scratch.

# Start a firewall

$azfw = Get-AzFirewall -Name "FW Name" -ResourceGroupName "RG Name"
$vnet = Get-AzVirtualNetwork -ResourceGroupName "RG Name" -Name "VNet Name"
$publicip = Get-AzPublicIpAddress -Name "Public IP Name" -ResourceGroupName " RG Name"
$azfw.Allocate($vnet,$publicip)
Set-AzFirewall -AzureFirewall $azfw

So, your AFW is active and receiving traffic via whatever method (NAT, Custom Route Tables etc.) and you have created rules to allow traffic as required. Don’t forget all traffic is blocked by default until you create rules.

By default the only detail you can get from AFW are metrics. These can show a small range of traffic with no granular detail, such as rules hit count.


To get detailed logs, like other Azure services, you need to enable them. I recommend doing this as part of your creation process. In terms of what to do, you have to add a diagnostic setting. There are two logs available, and I recommend choosing both.

  • AzureFirewallApplicationRule
  • AzureFirewallNetworkRule

In terms of where to send the logs, I like the integration offered by Azure Monitor Logs and there is a filtered shortcut right within the AFW blade too.

Once enabled, you should start seeing logs flowing into Azure Monitor Logs within five to ten minutes. One aspect that can be viewed as a slight negative is that logs are sent in JSON. As a result most of the interesting data you want is part of an object array:

{
  "category": "AzureFirewallNetworkRule",
  "time": "2018-06-14T23:44:11.0590400Z",
  "resourceId": "/SUBSCRIPTIONS/{subscriptionId}/RESOURCEGROUPS/{resourceGroupName}/PROVIDERS/MICROSOFT.NETWORK/AZUREFIREWALLS/{resourceName}",
  "operationName": "AzureFirewallNetworkRuleLog",
  "properties": {
      "msg": "TCP request from 111.35.136.173:12518 to 13.78.143.217:2323. Action: Deny"
  }
}

So, when running your queries, you need to parse that data. For those who have strong experience in Kusto, this will be no problem. For those who don’t, Microsoft thankfully provide guidance on how to parse both logs including explanatory comments

For ApplicationRule log

AzureDiagnostics
| where Category == "AzureFirewallApplicationRule"
//using :int makes it easier to pars but later we'll convert to string as we're not interested to do mathematical functions on these fields
//this first parse statement is valid for all entries as they all start with this format
| parse msg_s with Protocol " request from " SourceIP ":" SourcePortInt:int " " TempDetails
//case 1: for records that end with: "was denied. Reason: SNI TLS extension was missing."
| parse TempDetails with "was " Action1 ". Reason: " Rule1
//case 2: for records that end with
//"to ocsp.digicert.com:80. Action: Allow. Rule Collection: RC1. Rule: Rule1"
//"to v10.vortex-win.data.microsoft.com:443. Action: Deny. No rule matched. Proceeding with default action"
| parse TempDetails with "to " FQDN ":" TargetPortInt:int ". Action: " Action2 "." *
//case 2a: for records that end with:
//"to ocsp.digicert.com:80. Action: Allow. Rule Collection: RC1. Rule: Rule1"
| parse TempDetails with * ". Rule Collection: " RuleCollection2a ". Rule:" Rule2a
//case 2b: for records that end with:
//for records that end with: "to v10.vortex-win.data.microsoft.com:443. Action: Deny. No rule matched. Proceeding with default action"
| parse TempDetails with * "Deny." RuleCollection2b ". Proceeding with" Rule2b
| extend 
SourcePort = tostring(SourcePortInt)
|extend
TargetPort = tostring(TargetPortInt)
| extend
//make sure we only have Allowed / Deny in the Action Field
Action1 = case(Action1 == "Deny","Deny","Unknown Action")
| extend
    Action = case(Action2 == "",Action1,Action2),
    Rule = case(Rule2a == "",case(Rule1 == "",case(Rule2b == "","N/A", Rule2b),Rule1),Rule2a), 
    RuleCollection = case(RuleCollection2b == "",case(RuleCollection2a == "","No rule matched",RuleCollection2a),RuleCollection2b),
    FQDN = case(FQDN == "", "N/A", FQDN),
    TargetPort = case(TargetPort == "", "N/A", TargetPort)
| project TimeGenerated, msg_s, Protocol, SourceIP, SourcePort, FQDN, TargetPort, Action ,RuleCollection, Rule

For NetworkRule log

AzureDiagnostics
| where Category == "AzureFirewallNetworkRule"
//using :int makes it easier to pars but later we'll convert to string as we're not interested to do mathematical functions on these fields
//case 1: for records that look like this:
//TCP request from 10.0.2.4:51990 to 13.69.65.17:443. Action: Deny//Allow
//UDP request from 10.0.3.4:123 to 51.141.32.51:123. Action: Deny/Allow
//TCP request from 193.238.46.72:50522 to 40.119.154.83:3389 was DNAT'ed to 10.0.2.4:3389
| parse msg_s with Protocol " request from " SourceIP ":" SourcePortInt:int " to " TargetIP ":" TargetPortInt:int *
//case 1a: for regular network rules
//TCP request from 10.0.2.4:51990 to 13.69.65.17:443. Action: Deny//Allow
//UDP request from 10.0.3.4:123 to 51.141.32.51:123. Action: Deny/Allow
| parse msg_s with * ". Action: " Action1a
//case 1b: for NAT rules
//TCP request from 193.238.46.72:50522 to 40.119.154.83:3389 was DNAT'ed to 10.0.2.4:3389
| parse msg_s with * " was " Action1b " to " NatDestination
//case 2: for ICMP records
//ICMP request from 10.0.2.4 to 10.0.3.4. Action: Allow
| parse msg_s with Protocol2 " request from " SourceIP2 " to " TargetIP2 ". Action: " Action2
| extend
SourcePort = tostring(SourcePortInt),
TargetPort = tostring(TargetPortInt)
| extend 
    Action = case(Action1a == "", case(Action1b == "",Action2,Action1b), Action1a),
    Protocol = case(Protocol == "", Protocol2, Protocol),
    SourceIP = case(SourceIP == "", SourceIP2, SourceIP),
    TargetIP = case(TargetIP == "", TargetIP2, TargetIP),
    //ICMP records don't have port information
    SourcePort = case(SourcePort == "", "N/A", SourcePort),
    TargetPort = case(TargetPort == "", "N/A", TargetPort),
    //Regular network rules don't have a DNAT destination
    NatDestination = case(NatDestination == "", "N/A", NatDestination)
| project TimeGenerated, msg_s, Protocol, SourceIP,SourcePort,TargetIP,TargetPort,Action, NatDestination

Using either query gives you clear readable that you can filter. One tip however, is to add a sort command to the end of the queries, normally I use by TimeGenerated to show me the latest data. So to condense and add that for the NetworkRule query above, it would look like:

AzureDiagnostics
| where Category == "AzureFirewallNetworkRule"
| parse msg_s with Protocol " request from " SourceIP ":" SourcePortInt:int " to " TargetIP ":" TargetPortInt:int *
| parse msg_s with * ". Action: " Action1a
| parse msg_s with * " was " Action1b " to " NatDestination
| parse msg_s with Protocol2 " request from " SourceIP2 " to " TargetIP2 ". Action: " Action2
| extend SourcePort = tostring(SourcePortInt),TargetPort = tostring(TargetPortInt)
| extend Action = case(Action1a == "", case(Action1b == "",Action2,Action1b), Action1a),Protocol = case(Protocol == "", Protocol2, Protocol),SourceIP = case(SourceIP == "", SourceIP2, SourceIP),TargetIP = case(TargetIP == "", TargetIP2, TargetIP),SourcePort = case(SourcePort == "", "N/A", SourcePort),TargetPort = case(TargetPort == "", "N/A", TargetPort),NatDestination = case(NatDestination == "", "N/A", NatDestination)
| project TimeGenerated, msg_s, Protocol, SourceIP,SourcePort,TargetIP,TargetPort,Action, NatDestination
| sort by TimeGenerated desc

Using parsed data, you can immediately see all the traffic hitting AFW and, for example, filter on options such as Action to see only denied traffic.

Microsoft also provide pre-cooked visualisation should you prefer it, you can download from here – https://raw.githubusercontent.com/Azure/azure-docs-json-samples/master/azure-firewall/AzureFirewall.omsview – then import into Azure Monitor. The detail is great for quick glance work, I really like the ApplicationRule breakout

Application rule log data

That about sums it up. Hopefully you are now informed and equipped to troubleshoot traffic issues in your Azure Firewall instance. As always, if there are any questions, please get in touch!

If you need more info on how to enable logs – https://docs.microsoft.com/en-us/azure/firewall/tutorial-diagnostics

Log and metrics concepts – https://docs.microsoft.com/en-us/azure/firewall/logs-and-metrics

Azure Networking Security – Where to Start?

If you’ve read any of my blog posts regarding networking in Azure, you might have guessed it’s one of my favourite topics. For ITops, it’s one of the shifts in thinking required to make a change to cloud. As software-defined-networking is one of the core concepts required for a successful cloud implementation, it’s no surprise that the security of that networking is a close second.

Looking at it as simply as possible, good network security means allowing only required traffic and preventing everything else while logging what is useful for auditing. Azure offers several integrated services that can help achieve this.

With that in mind, there are three major scenarios to deal with when it comes to Azure networking:

  1. Azure Resource to Azure Resource
  2. Azure Resource to on-premises Resource
  3. Azure Resource to/from the Internet

I will reference each as we cover the different best practises available.

Access Control

Good network access control requires layering. In Azure, the most common networking concept is a vnet. A vnet does not, by default, get access to another vnet. However, within a vnet, every subnet, by default, has access to each other. So, the subnet layer is most likely where you will need to address access control. In Azure this can be done in two, free, simple ways. Custom Route Tables and/or Network Security Groups.

Custom Route Tables are exactly as they sound. They modify the system route table using routes you specify. If your route matches a system route, it will take preference, user defined routes always do. Similarly the lowest prefix match will always win. More on route tables here. CRTs are applied at subnet level and can quickly manipulate network traffic for your entire vnet. For example, preventing internet access by dropping traffic to 0.0.0.0/0.

Network Security Groups are a little bit more complex in application, but their concept is straight forward. They are an ACL for your network. They can be applied at subnet or network interface level. While NSGs allow you to create complex and granular rules quite simply, managing them at scale can be a challenge. More on them here.

Firewall

While the above allows for control of the network from a routing and access perspective, you may also need to control traffic by inspection and filtering. Within Azure, there are two main options for this; Azure Firewall or a 3rd party NVA.

Azure Firewall was released last year and is a stateful, firewall-as-a-service resource. It offers HA and scalability, however, it’s still a young product and therefore light on traditional network security options. More on it here.

Thankfully, Azure and network appliance vendors have been working better together recently. Most solutions you would expect are available in the Marketplace. The common gripe is that documentation can be light if not bad. However, if you need continuity with your local site, or a specific feature well then they are your best choice. My advice is to reach out to the Azure community if you are having issues, generally someone will have had the same issue and can help!

Perimeter

It’s best to start with some basic architecture decisions relative to your Azure perimeter.

  • Will Azure have a public perimeter?
  • Will it be inbound and outbound?
  • What requirements are there for a private perimeter?

Once the above are answered, you have a couple of well documented implementation options. They all operate on the same premise of layering. This allows for segregation of traffic most commonly with a firewall aspect. This combined with UDR can lead to a well designed and secure environment allowing only the network access required. Therefore layering everything that has been discussed already.

Monitoring

In Azure, there are two major tools to help you with this:

  • Azure Network Watcher
  • Azure Security Center

Network Watcher is one of my favourite tools in Azure. Within a couple of minutes, you can gain granular insights into your complex network issues with minimal effort. You can also integrate the output to other Azure services like Monitor and Functions to react to alerts and capture traffic automatically (*notes to self* must blog that).

Security Center, as it does for other infrastructure, offers insights into your network topology and can provide actionable recommendations at scale. Meaning you have a single pane to sanity check your network, regardless of how complex it may be.

If you take the time to understand and implement the above, you’re well on your way to having a secure networking environment. However, every single environment and workload should be treated as unique. The best network security is constantly auditing and reassessing itself. Be proactive to avoid having to be reactive!

As always, get in touch with any questions or to chat about your go-to network security steps.