Azure Private Link – Where to Start?

Last week, Microsoft announced that the majority of Private Link is now GA. This means it’s now supported for production workloads, however, is it useful for your production workloads? Hopefully this post will help you understand the service a bit more, and whether it’s worth investing your time in exploring.

First up, what exactly is Azure Private Link (APL)? Well, it allows you to access Azure resources using a private endpoint in a virtual network. There are two possible component within APL.

  • Private Endpoint
  • Private Link Service

A Private Endpoint is a private IP that exists within your vnet and presents a service via APL. Traffic targeting a service via APL travels via the Azure backbone, removing the need for public access if it’s not required. The service could be Azure PaaS, or a service presented by yourself, or a Partner via Private Link Service.

A Private Link Service leverages Azure Standard Load Balancer frontend IP configuration to present as a service, for use by a Private Endpoint. The workload behind ALB could be your own, or a vendors.

Below is a somewhat complex diagram explaining this visually, from Microsoft

Private endpoint overview

So, let’s try break that down into simpler concepts.

First, it allows you to access PaaS services without the need to implement a Microsoft Peering for ExpressRoute. Which gives you greater flexibility as well as a simpler footprint should you have VPN connectivity.

APL maps to single instances of resources, rather than whole services, so you have a more direct and therefore secure connectivity footprint. For example, allow connections to an Azure SQL instance, rather than all Azure SQL instances. This is also relative to vendors providing a service, you can connect to just their presented endpoint rather than a broader or more complex connection.

All of the above can be done across regions and across Azure AD tenants, so if you are providing a service, you can offer it via APL at a global scale.

A lot of Azure PaaS is GA, but there are still quite a few in Preview, so be wary of your production requirements, a full list is here – https://docs.microsoft.com/en-us/azure/private-link/private-link-overview#availability

So what might an APL deployment look like in your environment? Below, I’ve put together a quick example from my own tenant. I have a very simple website, running from a Storage Account and I have a test VM I will RDP to to load the site.

I created an APL, connected it to the SA, which automatically makes a couple of changes and auto-approves the connection within the SA. So now, when I lookup the corresponding endpoint, it shows as a private IP within my vnet (10.55.1.5), and I can browse to it successfully (cert error is expected due to lack of DNS here).

Within APL, I can check the status of my Private Endpoint, which confirms settings as expected.

And on the SA itself, I can see that a Private Endpoint has been activated.

So all of the above seems quite simple and very usable. But there are some current limitations to be aware of.

For Private Endpoint, NSGs are not supported. While subnets containing the private endpoint can have NSG associated with it, the rules will not be effective on traffic processed by the private endpoint. When you create a Private Endpoint via the Portal, it automatically makes a switch to the subnet to disable network policies. Other deployment methods require a manual change, documented here – https://docs.microsoft.com/en-us/azure/private-link/disable-private-endpoint-network-policy

For Private Link Service, it must be a Standard ALB, only IPv4 and only TCP traffic. The above note about subnets and network policies is also valid, documented here – https://docs.microsoft.com/en-us/azure/private-link/disable-private-link-service-network-policy

Overall I think APL is a great addition to the network offerings and more closely aligns what Operations teams like to be able to control when working with PaaS. The options it introduces for vendors could also see some clever solutions brought to market, especially with the supported global capability.

As always, if there are any questions, get in touch!

How to – Troubleshoot Azure Firewall

Networking in Azure is one of my favourite topics. As my work has me focus primarily on Azure Virtual Datacenter builds, networking is key. When Microsoft introduced Azure Firewall (AFW), I was excited to see a platform based option as a hopeful alternative to the traditional NVAs. Feature wise in preview, AFW lacked some key functionality. Once it went GA a lot of the asks from the community were rectified however there are still some outstanding issues like cost, but all of that is for a blog post for another day!

AFW is used in a lot of environments. It’s simple to deploy, resilient and relatively straight forward to configure. However, once active in the environment, I noticed that finding out what is going wrong can be tricky. Hopefully this post helps with that and can save you some valuable time!

I don’t know about you, but the first thing I always check when trying to solve a problem is the most simple solution. For AFW that check is to make sure it’s not stopped. Yes that’s right you can “stop” AFW. It’s quick and easy to do via shell:

# Stop an existing firewall

$azfw = Get-AzFirewall -Name "FW Name" -ResourceGroupName "RG Name"
$azfw.Deallocate()
Set-AzFirewall -AzureFirewall $azfw

But how do you check if it has been stopped? Very simply, via the Azure Portal. On the overview blade for AFW it shows provisioning state. If this is anything but “Succeeded” you most likely have an issue.

So, how do you enable it again should you find your AFW deallocated? Again, quite simply via shell, however, it must be allocated to the original resource group and subscription. Also, while it deallocates almost instantly, it takes roughly the same amount of time to allocate AFW as it does to create one from scratch.

# Start a firewall

$azfw = Get-AzFirewall -Name "FW Name" -ResourceGroupName "RG Name"
$vnet = Get-AzVirtualNetwork -ResourceGroupName "RG Name" -Name "VNet Name"
$publicip = Get-AzPublicIpAddress -Name "Public IP Name" -ResourceGroupName " RG Name"
$azfw.Allocate($vnet,$publicip)
Set-AzFirewall -AzureFirewall $azfw

So, your AFW is active and receiving traffic via whatever method (NAT, Custom Route Tables etc.) and you have created rules to allow traffic as required. Don’t forget all traffic is blocked by default until you create rules.

By default the only detail you can get from AFW are metrics. These can show a small range of traffic with no granular detail, such as rules hit count.


To get detailed logs, like other Azure services, you need to enable them. I recommend doing this as part of your creation process. In terms of what to do, you have to add a diagnostic setting. There are two logs available, and I recommend choosing both.

  • AzureFirewallApplicationRule
  • AzureFirewallNetworkRule

In terms of where to send the logs, I like the integration offered by Azure Monitor Logs and there is a filtered shortcut right within the AFW blade too.

Once enabled, you should start seeing logs flowing into Azure Monitor Logs within five to ten minutes. One aspect that can be viewed as a slight negative is that logs are sent in JSON. As a result most of the interesting data you want is part of an object array:

{
  "category": "AzureFirewallNetworkRule",
  "time": "2018-06-14T23:44:11.0590400Z",
  "resourceId": "/SUBSCRIPTIONS/{subscriptionId}/RESOURCEGROUPS/{resourceGroupName}/PROVIDERS/MICROSOFT.NETWORK/AZUREFIREWALLS/{resourceName}",
  "operationName": "AzureFirewallNetworkRuleLog",
  "properties": {
      "msg": "TCP request from 111.35.136.173:12518 to 13.78.143.217:2323. Action: Deny"
  }
}

So, when running your queries, you need to parse that data. For those who have strong experience in Kusto, this will be no problem. For those who don’t, Microsoft thankfully provide guidance on how to parse both logs including explanatory comments

For ApplicationRule log

AzureDiagnostics
| where Category == "AzureFirewallApplicationRule"
//using :int makes it easier to pars but later we'll convert to string as we're not interested to do mathematical functions on these fields
//this first parse statement is valid for all entries as they all start with this format
| parse msg_s with Protocol " request from " SourceIP ":" SourcePortInt:int " " TempDetails
//case 1: for records that end with: "was denied. Reason: SNI TLS extension was missing."
| parse TempDetails with "was " Action1 ". Reason: " Rule1
//case 2: for records that end with
//"to ocsp.digicert.com:80. Action: Allow. Rule Collection: RC1. Rule: Rule1"
//"to v10.vortex-win.data.microsoft.com:443. Action: Deny. No rule matched. Proceeding with default action"
| parse TempDetails with "to " FQDN ":" TargetPortInt:int ". Action: " Action2 "." *
//case 2a: for records that end with:
//"to ocsp.digicert.com:80. Action: Allow. Rule Collection: RC1. Rule: Rule1"
| parse TempDetails with * ". Rule Collection: " RuleCollection2a ". Rule:" Rule2a
//case 2b: for records that end with:
//for records that end with: "to v10.vortex-win.data.microsoft.com:443. Action: Deny. No rule matched. Proceeding with default action"
| parse TempDetails with * "Deny." RuleCollection2b ". Proceeding with" Rule2b
| extend 
SourcePort = tostring(SourcePortInt)
|extend
TargetPort = tostring(TargetPortInt)
| extend
//make sure we only have Allowed / Deny in the Action Field
Action1 = case(Action1 == "Deny","Deny","Unknown Action")
| extend
    Action = case(Action2 == "",Action1,Action2),
    Rule = case(Rule2a == "",case(Rule1 == "",case(Rule2b == "","N/A", Rule2b),Rule1),Rule2a), 
    RuleCollection = case(RuleCollection2b == "",case(RuleCollection2a == "","No rule matched",RuleCollection2a),RuleCollection2b),
    FQDN = case(FQDN == "", "N/A", FQDN),
    TargetPort = case(TargetPort == "", "N/A", TargetPort)
| project TimeGenerated, msg_s, Protocol, SourceIP, SourcePort, FQDN, TargetPort, Action ,RuleCollection, Rule

For NetworkRule log

AzureDiagnostics
| where Category == "AzureFirewallNetworkRule"
//using :int makes it easier to pars but later we'll convert to string as we're not interested to do mathematical functions on these fields
//case 1: for records that look like this:
//TCP request from 10.0.2.4:51990 to 13.69.65.17:443. Action: Deny//Allow
//UDP request from 10.0.3.4:123 to 51.141.32.51:123. Action: Deny/Allow
//TCP request from 193.238.46.72:50522 to 40.119.154.83:3389 was DNAT'ed to 10.0.2.4:3389
| parse msg_s with Protocol " request from " SourceIP ":" SourcePortInt:int " to " TargetIP ":" TargetPortInt:int *
//case 1a: for regular network rules
//TCP request from 10.0.2.4:51990 to 13.69.65.17:443. Action: Deny//Allow
//UDP request from 10.0.3.4:123 to 51.141.32.51:123. Action: Deny/Allow
| parse msg_s with * ". Action: " Action1a
//case 1b: for NAT rules
//TCP request from 193.238.46.72:50522 to 40.119.154.83:3389 was DNAT'ed to 10.0.2.4:3389
| parse msg_s with * " was " Action1b " to " NatDestination
//case 2: for ICMP records
//ICMP request from 10.0.2.4 to 10.0.3.4. Action: Allow
| parse msg_s with Protocol2 " request from " SourceIP2 " to " TargetIP2 ". Action: " Action2
| extend
SourcePort = tostring(SourcePortInt),
TargetPort = tostring(TargetPortInt)
| extend 
    Action = case(Action1a == "", case(Action1b == "",Action2,Action1b), Action1a),
    Protocol = case(Protocol == "", Protocol2, Protocol),
    SourceIP = case(SourceIP == "", SourceIP2, SourceIP),
    TargetIP = case(TargetIP == "", TargetIP2, TargetIP),
    //ICMP records don't have port information
    SourcePort = case(SourcePort == "", "N/A", SourcePort),
    TargetPort = case(TargetPort == "", "N/A", TargetPort),
    //Regular network rules don't have a DNAT destination
    NatDestination = case(NatDestination == "", "N/A", NatDestination)
| project TimeGenerated, msg_s, Protocol, SourceIP,SourcePort,TargetIP,TargetPort,Action, NatDestination

Using either query gives you clear readable that you can filter. One tip however, is to add a sort command to the end of the queries, normally I use by TimeGenerated to show me the latest data. So to condense and add that for the NetworkRule query above, it would look like:

AzureDiagnostics
| where Category == "AzureFirewallNetworkRule"
| parse msg_s with Protocol " request from " SourceIP ":" SourcePortInt:int " to " TargetIP ":" TargetPortInt:int *
| parse msg_s with * ". Action: " Action1a
| parse msg_s with * " was " Action1b " to " NatDestination
| parse msg_s with Protocol2 " request from " SourceIP2 " to " TargetIP2 ". Action: " Action2
| extend SourcePort = tostring(SourcePortInt),TargetPort = tostring(TargetPortInt)
| extend Action = case(Action1a == "", case(Action1b == "",Action2,Action1b), Action1a),Protocol = case(Protocol == "", Protocol2, Protocol),SourceIP = case(SourceIP == "", SourceIP2, SourceIP),TargetIP = case(TargetIP == "", TargetIP2, TargetIP),SourcePort = case(SourcePort == "", "N/A", SourcePort),TargetPort = case(TargetPort == "", "N/A", TargetPort),NatDestination = case(NatDestination == "", "N/A", NatDestination)
| project TimeGenerated, msg_s, Protocol, SourceIP,SourcePort,TargetIP,TargetPort,Action, NatDestination
| sort by TimeGenerated desc

Using parsed data, you can immediately see all the traffic hitting AFW and, for example, filter on options such as Action to see only denied traffic.

Microsoft also provide pre-cooked visualisation should you prefer it, you can download from here – https://raw.githubusercontent.com/Azure/azure-docs-json-samples/master/azure-firewall/AzureFirewall.omsview – then import into Azure Monitor. The detail is great for quick glance work, I really like the ApplicationRule breakout

Application rule log data

That about sums it up. Hopefully you are now informed and equipped to troubleshoot traffic issues in your Azure Firewall instance. As always, if there are any questions, please get in touch!

If you need more info on how to enable logs – https://docs.microsoft.com/en-us/azure/firewall/tutorial-diagnostics

Log and metrics concepts – https://docs.microsoft.com/en-us/azure/firewall/logs-and-metrics

Azure Networking Security – Where to Start?

If you’ve read any of my blog posts regarding networking in Azure, you might have guessed it’s one of my favourite topics. For ITops, it’s one of the shifts in thinking required to make a change to cloud. As software-defined-networking is one of the core concepts required for a successful cloud implementation, it’s no surprise that the security of that networking is a close second.

Looking at it as simply as possible, good network security means allowing only required traffic and preventing everything else while logging what is useful for auditing. Azure offers several integrated services that can help achieve this.

With that in mind, there are three major scenarios to deal with when it comes to Azure networking:

  1. Azure Resource to Azure Resource
  2. Azure Resource to on-premises Resource
  3. Azure Resource to/from the Internet

I will reference each as we cover the different best practises available.

Access Control

Good network access control requires layering. In Azure, the most common networking concept is a vnet. A vnet does not, by default, get access to another vnet. However, within a vnet, every subnet, by default, has access to each other. So, the subnet layer is most likely where you will need to address access control. In Azure this can be done in two, free, simple ways. Custom Route Tables and/or Network Security Groups.

Custom Route Tables are exactly as they sound. They modify the system route table using routes you specify. If your route matches a system route, it will take preference, user defined routes always do. Similarly the lowest prefix match will always win. More on route tables here. CRTs are applied at subnet level and can quickly manipulate network traffic for your entire vnet. For example, preventing internet access by dropping traffic to 0.0.0.0/0.

Network Security Groups are a little bit more complex in application, but their concept is straight forward. They are an ACL for your network. They can be applied at subnet or network interface level. While NSGs allow you to create complex and granular rules quite simply, managing them at scale can be a challenge. More on them here.

Firewall

While the above allows for control of the network from a routing and access perspective, you may also need to control traffic by inspection and filtering. Within Azure, there are two main options for this; Azure Firewall or a 3rd party NVA.

Azure Firewall was released last year and is a stateful, firewall-as-a-service resource. It offers HA and scalability, however, it’s still a young product and therefore light on traditional network security options. More on it here.

Thankfully, Azure and network appliance vendors have been working better together recently. Most solutions you would expect are available in the Marketplace. The common gripe is that documentation can be light if not bad. However, if you need continuity with your local site, or a specific feature well then they are your best choice. My advice is to reach out to the Azure community if you are having issues, generally someone will have had the same issue and can help!

Perimeter

It’s best to start with some basic architecture decisions relative to your Azure perimeter.

  • Will Azure have a public perimeter?
  • Will it be inbound and outbound?
  • What requirements are there for a private perimeter?

Once the above are answered, you have a couple of well documented implementation options. They all operate on the same premise of layering. This allows for segregation of traffic most commonly with a firewall aspect. This combined with UDR can lead to a well designed and secure environment allowing only the network access required. Therefore layering everything that has been discussed already.

Monitoring

In Azure, there are two major tools to help you with this:

  • Azure Network Watcher
  • Azure Security Center

Network Watcher is one of my favourite tools in Azure. Within a couple of minutes, you can gain granular insights into your complex network issues with minimal effort. You can also integrate the output to other Azure services like Monitor and Functions to react to alerts and capture traffic automatically (*notes to self* must blog that).

Security Center, as it does for other infrastructure, offers insights into your network topology and can provide actionable recommendations at scale. Meaning you have a single pane to sanity check your network, regardless of how complex it may be.

If you take the time to understand and implement the above, you’re well on your way to having a secure networking environment. However, every single environment and workload should be treated as unique. The best network security is constantly auditing and reassessing itself. Be proactive to avoid having to be reactive!

As always, get in touch with any questions or to chat about your go-to network security steps.

Azure Firewall – Where to Start?

About a year ago, Microsoft introduced the first release of Azure Firewall. Since then, and since its general release the service has grown and the features have matured.

To begin, let’s understand what Azure Firewall is? At its core it’s a managed, network security service that protects your Azure Virtual Network resources. It functions as a stateful firewall-as-a-service and offers built-in high availability and scalability. This means you can centrally control, enforce and log all of your network traffic. It fully integrates with Azure Monitor too which means all of the usual logging and analytical goodness.

If the above sounds like something you’d like to use, or at least try, in your Azure environment, read on! To start, let’s break out what can be configured within Azure Firewall and which features could be useful for you.

When deploying an Azure Firewall, you need a couple of things in advance. It needs a dedicated subnet, specifically named “AzureFirewallSubnet” and the minimum size it can be is a /26. It also needs at least one Static Public IP. The Public IP must be on the Standard tier. My recommendation here is to look at creating a Public IP Prefix in advance of creating your Azure Firewall. That way, if you need to delete it and redeploy, you can continue to use the same Public IP again and again. If you want to use multiple Public IPs, it supports up to 100.

So, let’s look at what Azure Firewall (AFW) can do for you on your Virtual Network and then consider some deployment options.

Access

Using your single, or multiple Public IP addresses, AFW allows both source and destination NATing. Meaning it can support multiple inbound ports, such as HTTPS over 443 to different resources. Outbound SNAT helps greatly with services that require white-listing. If you are using multiple Public IPs, AFW randomly picks one for SNAT, so ensure you include all of them in your white-listing requirements.

Protection

AFW uses a Microsoft service called Threat Intelligence filtering. This allows Azure Firewall to alert and deny traffic to and from known malicious IPs and domains. You can turn this setting off, set it to just alert or to both alert and deny. All of the actions are logged.

Filtering

Finally, for filtering, AFW can use both Network Traffic and Application FQDN rules. This means that you can limit traffic to only those explicitly listed within the rule collections. For example, an application rule that only allows traffic to the FQDN – www.wedoazure.ie

A visual representation of the above features is below:

Firewall overview

Now that you understand AFW, let’s look at how to configure to your needs. Normally I would go into the deployment aspect, but it is excellently documented already and relatively easy to follow. However, there are some aspects of the configuration that warrant further detail.

Once deployed, you must create a Custom Route Table to force traffic to your AFW. In the tutorial, it shows you how to create a route for Internet traffic (0.0.0.0/0), however you may want the AFW to be your central control point for your vnet traffic too. Don’t forget, traffic between subnets is not filtered by default. Routing all traffic for each subnet to AFW could allow you to manage which subnet can route where centrally. For example, if we have three subnets, Web, App and DB. A single route table applied to each subnet can tunnel all traffic to AFW. On the AFW you can then allow Web to the Internet and the App subnet. The App subnet can access Web and DB but not Internet and finally the DB subnet can only access the App subnet. This would all be achieved with a single Network Rule collection.

Similarly you can allow/block specific FQDNs with an Application Rule collection. In the tutorial, a single FQDN is allowed. This means that all others are blocked as that is the default behaviour. This might not be practical for your environment and the good news is, you can implement the reverse. With the right priority order, you can allow all traffic except for blocked FQDNs.

A feature you may also want to consider trying is destination NATing. This thankfully has another well documented tutorial on Docs.

Finally, and in some cases most importantly, let’s look at price. You are charged in two ways for AFW. There is a price per-hour-per-instance. That means if you deploy and don’t use it for anything, you will pay approx. €770 per-month (PAYG Calculator). On top of that, you will pay for both data inbound and outbound that is filtered by AFW. You’re charged the same price either direction and that’s approx. €14 per-Tb-per-month. Depending on your environment and/or requirements this price could be OK or too steep. My main advice is to ensure you understand it before deploying!

As always, if there are any questions please get in touch!