How To – Troubleshoot Azure Virtual Network Peerings

Recently, Microsoft announced a new preview feature relative to Virtual Network Peerings. This preview allows you to resize the address space of a peered virtual network. Up until the point of this preview, any resize operation on a Virtual Network with a peering activated would fail. The only previous workaround was to delete the peering, complete the required address space modifications, then recreate the peering. While this can be completed quite quickly, it does cause downtime. This preview feature is definitely a welcome addition.

Address space change error example due to peering

So while there is a new feature available, but it’s in preview, let’s have a look at the current state of Virtual Network Peering. Starting off with what exactly they are. A peering allows you to seamlessly connect two Virtual Networks. There are a couple of the usual caveats like non overlapping adress spaces but configuration is as simple as can be. One item of note, peerings are not transient. That means that in the graphic below, even though the Hub Virtual Network is peered to both A and B, traffic cannot traverse from A directly to B or vice versa.

virtual network peering transit
Hub/spoke network example with non-transient peering

Using peerings allows for the creation of well architected, and secure, network footprints in Azure. If you’re new to network on Azure, and/or new to peerings, I really recommend reading this routing page on Docs, it gives clear examples and explanations of common scenarios. And for more specifically on peering, check out this page on Docs. Both will help set the context for some of the areas we are going to cover in this post, and if you’re not familiar with them, may be a challenge.

Configuration

For all explanations we will use the following architecture:

Hub/spoke virtual networks with peerings

The first step to create a peering is also the first place you should confirm when troubleshooting. Often, it can simply be a missed setting in this configuration that causes your issue. A peering contains three elements of configuration, and they are repeated on both sides, leaving you with six settings in total that can impact your peering.When creating a peering, we have the following choices. These choices impact how your peering will function and should be your first check, every time, when you have an issue with a peering.

Let’s break those down a little. First, “Traffic to remote virtual network”, this enables communication between the virtual networks and allows resources connected to either virtual network to communicate with each other with the same bandwidth and latency as if they were connected to the same virtual network. So, you may ask why is there a block option if I am enabling a peering!? Good question! It’s mostly there to facilitate temporary blocks, saving you from deleting and recreating the peering. How this works is slightly complicated, as it’s based on manipulating the service tag “Virtual Network” rather than explicitly blocking traffic.

If you want to block all traffic – delete the peering.

The VirtualNetwork service tag for network security groups encompasses the virtual network and peered virtual network when this setting is Enabled. Read more about network security group service tags here. When this setting is disabled, traffic doesn’t flow between the peered virtual networks by default; however, traffic may still flow if explicitly allowed through a network security group rule that includes the appropriate IP addresses or application security groups.

Next, “Traffic forwared from remote virtual network”, when enabled, this allows traffic that didn’t originate from the virtual network. This is best explained using a routing example; take our three virtual networks, VNET A is peered to VNET B and to VNET C, however, B and C are not peered. Don’t forget, peerings are not transient, so without additional configuration, B and C cannot communicate. WE can create a service chain by using VNET A as the next hop for B to reach C and vice versa, we do this by using route tables and a network virtual appliance, such as a router or firewall etc. However, for that traffic to be allowed use the peering, we need the forwarded traffic flag enabled on BOTH sides of the peering.

Finally, “Virtual network gateway or Route Server”, this can be enabled depending on whether the services exist or not in your peered virtual networks. Using our architecture, creating a peering between A and B. On the A side, we could choose “Use this virtual network’s gateway…” and on B “Use the remote virtual network’s gateway…”. This would allow B to use the connections terminated on the gateway in A. Without this enabled, for example on our peering between A and C, traffic cannot use the connections on the gateway.

Important note here, you cannot peer and use gateways if both virtual networks have gateways. You can peer and use the other two options, but not the gateway.

One often over looked feature is the ability to use the VNG in A to route traffic from B to C. It’s a supported configuration, requiring route tables etc. and isn’t very well documented but it works! Ping me if you need help with it.

Constraints

Now, if you have confirmed that all of these settings are correct, but you are still facing issues, there are some common constraints that may be the cause of your trouble.

  • Problem Resources – There are a set of resources that do not work across global peering – Listed here.
  • Classic Virtual Network – while you can peer an ARM and a classic, you cannot peer two classics. Upgrade those vnets to ARM!
  • Peering status – Every peering has a status when viewed within the virtual network. If it shows Initiated, traffic wont flow. Check both sides of the peering and update configuration until it shows Connected
  • DNS – If you’re using the default DNS with your virtual network, you cannot resolve names in a peered virtual network. You will need custom DNS, or Azure DNS Private.
  • P2S – If you have P2S configured and then add a peering, you must donwload the client config again to pick up the peered virtual network routes.

Scenarios

While 99% of the issues I have seen are resolved with some combination of the above, there are a few specific scenarios that require a slightly different focus, Microsoft have documented these all on one page.

Finally

Don’t forget, peerings are all about the system route table of the virtual networks, that’s how they function. Understanding and validating your route tables is key to successful troubleshooting. Having said that, if you are ever stuck, please get in touch, I will do my best to help!

How To – Confirm and Enable Azure Resource Providers

Depending on your level of permission on an Azure subscription, you may or may not have encountered Resource Providers directly. However, when you do, they can be a bit tricky. This post will hopefully clear up some of the most common issues and help you get working that bit quicker.

First up, what is an Azure Resource Provider? Simply put, it is a service within Azure Resource Manager that provides the resources you build. An example is Microsoft.Network which provides Virtual Networks among many others.

By default, if you have the correct role at a subscription level, Resource Providers are automatically registered. However, to register you need either Contributor, Owner, or a Custom Role with permission to do the /register/action operation. Resource Providers are always at subscription level and once registered, you can’t unregister when you still have resource types from that Resource Provider in your subscription.

So, in a scenario where you have an Owner role but only on a Resource Group within a subscription, you do not have permission to register Resource Providers.

Next, how do I check which Resource Providers are registered? There are a couple of ways to achieve this. You can simply check within the Portal, which gives some nice immediate visuals. Head to the Azure Portal, and navigate to your subscription. Scroll down to the Settings section and choose Resource Providers.

From here you can see a list of Registered, NotRegistered, and Registering providers. To register, simply click the relevant provider and choose Register at the top of the list. Similar for unregister once the previously mentioned caveat is met.

In some cases, you may want to avoid issues with NotRegistered providers and want to Register them all for a subscription. This can be achieved via the shell.

Log into Azure Powershell and choose your required subscription. Next run the following:

Get-AzResourceProvider -ListAvailable | Select-Object ProviderNamespace, RegistrationState

This will list all resource providers, and the registration status for your subscription. You can get additional details on each provider including resources it supports and locations supported by running the commands detailed in this doc.

To register all providers at once, run the following:

Get-AzResourceProvider -ListAvailable | Register-AzResourceProvider

The shell will then cycle through all providers and list their status as it works its way through them all. Similar to below:

And that’s it! You now know how to check the status of your Resource Providers and how to enable them as needed. As usual, I can’t take any responsibility for commands provided in examples, please use at your own risk. But, if there are any questions, please get in touch!

How To – Convert Azure Managed Disk Performance SKU

Due to some of the recent restrictions within Azure on deployments, some of you may have had to deploy VM sizes that were not idea, or didn’t exactly fit your needs. As part of that, you may have had to limit your choice in Managed Disk SKU too.

For example, I had to deploy a D8v3 for a customer and use Standard SSD until restrictions were lifted. I have no change the VM SKU to D8Sv3 and the disks to Premium SSD. For some people this is just about performance, but don’t forget, the financial SLA for VMs requires they run Premium SSD Managed Disks. Another solid reason to use them!

So, if you find yourself in a situation where you need to change the SKU, how can you do it? There are two ways I recommend, both require the VM to be deallocated so you may have to plan an outage, however the process is quick, so just a short one should be required.

Method 1: Via the Portal

This works best for individual, or a small number of VMs.

So, first up, if your VM needs to be resized to support Premium SSD, you should do that. Via the Portal, you simply stop the VM, choose your new size, and apply. One tip however, on the overview page of the VM blade, ensure you hit refresh so that the updated size is shown. For some reason, the Disk blade needs that to show that correctly or you cannot update your Disk SKU.

Next, choose the Disk blade, and select the disk you want to change. Click the Configuration option and simply choose the correct SKU from the drop down menu. This works for moving between any SKU by the way, as long as the VM size supports. Make sure to click Save.

Now, back to your VM blade, click Start and you’re done! VM and Disk updated.

Method 1: Via the Shell

This works best if you need to make multiple changes at once, save some time. As always with my shell examples, I am using Powershell, but the same actions can be carried out with CLI.

The below shows how to switch all Disks for a VM between tiers. In this example, we’re changing them to Premium, you can change that with the $storageType variable. Note, the below will Stop the VM for you and start it again. The VM must be running a Size that supports Premium in advance.

# Name of the resource group that contains the VM
$rgName = 'yourResourceGroup'

# Name of the your virtual machine
$vmName = 'yourVM'

# Choose between Standard_LRS and Premium_LRS based on your scenario
$storageType = 'Premium_LRS'

# Stop and deallocate the VM before changing the size
Stop-AzVM -ResourceGroupName $rgName -Name $vmName -Force

# Get all disks in the resource group of the VM
$vmDisks = Get-AzDisk -ResourceGroupName $rgName 

# For disks that belong to the selected VM, convert to Premium storage
foreach ($disk in $vmDisks)
{
	if ($disk.ManagedBy -eq $vm.Id)
	{
		$disk.Sku = [Microsoft.Azure.Management.Compute.Models.DiskSku]::new($storageType)
		$disk | Update-AzDisk
	}
}

Start-AzVM -ResourceGroupName $rgName -Name $vmName

And that’s it, you’re done!

The above example code was taken and edited slightly from the Docs article on this. The article includes multiple options for changing between tiers, single disks vs multiple etc. Check it out here – https://docs.microsoft.com/en-us/azure/virtual-machines/windows/convert-disk-storage

How to – Use Azure Firewall IP Groups

If you’re familiar with Azure Firewall you would know that the introduction of an IP Group resource is most welcome. IP Groups are still in preview at the moment, so as usual be cautious on production environments as there is no SLA. However, it’s always nice to try out a service to see if it can work for you, or make your life easier.

IP Groups themselves are a relatively simple resource. They can contain a single IP address, multiple IP addresses, or one or more IP address ranges. They can then be used for DNAT, Network, or Application rules in Azure Firewall.

They currently have some interesting limitations that are a little bit confusing at first. From Docs:

For 50 IP Groups or less, you can have a maximum of 5000 individual IP addresses each per firewall instance. For 51 to 100 IP Groups, you can have 500 individual IP address each per firewall instance.

What this means is that while your rules should already be scoped accurately, you may need to use a couple of extra IP groups if you’re working with large address ranges. A simple example is a /16 will simply not work in an IP Group, /20 is basically your limit per IP Group.

I actually tried this on my own sub and it appears to actually work for now. Expect that to change as preview progresses.

If you’ve worked with Azure Firewall, I’m sure you’ve already thought of several places these rules can really help. For me, it was within Network Rule Collections.

However, as the service is in preview, there are a few aspects to be ironed out. Unfortunately, one of those is the ability to add an IP Group as a destination within a network rule when using the Portal. See below

UPDATE: As expected, this is now resolved! However, read on to see how to do this at scale.

At this point, I am going to flag extreme caution if your Azure Firewall is in production and you are trying this. It is very easy to overwrite all of your collections, take your time and export them before making any changes!

I’m a Windows guy, so I’m going to explain how to do this with Powershell, but it also works for CLI. Similarly, I’m showing a Network Rule, same process works for Application Rules.

First up, you need to all of the details for your Azure Firewall as we will work with it’s config as a variable and finally update it.

#Get the AFW I want to edit
$afw = Get-AzFirewall -Name wda-afw-test -ResourceGroupName rg-wda-afw

#Save current Network Rule Collection to a variable for reference
$oldcol = $afw.NetworkRuleCollections

#Get the IP Group I want to use
$ipg = Get-AzIpGroup -Name wda-group1 -ResourceGroupName rg-wda-afw

#Create my new network rule
$newrule = New-AzFirewallNetworkRule -Name "rule2" -Protocol TCP -SourceAddress * -DestinationIpGroup $ipg.Id -DestinationPort 445

Now this is where it can get a bit tricky. Collections are stored as nested arrays. My AFW has two collections, I want to add my new rule to the second one which means I need to reference index 1. See the collections below, the one we’ll be editing is “collection2” which currently only has “rule2”

#view all collections
$afw.NetworkRuleCollections

#view the specific collection rules using place in array
$afw.NetworkRuleCollections[1].Rules | ft

#add my new rule to my collection
$afw.NetworkRuleCollections[1].AddRule($newrule)

#if you like, check it has updated as desired
$afw.NetworkRuleCollections[1].Rules | ft

#If as expected, update AFW
Set-AzFirewall -AzureFirewall $afw

The last command can take a minute or two to complete. Once it has, you can see the rule is now added to my collection2. The Portal will display it correctly, but you cannot edit correctly with the glitch.

And that’s it! You’ve successfully added an IP Group as a destination to your Azure Firewall. Again, please be careful, the above is only a guide and I cannot be responsible for your Azure Firewall 🙂

How to – Enable Alerts for Azure ExpressRoute

Azure ExpressRoute is Microsoft’s recommended method of accessing resources in Azure over a private connection. It is unique to the platform and can offer unparalleled resilience and performance. It also has the capability to allow you connect to Azure Public Resources, such as Storage Accounts, Azure SQL, over the same circuit and therefore, private connection.

With the above being possible, it’s clear that ExpressRoute, once implemented, becomes the key resource in your environment. As such, you would be wise to implement some level of monitoring and alerting for it.

Here, I’m going to show you how to very quickly put in place alerts should either of the required BGP sessions drop for your ExpressRoute circuit.

To add some context to that, each ExpressRoute circuit requires a pair of BGP sessions to meet SLA requirements. They will function with just one, but that’s not fully supported and definitely not recommended! So, presuming you have both active, we’ll base our alert off of that metric. We’re going to look specifically at Private Peering as it’s the most common implementation, but the logic is identical for Microsoft Peering.

If both are active and configured correctly, you should see the same routes being advertised within the circuit if you check your route table from the Peering blade

Next, let’s look at the Metric we will be using for the alert. In your ExpressRoute Circuit blade, choose Metrics. Then we’ll choose the ‘BGP Availability’ Metric and ensure Aggregation is ‘Avg’

Next, to future proof the alert, or if you have a Microsoft Peering, we’re going to apply some filtering. So choose ‘Add filter’ (highlighted above). Then we’re going to select the ‘Peering Type’ Property, and choose ‘private’ from the Values drop down and click the tiny blue tick to the right (You won’t see Microsoft if you don’t have one, but do this anyway to ensure your alert is tightly scoped).

Now, we’re looking at the average BGP availability, for our Private Peering only.

To make life easier, you can now just click ‘New Alert Rule’ on the top right to use this scoped metric for your alert.

This applies the right resource scope and brings the metric across too, you now just have to configure some of the required choices on specifics.

First up, let’s confirm the condition and signal logic. Click on the blue text under Condition. You will see Private is already selected for you. I’m going to change Operator to ‘less than or equal to’ and add a value of 95.

The above settings mean that the BGP Availability metric for my private peering, over the last 5 minutes, will be assessed every minute. Should it drop below or equal to 95%, an alert will fire. You can adjust this as needed to suit your own environment.

Now, we need to define what happens when they alert fires. Click Done on the signal logic pop out, this will bring you back to the alert configuration blade.

I would recommend you create and use an Action Group, even something simple like a distribution group to email. More on Action Groups – https://docs.microsoft.com/en-us/azure/azure-monitor/platform/action-groups

Next, define the final details for the alert. Use a descriptive name and the appropriate severity for your environment/system.

Now, the alert itself can take some time to become fully active, so I would recommend waiting about thirty minutes before attempting a test. But, please ensure you test the alert!

And that’s it. Once tested, you have now setup a fully scoped alert rule for your ExpressRoute Private Peering. You can repeat the above, and change the scope to Microsoft Peering to cover that too if you have it.

As always, if there are any questions, get in touch!