We Need to Talk About VNETs – Azure Back to School 2022

Virtual Networks are arguably one of the most common resources in Azure. You will find them in the vast majority of environments facilitating some form of private or static network functionality. However, they haven’t always been around.

For those that remember the older Azure days, the ASM model didn’t have the same network concept. We have only had Virtual Networks since the introduction of ARM and they have changed drastically over the years.

Don’t get me wrong, at it’s core a Virtual Network is still just an address range. A private network of at least one subnet that sets your connectivity boundary. However, it has been a long time since I have seen a Virtual Network only operate in its basic capacity.

As a result of this ever growing list of services a Virtual Network offers, I think it is about time we talk about VNETs!

But aren’t VNETs straight forward? I deploy them all the time etc. Yes they can be, but they also offer an enormous range of network services. As an abstract piece of evidence, did you know that if you download the Virtual Network documentation page on Docs to a PDF, it is 848 pages!?

First off, what this article is not – an explanation of the basic elements of a VNET. For example subnets and address spacing. So presuming you have some familiarity, let’s discuss what I like to call secondary services.

So, what is a secondary service? For me, it’s a service that cannot exist, or serves no purpose without requiring a VNET. Think Bastion, Route Server, NSGs etc. they all serve specific purposes but commonly enhance the functionality of a VNET. Some of these, like Bastion, I feel would be better if included within a VNET resource, like Service Endpoints. However, that is for another post!

They are also drastically different in their complexity. For example, a Service Endpoint can be deployed without much effort and barely any planning (just double check those routes). However, Route Server requires significant elements of both.

While this list of secondary services is ever growing, I do not necessarily think this is a bad thing. I am always all for extra functionality. However, understanding that you cannot simply deploy a VNET and have the majority of network features that most people use is something that should be clearer. There are new services released like Network Manager that will help with management, but none offer a single view of everything.

To both convey the complexity, but help simplify things (weird I know) I thought it best to pick two services, create a test environment so you can try them, and discuss some of the components. So I’ve chosen two of the newer services:

  • Azure Route Server
  • NAT Gateway

Azure Route Server

Route Server is an excellent addition to routing services within Azure. Previously, there could be some routes originating from Virtual Network Gateways via BGP, the system routes and everything else would be via Route Table. While this works, it can be cumbersome and management at scale of Route Tables is almost non-existent. Route Server solves some of those problems by allowing BGP interaction between NVAs, Virtual Network Gateways and your VNETs system routing table. It’s also nice that it is a managed service, and HA out-of-the-box.

The objective of Route Server is to simplify and centralise routing management. This is helped by using a default peering process, meaning if your NVA supports BGP – it should work with Route Server. It also natively supports peering of VNETs with the same switch as “use remote gateway” meaning it slots very neatly into Hub-Spoke designs. Including the use of Virtual Network Gateways as peers (note VPN VNGs have to be configured in active-active mode).

As with many secondary services, Route Server requires a dedicated subnet in your VNET and each VNET can only have a single Route Server. The subnet does not allow the addition of an NSG or a UDR. This may flag as a concern as Route Server now requires a Public IP, however, this is only to guarantee access to management services and does not open the VNET (according to Microsoft). Also, no data traffic is sent between Route Server and your NVAs.

However, it’s important to note some of the configurations where Route Server alone is not the answer and in some cases begs the question that if I still have to use UDRs for that, why should I bother with Route Server? For example, ExpressRoute will advertise routes that will be preferred over Route Server routes, meaning you would need to overwrite this with a UDR. You cannot simply turn off the ER advertisement as this runs over the same peering functionality. A nice fix here would be to split that choice into two switches. One for VNGs, one for Route Server.

Another element that may be important is price. VNETs are free, UDRs are free, Route Server is far from that. On many large environments, this may be a negligible cost. However, you should weigh up the benefits vs the cost with introducing Route Server.

So to help, as promised, here is a repo that will build a test footprint for you. I’ve taken the Route Server tutorial using Quagga and integrated it with theother services from this article. You can follow the steps to complete the configuration and confirm you have a functioning peer. You should see output similar to the below from Cloud Shell:

Learned routes in Route Server

NAT Gateway

Implicit internet outbound is potentially one of the Azure network features that surprise most people. Deploy a VM into a VNET and you will be able to reach the internet with a random IP from the region deployed. Not exactly a dream scenario for many admins!

However VMs are not where I see this used most often. That doesn’t mean it’s not a good solution for VMs, it works exactly the same and works well. I just more commonly see this to facilitate static outbound IPs for PaaS resources. Like an App Service that requires a static IP due to a vendor allow list.

One interesting piece here, NAT Gateway when configured on a subnet, will take precedent over locally attached Public IPs and Standard Load Balancer Outbound NAT rules. However, UDRs will still overwrite this when advertising 0/0. Another item of note, no ICMP support, only TCP/UDP.

To try out NAT Gateway, I have again included it within a repo. This will also deploy a Ubuntu VM, which you can use Bastion to connect to and login. This VM has a Public IP locally attached but is deployed to a subnet with a NAT Gateway. So, use Bastion to connect, then simply copy the ipcheck script and paste it into the command line, it will give you an output similar to the below which you can then verify against your NAT Gateway resource. Proving NAT Gateway is taking precedence over the locally attached IP.

Locally attached IP
NAT Gateway seen as the outbound IP publicly

Roundup

In closing, I think that more and more secondary services does two things.

  1. Makes networking in Azure ever more complex
  2. Solidifies VNETs as the most important core resource

Now, everyone should agree with number one. However, two may cause some concern, but hear me out. Regardless of your resource deployment, your application architecture etc. 99/100 you will deploy a VNET and 9/10 you will need at least one secondary service. This means that getting it right, having it well designed for deployment, management etc is crucial. Not everyone loves networking, but within Azure at the moment – you’ve gotta learn it!

Speaking of learning, if networking is your thing, check out the most appropriate Azure exam – Microsoft Certified: Azure Network Engineer Associate.

What is Azure Network Security Group?

If you’re considering Azure for IaaS workloads, the first aspect of cloud you will have to understand, design and deploy is networking. As with any other cloud, software defined networking is the foundation of IaaS for Azure.

You cannot deploy a workload without first deploying a Virtual Network. However, once you have a network, you then need to consider its security and specifically how you control its perimeter and access. The perimeter is something that requires its own post but for platform-native, have a look at Azure Firewall and for other topics, start by checking out the Security Center docs for an overview.

When it comes to access control on your Virtual Network, Azure offers built-in solutions for both network layer control and route control. Network Security Groups (NSG) function as the network layer control service. So, what are they and how do you use them?

NSGs filter traffic to and from resources in an Azure Virtual Network. Combining rules that allow or deny traffic for both inbound and outbound traffic, allows granular control at the network layer.

They can be viewed as a basic, stateful, packet filtering firewall, but what does that mean? First, lets note what they don’t do; there is no traffic inspection or authentication access control.

So how do they help secure your network? By combining 5 variables into a scenario which you then allow or deny, you can quickly and easily manipulate the access that is possible to your resource. For example, consider the following two rules:

PriorityPortProtocolSourceDestinationAccess
1003389TCP10.10.10.10*Allow
200****Block

Our first rule, allows RDP traffic to the resources protected by the NSG, but only from the scoped source IP. The second rule, blocks all traffic. The rules are processed in order or priority.

Source and Destination can use IPs, IP ranges, ANY or Service Tags. Service Tags really help you define simple but powerful access rules quickly. For example, consider the following change to our above rules:

PriorityPortProtocolSourceDestinationAccess
1003389TCP10.10.10.10*Allow
101443TCPVirtualNetwork*Allow
200****Block

We still allow RDP and we still block all traffic, however, we now also allow HTTPS traffic from any source tagged as VirtualNetwork. This includes everything in your Virtual Network, any peered Virtual Networks and any traffic originating across a VPN or ExpressRoute. A single Service Tag replaces multiple source ranges and simplifies management.

If you’re still struggling with the filtering aspect, check out this handy tutorial from Docs.

A couple of other items to note about NSGs; they can be applied to a Network Interface or a Subnet and they have some default rules. Which layer you apply an NSG to is important. Remember traffic is processed inbound and outbound in reversed layers. So traffic from a VM out hits Network Interface then Subnet. So how you scope and combine NSGs is critical to ensuring your access control is as you want it. There is a great example of this on Docs.

The default rules that exist within an NSG allow Virtual Network traffic IN and Internet traffic OUT. You can check out the full list for exact details.

As always, if you have any questions or require a steer on a specific scenario, please get in touch!

Securing Azure PaaS

When considering Azure as a platform, part of the conversation should revolve around transformation. That is, how do we transform our approach from what is viewed as traditional to something more modern. Often this could lead to redesigning how your application/service is deployed, but with some workflows, a simple change from IaaS to PaaS is viewed as a quick win.

This change isn’t suitable in all scenarios, but depending on your specific requirement it could allow for greater resiliency, a reduction in costs, and a simpler administration requirement. One service that is often considered is SQL. Azure has its own PaaS SQL offering which removes the need for you to manage the underlying infrastructure. That alone makes the transformation a worthy consideration.

However, what isn’t often immediately apparent to some administrators is that PaaS offerings are, by their nature, public facing. For Azure SQL to be as resilient as possible and scale responsively, it sits behind a public FQDN. Therefore, how this FQDN is secured must be taken into consideration as a priority to ensure your data is protected appropriately.

Thankfully, Azure SQL comes with a built in firewall service. Initially, all Transact-SQL access to your Azure SQL server is blocked by the firewall. To allow traffic, you must specify one or more server-level firewall rules that enable access. The firewall rules specify which IP address ranges from the Internet are allowed. There is also the ability to choose whether Azure applications can connect to your Azure SQL server.

The ability to grant access to just one of the databases within your Azure SQL server is also possible. You simply create a database-level rule for the required database. However, while this limits the traffic to specific IP ranges, the traffic still flows via the internet.

To communicate with Azure SQL privately, you will first need an Azure V-Net. Once in place, you must enable the service endpoint for Azure SQL, see here. This will allow communication directly between listed subnets within your v-net and Azure SQL via the Azure backbone. This traffic is more secure and possibly faster than via the internet.

Once your endpoint is enabled, you can then create a v-net firewall rule on Azure SQL for the subnet which had a service endpoint enabled. All endpoints within the subnet will have access to all databases. You can repeat these steps to add additional subnets. If adding your v-net replaces the previous IP rules, remember to remove them from your Azure SQL firewall rules.

Also worth noting is the option for “Allow all Azure Services”, the presumption here is that this somehow would only access from Azure Services within your subscription, but this is not the case. It means every single Azure service in all subscriptions, even mine! My recommendation is to avoid this whenever possible, however, there are some cases where this required and this access should be noted as a risk.

More on Azure SQL Firewall – https://docs.microsoft.com/en-us/azure/sql-database/sql-database-firewall-configure

More on Azure SQL with V-Nets – https://docs.microsoft.com/en-us/azure/sql-database/sql-database-vnet-service-endpoint-rule-overview

 

Virtual Network Service Endpoints

When designing an application deployment to be hosted in Azure, a design consideration that is commonly enticing is to transform a layer of the application from traditional infrastructure to something more modern. Microsoft offer several Platform-as-a-service (PAAS) options that allow this to be achieved, for example, transforming SQL server installed on a VM to Azure SQL.

While this transformation might be straight forward from an SQL Database point of view and most likely when considering the cost of running your deployment, a concern that often arises is security. As Azure SQL is PAAS, it offers a public endpoint for SQL authentication and connectivity. This is by design and there are limited options to prefix this with a security layer. If your application runs somewhere outside Azure, this makes sense and might be an acceptable and noted weak spot. However, if the rest of your application layers are hosted within Azure, having to route out to a public endpoint is less secure than it could be and simply bad design.

Thankfully, Microsoft have been making updates to virtual network functionality that allow you to route directly from your virtual network resources to several PAAS offerings. To do this, you must make use of virtual network Service Endpoints.

Endpoints extend your virtual network private address space and the identity of your VNet to specific Azure services, over a direct connection. Endpoints allow you to secure your critical Azure service resources, such as Azure SQL, to only your virtual networks. Traffic from your VNet to the Azure service always remains on the Microsoft Azure backbone network and never takes a public route. They are currently available for three PAAS offerings:

  1. Azure SQL
  2. Azure Storage
  3. Azure Data Warehouse (Preview)

Introducing an Endpoint for Azure SQL (to stick with the initial example) allows improved security as it fully removes public Internet access and allows traffic only from the virtual network.

It also optimises routing as Endpoints always take service traffic directly from your virtual network to the service itself on the Microsoft Azure backbone network. Doing this means that if your environment uses forced-tunnelling this traffic will no longer be viewed as outbound, but intra-Azure and will flow direct.

There are some considerations to be aware of:

  • Location – the virtual network and the PAAS offering must be located in the same region.
  • Outbound Network Flow – if you control outbound network flow via NSG, you can make use of the “Azure Service” tags to allow this traffic via Endpoint.
  • Connections – If you enable a service endpoint, all current TCP connections from your virtual network will drop. This is to allow a change from Public IP access to Private.

Personally, I think Endpoints should be used as widely as possible. From a security and design perspective they allow greater ease of adoption when PAAS offerings are being considered and perhaps best of all, they are free!

User Defined Routing

How traffic is routed within and between Azure networks isn’t something that is immediately apparent to the average admin. Therefore trying to increase the complexity of this for security reasons or by adding a virtual appliance can offer several challenges. As more companies begin to adopt a cloud strategy these scenarios will become more common. I’m here to tell you that with a little patience, taking full control of traffic routing in your Azure environment only takes a couple of well planned steps.

This is something you should see repeated often when reading about Azure; everything starts with the Virtual Network. When you create a Virtual Network (vnet) you must define some simple parameters, we’ll focus on two of these for this blog post, Address Space and Subnets.

If you are planning on a cloud-only environment, you can choose whatever address space you like (well, almost… any range defined in RFC1918) but, I would always recommend you plan vnets with the possibility they will be linked to your LAN. It is much easier to plan for this now than trying to resolve it later, trust me.

Choosing how to divide your address space up into several subnets also requires careful consideration. To start with, not all of the addresses in a subnet are usable, Azure reserves the first, last and three other addresses for Azure services. So you need to ensure there are enough usable addresses in each subnet for its intended use.

Once the above steps are completed, Azure creates a system routing table. This defines how traffic will route both within and out of the vnet. It is a combination of your vnet address space and Azure defined address spaces. Below are the effective routes for a demo network interface sitting in a vnet with the address space 10.0.0.0/16

system_routing

As you can see, all of the source definitions are ‘Default’ indicating they are system generated. The entry noted with ‘next hop type’ of ‘Virtual Network’ is our vnet address space, this defines the traffic as local. The rest are added by default to cover system and internet routing, more on those here.

The above table applies to all subnets within the vnet, so if you had a requirement for an incredibly secure vnet that could not route to the internet, only local traffic, you could introduce a user defined route table to overwrite the system route.

Routing in Azure is done by longest prefix match (LPM) this means that if the system table defines a 0/16 and you define a 0/24 and traffic is bound for that address space, it will follow the 0/24 route. If the LPM is an exact match, the following order is adhered to:

  1. User defined route
  2. BGP route (when ExpressRoute is used)
  3. System route

So using our example above, we would create a UDR and define a route for traffic bound for 0.0.0.0/0 as having a ‘next hop’ of none. This will prevent any outbound traffic, with the exception of your vnet (remember, LPM) from being able to route entirely.

The below grab of my demo VM effective routes shows how the system route is now marked as invalid and my user route has precedence.

udr

There are many other scenarios where this can be useful, for example if you have a site-to-site VPN and would like all internet traffic to tunnel through on-premise appliances you can force tunnel traffic to the gateway.

Similarly if you have an appliance as part of your vnet within Azure, you can route all traffic to and from it as you see fit. Remember to always view the effective routes being applied to a network interface before pushing to a live production environment!

My final recommendation if using UDRs in your environment is to add a Network Watcher object. It greatly simplifies analysing and troubleshooting routing. There are multiple powershell and CLI commands you can also make use of for troubleshooting.

If you’ve any questions etc, get in touch!