Optimising Azure Disk Performance

When deploying a VM, there are several aspects of configuration to consider to ensure you are achieving the best possible performance for your application.  The most common are vCPU and RAM, however, I recommend equal consideration also be given to your disks.

In Azure, the are multiple options available when provisioning a disk for a VM. The recommendation from Microsoft is to use Managed Disks and depending on performance required choose either Standard or Premium tier. As this post is about performance, I am going to discuss Premium tier Managed Disks (PMD).

For most applications/services, if you choose the closest sizing for the VM and provision the disk as PMD, this will more than meet your requirements. However, should you have heavy read/write requirements, you may need to maximise your disk performance to ensure you are also maximising your other resources.

When talking about disk performance there will be references to disk IOPS and throughput, it is important to have an understanding of both concepts.

IOPS is the number of requests that your application is sending to the storage disks in one second. An input/output operation could be read or write, sequential or random.

Throughput or Bandwidth is the amount of data that your application is sending to the storage disks in a specified interval. If your application is performing input/output operations with large IO unit sizes, it requires high Throughput.

How actual disk performance is calculated for your VM is slightly complex. There are several variables that effect what performance is available, achievable and when you should expect throttling to occur. The key aspects however are:

  1. Virtual Machine Size
  2. Managed Disk Size

As you move up VM sizes, you don’t just increase the amount of vCPU and RAM available, you also increase the allocated IOPS and Throughput. In the following examples I’ll be comparing a DS12v2 and a DS13v2, below are their listed specifications:

 

Size vCPU Memory: GiB Temp storage (SSD) GiB Max data disks Max cached and temp storage throughput: IOPS / MBps Max uncached disk throughput: IOPS / MBps
Standard_DS12_v2 4 28 56 16 16,000 / 128 (144) 12,800 / 192
Standard_DS13_v2 8 56 112 32 32,000 / 256 (288) 25,600 / 384

As you can see there are significant differences across the board in terms of specifications, however, we’ll focus on the final two columns which are disk related. There are two channels of performance for disks attached to VMs. This relates to whether you choose to enabling caching on your disks. For the OS disk, Microsoft recommends read/write cache and enables it by default however for data disks, the choice is up to you. IOPS are higher when caching but throughput is lower, so it will be application/service dependent.

Similarly, as you move up PMD sizes, you increase the IOPS and Throughput available. See below for table highlighting this:

Premium Disks Type P4 P6 P10 P20 P30 P40 P50
Disk size 32 GB 64 GB 128 GB 512 GB 1024 GB (1 TB) 2048 GB (2 TB) 4095 GB (4 TB)
IOPS per disk 120 240 500 2300 5000 7500 7500
Throughput per disk 25 MB per second 50 MB per second 100 MB per second 150 MB per second 200 MB per second 250 MB per second 250 MB per second

You can see that Throughput will max out at 250mb however, on a DS13v2 if uncached you can get a maximum of 384mb. A similar restriction applies with IOPS. To achieve performance above the listed for PMD and in line with your VM size, you need to combine the disks.

For example, two P20s will give you 1TB of storage and roughly 300mb Throughput in comparison to a P30 which gives same storage but only 200mb Throughput. Obviously there is a cost consideration, but performance may justify this. To achieve this disk combination, it is best to use Storage Spaces in Windows to perform disk striping. More on that here – (Server 2016 – https://docs.microsoft.com/en-us/windows-server/storage/storage-spaces/deploy-storage-spaces-direct).

As a demo I completed the above and below are captures of testing throughput for the same scenario on a DS13v2. Firstly is our striped virtual disk. Same storage, maximum throughput.

DiskStripe

On the same VM, our single disk, same storage available, lower throughput.

DiskSimple

A simple change to how your storage is attached produces a 50% increase in performance.

While not always applicable, changes such as the above could prove vital when considering how to size a VM for a database and at the same time ensure you are maximisng that cost/performance ratio.

More links on sizing etc.

Premium Storage Performance
VM Sizes

 

Resource Locks and Policies

When considering production workloads for your Azure environment there are some simple features that ensure the safety of your workloads that are being overlooked. The features I’m referring to are Resource Locks and Resource Manager Policies (RMPs).

Both features allow you greater control over your environment with minimal administrative effort. In my opinion, regardless of whether you are running production workloads or not, you should at the very least be using Locks and RMPs as a preventative method of control over your deployments.

Locks are a very simple and quick tool that can prevent changes to your environment in an instant. They can be applied at different tiers of your environment. Depending on your governance model, you might want to apply at the subscription, resource group or resource level, all are possible. Locks have only two basic operations:

  • CanNotDelete means authorized users can still read and modify a resource, but they can’t delete the resource.
  • ReadOnly means authorised users can read a resource, but they can’t delete or update the resource. Applying this lock is similar to restricting all authorised users to the permissions granted by the Reader role.

Locks obey inheritance, so if you apply at resource group level, all resources contained within will receive the applied lock, the same is true for subscription level assignments.

Of the built-in roles, only Owner and User Access Administrator are granted the ability to apply and remove locks. In general, my recommendation is that all production resources are assigned a CanNotDelete lock. Environments such as UAT where performance etc is being monitored are more suited to a ReadOnly lock to ensure consistent environment results.

RMPs can be used individually or in conjunction with Locks to ensure even more granular control of your environment. RMPs define what you can and cannot do with your environment. For example, all resources created must be located in the European datacentres, or, all resources created must have a defined set of tags applied.

In terms of scope, RMPs can be applied exactly the same as Locks and also obey inheritance. A common scenario here is to apply a policy at subscription level to specify your allowed datacentres, then if you have a traditional IT Resource Group design, specify policies at RG level allowing only specific VM sizes for dev/test to manage cost.

There are many combinations that can be put to use to allow you greater control of your environment. At the end of the day, Azure allows for huge flexibility by design, but it is important for many companies for both security and cost management reasons to be able to exercise a degree of control over that flexibility.

A little tip if you are using both features, make sure you apply a CanNotDelete Lock to your important RMPs!

Windows Update Management

Update management is a necessary evil in the IT world. Some admins enjoy “Patch Tuesday” and for some it’s the most dreaded day of the month. Microsoft have made strides in relieving the stress that can be associated with patching certain core VMs but good management still requires a lot of administration.

Within Azure, every time you deploy a Windows server VM from a Marketplace image, you are getting the latest available patches, but what options do you have should those VMs need to run for an extended period of time? How do you keep them patched so they adhere to company security policies?

Traditionally linking Azure to your existing on-premise solution, or building a WSUS or SCCM implementation were options. Both of these obviously work well but for smaller sites could be considered cumbersome. Now, within Azure itself, making use of some platform objects that you may already be using, you can get a central console view of all of your machine updates.

The two requirements, outside of a VM to manage, are:

  1. Automation Account
  2. Log Analytics Workspace

Both of these implementations are basically free,  (see latest pricing details for limits etc.) and relatively easy to set up separately. However, part of the process for enabling update management can also set these up for you should that be required.

To enable update management for a single VM, open the VM blade and choose the Update Management button from the left action menu, it is part of the Operations section. This will run a validation operation to see if the feature is enabled and assess whether there are automation accounts and log analytics workspaces available. The validation process also checks to see if the VM is provisioned with the Microsoft Monitoring Agent (MMA) and Automation hybrid runbook worker. This agent is used to communicate with the VM and obtain information about the update status. This information is stored in the log analytics workspace.

Once you choose your current available options, or request to have new ones created, the solution takes roughly 15 minutes to enable. Once enabled, you will now see a management page, it will take some time for the live data to be collected from the server, but once that completes, this page will display information regarding the status of updates available/missing. You can click on individual updates for more information. You can also analyse the log search queries that run for checking updates, these can be modified to suit your environment if/as required.

Now that your management pane is displaying what updates are missing, you need to install them. You can schedule the installation of the updates you require from the same management pane. To install updates, schedule a deployment that follows your release schedule and service window. You can choose which update types to include in the deployment. For example, you can include critical or security updates and exclude update rollups. One thing to note that is important, if an update requires a restart the VM will reboot automatically.

The scheduling process is very simple. You choose a name for the deployment, the classification of updates you would like to install, your scheduled time to begin the process of installation and finally a maintenance window to ensure compliance with your defined service windows.

Once the scheduled deployment runs, you can then view its status. Again, this is via the Update Managment blade. This reports on all stages of the deployment from “In Progress” to “Partially Failed” etc. You can then troubleshoot any issues should they arise.

Overall, I really like this solution. It also scales, you can add several machines using the same automation etc. From the Automation Account, you can then access the Update Management blade and manage multiple enabled VMs at once, including scheduling mutli-VM deployments of updates.

While I haven’t covered it here, this solution also works with Linux distributions and can be integrated with SCCM.

More here:

Update Management Overview

Patching Linux

Manage multiple VMs

SCCM Integration

Azure Resource Manager (ARM) Templates

One of the most useful aspects of a platform like Azure is the multitude of deployment options that are available. Which one you use may be down to familiarity, efficiency or sometimes nature of deployment. In this post I will discuss ARM templates which can greatly speed up your deployment cycle.

Infrastructure as Code (IaC) is the management of infrastructure (networks, virtual machines, load balancers, etc.) in a descriptive model. It functions best when using the same versioning that your DevOps team uses for source code. Similar to the principle that the same source code generates the same binary, an IaC model generates the same environment every time it is applied. Therefore, it is very beneficial to reducing deployment times as well as simplifying how resources are deployed.

An ARM template is a JSON file, in its simplest form it must contain the following definitions:

  • schema
  • content version
  • resources

For more deployment options it can also include the following:

  • parameters
  • variables
  • outputs

In general, your templates will include all of the above, this ensures the greatest level of customisation to the deployment as and when needed. Without getting too much into the technicalities of each aspect, the file will contain everything needed to build all the objects you have defined. For example, if the file builds a VM you will define the name, size, NIC used, OS profile and disk options. You have multiple choices within each definition to greater customise your deployment and these definitions can be passed as direct referrals, variables or parameters.

One thing to note, is that while these templates deploy resources via code, they cannot configure the resources. To automate that, you must consider a technology like DSC or Powershell once the template completes deployment.

JSON files are not simple to read, I deliberately haven’t included a sample as they are easier to understand as you build one. The fact that they aren’t simple makes error checking somewhat problematic. Most code editing applications that support ARM plugins will catch basic formatting errors. You can also verify the file via Azure Powershell. If you really want to confirm your template works it is best to test the deployment properly. Ideally, you could make use of a test/dev subscription to minimise costs but once the template completes, you can delete the entire resource group quite quickly.

To best understand how these templates can be of use, start with one of the simple quick start templates from Github, for example, a simple Windows server deployment – https://github.com/Azure/azure-quickstart-templates/tree/master/101-vm-simple-windows

You can then build layer upon layer of code on top of this to increase the complexity of the deployment or use one of the other samples that closer matches your intention.

For more reading, I would recommend starting with understanding the structure and syntax before moving onto the actual templates themselves here.

Virtual Network Service Endpoints

When designing an application deployment to be hosted in Azure, a design consideration that is commonly enticing is to transform a layer of the application from traditional infrastructure to something more modern. Microsoft offer several Platform-as-a-service (PAAS) options that allow this to be achieved, for example, transforming SQL server installed on a VM to Azure SQL.

While this transformation might be straight forward from an SQL Database point of view and most likely when considering the cost of running your deployment, a concern that often arises is security. As Azure SQL is PAAS, it offers a public endpoint for SQL authentication and connectivity. This is by design and there are limited options to prefix this with a security layer. If your application runs somewhere outside Azure, this makes sense and might be an acceptable and noted weak spot. However, if the rest of your application layers are hosted within Azure, having to route out to a public endpoint is less secure than it could be and simply bad design.

Thankfully, Microsoft have been making updates to virtual network functionality that allow you to route directly from your virtual network resources to several PAAS offerings. To do this, you must make use of virtual network Service Endpoints.

Endpoints extend your virtual network private address space and the identity of your VNet to specific Azure services, over a direct connection. Endpoints allow you to secure your critical Azure service resources, such as Azure SQL, to only your virtual networks. Traffic from your VNet to the Azure service always remains on the Microsoft Azure backbone network and never takes a public route. They are currently available for three PAAS offerings:

  1. Azure SQL
  2. Azure Storage
  3. Azure Data Warehouse (Preview)

Introducing an Endpoint for Azure SQL (to stick with the initial example) allows improved security as it fully removes public Internet access and allows traffic only from the virtual network.

It also optimises routing as Endpoints always take service traffic directly from your virtual network to the service itself on the Microsoft Azure backbone network. Doing this means that if your environment uses forced-tunnelling this traffic will no longer be viewed as outbound, but intra-Azure and will flow direct.

There are some considerations to be aware of:

  • Location – the virtual network and the PAAS offering must be located in the same region.
  • Outbound Network Flow – if you control outbound network flow via NSG, you can make use of the “Azure Service” tags to allow this traffic via Endpoint.
  • Connections – If you enable a service endpoint, all current TCP connections from your virtual network will drop. This is to allow a change from Public IP access to Private.

Personally, I think Endpoints should be used as widely as possible. From a security and design perspective they allow greater ease of adoption when PAAS offerings are being considered and perhaps best of all, they are free!

Securing Remote RDP Access

Accessing a VM in your local environment is generally a straight forward operation. You initiate an RDP connection to a private IP and enter your credentials. Simple and relatively secure as all traffic is local relative to your LAN. With Azure, this can become  more complex as the option to attach a public IP is available.

Even with added complexity, attaching a public IP to your VM should not be viewed as a negative. The functionality it offers is incredibly useful in many scenarios.  However, as this allows the VM to be accessed from the internet, the correct precautions must be taken to ensure your VM is kept safe and secure.

Let’s start with the most basic but often overlooked piece of security; username and password. Choose a username that is unique to your organisation and choose a password that is complex. RDP ports are one of the most tried perimeter ports on Microsoft’s public network, having your credentials as secure as possible adds a minimal but useful level of security to your VM.

Next up, a Network Security Group. This can prevent or allow IP addresses accessing your VMs network interface and/or subnet. An example configuration would be to allow access to RDP for your outbound IP range and deny all other access. This would mean that only your outbound IP range can access the RDP port. This works well but what if the VM requires access from multiple locations? What if you don’t have access to their IP range?

This is where we add our final layer of security, a public facing Azure Load Balancer. This object will carry it’s own public IP address and allows for the creation of NAT rules. To secure the RDP port (3389 by default) simply create an inbound NAT rule to translate a random public facing port, for example 50001 or 100003, to our standard internal port 3389. Once this rule is in place, you should remove the public IP associated to your VM. The LB association will route traffic inbound and outbound via the LB public IP and therefore use the NAT translation for secure access. For those of you who used ASM VMs, this is similar to how endpoints were setup within Cloud Services, but this offers a more granular control.

Below are some links on some of the features and points mentioned in this post as well as a technet blog on provisioning and attaching an LB for RDP. As always, any questions, get in touch, enjoy!

Azure AD Application Proxy

Web applications that are only accessible on your corporate LAN are common place in most companies. The lack of public access can be the result of many factors, most commonly the reason is the complexity of allowing a route through your secure perimeter. As a result, providing access to these applications has traditionally involved creating and utilising virtual private networks (VPNs) or demilitarized zones (DMZs), both requiring significant IT effort to put in place and keep secure. Adding to this, a lot of these applications can be quite difficult to lift and shift into a DMZ which would of course be best practise. Overall, both solutions have several complexities and offer different degrees of difficulty to manage.

Enter Azure AD Application Proxy (AP). This service can provide single sign-on (SSO) and secure remote access for these common web applications. It leverages your current infrastructure and ties into Azure AD for identity management so if you are already using Office 365, the authentication process for users is identical and the configuration required is minimal. Additionally, the page will be available on any web-accessible device at all times. This greatly simplifies the process as you don’t need to change your network infrastructure or allow VPN access for external users.

Again, those already using Office 365 will have their identities, or a version of them, active within Azure AD. When using AP, you can require users pre-authenticate before the internal page loads, offering an additional layer of security and auditing. If your identities are federated for example, this process ties-in seamlessly without any further configuration required. The ability to then add a method of passthrough authentication is when your end users life is made a lot easier.

To make use of AP, you need to install at least one connector in your environment. The requirements for this installation are here. The beauty of the connector is that it only requires outbound ports on your firewall and the main ports are 80 and 443, ports you would most likely already have open. Again, Microsoft are trying to make this as simple as possible! Once you have a connector installed and active, it is a good idea to install at least one more. This allows for high availability should one of the servers be inaccessible accidentally or due to maintenance.

Next, you obviously want to publish your application. This can be done in several ways, from the very simple to the completely integrated. Digging through those complexities is something that requires a lot more time than this blog post is suitable for, but believe me, with some guidance from an experienced architect, this process is very much achievable in the majority of environments. Here is some additional reading on application publishing:

Publishing Applications

SSO utilising KCD

Remember, AP isn’t just for simple internal web applications like your intranet page. It can be leveraged to provide SSO and secure remote access to your on-premise Sharepoint farm and even applications published locally via RDS.

I’ve implemented this service for several clients and they all compliment the functionality it allows them to leverage. The guidance during the configuration phase allows IT admins to then layer additional points of access to applications that would previously have been simply to cumbersome to offer externally. In my opinion, AP is one of the best features of Azure AD and it is only improving as Microsoft adds additional options for security and functionality.

This post is much more of an introduction to AP than a guide to configuration, if you want to talk about config, or have any other questions you can contact me on Twitter – @wedoAzure or via email.

 

User Defined Routing

How traffic is routed within and between Azure networks isn’t something that is immediately apparent to the average admin. Therefore trying to increase the complexity of this for security reasons or by adding a virtual appliance can offer several challenges. As more companies begin to adopt a cloud strategy these scenarios will become more common. I’m here to tell you that with a little patience, taking full control of traffic routing in your Azure environment only takes a couple of well planned steps.

This is something you should see repeated often when reading about Azure; everything starts with the Virtual Network. When you create a Virtual Network (vnet) you must define some simple parameters, we’ll focus on two of these for this blog post, Address Space and Subnets.

If you are planning on a cloud-only environment, you can choose whatever address space you like (well, almost… any range defined in RFC1918) but, I would always recommend you plan vnets with the possibility they will be linked to your LAN. It is much easier to plan for this now than trying to resolve it later, trust me.

Choosing how to divide your address space up into several subnets also requires careful consideration. To start with, not all of the addresses in a subnet are usable, Azure reserves the first, last and three other addresses for Azure services. So you need to ensure there are enough usable addresses in each subnet for its intended use.

Once the above steps are completed, Azure creates a system routing table. This defines how traffic will route both within and out of the vnet. It is a combination of your vnet address space and Azure defined address spaces. Below are the effective routes for a demo network interface sitting in a vnet with the address space 10.0.0.0/16

system_routing

As you can see, all of the source definitions are ‘Default’ indicating they are system generated. The entry noted with ‘next hop type’ of ‘Virtual Network’ is our vnet address space, this defines the traffic as local. The rest are added by default to cover system and internet routing, more on those here.

The above table applies to all subnets within the vnet, so if you had a requirement for an incredibly secure vnet that could not route to the internet, only local traffic, you could introduce a user defined route table to overwrite the system route.

Routing in Azure is done by longest prefix match (LPM) this means that if the system table defines a 0/16 and you define a 0/24 and traffic is bound for that address space, it will follow the 0/24 route. If the LPM is an exact match, the following order is adhered to:

  1. User defined route
  2. BGP route (when ExpressRoute is used)
  3. System route

So using our example above, we would create a UDR and define a route for traffic bound for 0.0.0.0/0 as having a ‘next hop’ of none. This will prevent any outbound traffic, with the exception of your vnet (remember, LPM) from being able to route entirely.

The below grab of my demo VM effective routes shows how the system route is now marked as invalid and my user route has precedence.

udr

There are many other scenarios where this can be useful, for example if you have a site-to-site VPN and would like all internet traffic to tunnel through on-premise appliances you can force tunnel traffic to the gateway.

Similarly if you have an appliance as part of your vnet within Azure, you can route all traffic to and from it as you see fit. Remember to always view the effective routes being applied to a network interface before pushing to a live production environment!

My final recommendation if using UDRs in your environment is to add a Network Watcher object. It greatly simplifies analysing and troubleshooting routing. There are multiple powershell and CLI commands you can also make use of for troubleshooting.

If you’ve any questions etc, get in touch!

Welcome!

I gave quite a lot of thought to what my first post should be; a detailed how-to, a cautionary tale of experience or an excited new feature set. However, I decided that all of that can wait. What is more important initially is to introduce myself so you know a little bit about whose posts you are reading.

My name is Joe Carlyle and I am an Azure Consultant based in Dublin, Ireland. This blog is something I have been meaning to create for quite some time and I’m very happy to be eventually starting it.

As you have probably guessed by the name, this will be a blog about Microsoft Azure which is Microsoft’s cloud computing platform. Part of my reason for setting this blog up is that I believe it is one of the most exciting areas of technology that anyone could currently be working in. The platform itself is vast and skillsets required are broad, therefore I don’t believe that any one person can be an expert of the whole platform. There are simply too many services available.

The options within Azure that this blog will primarily focus on are as follows:

  • IAAS and everything that goes with it
  • Azure Active Directory
  • Intune
  • Security & Identity
  • Monitoring
  • Automation

My initial plan is to try to post something that is useful, interesting or both, at least once a week. Obviously if there is anything new and exciting, this will also be added into this mix. Once the blog is up and running, I also plan on introducing a Mailbag post to answer any questions that may have arisen from previous posts, or to simply take requests for future ones.

It is my opinion that for a blog to be successful and benefit all those who use it, it must offer a degree of interaction. All of my professional contact details are listed, please feel free to get in touch!

Joe