Back for another year, Azure Spring Clean returns this week to help with all of your Azure Management options. The event will run from today Monday 23rd through until Friday 26th.
Each day, there will be articles from the following blend of topics:
Azure Cost Management
Azure Security Principles
All articles are community driven and are a mix of experience and technical detail. You may recognise some faces from last year, which is great to see as organisers. Conversely there are new contributors too which is equally great and inspiring to continue the event annually.
Azure ExpressRoute is Microsoft’s recommended method of accessing resources in Azure over a private connection. It is unique to the platform and can offer unparalleled resilience and performance. It also has the capability to allow you connect to Azure Public Resources, such as Storage Accounts, Azure SQL, over the same circuit and therefore, private connection.
With the above being possible, it’s clear that ExpressRoute, once implemented, becomes the key resource in your environment. As such, you would be wise to implement some level of monitoring and alerting for it.
Here, I’m going to show you how to very quickly put in place alerts should either of the required BGP sessions drop for your ExpressRoute circuit.
To add some context to that, each ExpressRoute circuit requires a pair of BGP sessions to meet SLA requirements. They will function with just one, but that’s not fully supported and definitely not recommended! So, presuming you have both active, we’ll base our alert off of that metric. We’re going to look specifically at Private Peering as it’s the most common implementation, but the logic is identical for Microsoft Peering.
If both are active and configured correctly, you should see the same routes being advertised within the circuit if you check your route table from the Peering blade
Next, let’s look at the Metric we will be using for the alert. In your ExpressRoute Circuit blade, choose Metrics. Then we’ll choose the ‘BGP Availability’ Metric and ensure Aggregation is ‘Avg’
Next, to future proof the alert, or if you have a Microsoft Peering, we’re going to apply some filtering. So choose ‘Add filter’ (highlighted above). Then we’re going to select the ‘Peering Type’ Property, and choose ‘private’ from the Values drop down and click the tiny blue tick to the right (You won’t see Microsoft if you don’t have one, but do this anyway to ensure your alert is tightly scoped).
Now, we’re looking at the average BGP availability, for our Private Peering only.
To make life easier, you can now just click ‘New Alert Rule’ on the top right to use this scoped metric for your alert.
This applies the right resource scope and brings the metric across too, you now just have to configure some of the required choices on specifics.
First up, let’s confirm the condition and signal logic. Click on the blue text under Condition. You will see Private is already selected for you. I’m going to change Operator to ‘less than or equal to’ and add a value of 95.
The above settings mean that the BGP Availability metric for my private peering, over the last 5 minutes, will be assessed every minute. Should it drop below or equal to 95%, an alert will fire. You can adjust this as needed to suit your own environment.
Now, we need to define what happens when they alert fires. Click Done on the signal logic pop out, this will bring you back to the alert configuration blade.
Next, define the final details for the alert. Use a descriptive name and the appropriate severity for your environment/system.
Now, the alert itself can take some time to become fully active, so I would recommend waiting about thirty minutes before attempting a test. But, please ensure you test the alert!
And that’s it. Once tested, you have now setup a fully scoped alert rule for your ExpressRoute Private Peering. You can repeat the above, and change the scope to Microsoft Peering to cover that too if you have it.
As always, if there are any questions, get in touch!
First announced back in late February, Azure Sentinel is the first cloud native SIEM service from a major provider. SIEM (security information and event management) is a primary component in any security service. Sentinel aims to leverage cloud specific benefits like elastic scale and AI to allow customers detect and respond to security incidents as quickly and efficiently as possible.
The workflow of Azure Sentinel can be broken into four steps:
Sentinel allows you to collect data at scale from multiple users, devices, applications and infrastructure, hosted in Azure, on-premises, and even in multiple clouds. This means you can aggregate all security data using industry standard log formatting. With built-in integration, you can enable collection for features such as Office 365 or Azure AD within seconds.
Having all of your data collected to Sentinel allows for more simple analysis and detection at scale than was previously possible. This more efficient triage, and the capability to leverage Microsoft Machine Learning allows you to be more productive, minimise false positives and react to those high accuracy alerts as early as possible.
Sentinel allows you to visualise and resolve alerts using the same dashboards. Proactively hunting for incidents can be automatic or scripted into a set of queries. Microsoft have provided some to get you started too based on their analysis and response teams.
Continuing the efficiency seen in previous steps, Sentinel allows you to orchestrate and automate responses to incidents. Allowing you to automatically handle repeat and/or known incidents.
So now that you know what it is, the next step is to put it in action and see if it can be of use to you and your client/business. Currently still in preview, Sentinel is free to use, which is always good and allows you to assess the service without any significant financial impact. Bear in mind, you will pay for the Log Analytics workspace which stores the data!
First, you’ll need to enable Sentinel and a workspace, this can be done via the portal and a walkthrough is here. Then, you need to connect some services to start streaming data to Sentinel. As you can see below, there are multiple options and you can choose which logs/data is sent to Sentinel too.
Once your data connector is active, you can make use of the built in dashboards to visualise it. Below is a subsection of the Azure AD sign-in log dashboard, which is available immediately via Sentinel. You can also create your own custom dashboards, there is a guide with samples here.
Now that is being collected and you have visuals, your next step is analysis. The first thing you will need to do is to create Detection Rules. These are essentially Log Analytics queries with alerting parameters wrapped around them. Microsoft offer sample queries on Github which are updated regularly. Alternatively, you can simply write your own to meet your needs.
The results of your Detection Rules are then fed into the Cases section of Sentinel. Here you can triage, investigate and remediate incidents. The cases are created dynamically from the parameters you set for Detection Rules such as severity and entity mapping. As such, be prepared to have to tweak those thresholds and alert patterns a bit. I have Sentinel running within several customer tenants, and am still not 100% happy with my detection rules yet. Always remember to update the status of your case too, in progress, resolved etc.
Finally, you should set up some Playbooks to respond to your alerts. A Playbook is simply a set of procedures that you can run from Azure Sentinel. They help automate and orchestrate your responses to alerts, and you can run them manually or ideally set them up to run automatically in response to certain alerts. They are based on Logic Apps, which means all of the same actions are available via Sentinel. One quick note, there is a charge for Logic Apps and therefore Playbooks, so ensure you understand your costs first.
When creating a Playbook, regardless of it is going to be run automatically by Sentinel or manually by you, you should first define your scenario. My preferred approach here is to come up with “If-This-Then-That” loops and apply them as needed. This is another section that will take some tweaking over time. In my experience, I only run Playbooks manually initially, then start to add automated triggering once I’m happy with the alert and response. Docs have a nice sample alert-response playbook with messaging and actions which is a great place to start.
Another function which I haven’t covered here is Hunting. I haven’t spent enough time with this feature yet to give a detailed opinion but you can read more on it over on Docs.
So, if you haven’t given Sentinel a try yet, I’d recommend you review the quickstarts and deploy in your tenant for one or two of the data sources like Azure AD. While it’s in Preview, it is a great chance to assess it relevant to your tenant and hopefully gain some greater insight and response capability too.
As always, if there any questions or if you have any problems with your Sentinel, get in touch!