Azure ExpressRoute Troubleshooting and Alerts
Azure ExpressRoute Troubleshooting and Alerts
Setting up an ExpressRoute connection is just the beginning. To ensure high availability, performance, and fast incident response, configuring comprehensive monitoring and alerting is critical.
๐ Types of Alerts: Circuit-Level vs. Gateway-Level
Azure Monitor supports alerts at both the ExpressRoute circuit level and the gateway level.
Circuit-Level Alerts
These focus on peering and protocol availability:
-
ARP Availability Down: Alerts when Address Resolution Protocol traffic drops below 100% for a peering type.
-
BGP Availability Down: Triggers when BGP peering sessions go inactive.
Use dimensions like Peering Type and Peer when defining these metrics to get precise and actionable data.
Gateway-Level Alerts
Set up alerts for ExpressRoute gateway connections to monitor overall connection health. To create one:
-
Navigate to Azure Monitor > Alerts > + Create Alert Rule.
-
Select the ExpressRoute Gateway as the resource.
-
Choose the signal type (metrics, activity logs, or resource health).
-
Set conditions, thresholds, and actions.
-
Assign an action group (email, webhook, ITSM, etc.).
:::image type=”content” source=”./media/expressroute-monitoring-metrics-alerts/signal.png” alt-text=”Azure Monitor signal selection for ExpressRoute”:::
๐ Alerts by Peering Dimension
Azure lets you create alert rules scoped by peering or individual peers, so you can zero in on specific routes or VNETs for diagnostics.
:::image type=”content” source=”./media/expressroute-monitoring-metrics-alerts/alerts-peering-dimensions.png” alt-text=”Alert scoped by peering dimension”:::
๐งพ Monitoring with Logs
-
Activity Logs: Capture control plane events like route changes and BGP resets.
-
Resource Logs: Set diagnostic settings to collect route metrics and session status.
-
NSG Flow Logs: Useful for diagnosing network-level anomalies.
-
Route Diagnostic Logs: Inspect BGP route advertisements and withdrawals.
๐ ๏ธ Troubleshooting Tips
If ICMP works (ping) but no app-level connectivity (SSH, RDP, SQL), check:
-
GatewaySubnet settings: No NSG or NAT gateway should be attached.
-
Route Table (UDR): Set to None for GatewaySubnet.
-
Connection state: Look for aged-out TCP sessions vs. proper FIN/CLOSE events.
Leave a Reply