Dashboard Alerts - Insight
NOTE: This feature is currently in Beta
Troubleshooting Guide
This document enlists all the available alert categories under Meraki Insight’s Web Application Performance and the triggers and recommended next steps to resolve the application issue on the network.
The root cause currently covers the WAN problems detected by passive analysis of application flows on the web app health WAN problems
Performance Degradation
If no root cause for an issue has been found, the alert will show as Performance Degradation.
Triggers
Web App Health goodput or response time data for a particular application and network drops below whatever threshold the user has set (either smart thresholds or manually set).
ISP issues
Triggers
Goodput on the WAN for passively collected flows per application and network drops below whatever threshold the user has set (either smart thresholds or manually set).
Uplink usage is less than 80% of the bandwidth defined on the SD-WAN & Traffic shaping page
Goodput of actively collected ICMP ping data for the alerting uplink drops below 2.5KB
Troubleshooting steps
-
Check to see if you have configured the bandwidth on the SD-WAN & Traffic shaping page (link to the SD-WAN and traffic shaping page)
-
Check Bandwidth restrictions by referring to the Global Bandwidth limit considerations KB article
-
Look at the event logs to notice if there are continuous VPN registry changes and refer to the Site-to-Site VPN troubleshooting doc or Meraki Auto VPN -Configuration and Troubleshooting
-
Look at network status on the WAN Health page, or Uplink Tab on Appliance status page to check loss and latency reported from ping data
-
Look at the Meraki Insight Web App health drill down page to check passive goodput
High latency over VPN
Triggers
Goodput on the WAN for passively collected flows per application and network drops below whatever threshold the user has set (either smart thresholds or manually set).
The flow goes over VPN and there is latency detected on the WAN for this application only, then this alert is triggered. There is a check for other major problems/outages considering this root cause High confidence
Troubleshooting steps
-
Check the WAN health portfolio in Meraki insight to see if the overall network is having high latency.
-
Check VPN status page to see usage information for any bandwidth throttling
Uplink saturation
Triggers
Goodput on the WAN for passively collected flows per application and network drops below whatever threshold the user has set (either smart thresholds or manually set).
If the uplink usage is greater than 80% (as configured on the SD-WAN & Traffic shaping page) then there is High confidence that the application performance is suffering due to the uplink being saturated.
Troubleshooting steps
-
Check to see if you have configured the bandwidth on the SD-WAN & Traffic shaping page[insert SD-WAN and traffic shaping page dash link]
-
Check Bandwidth restrictions by referring to the Global Bandwidth limit considerations KB article
-
Check WAN health Page to see if the given network shows a high usage alert on the availability graph
-
if a certain user is utilizing more than the required bandwidth, refer to Creating and Applying group Policies to add policies to restrict or limit bandwidth utilized
Dual Active MXes
Triggers
When there are unexpected numbers of VRRP transition events on the event log for primary and spare MX that can hamper the flows of the application to be interrupted or cause loss, this alert is triggered. There is also a check to see if the WAN overall is seeing a loss,
If yes, then this alert is triggered with High confidence that the reason for the issue is due to VRRP transitions and high WAN loss
If no, then this alert is triggered with medium confidence that this might only be due to the VRRP transitions
Troubleshooting steps
- Check the Connection Monitoring check for MX by referring to this doc: MX VRRP Transitions
Traffic shaping rule saturation
Triggers
Goodput on the WAN for passively collected flows per application and network drops below whatever threshold the user has set (either smart thresholds or manually set).
The app configured on Web App Health has a traffic shaping rule with a bandwidth limit applied to it under the SD-WAN and Traffic shaping page
Usage of the app is >80% of the configured bandwidth limit as configured by traffic shaping
Troubleshooting steps
-
Check the Traffic Shaping rule section under the SD-WAN and Traffic shaping page to increase the limit on the bandwidth provided to the application
-
Refer to Traffic and bandwidth Shaping for more information on limitations that can be applied to application traffic
Sticky client
Triggers
If poor performance detected between SSID and client connection due to suboptimal AP selection this alert will be triggered. Client devices choose which AP to connect to. Meraki APs cannot force a client to choose a particular AP.
Troubleshooting Steps
-
Try to force the client to re-select a more optimal AP by having the client disassociate and reassociate.
-
To learn more about setting to avoid sticky client issues please refer to this documentation.
MAC Flap Anomaly Alert
Overview
MAC flap event is triggered when a MAC address is learned 3 times or more on 2 or more different ports within 10 seconds. Sometimes these events can become noisy due to wireless roaming which is expected. Due to this noise, it is hard to tell which events are expected vs unexpected. Networks with APs in bridge mode are susceptible to this issue.
It is difficult to separate and identify unexpected events. To solve this problem, Cisco Meraki built a machine-learning algorithm to detect these unexpected events more accurately.
Trigger
The machine learning algorithm looks at the past 35 days of hourly event counts and creates hourly thresholds for the next 7 days. It monitors the real-time hourly event counts, and if the real-time hourly count breaches the expected threshold, it will create an alert - MAC flap anomaly detected on “SWITCHNAME” from “TIMESTAMP”.
The alert will only resolve itself if the algorithm detects the real-time hourly counts goes below the threshold, which happens at the top of the hour from the time of anomaly detection otherwise, it will persist.
Troubleshooting steps
- Make sure there is no loop in the network.
- Follow the high-density Wi-Fi deployment guidelines to make sure the network meets the optimal network criteria.
- Sometimes a bad WiFi adapter on a client device can cause frequent connection failure and cause frequent roaming.
-
This algorithm can create false positive alerts from time to time. Cisco Meraki will continue to make improvements to the algorithm to reduce false positives.
-
This alert will appear on the alert hub and the new organization alert page.
-
This alert will show up only if an organization is opted-in for the new organization alert page. To opt-in please navigate to the Organization > Early Access page on Dashboard and opt-in for “New Organization Alert Page & Alert hub Enhancements”.