Often, networks as a whole are configured correctly with no major network issues reported by users or seen on network device monitoring logs. However, even in these circumstances, there may be some clients with a less-than-satisfactory experience. Issues with authentication, association, roaming, IP assignment, and poor throughput are examples of what might be experienced by individual wireless clients, and it may not be obvious where the problem lies.
There may also be instances where a wireless admin might want to verify the effects of network changes from a client standpoint, e.g., a driver update or device refresh, a configuration change, or physical topology changes in AP location or count. The WLAN admin can now verify the new RF metrics and compare them against previously recorded baselines.
There also exist dynamic larger-scale RF environments, such as warehouses and stadia, where the client density and signal propagation constantly change. When filled to capacity, these RF environments are very different from when they are empty, which impacts wireless device connectivity. There will be different issues to contend with when they are full, such as attenuation and absorption of signal, dense client distribution with multiple clients on a single AP, interference from other 802.11 sources, etc. As the space empties, there may be issues involving cell overlap and multipath interference as signals get reflected. Aside from variances in usage and traffic on the network, there would be no other indicators to show the unique state from the client perspective.
Meraki Health is a suite of tools and analysis to assist wireless administrators by providing each client’s and access point's unique perspective of connectivity to the WLAN, allowing network administrators to drill down into client and access point issues to have visibility into the connection state. This document will cover the information available on the Wireless > Access points page under the tabs:
- Connection log
The Health tab provides a network-level overview of SSIDs, client connections, and AP statistics. This enables an administrator to easily look at a specific client, access point, or SSID that has had issues reported and clearly see what the most common points of failure are. The health tab makes troubleshooting existing issues much easier, while also helping to identify potential issues that may arise in the future or areas where improvements could be made in the current deployment.
The recommended minimum firmware version for the health tab is MR 24.12 with standard dashboard licensing; no special licensing or additional fees required.
Wireless Health is accessed by going to Wireless > Access points > Health in the dashboard. The Overview page gives a quick reference point to see the overall health of the wireless network. At a glance, the Overview page provides a quick reference for the total number of client devices facing issues from a connection and latency standpoint. The Overview page will also show the percentage of change in value over the past hour, however, the time reference can be changed to last day, week, or even month.
If the "%" change from the previous time period is < 1% it will not be displayed.
Connection Steps Graph
Below the Overview header, there is a graphic view showing client connection statistics across the entire network in a single snapshot, as seen below. This is the connection steps graph. This is very useful to understand how the overall network is doing in terms of client connectivity.
The connection steps graph quickly and easily shows each step in the process that clients go through every time they connect to an access point. This makes it simple to see at which step in the process clients might be experiencing issues. From left to right we see each step in the process, beginning with the association and followed by authentication, DHCP, and DNS resolution, culminating with the overall success rate of clients that have attempted to connect to the wireless network. As clients move through the connection process, that data is recorded and displayed as a percentage of clients who were able to successfully complete each step.
It is expected to see an overall success rate of less than 100%, as no client will always connect to a wireless network properly every time. Seeing a large drop in success rates at a certain step in the process could indicate an issue that is potentially affecting clients' abilities to connect to the network properly. For example, in the image below we can clearly see that a disproportionate number of issues happen during initial DNS resolution for connected clients, indicating that there may be an issue relating to DNS on the network.
Shows the total percentage of clients that successfully associated to an access point out of those that attempted in the specified time period.
Shows the percentage of clients that successfully associated that were also able to successfully authenticate in the specified time period.
Shows the percentage of clients that successfully associated and authenticated that were also able to receive a valid DHCP address.
Shows the percentage of clients that successfully associated, authenticated, got a DHCP address, and were able to resolve their first DNS request.
|Success rate||Shows the percentage of clients that were able to successfully associate, authenticate, get a DHCP address, resolve DNS, and pass traffic on the wireless network.|
Connection Issues by SSID
This section lists the SSID that is having the most issues along with the number of clients impacted on the SSID. The dashboard will also capture the step at which clients are facing issues on a specific SSID. This will provide a clear indication of whether the failures are on a specific SSID or across all SSIDs in the network.
Connection Issues by Client
This section lists the problematic clients that are having the highest number of issues for connection. The connection of a client is monitored across different steps: association, authentication, DHCP, and DNS. As a client device fails any of these steps, it is captured in this section. The section also shows the number of attempts made by the client device that failed, along with the step on which failure was observed.
Connection Issues by AP
This section lists APs in the network and the number of clients that have experienced connection issues when attempting to connect to each AP during the selected time frame. The respective Client Devices with Problems columns list the number and percentage of clients out of total devices that had more than 50% of connection attempts fail for the selected time frame. Clicking on Number of Client Devices for a specific AP will open the Failed Connections page, filtered to show only failed connections for the chosen AP.
Highest AP to Client Latency
This section of the page lists each AP in the network and the average 802.11 latency for clients connected to that AP, sorted by highest average latency. Latency is measured by looking at 802.11 frames and comparing the time between when the frame leaves the AP radio and when the corresponding ACK arrives back at the AP from the client. Clicking on an AP's name will take you the access point status page of the AP.
Connection Issues by Client Device Type
This section displays a list of detected client types and the number of clients in each of those those groups that have experienced connection issues when attempting to connect to the wireless network during the selected time frame. The respective Client Devices with Problems columns list the number and percentage of clients out of total devices that had more than 50% of connection attempts fail for the selected time frame.
Highest Client Latency by Device Type
This section lists each detected client type on the network and the average 802.11 latency for connected clients of that type, sorted by highest average latency. Latency is measured by looking at 802.11 frames and comparing the time between when the frame leaves the AP radio and when the corresponding ACK arrives back at the AP from the client.
In order to provide additional insights, the dashboard will surface an additional graph for any anomaly noted in your network for onboarding wireless client devices. This information is generated on a per-network basis and is calculated by computing up to six weeks of historic data.
Based on the historic data, the dashboard will create a smart threshold value for each sample in the historic data, shown on the graph below. This can be found in the Wireless > Health tab in the dashboard.
When any of the onboarding steps exceeds this smart threshold value, an anomaly will be generated to indicate this, and an email can be triggered to the network administrator.
As shown in the graph above, the dashboard will highlight all onboarding failures and will map them, along with other client devices that have been successful in onboarding. This allows the thresholds to be dynamic and respond to changes in the network. The graph is overlaid with anomalies highlighted in red. To make it simpler for a dashboard administrator to consume this information, there are toggles provided in the legend that allow you to enable/disable total clients and failure threshold computed historically.
This graph will only apply filters for SSID and time duration as selected in the Wireless > Health page.
As a network administrator hovers over the graphs, a new pop-up shows the actual values at any given point in time, and will show how many client devices are above the smart threshold graph and categorize them as anomalies.
All this information can be broken down into the four steps for client onboarding:
A network administrator can look into the individual steps to understand how clients fail to connect and cause the dashboard to trigger anomalies.
This graph is developed in correlation with our smart threshold-based alert. The new graph UI will only show up if the smart threshold-based alert is enabled for a given network they are not available on by default.
The Map tab initially displays a color-coded map of APs. Hovering over an AP will display a pop-up showing the AP name, number of clients that have had >50% of connection attempts to that AP fail, and the percentage of total clients affected by connection issues. This information can be filtered by SSID, AP tag, and frequency band. As each AP experiences more client connection issues, it will change color:
- Green (<50% of total connection attempts failing)
- Yellow (50-75% of total connection attempts failing)
- Red ( >75% of total connection attempts failing)
The Connection Log shows a list of all failed connections with information about the client, AP, SSID, failure stage, and failure reason in a tabular format. This information can be filtered for a time span, SSID, AP tag, AP, frequency band, client, or failure step to get specific information as shown below.
The text in blue represents clickable links to the client devices details page or the access point status page.
Network level Timeline
Network level timeline is used to highlight any important events across all wireless clients in the given network. This allows the Dashboard administrator to take a quick glance at the network and check for any important events in the network across the selected timeline.
Note: Network timeline will only be available for networks running 28.5 and above firmware version. Networks running older firmware version will continue to use the Connection log tab.
Server root cause analysis
As part of the Meraki Health solution, APs will monitor and report all the failures that are seen from any wired servers configured in the Dashboard network. These servers include RADIUS server, DHCP server and DNS server. Different root causes are monitored for each of these server types configured in Dashboard. Below is a screenshot of this information being surfaced in the Network-wide -> AP's timeline or Client's timeline tab. Similar information will also be available in the AP timeline and Network timeline for a wireless network.
As shown in the screenshot above the root cause of the issue is identified by the Dashboard and shown in the tile for the specific client device. Evidence is also gathered by looking and the impact and other similar failures across SSIDs and VLANs. Based on the intelligence built into the Dashboard, recommendations are updated in order to root cause the issue observed. This is done even before the Dashboard administrators starts looking at their RADIUS server. Hence the added level of intelligence provides a high degree of confidence for an administrator to isolate the root cause of the issue. This tool provides an end-to-end root cause analysis for one of the most crucial components of any enterprise wireless network. Here is a list of root causes that can be highlighted across different server types:
Note: Server RCAs will only be available for networks running 28.5 and above firmware version.
For general wireless troubleshooting tips, feel free to reference the following articles:
- Channel Planning Best Practices - Provides a detailed overview of best practices for wireless channel planning.
- Understanding Wireless Performance and Coverage - Provides a detailed overview of the technical aspects of wireless signal coverage and performance impacts.
- Wireless Throughput Calculations and Limitations - Provides an overview of how to determine the potential real-world throughput of a wireless network.
- Roaming Technologies - Provides an overview of the different types of supported client roaming and their impact on the roaming process.
- Tools for Troubleshooting Poor Wireless Performance - Provides an overview of other locations available in the dashboard that can provide useful information for more specific wireless troubleshooting.
- Using the MR Live Tools - Another article on how to use the Live Tools on the dashboard.
- VLAN and RADIUS status on access points - Provides an overview of how to use the Live Tools on the dashboard to troubleshoot wireless issues.
- Common Wireless Event Log Messages - Explains the most common Event Log entries that are seen on wireless networks.
- Capturing Wireless Traffic from a Client Machine - Provides a detailed guide to taking Monitor Mode packet captures for troubleshooting.
- Understanding and Configuring Management VLANs on Meraki Devices - Explains how the management VLAN is used on Meraki APs.