When using VPN functionality to securely tunnel traffic between Cisco Meraki devices, such as the MX Site-to-site VPN, or MR Teleworker VPN, the devices must first register with the Dashboard VPN registry. This allows their connections between each other to be dynamic, and automatically establish without manual configuration. However, sometimes issues can occur with this process, which will be discussed in this article.
For information on how the VPN registry works, please read the article on Automatic NAT Traversal.
If the appliance/concentrator is successfully connected to the VPN registry, but is disconnected from another VPN peer, refer to the article on troubleshooting VPN connections between peers.
When the "VPN Registry: Disconnected" message appears on the Monitor > VPN status page for MX networks, it indicates that the appliance has been unable to establish connectivity with the VPN registry. This means that a firewall, or other upstream device, is either preventing traffic from reaching the VPN registry, or returning to the appliance.
In the example packet capture below, an appliance is attempting to reach the VPN registry on UDP port 9350, but is receiving no response because an upstream firewall is preventing the outbound traffic:
In this example, the appropriate firewall rules have been added to allow the traffic to the VPN registry, and responses can be seen:
If this occurs, make sure that any upstream firewalls are configured to allow traffic to the IP addresses and ports listed on the Help > Firewall info page. Particularly for the VPN registry. It should also allow return traffic from established connections (this is allowed by default for stateful firewalls):
UDP hole-punching, the mechanism used to establish the VPN connections between Cisco Meraki devices, relies on a consistent IP address and port for both devices involved. Two VPN registry servers are used for redundancy, and both expect to see the device as available on the same public IP address and port.
However, some NAT devices (such as a firewall) will rewrite the source ports differently for each VPN registry server. Other NAT devices or load balancers will attempt to spread the connections to each VPN registry server across two different public IP addresses. Both of these cases will result in the VPN connection failing, and marking the NAT as unfriendly:
In this example the upstream firewall rewrites the source port for each outbound connection differently. Notice that the first connection is changed to port 56125 while the second is instead 56126. When the registry servers see different source ports, the NAT unfriendly error will appear:
In this example, the upstream firewall is load balancing connections over two WAN connections, and then performing NAT using two different public IP addresses. Notice that the first connection is sent from the 198.51.100.23 address, while the second is sent from 198.51.100.17 instead. When the registry servers see different source IP addresses, the NAT unfriendly error will appear:
If using a load balancer, or NAT across multiple public IP addresses, map traffic from the internal address of the appliance to a single public IP address. This will keep the public IP address seen by the VPN registry consistent.
Select an arbitrary port that will be used for all VPN traffic to this MX (e.g. UDP port 51625). Manually create a port mapping on the upstream firewall that will forward all traffic received on a specific public IP and port to the internal address of the appliance on the selected port. In Dashboard on the Configure > Site-to-site VPN page use the Manual: Port forwarding option for NAT traversal, and provide the public IP address and port that was configured. All peers will then connect using this IP address and port combination.
When using teleworker VPN to tunnel AP traffic back to either an MX Appliance or VM concentrator, devices must still connect to the VPN registry as described above. However, the tools and information used for testing are different.
In the AP network, from the Configure > Access control page, under Addressing and traffic use the Test connectivity button next to the selected Concentrator to test the ability for all APs to connect to the concentrator. If using a VM concentrator, it is also possible to use the SSID status live tool on the Concentrator > VM status page.
If all of the APs fail to pass the connectivity test, the issue is most likely on the concentrator/appliance end. Packet captures can be performed on the APs, appliance, or other points in the network if possible, to determine where the traffic is being blocked. If traffic appears to be successful to/from the VPN registry, refer to the article on troubleshooting VPN connections between peers to investigate potential issues between the APs and the concentrator/appliance.