Site-to-Site VPN Troubleshooting

Last updated
Save as PDF

Click 日本語 for Japanese

Meraki Site-to-site VPN makes it easy to connect remote networks and share network resources. In the event that VPN fails or network resources are inaccessible, there are several places to look in Dashboard to quickly resolve most problems. This article will overview common site-to-site VPN issues and recommended troubleshooting steps.

Troubleshooting

If there appears to be an issue with VPN, start by referencing the Security & SD-WAN > Monitor > VPN status page to check the health of the appliance's connection to the VPN registry and the other peers. If one specific tunnel is having issues, it may be helpful to check the status page for the networks of each peer in case one of them is offline or disconnected from the registry:

Screenshot of the VPN status page. The following message is displayed on the right side of the screen: "Connectivity: Disconnected. This WAN appliance is currently unreachable from the Meraki cloud."

The following sections outline common issues with site-to-site VPN and recommended troubleshooting steps:

Can't ping or access network resources on the other network

If you are unable to connect to devices on the other network from your site:

Are both devices online and connected to the registry?
- As outlined above, be sure to check the Security & SD-WAN > Monitor > VPN status page for each side's Dashboard network.
Is the subnet you're trying to reach advertised over VPN?
- On the remote side's Dashboard network, navigate to Security & SD-WAN > Configure > Site-to-site VPN. Under Local networks, make sure the Use VPN toggle is set to Yes for the subnet you're trying to reach. You should also check these settings on your local site's Dashboard network to ensure that the subnet you're connecting from is also advertised.
- If using a full tunnel configuration, bear in mind that when specifying a prefix to be part of a VPN, everything covered by that prefix will be allowed in the VPN. Therefore, subnets that overlap will cause traffic in a more specific subnet to be sent through the VPN, even if it is not configured to be included in the VPN. For example, if 10.0.0.0/16 is configured to be included in the VPN but 10.0.1.0/24 is not, traffic sourced from 10.0.1.50 will still be sent over the VPN.
Are any firewalls blocking this traffic on the network?
- In addition to any non-Meraki firewalls on the network that may be blocking this traffic (including firewalls that may be enabled on the device you're trying to access), check the Security & SD-WAN > Configure > Site-to-site VPN > Organization-wide settings section to see if there are any Site-to-site outbound firewall rules.
Are there any problems reaching out to non-VPN peers?
- Try sending pings or traceroutes to public IPs (such as 8.8.8.8) or access public websites to see if the problem isn't strictly related to VPN.
- Try pinging the public IP of the other WAN Appliance from your local network. If this fails but general Internet connectivity appears to be fine, there is likely an upstream ISP routing issue that is preventing the two sites from communicating directly even though they both have Internet access and are connected to the VPN registry.
Are there routes configured on both sides that point to the remote subnets?
- If the WAN Appliance is not the only gateway in the network (e.g. the WAN Appliance is connected to a layer 3 switch or router with its own directly connected networks), any devices that are not using the WAN Appliance as their gateway will need their traffic routed to the WAN Appliance in order to send traffic across the VPN. Make sure any other routing devices on the network have a route that allows them to access the remote VPN subnets via the WAN Appliance's local IP address.
- For extensive details on deploying the WAN Appliance as a VPN concentrator, please refer to our VPN Concentrator Deployment Guide.
Are these devices on non-overlapping subnets?
- If the device on each end is on a subnet that overlaps with the other side, the WAN Appliance will be unable to route traffic to the other side as it will believe the traffic is destined for the local network. It is recommended to have unique subnets with no overlap on each network connected to the VPN.
- If identical networks are required on each side of a tunnel, you may need to enable VPN Subnet Translation. Please note that this feature does not allow for partial overlap between subnets, and is not supported with non-Meraki VPN peers.

VPN status page reports an unfriendly NAT or disconnected from VPN Registry

If the Security & SD-WAN > Monitor > VPN status page for a given network reports either "NAT type: Unfriendly" or "VPN Registry: Disconnected", there is likely a device upstream of the WAN Appliance for that site that is preventing AutoVPN from working correctly.

NAT type: Unfriendly indicates that the upstream NAT won't allow the WAN Appliance to use UDP hole punching to form the tunnel. It is recommended to set NAT traversal to Manual: Port forwarding to bypass this issue.
VPN Registry: Disconnected indicates that the upstream device is not allowing the WAN Appliance to communicate with the VPN registry. It is recommended to configure any upstream firewalls to allow the traffic listed in Dashboard under Help > Firewall info.

For more information on these two error messages and VPN registry troubleshooting in general, reference our documentation regarding Troubleshooting VPN Registration for Meraki AutoVPN.

Problems with VPN between Meraki MX/Z-series and a non-Meraki peer

If you are having issues with a non-Meraki VPN connection and the above troubleshooting tips did not resolve the issue, reference our documentation regarding Troubleshooting Non-Meraki Site-to-Site VPN Peers.

Troubleshooting with Packet Captures

Packet Captures are a powerful tool to troubleshoot VPN connectivity. They can be especially useful in scenarios when traffic is not flowing as expected through the tunnel, or the AutoVPN tunnel is failing to establish. The following sections will outline the general approach for troubleshooting AutoVPN connectivity using packet captures.

Identify Source and Destination Sites

To properly troubleshoot VPN connectivity we first need to identify the two sites that are having connectivity issues and the expected/intended traffic flow between those sites if the VPN was functioning. Begin by identifying the two sites involved in the communication.

Once the sites have been identified, determine the expected traffic flow if the tunnel was functional. Determine where the connection is originating and the intended destination, the location of each of these will become the Source and Destination site/MX respectively.

Identify AutoVPN Traffic on the WAN

To properly identify the WAN traffic being passed between the two sites, we will first need to know the public IP and port, as well as the IP and port assigned to the currently active physical WAN interface of each MX.

From the Security & SD-WAN > Monitor > VPN Status page of each MX, check the NAT type box to see both the physical WAN IP and the Public IP being used for the VPN connection. For each IP, a port will also be listed.

If the MX is using a routable public address directly on its WAN interface, i.e, the MX WAN is not behind a NAT, the public IP and physical WAN IP may be identical. If the public IP and physical WAN IP are the same, only one IP will be displayed in the NAT type box on the VPN Status page. If the MX is behind a NAT with different physical WAN and public IPs, the UDP port used by the public IP and the physical WAN IP may be the same, depending on the NAT type.

If the NAT Type is Unfriendly, MX IP and port details will not be displayed in the box, and the VPN-unfriendly NAT issue should be addressed first.

When using MX in an HA configuration, a virtual IP may be used on the WAN interface in addition to the physical WAN IP. If a virtual IP is configured, be aware that inbound and outbound traffic for that MX will use the virtual IP instead of the physical WAN IP.

Identify LAN traffic

After gathering the necessary information to identify relevant WAN traffic in a packet capture, we next need to gather IP and port information about the relevant traffic seen directly on the LAN of each site. In this case, the relevant traffic is directly related to the connectivity issue being troubleshot.

Note down the initial source IP and port(s) as well as the initial destination IP and port(s) being used by hosts on the LAN of MXs on both sides. If we do not know the specific port being used as the source or destination port, knowing the source and destination LAN IPs of both clients will allow us to, at least, confirm the ability for layer 3 traffic to pass over the VPN between the two sites as intended.

Be aware that return or response traffic generally uses the same port numbers as the initiating traffic, but in reverse.

Packet Captures

Once the source and destination sites and expected traffic flow have been identified, we can now collect the Dashboard Packet Captures to compare real and intended traffic flow and identify where the connection is failing. To have a full understanding of the real traffic flow in a given moment, we can collect packet captures from LAN and Internet interfaces of both MXs at the same time. To collect several packet captures simultaneously, we can open the Network-wide > Packet Capture page multiple times in separate tabs.

In order to view only the relevant traffic in the packet captures, we should apply filters based on the identified IP addresses and ports. For Internet captures, use the WAN IPs and ports, and for LAN captures, use the LAN IPs and ports. When a high amount of traffic is being passed through the interface, it is important to use filters since a single packet capture is limited to 100,000 packets.

If MX has multiple WAN interfaces, when collecting packet captures, make sure to select the active one.