Skip to main content

 

Cisco Meraki Documentation

Tag-Based IPsec VPN Failover

Authors: Mitchell Gulledge, Raul Ricano and Chris Weber

This document describes the benefits and uses of Tagged Based VPN Failover. This document will serve as a reference for the optimal architecture to allow our customers to receive the most benefit of this technology.

Overview

Tagged Based VPN Failover is utilized for third party Data Center Failover and OTT SD WAN Integration. This is accomplished by utilizing the API at each branch or Data Center. Each MX appliance will utilize IPsec VPN with cloud VPN nodes. IPsec along with the API is utilized to facilitate the dynamic tag allocation.

Use case topology -- branch config tracks the public IPs of Primary and Secondary tunnels.

A typical VPN topology for enterprise routing can be seen above.  In this use case, the design is providing DC-DC Failover for branch(spoke) sites. In this scenario, if there is a failure on any of the monitored IPs (IPsec peers) there will be an immediate, secure and reliable failover. In order for DC-DC Failover to be achieved, the following behavior must occur:

 

  • Spoke sites will form a VPN tunnel to the primary DC

    • dual active VPN tunnels to both DC’s is not possible with IPSEC given that interesting traffic is often needed to bring up an ipsec tunnel and that interesting traffic will be routed to the first tunnel/peer configured and never the second

    • Each spoke will be configured with a tracked IP of its primary DC under the traffic shaping page

  • If the tracked IP experiences loss in the last 5 minutes, the API script (below) will re-tag the network in order to swap to the secondary ipsec VPN tunnel

  • Once the tracked IP has not had any loss in the last 5 minutes, the tags will be swapped back to swap back to the primary DC (to avoid flapping)

Sample API Solution

The following code is one sample python implementation of this solution.  The following will describe how this works.

Prerequisites

Add your API key and org ID to the code in the bolded sections (api_key and url) of the code.

Topology

Branch has an active and standby tunnel

Dashboard Configuration
Tracked IP's

Navigate to Security & SD-WAN > Configure > SD-WAN & Traffic Shaping and add the IP of the primary peer under the uplink statistics.  The MX will start sending ICMP requests to this IP to track reachability.  This data can be viewed on the Security & SD-WAN > Monitor > Appliance Status > Uplink page and can be obtained via the API

Uplink statistics - test connectivity to (IP), Description, Default and actions

Network Tags

Naviate to Organization > Monitor > Overview.  Select the network you wish to tag and add one tag for each IPSec peer.  Tags should be in the format:

<identifier>_<primary/backup>_<state(up/down)>

 

As an example, if my primary VPN endpoint is London and backup is Paris my tags would be:

 

london_primary_up (default state for primary is up)

paris_backup_down (default state for the backup is down)

 

The script below will change the up/down state of these tags when loss is detected on the primary peer (tracked per the section above).


Org > Networks > shows tags

Site to Site VPN

Navigate to Security & SD-WAN > Configure > Site-to-Site VPN and add a peer for the primary and one for the secondary.  Each will have the same private subnets but do not cause an overlapping conflict because each will be tagged to a different network with the availability selector.  Tag each peer with its corresponding tag configured in the section above.

NMVPN Config that has those network tags in there for primary/secondary under Availability

Code

The below code is for reference only.  Meraki support does not assist with scripting.

import requests, json, time

api_key = '<API Key>'
url = 'https://api.meraki.com/api/v1/organizations/<org_ID>/devices/uplinksLossAndLatency'
header = {"X-Cisco-Meraki-API-Key": api_key, "Content-Type": "application/json"}

networkDownList = []

while True:
    response = requests.get(url,headers=header)
    for network in response.json():
        if network['ip'] != '8.8.8.8' and network['uplink']!="wan1":
            print(network['networkId'])
            print(network['ip'])
            loss=False
            for iteration in network['timeSeries']:
                if iteration['lossPercent'] >= 30:
                    loss=True
                    network_info = requests.get("https://api.meraki.com/api/v1/networks/"+network['networkId'], headers=header)
                    print(network_info.json()['name'])
                    tags = network_info.json()['tags'].split(' ')
                    if "_primary_down" in tags[1] or "_primary_down" in tags[2]:
                        print("VPN already swapped")
                        break
                    else:
                        print("Need to change VPN, recent loss - "+str(iteration['lossPercent']))
                        if "_primary_up" in tags[1]:
                            tags[1] = tags[1].split("_up")[0]+"_down"
                        if "_primary_up" in tags[2]:
                            tags[2] = tags[2].split("_up")[0]+"_down"
                        if "_backup_down" in tags[1]:
                            tags[1] = tags[1].split("_down")[0]+"_up"
                        if "_backup_down" in tags[2]:
                            tags[2] = tags[2].split("_down")[0]+"_up"
                        payload = {'tags': tags[2]+" "+tags[1]}
                        new_network_info = requests.put("https://api.meraki.com/api/v1/networks/"+network['networkId'], data=json.dumps(payload), headers=header)
                        networkDownList.append(network['networkId'])
                        break
            if loss==False and network['networkId'] in networkDownList:
                print("Primary VPN healthy again..swapping back")
                network_info = requests.get("https://api.meraki.com/api/v1/networks/"+network['networkId'], headers=header)
                tags = network_info.json()['tags'].split(' ')
                if "_primary_down" in tags[1]:
                    tags[1] = tags[1].split("_down")[0]+"_up" 
                if "_primary_down" in tags[2]:
                    tags[2] = tags[2].split("_down")[0]+"_up" 
                if "_backup_up" in tags[1]:
                    tags[1] = tags[1].split("_up")[0]+"_down"
                if "_backup_up" in tags[2]:
                    tags[2] = tags[2].split("_up")[0]+"_down"
                    
                payload = {'tags': tags[1]+" "+tags[2]}
                new_network_info = requests.put("https://api.meraki.com/api/v1/networks/"+network['networkId'], data=json.dumps(payload), headers=header)
                networkDownList.remove(network['networkId'])
                print(networkDownList)    
    print("Sleeping for 30s...")
    time.sleep(30)

Note: This is a sample script that can be used as a reference to create a custom script. Cisco Meraki support will not have the ability to troubleshoot any third party scripts that are based on/or similar to this. 

Sample Output
N_573083052582988629 <--Network we are tracking
192.168.128.201 <--Primary VPN hub we are tracking
SD-WAN Hub <-- Network Name
Need to change VPN, recent loss - 41.7 <--Packet loss of 41.7% detected.  Script above set to failover on 30% loss
Sleeping for 30s... <--continues to repeat process every 30s (adjustable in script)
N_573083052582988629
192.168.128.201
SD-WAN Hub
VPN already swapped
Sleeping for 30s...
N_573083052582988629
192.168.128.201
SD-WAN Hub
VPN already swapped
Sleeping for 30s...
.
...Repeats until 5 minutes of 0% loss
.
Sleeping for 30s...
N_573083052582988629
192.168.128.201
Primary VPN healthy again..swapping back <--Hasn't been any packet loss on the tracked IP for 5 minutes.  Swap back
Sleeping for 30s...
Tags Before Failover

Tags before failover >> Priary UP, Backup Down

 

Tags During Failover

Tags during failover >> Primary down, backup up