Home > Security and SD-WAN > Deployment Guides > NAT Mode Warm Spare (NAT HA)

NAT Mode Warm Spare (NAT HA)

This guide details the various deployment architectures that are possible for MX Warm Spare functionality in NAT mode and explains the advantages and disadvantages of each.

Use Case

In most customer deployments, network downtime has a direct impact on the business and should be avoided at all costs. Warm Spare functionality prevents the network from having a single point of failure at the edge and allows for fast, automatic recovery in the event of device failure, reducing the negative impact on end-user services.

Benefits:

  • In the event of hardware failure, network downtime will be greatly reduced or eliminated entirely depending on the architecture being used.
  • No manual intervention by the network administration team will be required to facilitate recovery from a hardware failure.

Terminology

For purposes of this document, it is important to understand the following terms and their meaning:

 

Primary: The MX that is configured as the "main" MX for the network. If both MXes are online, this is the MX that traffic should be flowing through. This is a static designation, meaning that regardless of the current state of the network the Primary will always be the Primary.

Spare: The MX that is configured as the "secondary" MX for the network. If both MXes are online, this is the MX that is the inactive warm spare. This is a static designation, meaning that regardless of the current state of the network the Secondary will always be the Secondary.

 

Active (Master): The MX that is currently acting as the edge firewall/security appliance for the network. This is a dynamic designation.

Passive: The MX that is currently acting as an inactive warm spare with no traffic passing through it. This is a dynamic designation.

 

Dual Master: Dual Master describes a scenario in which both the Primary and the Spare are in the Active state. This occurs when both MXes are online and communicating with the cloud, but the Secondary is not receiving heartbeat packets (see VRRP heartbeats in the next section) from the Primary. This can cause several issues with Dynamic DNS, VPN, and traffic processing in general and should be avoided at all costs. The Physical Architectures section of this document describes how to deploy an MX Warm Spare pair in order to minimize the chances of a Dual Master scenario occurring.

Underlying Concepts and Technologies

VRRP Heartbeats

Failure detection for an MX Warm Spare pair uses VRRP heartbeat packets. These heartbeat packets are sent from the Primary MX to the Secondary MX on all configured VLANs in order to indicate that the Primary is online and functioning properly. As long as the Secondary is receiving these heartbeat packets, it functions in the spare state. If the Secondary stops receiving these heartbeat packets, it will assume that the Primary is offline and will transition into the master state. When the MX is in NAT mode, VRRP heartbeats are not sent over the WAN; there is no guarantee that the WAN interfaces can communicate with each other. Instead, we use a mechanism called "connection monitor" to determine the WAN state of the device.

For more in-depth information regarding the VRRP Mechanics on the MX, please see the NAT HA Failover Behavior documentation.

Connection Monitor

Connection monitor is an uplink monitoring engine built into every MX Security Appliance. The mechanics of the engine are described in this article. When all uplinks of a Primary MX are marked as failed by connection monitor, that MX will stop sending VRRP heartbeat packets, which will initiate a Warm Spare failover. Once at least one uplink on the Primary returns to a working state, the Primary resumes sending heartbeat packets and the Secondary relinquishes the Active role back to the Primary.

DHCP Synchronization

To prevent a scenario in which an IP address is assigned by the Primary via DHCP and then that same address is assigned to another client by the Secondary after a failover, the DHCP lease table is synchronized regularly between the Primary and Secondary.

Dashboard Configuration

To configure warm spare failover for an existing Dashboard network, navigate to the Security appliance > Appliance status, and select Configure warm spare near the upper-left side of the page, below the device name. In the window that appears, select Enabled. Enter the serial number of the Secondary MX and select the desired Uplink IP configuration, then select Update to enable Warm Spare.

Use MX uplink IPs: When using this option, the current Active MX will use its distinct uplink IP or IPs when sending traffic out to the Internet. This option does not require additional public IPs for Internet-facing MXes, but also results in more disruptive failover because the source IP of outbound flows will change.

Use virtual uplink IPs: When using this option, both MXes will use a shared virtual IP (vIP) when sending traffic out to the Internet. This option requires an additional public IP per uplink but allows for seamless failover because the IP address the network is using to communicate with the Internet will be consistent. The vIP for each uplink must be in the same subnet as the IPs of the MXes themselves for that uplink, and the vIP must be different from both MX uplink IPs.

To configure a new network with warm spare failover, create the network as you would normally and add the Primary MX. Then add the Secondary MX using the process described above.

Regardless of which option is selected, both MX devices will need their own uplink IP addresses for Dashboard connectivity.

Dashboard configuration should always be performed before the Secondary MX is physically connected to the network. 

Recommended Topologies

There are two physical architectures available for NAT Warm Spare deployments.

Fully Redundant (Multiple Switches)

In this architecture, the Primary and Secondary MXs are not directly connected, and VRRP Heartbeats are carried between the downstream switches. This is the recommended architecture for most deployments, as there is no single point of failure in this topology.

multiple_switch_HA

Fully Redundant (Switch Stack)

In this architecture, the Primary and Secondary MXs are connected via a downstream switch stack. Each switch has at least one uplink to each MX. This ensures that there is no single point of failure in the topology. 

switch_stack_HA.png

 

Additional Resources   

For more information about NAT HA, please refer to our documentation on Troubleshooting MX Warm Spare in NAT Mode (NAT HA).

Last modified

Tags

Classifications

This page has no classifications.

Explore the Product

Click to Learn More

Article ID

ID: 4178

Explore Meraki

You can find out more about Cisco Meraki on our main site, including information on products, contacting sales and finding a vendor.

Explore Meraki

Contact Support

Most questions can be answered by reviewing our documentation, but if you need more help, Cisco Meraki Support is ready to work with you.

Open a Case

Ask the Community

In the Meraki Community, you can keep track of the latest announcements, find answers provided by fellow Meraki users and ask questions of your own.

Visit the Community