High-availability VPN connectivity is a must for businesses, hosting providers, and operators who rely on encrypted tunnels for site-to-site connectivity or remote access. When an IKEv2 tunnel goes down, properly designed failover mechanisms can keep traffic flowing with minimal disruption. This article walks through practical strategies to implement robust IKEv2 VPN failover on pfSense, covering configuration details, monitoring, routing behavior, and testing methodologies suitable for sysadmins, developers, and enterprise IT teams.
Why IKEv2 and pfSense for resilient VPNs?
IKEv2 is the preferred VPN protocol for many deployments due to its support for modern cryptographic suites, NAT traversal, MOBIKE (mobility and multi-homing), and fast rekeying. pfSense, a widely used open-source firewall and router platform, includes a mature implementation of IPsec/IKEv2 and a flexible routing and gateway monitoring framework. Combining IKEv2’s protocol capabilities with pfSense’s routing and failover constructs allows building reliable redundancy without excessive complexity.
Core concepts for effective IPsec failover
Before diving into configuration, it helps to understand the key concepts that make failover work in pfSense:
- Gateway Groups — pfSense supports grouping multiple WAN gateways and defining tiers and failover policies (failover, load balancing). These groups drive routing decisions for outbound traffic.
- IPsec Phase 1 / Phase 2 — IKEv2 Phase 1 (IKE SA) establishes the authenticated control channel, while Phase 2 (IPsec SA) negotiates the data-encryption parameters. Both SAs must be available for the tunnel to carry traffic.
- Dead Peer Detection (DPD) and Rekey — DPD detects unresponsive peers and can bring down the SA so pfSense can attempt reestablishment or failover. Proper DPD settings are vital for rapid detection.
- MOBIKE — For peers with dynamic interfaces (e.g., mobile clients, multi-WAN peers), MOBIKE allows changing the underlying IP without tearing down the IKE SA.
- Static Routes vs Policy-based Routing — Outbound VPN traffic can be influenced by firewall rules (policy-based routing) or by system routes. Correctly steering traffic is key for deterministic failover.
Architecture patterns for IKEv2 failover
There are several patterns depending on whether you control both VPN endpoints, whether the peer supports multiple addresses, and the level of stateful failover desired:
- Active/Passive Dual WAN on a single pfSense — One primary WAN and one secondary WAN. The IPsec tunnel is configured to originate from the pfSense box. Use gateway groups and DPD to switch the outgoing interface and re-establish the SA when the primary WAN fails.
- Active/Active Dual IPsec Tunnels — Configure two separate IPsec tunnels to the same remote peer (or two remote peers) using distinct source IPs. Combine both in a gateway group for load sharing or failover. Useful with remote peers that support multiple endpoints.
- HA Cluster with CARP and State Synchronization — Two pfSense nodes in a CARP cluster keep configuration and state synchronized (pfsync for firewall state and IPsec state sync). This enables seamless failover with maintained connections.
- Remote Multi-homing (MOBIKE)** — If the remote endpoint supports MOBIKE or multiple-site endpoints, the remote can switch its public endpoint and reattach the IKE SA without a full rekey, improving continuity.
Step-by-step: Implementing Active/Passive IKEv2 failover on pfSense
1. Prepare multiple WAN gateways
Ensure pfSense has two WAN interfaces with public IPs or NATed connectivity. Configure each interface under Interfaces and set appropriate gateway entries in System → Routing → Gateways. For each gateway, enable monitoring by selecting an appropriate monitoring IP (e.g., 8.8.8.8 or your remote peer). Monitoring determines gateway health.
2. Create a Gateway Group
Under System → Routing → Gateway Groups, create a group with the primary WAN in Tier 1 and the secondary in Tier 2. Select the trigger level (e.g., “Packet Loss or high Latency” or “Member Down”) depending on sensitivity. The group will be used in firewall rules and static routes to steer traffic through the active WAN.
3. Configure IKEv2 IPsec Phase 1 and Phase 2
Go to VPN → IPsec and add a new Phase 1 entry:
- Select Ikev2 as the internet protocol.
- Authentication: Use Certificates for best security and smoother rekey. Pre-shared keys work too but are less flexible.
- Local Gateway: leave default unless you need to bind to a specific interface IP (for multi-WAN setups you can bind to WAN IP or use Virtual IPs).
- Enable Dead Peer Detection — set aggressive but reasonable intervals (e.g., DPD Delay: 20s, DPD Timeout: 60s) so failures are detected promptly without false positives.
- Enable Mobility/Roaming (MOBIKE) if the peer supports it.
Configure Phase 2 entries as required (encryption/auth algorithms, PFS group). Use modern ciphers (AES-GCM, SHA2) and keep lifetimes short enough for security but not too short to cause frequent rekeying.
4. Bind IPsec traffic to the Gateway Group
pfSense initiates IPsec using the system routing table. If you want the IPsec tunnel to prefer the gateway group, you can:
- Set up a firewall rule on the LAN or a specific network that uses Policy-based routing: edit the rule and set the gateway to the Gateway Group. This will cause traffic destined for the remote site to be routed via the group, which uses the active WAN.
- Alternatively, create static outbound routes if the remote subnets are known, pointing to the Gateway Group (less common).
Note: IPsec initiation itself may follow the system default route. If the local endpoint IP matters to the remote peer, configure Virtual IPs or utilize CARP to keep a stable source IP.
5. Tweak advanced IPsec options
- Enable Rekeying parameters prudently. For example, use rekey margin to trigger rekey slightly before SA lifetime expires to reduce downtime.
- Use DPD with short intervals for quicker failover detection (balance with false positive risk on flaky networks).
- Consider enabling IPsec Mobile Client options if you use remote access VPNs; MOBIKE helps clients switch networks.
High-availability using CARP and pfSense HA
For stateful failover that preserves active connections, use a CARP pair with pfsync and XMLRPC synchronization:
- Set up two pfSense appliances with identical IPsec configs and enable XMLRPC sync for configuration and pfsync for stateful synchronization of firewall states.
- Enable IPsec sync so IPsec SAs are synchronized across nodes; this preserves tunnel state across failover.
- Use CARP virtual IPs for LAN and possibly for WAN if the upstream supports ARP or you have layer-2 control. Otherwise, NAT scenarios may require special handling.
Stateful HA is slightly more complex but yields near-zero session loss for many TCP flows. UDP-based protocols may still be sensitive to short interruptions.
Testing and validation
Testing is crucial. Use the following methodology:
- Perform controlled failover tests by physically disconnecting the primary WAN or simulating upstream failure using upstream router controls.
- Monitor Status → IPsec to confirm Phase 1/2 re-establishment on the secondary path. Use logs (Status → System Logs → IPsec) for troubleshooting.
- Validate application behavior: check long-lived TCP sessions, VoIP calls, and databases. Measure downtime and recovery times.
- Test CARP failover: force a failover and verify pfsync keeps states active and IPsec SAs move over as expected.
Troubleshooting tips
If failover does not work as expected, check:
- Gateway monitoring status (System → Routing → Gateways) to ensure pfSense correctly detects WAN health.
- IPsec logs for DPD timeouts, authentication failures, or mismatch in proposals.
- Source IP binding — if the remote peer requires a fixed source IP, ensure the active WAN provides the expected IP or use a virtual IP with NAT.
- Firewall rules — ensure policy-based routing rules point to the Gateway Group and that IPsec traffic is allowed on the outgoing interfaces.
- MTU and MSS clamping — tunnels across different WANs may expose MTU issues; enable MSS clamping on PF rules or set proper MTU on IPsec Phase 2.
Advanced topics
Dual-tunnel active/active with ECMP
You can configure two parallel IPsec tunnels to the same remote site and use load balancing for resilience and throughput aggregation. This requires careful tweaking of traffic selectors and may need the remote peer to support multiple SAs and ECMP-friendly routing.
Automated failover scripts
For very specific behaviors, pfSense allows custom scripts via cron or hooks. Common use-cases include:
- Automatically reassigning Virtual IPs when a particular WAN becomes active.
- Triggering custom ping checks to application endpoints and dynamically adjusting gateway weighting using the pfSense API.
Security considerations
- Use certificate-based authentication to prevent exposure of shared secrets during failover or when multiple peers are involved.
- Ensure the secondary WAN is as secure as the primary (e.g., same firewall rules applied) to avoid introducing weaker security postures during failover.
- Monitor logs and alerts to detect repeated flaps, which could indicate upstream instability rather than a solved failover scenario.
Conclusion
Implementing reliable IKEv2 VPN failover on pfSense combines protocol features (IKEv2, MOBIKE, DPD) with pfSense’s routing and HA mechanisms (gateway groups, CARP, pfsync). For many deployments, active/passive dual-WAN with a well-configured gateway group and aggressive DPD is sufficient. For enterprise environments requiring minimal session loss, pfSense HA with CARP and state synchronization is the recommended path. Test thoroughly in a controlled manner and tune DPD, rekey, and routing rules to achieve the desired balance between rapid failover and stability.
For additional resources, configuration examples, and enterprise-grade VPN services, visit Dedicated-IP-VPN: https://dedicated-ip-vpn.com/