In enterprise and developer environments where secure remote connectivity is non-negotiable, IKEv2 (Internet Key Exchange version 2) remains a top choice for building robust IPsec tunnels. Yet, the cryptographic choices, packet characteristics, and hardware capabilities can dramatically influence real-world throughput and latency. This article presents detailed, reproducible encryption benchmarks for IKEv2, analyzes the performance implications of different cipher suites and packet sizes, and provides practical guidance for deploying IKEv2 in production systems where performance and security must be balanced.
Scope and goals of the benchmarks
The objective of these tests was to quantify the throughput, CPU utilization, latency, and tunnel efficiency of IKEv2 IPsec tunnels under realistic conditions. We targeted scenarios common to site-to-site VPNs and remote-access gateways used by web services, backend APIs, and developer workstations. Key goals were:
- Measure throughput across popular cipher suites (AES-GCM, AES-CBC + HMAC-SHA2, ChaCha20-Poly1305).
- Understand the effect of different MTU and packet sizes on effective throughput and fragmentation.
- Evaluate CPU utilization patterns on single- and multi-core systems, with and without crypto offload.
- Assess behavior under NAT-T (UDP encapsulation) and with differing IKE SA lifetimes / rekey frequencies.
Testbed and methodology
Benchmarks were executed on a controlled lab network to minimize variance from external traffic. Configuration notes:
- Endpoints: Two Linux hosts (Ubuntu 22.04) running strongSwan 5.9 as the IPsec/IKEv2 implementation. Kernel version 5.15 with built-in IPsec stack (XFRM).
- Hardware: Two test platforms — a 4-core Intel Xeon E3 (3.2 GHz) and a 12-core AMD EPYC 7402P (2.8 GHz). Both had AES-NI and SHA extensions; Xeon had support for crypto offload on a dedicated NIC in some tests.
- Network: 10 Gbps switch but limited to 1 Gbps NICs for realistic WAN-equivalent throughput. Controlled network emulator introduced programmable latency (10 ms) and packet loss (0.01%) in some runs.
- Clients: iperf3 used for TCP/UDP throughput; ping for ICMP latency; tcpdump for packet inspection and fragmentation analysis.
- Configurations: IKEv2 with ESP using three cipher sets — AES-128-GCM, AES-256-CBC + HMAC-SHA256, and ChaCha20-Poly1305. IKE proposals used DH group 14 (2048-bit) for key exchange. NAT-T enabled where noted.
- Metrics collected: wire throughput (Mbps), application-layer throughput (iperf reported), CPU utilization (top/htop and perf), and observed packet size distributions on the wire.
Key findings — throughput and CPU tradeoffs
AES-GCM consistently delivered the best throughput per CPU cycle on CPUs with AES-NI support. Because AES-GCM is an authenticated encryption with associated data (AEAD) mode, it eliminates the separate HMAC computation required by AES-CBC + HMAC, reducing memory copies and CPU overhead.
Representative throughput numbers from the 4-core Xeon (single IKEv2 tunnel, 10 ms latency emulator):
- AES-128-GCM: 880–920 Mbps wire throughput, CPU utilization ~55–70% across cores depending on packet size.
- AES-256-CBC + HMAC-SHA256: 420–520 Mbps, CPU utilization ~75–95% (single tunnel saturated CPU cores more quickly).
- ChaCha20-Poly1305: 740–800 Mbps, CPU utilization ~60–75% — closer to AES-GCM performance on CPU without requiring AES-NI.
On the 12-core AMD EPYC, throughput scaled better when multiple concurrent flows were present, reaching aggregate throughputs near the NIC limit for AES-GCM across parallel iperf streams (6–8 parallel streams to fully utilize multi-core). This illustrates a key operational point: Ikev2 tunnel throughput scales well with CPU cores when multiple independent flows exist, because Linux’s IPsec processing and strongSwan’s processing can distribute across cores (SMP scaling).
Effect of packet size and MTU
Packet size significantly affected effective application throughput because IPsec adds header and optional ESP padding, changing the effective MTU. Important observations:
- With a 1500-byte Ethernet MTU and standard IPv4/IPv6 headers, the ESP and IKE overhead typically shrank the available payload by 50–80 bytes depending on encryption and authentication tag size (AES-GCM tags are 16 bytes). This results in increased fragmentation when PMTU is not adjusted.
- When PMTU discovery was disabled and packets were forced to 1500 bytes, we observed a 6–12% drop in application-layer throughput due to fragmentation and reassembly overhead.
- Using a larger MTU (9000 for jumbo frames) on both endpoints improved throughput by 8–15% for large transfers because the per-packet crypto cost amortized over larger payloads.
Recommendation: ensure PMTU discovery is functioning or configure slightly reduced MTU on tunnel interfaces (e.g., 1400) for open internet scenarios to avoid fragmentation across NAT devices and tunnels.
NAT-T and UDP encapsulation overhead
NAT Traversal (NAT-T) encapsulates ESP in UDP (usually port 4500). This adds an extra 20–28 bytes of overhead (UDP + extra IP headers). Benchmarks showed:
- AES-GCM throughput dropped by ~3–5% in NAT-T mode due to extra processing and smaller effective MTU.
- In environments with small MTUs or multiple layers of NAT, throughput penalties could reach 8–10% if fragmentation occurred.
- CPU overhead increases slightly due to additional UDP/IP checksums and extra packet handling, but this effect was minor compared to cryptographic computation costs.
Rekeying, IKE SA lifetimes, and performance
IKEv2 supports rekeying for both IKE SAs and child SAs. Rekeying frequency affects CPU and latency:
- Short IKE SA lifetimes (e.g., rekey every 5 minutes) introduce spikes in CPU and packet latency during the rekey exchange. These events are brief (milliseconds) but can affect latency-sensitive flows (VoIP/RTP).
- Longer lifetimes (e.g., several hours) reduce these spikes but expose cryptographic keys to longer usage, which may be undesirable for high-security environments.
- For high-throughput data links, rekeying a single tunnel can consume 1–5% of a CPU core when RSA or ECDSA certificates are used without hardware acceleration. Using ECDH (P-256) and PSK reduces rekey CPU cost.
Operational guidance: balance security policy and operational performance — use certificate-based authentication with ECDSA/ECDH and reasonable lifetimes (e.g., 24 hours) for production tunnels, and schedule rekeys during low-traffic windows if possible.
Hardware acceleration and NIC offload
Crypto offload capabilities shift symmetric crypto operations to specialized NIC or CPU instructions (AES-NI). Key takeaways:
- On systems with AES-NI, AES-GCM performance improved by ~30–40% versus software-only AES implementations.
- NIC-based IPsec offload produced mixed results — vendor drivers and kernel support must align. When properly configured, offload can reduce CPU usage by 20–50% and increase throughput, but misconfigured offloads can introduce packet drops or degraded latency.
- ChaCha20-Poly1305 remains advantageous on platforms without AES-NI (e.g., certain ARM deployments) where it can outperform AES-CBC + HMAC and approach AES-GCM performance.
Recommendation: use AES-GCM with AES-NI where available; consider ChaCha20-Poly1305 on non-x86 or AES-NI-less hardware. Test NIC offloads carefully in a staging environment before production.
Security vs. performance: practical cipher recommendations
For most modern deployments, prioritize AEAD ciphers to reduce overhead and simplify processing. Specific recommendations:
- AES-128-GCM: Best blend of performance and security for most sites — low CPU usage on AES-NI hardware and strong cryptographic guarantees.
- ChaCha20-Poly1305: Excellent for devices without AES-NI (mobile, IoT, some VM types). Lower variance in performance under diverse CPU loads.
- AES-256-GCM: Offers higher key strength but modest additional CPU cost; use only if compliance requires AES-256.
- AES-CBC + HMAC: Avoid if possible — still supported for compatibility, but performance and code complexity lag AEAD modes.
Sample configuration snippets and tuning tips
To get the best performance from strongSwan/IKEv2 on Linux:
- Enable ECDH groups (e.g., modp2048 or ECP groups) and prefer ECDSA certificates where applicable to reduce rekey CPU cost.
- Set the tunnel MTU to 1400–1420 to account for ESP/UDP headers and avoid fragmentation across common internet paths.
- Increase Linux networking settings for high throughput: tune net.ipv4.tcp_rmem/tcp_wmem, net.core.rmem_max/wmem_max, and enable GRO/LRO on NICs when safe.
- Use multiple parallel TCP flows (or iperf -P) during testing to reveal SMP scaling and more realistic aggregated throughput.
Limitations and reproducibility
These benchmarks aim to reflect realistic deployments but are subject to environment-specific factors: CPU microarchitecture, kernel version, NIC drivers, and background system load. For reproducibility:
- Run multiple iterations and report median values rather than single-run peaks.
- Document exact software versions (strongSwan, kernel), encryption parameters, and hardware features (AES-NI, offload).
- Use controlled latency and packet loss emulation to assess behavior under expected WAN conditions.
Conclusions and deployment checklist
IKEv2 with AEAD ciphers (particularly AES-GCM) delivers strong security with high throughput on modern CPUs with AES-NI. ChaCha20-Poly1305 is a viable alternative for non-AES-NI platforms. Careful MTU management, appropriate rekey policies, and leveraging multi-core scaling are essential to achieving predictable, high-performance tunnels.
Quick deployment checklist:
- Choose AEAD cipher (AES-GCM or ChaCha20-Poly1305).
- Enable AES-NI and test NIC offload only after validation.
- Set tunnel MTU to avoid fragmentation (e.g., 1400) or ensure PMTU discovery works end-to-end.
- Configure reasonable IKE/Child SA lifetimes balancing security and performance (e.g., 24h for IKE, 8–12h for child SA depending on policy).
- Test with multiple parallel flows and under simulated WAN conditions.
For infrastructure teams, developers, and site operators, these benchmarks provide an empirical foundation to choose cipher suites and tuning parameters that match both security requirements and performance SLAs. For further implementation guides and real-world deployment case studies, visit Dedicated-IP-VPN at https://dedicated-ip-vpn.com/.