When choosing a VPN protocol for a production environment—whether you’re securing traffic for a corporate network, hosting a VPN service for customers, or building a development testbed—raw throughput and latency under real-world conditions are often decisive. Protocol design, cryptographic choices, kernel vs userspace implementation, and practical factors like MTU and CPU offload all influence observed speeds. This article examines L2TP (usually paired with IPsec), WireGuard, and OpenVPN, focusing on how they behave in realistic deployments and what tuning can improve performance.
High-level protocol characteristics
Before diving into benchmarks and tuning, it helps to understand the architectural differences that drive performance.
L2TP over IPsec (L2TP/IPsec)
L2TP itself is a layer 2 tunneling protocol that provides encapsulation but no encryption. In practical deployments, it is combined with IPsec (ESP) for confidentiality, integrity, and authentication. This pairing is conventionally supported on many clients and routers, which historically made it a default choice for compatibility.
Key performance points:
- Encapsulation overhead: L2TP adds L2 headers; IPsec (ESP) adds its own headers and potential padding, which increases packet size and can cause fragmentation at lower MTU values.
- Crypto stack: IPsec implementations often rely on kernel modules or OS-level stacks (Linux xfrm/ipsec or Windows native IPsec), which can be hardware accelerated if AES-NI or dedicated crypto offload is available.
- Complexity: Multiple layers (L2TP + IPsec) introduce extra processing and state machines, which can increase CPU cycles per packet compared to simpler modern protocols.
OpenVPN
OpenVPN is a widely used SSL/TLS-based VPN implemented mostly in userspace. It supports TCP or UDP transport, tun/tap interfaces, and a flexible plugin/configuration model.
Key performance points:
- Userspace vs kernel: OpenVPN runs in userspace, so packets cross the kernel/userspace boundary, adding context switch overhead and more memory copies compared to kernel-space implementations.
- Transport choice: OpenVPN over UDP avoids TCP-in-TCP problems and generally performs better; over TCP it can suffer from compounded retransmission and head-of-line blocking.
- Crypto: OpenVPN supports AES and ChaCha20; performance depends on selected cipher and whether AES-NI acceleration is available.
- Flexibility: Lots of options and plugins, but that flexibility can mean more code paths and potential inefficiencies.
WireGuard
WireGuard is a modern VPN protocol designed around simplicity, small codebase, and high performance. It operates as a kernel module on many platforms (or as an eBPF-like variant), uses a fixed set of modern cryptographic primitives (Curve25519, ChaCha20-Poly1305), and focuses on fast keying and minimal packet overhead.
Key performance points:
- Minimal overhead: WireGuard adds a single small header and uses compact, precomputed state for peers, reducing per-packet processing.
- Kernel-space implementation: On Linux and some BSDs WireGuard runs in kernel space, avoiding userspace context switch overhead. On platforms without kernel support it may fall back to userspace but still remains efficient.
- Crypto choices: ChaCha20-Poly1305 is faster than AES on CPUs without AES-NI and competitive even with AES-NI thanks to its simplicity and fewer CPU cycles.
- Single-threaded design: WireGuard is efficient but its original design favors single-threaded packet processing per interface; multi-CPU scaling requires multiple interfaces or kernel optimizations.
Real-world speed factors and test methodology
Any speed comparison must be reproducible and reflect realistic conditions. Raw single-threaded throughput numbers can be misleading if you don’t consider multi-stream traffic, latency sensitivity, CPU, and network path conditions.
Recommended methodology:
- Use both UDP and TCP transports (where applicable). For OpenVPN prefer UDP for fairer comparison.
- Test with different MTUs and enable/disable fragmentation to observe behavior under path MTU restrictions.
- Run tests with varying packet sizes: small packets (e.g., 64 bytes), medium (512–1500 bytes), and large (4–16 KB) to mimic different application workloads.
- Measure latency and jitter alongside throughput—some protocols trade latency for throughput.
- Test CPU utilization on client and server to identify bottlenecks (e.g., AES-NI absent CPUs may favor ChaCha20).
- Repeat tests with hardware crypto offload (if available) vs software crypto to see differences.
Observed performance patterns
Below are generalized findings experienced across many real deployments and independent tests. Exact numbers will vary by hardware, software versions, and network conditions.
WireGuard: often the fastest and most consistent
In many real-world tests, WireGuard delivers the highest throughput and lowest latency for the same hardware compared to L2TP/IPsec and OpenVPN. Reasons:
- Kernel-space processing reduces packet copy and context switching.
- Efficient, modern crypto (ChaCha20-Poly1305 or AES-GCM) with minimal handshakes and compact packet headers.
- Lower per-packet CPU cost—WireGuard routinely provides multi-Gbps throughput on modest CPUs where OpenVPN fails to saturate the link.
Edge case: on some multi-core servers with extremely high aggregate throughput requirements, WireGuard’s single-flow processing can be less scalable than a well-tuned multi-threaded IPsec stack that uses multiple queues or multiple tunnels. However, kernel-level optimizations and newer WireGuard features mitigate this.
OpenVPN: flexible but often CPU-bound
OpenVPN performs well in many scenarios but typically lags behind WireGuard in raw throughput on the same hardware. Common observations:
- Userspace processing and extra copies reduce maximum possible throughput.
- When using AES, systems with AES-NI close the gap; on CPUs without AES acceleration, ChaCha20 makes OpenVPN faster.
- OpenVPN over TCP incurs extra overhead and latency compared to UDP; for fair speed testing use UDP mode.
- OpenVPN scalability can be improved by enabling multi-threaded I/O (in newer versions) and using tun (layer 3) instead of tap (layer 2).
L2TP/IPsec: compatibility at the cost of speed
L2TP/IPsec is often the slowest in raw throughput comparisons for several reasons:
- Double encapsulation (L2TP + IPsec) increases overhead and likelihood of fragmentation.
- IPsec negotiation and payload handling can be heavier than WireGuard’s simplified protocol.
- While IPsec can leverage kernel crypto and hardware offload (potentially outperforming poorly tuned userspace solutions), typical deployments still find IPsec stacks consuming more CPU per Mbps than WireGuard.
Tuning tips to maximize VPN throughput
Regardless of protocol, you can significantly improve performance with careful tuning:
- Enable AES-NI or other crypto acceleration: On Linux, check /proc/cpuinfo and the kernel crypto modules. Hardware acceleration can double or triple AES performance.
- Use UDP transport for OpenVPN: Avoid TCP-over-TCP. If application requires TCP, consider splitting responsibilities or using TLS-based tunnels carefully.
- Adjust MTU and MSS clamping: Use tcp-mss-clamp on server-side (OpenVPN) or set MTU appropriately to avoid IP fragmentation. Path MTU Discovery can fail inside tunnels; explicit MTU settings often help.
- Choose modern ciphers: ChaCha20-Poly1305 for CPUs without AES-NI; AES-GCM or AES-CTR with AES-NI for accelerated CPUs.
- Offload when available: Use NIC features like GRO/TSO/LRO and hardware crypto if supported by your IPsec stack.
- Parallelize flows: For high aggregate throughput, split traffic across multiple tunnels/interfaces or leverage solutions that spread work across CPUs (multi-instance WireGuard or IPsec with multiple SA pairs).
- Keep software updated: Kernel improvements, WireGuard enhancements, or OpenVPN multithreading in newer releases can materially change performance.
Practical recommendations
For most modern deployments where speed and simplicity are primary concerns, WireGuard is likely the best choice: it combines high throughput, low latency, and a small attack surface. It’s particularly compelling for cloud servers, VPS deployments, and end-user VPN applications where resources and simplicity matter.
If compatibility with legacy clients (e.g., built-in OS VPN clients on older platforms) is required, L2TP/IPsec remains a pragmatic fallback—but expect lower throughput and spend time tuning MTU and encryption offload.
OpenVPN remains an excellent option when you need flexibility, extensive configuration options, or broad platform support where WireGuard is not available. Choose UDP transport and newer cipher suites for best performance, and consider running OpenVPN on powerful hardware if you must support many concurrent users.
What to measure in your own environment
When deciding, test with realistic traffic patterns from your environment:
- Simulate the number of concurrent users/flows your network will see.
- Measure per-flow throughput and aggregate throughput separately.
- Record CPU, memory, and context switching metrics during tests.
- Test over the actual WAN links (including NAT hops) to capture NAT traversal and packet loss impacts.
Only by measuring in-situ can you balance trade-offs between throughput, latency, CPU cost, and operational complexity.
For further practical deployment guidance and benchmark walkthroughs, visit Dedicated-IP-VPN at https://dedicated-ip-vpn.com/ where you can find configuration examples, tuning scripts, and platform-specific notes.