VPN connections that unexpectedly drop are disruptive for websites, backend services, and remote teams. When using the Trojan protocol or implementations like trojan-go and related clients, connection stability depends on a combination of TLS, transport settings, network stack tuning and server-side infrastructure. This article walks through fast, effective troubleshooting steps and concrete technical adjustments to diagnose and mitigate frequent connection drops.
Understand what “drop” means
Before changing configuration, clarify the failure mode. Common patterns include:
- Brief disconnects (a few seconds) followed by automatic reconnection.
- Long-lived sessions that stop sending/receiving traffic but appear established.
- Immediate failures when connecting (TLS handshake failures, auth errors).
- Intermittent drops under load (many concurrent clients).
Accurately identifying the failure pattern reduces guesswork and helps target whether the issue is TCP/TLS, application-layer, NAT/firewall, or resource exhaustion.
Collect diagnostic data
Gather logs and network traces from both client and server:
- Enable verbose logs in trojan/trojan-go client and server. Look for TLS handshake errors, password mismatches, ALPN/SNI issues, or transport fallback entries.
- Use tcpdump or Wireshark to capture traffic on the server interface: filter by port (usually 443 or your custom port). Capture both sides when possible.
- Check system logs (journalctl or /var/log/syslog) for kernel drops, OOM killer events, or iptables conntrack messages.
- Monitor resource usage (CPU, memory, file descriptors) with top, htop, vmstat, iostat and netstat/ss.
TLS and certificate checks
TLS misconfiguration is a frequent cause of abrupt disconnects:
- Confirm certificate chain is valid and unexpired. Use openssl s_client -connect host:port -servername your.sni.example to inspect the handshake. Look for incomplete chain or wrong cert for the SNI.
- Ensure the SNI sent by the client matches the certificate. Some clients let you set SNI explicitly; mismatches can lead to silent drops by middleboxes.
- Check TLS versions and cipher suites. Trojans typically rely on modern TLS1.2/1.3. If you force weak ciphers on server or client, TLS may fail under load or with certain middleboxes.
- If using a reverse proxy (NGINX, Caddy) or CDN in front of Trojan, validate proxy TLS settings and timeouts. Proxies may have aggressive idle timeouts that close backend connections.
Transport and protocol-specific settings
Trojan and its variants support multiple transports and multiplexing; misconfiguration here often explains drops:
- Mux / Multiplexing: If using multiplexing (multiple streams over a single connection), ensure both client and server have matching settings. Mismatch can cause intermittent resets.
- WebSocket / HTTP/2 transports: When running Trojan over WebSocket behind NGINX, verify WebSocket keepalive and proxy settings. For NGINX, set proxy_read_timeout and proxy_send_timeout to a large value and enable proxy_buffering off for streaming-style traffic.
- Keepalive: Enable TCP keepalive on both client and server sockets. Many OS and application configs allow tweaking keepalive interval, probes and idle time. This prevents NAT/firewall idle timeouts from silently removing mappings.
- Fallbacks: Trojan’s fallback mechanism can redirect plain HTTP requests to an alternative port. Ensure fallback rules don’t unintentionally redirect valid traffic.
Practical parameter checks
Recommended quick values to try (apply carefully and test):
- Linux TCP keepalive: net.ipv4.tcp_keepalive_time = 120 (seconds), tcp_keepalive_intvl = 30, tcp_keepalive_probes = 5.
- iptables conntrack timeout for TCP: inspect /proc/sys/net/netfilter/nf_conntrack_tcp_timeout_established and increase if NAT devices drop idle sessions too quickly.
- If using NGINX as TLS terminator: proxy_read_timeout 3600s; proxy_send_timeout 3600s; proxy_buffering off; send_timeout 3600s.
MTU, fragmentation and MSS clamping
Packets dropped due to fragmentation or PMTU issues can manifest as incomplete connections or stalls:
- Check MTU along the path (use ping -M do -s SIZE to determine maximum unfragmented payload). VPNs layered over other transports may need reduced MTU.
- Enable MSS clamping on the server or edge router to ensure TCP MSS is reduced for tunneled traffic: iptables -t mangle -A FORWARD -p tcp –tcp-flags SYN,RST SYN -j TCPMSS –clamp-mss-to-pmtu.
- On clients with mobile/ISP problems, reduce interface MTU (e.g., 1400) and test stability.
Firewall, NAT and intermediate devices
Firewalls and NAT devices can silently kill idle or perceived-abusive flows:
- Inspect NAT timeouts on your edge router or load balancer. Short TCP or UDP NAT timeouts will drop mappings, requiring a reconnection.
- For stateful firewalls, ensure established flows are allowed and that connection tracking table isn’t exhausted. Increase nf_conntrack_max if needed.
- Check for ISP-level DPI or traffic shaping. Some ISPs throttle or terminate connections that look like VPN traffic. Try varying SNI, ALPN or using different ports (e.g., 443 vs 80) to test behavior.
Server resource limits and concurrency
Under load, servers can run out of file descriptors, CPU or memory, leading to connection resets:
- Increase ulimit -n (file descriptors) for the trojan process; large site deployments should set this to several tens of thousands.
- Monitor and tune epoll/select limits. If using a single-threaded event loop, consider running multiple worker instances behind a load balancer.
- Check systemd service limits: set LimitNOFILE and LimitNPROC in the unit file to avoid hitting defaults.
DNS and resolution issues
DNS problems can break perceived persistent connections if the client re-resolves server addresses:
- Use stable DNS and avoid frequent DNS-based load balancing with short TTLs for critical endpoints.
- Check for transient DNS failures on the client side. If clients fallback to IPv6 or different addresses, ensure your certificate and SNI mapping support this.
Debugging workflow: a practical sequence
- 1) Reproduce quickly with verbose logs on both sides. Identify the timestamp of drop and examine logs around it.
- 2) Capture network trace at the server during the event. Look for TCP RST, FIN, or sudden silence after keepalive intervals.
- 3) Inspect TLS handshake failures using openssl s_client to confirm certificate and SNI behavior.
- 4) If idle drops occur, enable keepalive and increase NAT conntrack timeouts. Test again.
- 5) Under load, observe FD usage and increase limits or add worker processes; profile CPU saturation points.
- 6) If behind a reverse proxy, temporarily expose the trojan service directly to isolate proxy-related timeouts.
When to involve the network provider or CDN
If you suspect ISP throttling or middlebox interference (e.g., TCP RSTs injected), escalate to the provider with packet captures showing injected RSTs or unexpected modifications. CDNs and cloud load balancers often have idle timeout defaults—ask their support to raise them for your service or configure a proper TCP/TLS passthrough.
Small operational tips for improved reliability
- Use health checks and peripheral monitoring: automatically restart trojan daemon on crashes (systemd Restart=on-failure).
- Gracefully drain connections during maintenance by using upstream load balancing and slow shutdowns.
- Keep TLS certificates renewed and use OCSP stapling where possible for faster handshakes and clearer failure signals.
- Document client-server config pairings (TLS version, SNI, mux, websocket path) so new deployments match tested settings.
Summary: Connection drops with Trojan VPNs usually boil down to TLS/SNI mismatches, transport/keepalive settings, MTU/fragmentation, firewall/NAT timeouts, or server resource exhaustion. Systematic logging, packet captures and stepwise isolation (bypass proxies, adjust keepalive/MSS, raise file descriptor limits) are the fastest route to resolution. In complex environments, coordination with your ISP, CDN or edge device vendor may be required to adjust middlebox timeouts or traffic shaping.
For more implementation guidance, configuration snippets and deployment best practices tailored to high-availability setups, visit Dedicated-IP-VPN at https://dedicated-ip-vpn.com/.