Real-Time Shadowsocks Traffic Monitoring with Prometheus and Grafana

Real-time traffic visibility for proxy services is essential for capacity planning, billing, abuse detection, and SLA compliance. For administrators running Shadowsocks servers, integrating traffic metrics into a Prometheus + Grafana stack provides a flexible, scalable way to monitor per-port and per-user bandwidth usage, active connections, latency, and error rates. This article walks through practical, production-oriented approaches to exporting Shadowsocks traffic metrics, collecting them with Prometheus, and visualizing them in Grafana. The guidance below targets sysadmins, SaaS operators, and developers who want accurate, low-overhead observability for Shadowsocks deployments.

Overview: approaches to capturing Shadowsocks metrics

There are three common patterns to obtain traffic metrics from Shadowsocks servers. Each has different trade-offs in accuracy, implementation effort, and kernel-space vs user-space overhead:

Application-level exporters: Modify or wrap the Shadowsocks server to expose Prometheus metrics directly (best accuracy, simplest mapping to users/ports if the server supports it).
Log parsing exporters: Parse access logs emitted by the server and convert counters into Prometheus metrics (low intrusiveness, near-real-time depending on log flush frequency).
Kernel-level accounting: Use iptables/nftables byte counters or tc to track per-port/IP traffic and export those counters to Prometheus (works regardless of server implementation, minimal app changes).

In many production environments, a hybrid approach works best: prefer application-level metrics where available; fall back to kernel-level accounting for legacy servers or when you need robust enforcement outside the process.

Option A — Application-level Prometheus exporter

Some Shadowsocks implementations or forks expose metrics directly or through plugins. If you control the server binary (e.g., shadowsocks-rust, Outline forks, or custom builds), adding a /metrics HTTP endpoint with Prometheus instrumentation is straightforward:

Instrument counters: total_bytes_sent, total_bytes_received, active_connections, connection_errors, per_user_bytes (label by username or port).
Use client libraries (Go, Python, Rust) to expose the metrics over HTTP on localhost:9191 or a similar port.
Protect access to the metrics endpoint with firewall rules or bind to 127.0.0.1 only.

Advantages: precise attribution to users/ports and low measurement overhead. Drawbacks: requires modifying or wrapping the server process, and you must handle cardinality carefully (labels per-clientIP can explode).

Option B — Log parsing exporter

If the Shadowsocks server writes per-connection logs (bytes transferred, source IP, destination IP), you can run a lightweight parser that tail-follows the log file and updates Prometheus counters. Typical components:

Log emitter (Shadowsocks with extended logging or a proxy wrapper).
Parser (Go/Python) that maps log lines to metrics and exposes a /metrics endpoint or writes to the Prometheus textfile collector.
Prometheus scrapes that endpoint.

This approach is flexible and easier to deploy than in-process instrumentation. However, log rotation and delayed flushing can introduce latency and possible gaps if the server crashes before flushing logs.

Option C — Kernel-level accounting with iptables/nftables + node_exporter

When you cannot modify the server, using kernel counters is reliable. The general flow:

Create iptables or nftables rules that match Shadowsocks ports (or mark traffic by process owner if the server runs under a dedicated UID) and count bytes/packets per rule.
Periodically read these counters and expose them to Prometheus. Two common methods:
- Use node_exporter’s textfile collector: a script polls iptables counters and writes metrics to /var/lib/node_exporter/textfile_collector/shadowsocks.prom.
- Run a dedicated exporter that reads the counters and serves them via HTTP (more flexible, can keep per-second rates and resets detection).
Configure Prometheus to scrape the exporter or node_exporter textfile endpoint.

Example: iptables counting and textfile exporter

Below is a production-ready pattern using iptables and the node_exporter textfile collector. The strategy assumes each Shadowsocks port maps to a service or user — you can create one rule per port.

1) Create a dedicated chain and rules:

iptables -N SS_COUNT
iptables -A INPUT -p tcp –dport 8388 -j SS_COUNT
iptables -A INPUT -p udp –dport 8388 -j SS_COUNT

Better: create one rule per port/users so you can attribute bytes:

iptables -A SS_COUNT -p tcp --dport 8388 -m comment --comment "user=alice" -j ACCEPT iptables -A SS_COUNT -p tcp --dport 8389 -m comment --comment "user=bob" -j ACCEPT

2) Script to export counters (bash pseudocode):

# /usr/local/bin/export_ss_iptables.sh #!/bin/bash OUT=/var/lib/node_exporter/textfile_collector/shadowsocks.prom echo "# Generated at $(date -u +"%Y-%m-%dT%H:%M:%SZ")" > $OUT


parse iptables -L -v -x -n SS_COUNT

iptables -L SS_COUNT -v -x -n --line-numbers | tail -n +3 | while read pkts bytes rest; do # extract port and comment using iptables-save for structured output # example metric name: shadowsocks_bytes_total{user="alice",port="8388",proto="tcp"} user=$(echo "$rest" | sed -n 's/.comment "(user=[^"])"./1/p' | sed 's/user=//') port=$(echo "$rest" | sed -n 's/.dpt:([0-9])./1/p') proto=$(echo "$rest" | awk '{print $3}') echo "shadowsocks_bytes_total{user="${user}",port="${port}",proto="${proto}"} ${bytes}" >> $OUT done

Schedule the script as a cron job (every 15s–60s) or systemd timer. Node exporter will pick up the .prom file during its scrape interval and expose those counters to Prometheus.

Notes on iptables / nftables

Use iptables-save output parsing for reliable extraction when comments are involved. The iptables -L output is human-friendly but brittle for automation.
If you use nftables, the same approach works via nft list ruleset and per-rule counters. nftables provides structured JSON output (nft –json list ruleset) which is easier for parsers.
Be careful with NAT chains and FORWARD/PREROUTING chains if your Shadowsocks instance is running in a container or behind DNAT.

Prometheus configuration and best practices

Make sure your Prometheus server scrapes the exporter or node_exporter where you wrote metrics. Example job config:

scrape_configs: - job_name: 'shadowsocks' static_configs: - targets: ['10.0.0.5:9100']

Key recommendations:

Use counters (monotonic increasing values) for bytes so you can use rate() in Prometheus queries: rate(shadowsocks_bytes_total[5m]).
Apply relabeling to drop unnecessary labels or to normalize labels (e.g., user names or port numbers) to keep cardinality manageable.
Scrape intervals: 15–60s is typical. For near-real-time dashboards, choose 15s but beware of increased load and series cardinality.
Use recording rules to compute common aggregates (per-user rate, total traffic per server) to reduce query load on the Prometheus server.

Grafana dashboards: useful panels and queries

With metrics exported as shadowsocks_bytes_total{user=””,port=””,proto=””} and shadowsocks_connections{user=””,port=””}, create the following panels:

Current throughput (per server): sum by (instance) (rate(shadowsocks_bytes_total[1m]))
Top users by 5m traffic: topk(10, sum by (user) (rate(shadowsocks_bytes_total[5m])))
Active connections: sum by (user) (shadowsocks_connections)
Per-port latency or errors: if you expose connection_error counters, use increase(connection_errors[1m]) to show error rates.
Traffic trends: sum(rate(shadowsocks_bytes_total[1m])) offset by 1d for daily comparisons or use prometheus’ increase() for cumulative totals.

Use template variables for instance and user to make interactive dashboards. Restrict the time range for heavy queries and leverage downsampling/remote storage for long term data retention.

Scaling, cardinality and operational considerations

Prometheus excels at time-series, but it is sensitive to high cardinality. Follow these operational rules:

Limit labels: avoid including volatile labels like client IPs unless absolutely necessary. Prefer grouping by user or port.
Use recording rules to precompute heavy aggregations and reduce query-time work.
If you need per-IP analysis for occasional investigations, export sampled data to a separate analytics pipeline (e.g., flow logs via Kafka/ClickHouse or ElasticSearch), not Prometheus.
For large fleets, consider using Prometheus federation or remote_write to a long-term store (Thanos, Cortex, or Mimir).
Ensure metrics endpoints are protected and monitored; an unprotected /metrics could leak sensitive usage data.

Handling resets, reboots and counter wraps

Kernel counters and application counters may reset (process restart, iptables reset, server reboot). Prometheus’ rate() works across resets if the counter is monotonic, but you must:

Detect and handle counter resets in custom exporters (some prefer to convert absolute counts to monotonic counters exported as prometheus.Counter semantics).
Persist per-user state for long-running counters if you need absolute totals across restarts (e.g., in a database or by using remote_write).

Security and access considerations

Metrics often contain sensitive operational details. Protect them using:

Bind exporters to 127.0.0.1 and use Prometheus node_exporter to scrape locally, or use mTLS/HTTP basic auth where supported.
Network ACLs: limit Prometheus server egress to exporter ports with firewall rules.
Audit dashboards and restrict Grafana access to authorized users.

Example production deployment checklist

Choose export method: in-process exporter (preferred) or iptables-based exporter.
Implement exporter and protect /metrics endpoint.
Deploy node_exporter with textfile collector if using iptables script method.
Configure Prometheus scrape job, recording rules, and retention policy.
Create Grafana dashboards with templates and alerts for high bandwidth, anomalous spikes, and excessive connection counts.
Test behavior under server restarts and high cardinality scenarios.

By combining Shadowsocks-aware metrics (or kernel-level accounting) with Prometheus scraping and Grafana visualization, operators gain a reliable, extensible observability pipeline. This enables real-time traffic monitoring, trend analysis, user-level billing, and automated alerting for anomalous traffic. Whether you choose to instrument the Shadowsocks binary, parse logs, or rely on iptables/nftables counters, the patterns described above provide a practical path to production-grade monitoring with minimal runtime overhead.

For implementation examples, exporters, and community tools, check relevant GitHub projects such as shadowsocks-exporter or nftables-to-prometheus scripts, and adapt them to your environment and security requirements.

Article published by Dedicated-IP-VPN — https://dedicated-ip-vpn.com/