Metrics collector (wiregui/collector.py): - Standalone process spawned by web app when WG_METRICS_ENABLED=true - Polls wg show dump every WG_METRICS_POLL_INTERVAL seconds (default 5) - Updates device stats in PostgreSQL - Pushes Prometheus-format metrics to VictoriaMetrics (if configured) - Graceful shutdown on SIGTERM Integration test stack (compose.yml): - Unified compose file for dev, test, and integration modes - VictoriaMetrics single-node TSDB for metrics storage - 3 mock WireGuard client containers generating ping traffic - Automated setup script seeds server keypair, admin user, client devices - make test-stack-up: one command to start everything - make test-stack-verify: validates metrics flowing end-to-end Infrastructure: - Makefile with targets for dev, test, integration, and production - Integration tests verify VictoriaMetrics has data for all 3 clients - Fix Dockerfile to include img/ directory - Separate TESTS.md for test tracking, clean TODO.md for features only
4.1 KiB
4.1 KiB
WireGUI — TODO
WireGuard Metrics Collector
Overview
Separate Python process dedicated to high-frequency WireGuard stats collection, with optional VictoriaMetrics time-series storage. Replaces the current 60s in-process polling with a 5s external collector.
Current state
tasks/stats.py: pollswg show dumpevery 60s inside the web process asyncio loop- UI timers: 30s refresh on device pages
- Worst-case latency: ~90s before a stat change is visible
Target state
- Collector process: polls every 5s, writes to DB + VictoriaMetrics
- UI timers: 10s refresh
- Worst-case latency: ~15s
Phase 1: Configuration ✅
- Add settings to
config.py:WG_METRICS_ENABLED: bool = FalseWG_METRICS_POLL_INTERVAL: int = 5(seconds)WG_VICTORIAMETRICS_URL: str | None = None(e.g.http://localhost:8428)
- When
WG_METRICS_ENABLED=false, keep existingstats_loopas fallback - When
WG_METRICS_ENABLED=true, skip registeringstats_loopinmain.py
Phase 2: Collector process ✅
- Create
wiregui/collector.py— standalone entry point (python -m wiregui.collector) - No NiceGUI dependency — only asyncio + asyncpg + httpx
- Poll
wg show <iface> dumpeveryWG_METRICS_POLL_INTERVALseconds - Update Device rows in PostgreSQL (same fields as current
stats_loop) - Push metrics to VictoriaMetrics via
/api/v1/import/prometheus(if URL configured) - Graceful shutdown on SIGTERM/SIGINT
- Web app spawns collector as subprocess when
WG_METRICS_ENABLED=true - Web app terminates collector on shutdown
Phase 3: VictoriaMetrics metrics
Metrics to push (Prometheus exposition format):
wiregui_peer_rx_bytes{public_key, user_email, device_name}— counterwiregui_peer_tx_bytes{public_key, user_email, device_name}— counterwiregui_peer_latest_handshake_seconds{public_key, user_email, device_name}— gaugewiregui_peer_connected{public_key, user_email, device_name}— 1 if handshake < 180s, else 0wiregui_peers_total— gauge, count of active peers
Phase 4: UI improvements
- Reduce UI timer from 30s to 10s on device pages (devices.py, admin/devices.py)
- Add connection status indicator (green/yellow/red dot) based on handshake age
- Green: handshake < 2 min
- Yellow: handshake < 5 min
- Red: no recent handshake or never connected
- Add traffic rate display (bytes/sec computed from delta between polls)
- Device detail page: mini traffic chart (query VictoriaMetrics if available, else show last-known values)
Phase 5: Infrastructure ✅
- Create
compose.test.yml— full integration stack with real WG - Add VictoriaMetrics (single-node, port 8428, 7d retention)
- Add 3 mock WG client containers (alpine + wireguard-tools)
- Clients generate traffic by pinging each other through the tunnel every 3s
- Setup script (
docker/mock-clients/setup.py) generates keypairs and configs - Collector runs as subprocess inside the WireGUI container (shares network namespace)
- Add VictoriaMetrics to dev
compose.yml(optional, for local testing)
Design notes
- Why a separate process? The
wg showsubprocess call and DB writes at 5s intervals shouldn't share the asyncio loop with the web app. A separate process ensures UI responsiveness isn't affected by stats collection. - Why not
run.cpu_bound? That usesProcessPoolExecutorfor one-shot CPU tasks inside request handling — not suitable for a long-running daemon. A separate entry point is cleaner. - VictoriaMetrics push model: Use the Prometheus remote write API. No scrape config needed — the collector pushes directly. VictoriaMetrics is optional; the collector works fine with just PostgreSQL.
- Backward compatible: When
WG_METRICS_ENABLED=false(default), everything works exactly as it does today.
UI
- SAML provider management in Authentication tab (admin settings)
- SSO Providers on account page: add Status column, "Disconnect" action
- Admin pages (users, devices, rules): apply same card-based styling as account/settings/diagnostics
Features
- First-run CLI setup command