wiregui/TODO.md

89 lines
4.4 KiB
Markdown
Raw Normal View History

# WireGUI — TODO
feat: initial WireGUI implementation — full VPN management platform Complete Python/NiceGUI rewrite of the Wirezone (Elixir/Phoenix) VPN management platform. All 10 implementation phases delivered. Core stack: - NiceGUI reactive UI with SQLModel ORM on PostgreSQL (asyncpg) - Alembic migrations, Valkey/Redis cache, pydantic-settings config - WireGuard management via subprocess (wg/ip/nft CLIs) - 164 tests passing, 35% code coverage Features: - User/device/rule CRUD with admin and unprivileged roles - Full device config form with per-device WG overrides - WireGuard client config generation with QR codes - REST API (v0) with Bearer token auth for all resources - TOTP MFA with QR registration and challenge flow - OIDC SSO with authlib (provider registry, auto-create users) - Magic link passwordless sign-in via email - SAML SP-initiated SSO with IdP metadata parsing - WebAuthn/FIDO2 security key registration - nftables firewall with per-user chains and masquerade - Background tasks: WG stats polling, VPN session expiry, OIDC token refresh, WAN connectivity checks - Startup reconciliation (DB ↔ WireGuard state sync) - In-memory notification system with header badge - Admin UI: users, devices, rules, settings (3 tabs), diagnostics - Loguru logging with optional timestamped file output Deployment: - Multi-stage Dockerfile (python:3.13-slim) - Docker Compose prod stack (bridge networking, NET_ADMIN, nftables) - Forgejo CI: tests → semantic versioning → Docker registry push - Health endpoint at /api/health
2026-03-30 16:53:46 -05:00
---
## WireGuard Metrics Collector
### Overview
feat: initial WireGUI implementation — full VPN management platform Complete Python/NiceGUI rewrite of the Wirezone (Elixir/Phoenix) VPN management platform. All 10 implementation phases delivered. Core stack: - NiceGUI reactive UI with SQLModel ORM on PostgreSQL (asyncpg) - Alembic migrations, Valkey/Redis cache, pydantic-settings config - WireGuard management via subprocess (wg/ip/nft CLIs) - 164 tests passing, 35% code coverage Features: - User/device/rule CRUD with admin and unprivileged roles - Full device config form with per-device WG overrides - WireGuard client config generation with QR codes - REST API (v0) with Bearer token auth for all resources - TOTP MFA with QR registration and challenge flow - OIDC SSO with authlib (provider registry, auto-create users) - Magic link passwordless sign-in via email - SAML SP-initiated SSO with IdP metadata parsing - WebAuthn/FIDO2 security key registration - nftables firewall with per-user chains and masquerade - Background tasks: WG stats polling, VPN session expiry, OIDC token refresh, WAN connectivity checks - Startup reconciliation (DB ↔ WireGuard state sync) - In-memory notification system with header badge - Admin UI: users, devices, rules, settings (3 tabs), diagnostics - Loguru logging with optional timestamped file output Deployment: - Multi-stage Dockerfile (python:3.13-slim) - Docker Compose prod stack (bridge networking, NET_ADMIN, nftables) - Forgejo CI: tests → semantic versioning → Docker registry push - Health endpoint at /api/health
2026-03-30 16:53:46 -05:00
Separate Python process dedicated to high-frequency WireGuard stats collection, with optional VictoriaMetrics time-series storage. Replaces the current 60s in-process polling with a 5s external collector.
feat: initial WireGUI implementation — full VPN management platform Complete Python/NiceGUI rewrite of the Wirezone (Elixir/Phoenix) VPN management platform. All 10 implementation phases delivered. Core stack: - NiceGUI reactive UI with SQLModel ORM on PostgreSQL (asyncpg) - Alembic migrations, Valkey/Redis cache, pydantic-settings config - WireGuard management via subprocess (wg/ip/nft CLIs) - 164 tests passing, 35% code coverage Features: - User/device/rule CRUD with admin and unprivileged roles - Full device config form with per-device WG overrides - WireGuard client config generation with QR codes - REST API (v0) with Bearer token auth for all resources - TOTP MFA with QR registration and challenge flow - OIDC SSO with authlib (provider registry, auto-create users) - Magic link passwordless sign-in via email - SAML SP-initiated SSO with IdP metadata parsing - WebAuthn/FIDO2 security key registration - nftables firewall with per-user chains and masquerade - Background tasks: WG stats polling, VPN session expiry, OIDC token refresh, WAN connectivity checks - Startup reconciliation (DB ↔ WireGuard state sync) - In-memory notification system with header badge - Admin UI: users, devices, rules, settings (3 tabs), diagnostics - Loguru logging with optional timestamped file output Deployment: - Multi-stage Dockerfile (python:3.13-slim) - Docker Compose prod stack (bridge networking, NET_ADMIN, nftables) - Forgejo CI: tests → semantic versioning → Docker registry push - Health endpoint at /api/health
2026-03-30 16:53:46 -05:00
### Current state
- `tasks/stats.py`: polls `wg show dump` every 60s inside the web process asyncio loop
- UI timers: 30s refresh on device pages
- Worst-case latency: ~90s before a stat change is visible
feat: initial WireGUI implementation — full VPN management platform Complete Python/NiceGUI rewrite of the Wirezone (Elixir/Phoenix) VPN management platform. All 10 implementation phases delivered. Core stack: - NiceGUI reactive UI with SQLModel ORM on PostgreSQL (asyncpg) - Alembic migrations, Valkey/Redis cache, pydantic-settings config - WireGuard management via subprocess (wg/ip/nft CLIs) - 164 tests passing, 35% code coverage Features: - User/device/rule CRUD with admin and unprivileged roles - Full device config form with per-device WG overrides - WireGuard client config generation with QR codes - REST API (v0) with Bearer token auth for all resources - TOTP MFA with QR registration and challenge flow - OIDC SSO with authlib (provider registry, auto-create users) - Magic link passwordless sign-in via email - SAML SP-initiated SSO with IdP metadata parsing - WebAuthn/FIDO2 security key registration - nftables firewall with per-user chains and masquerade - Background tasks: WG stats polling, VPN session expiry, OIDC token refresh, WAN connectivity checks - Startup reconciliation (DB ↔ WireGuard state sync) - In-memory notification system with header badge - Admin UI: users, devices, rules, settings (3 tabs), diagnostics - Loguru logging with optional timestamped file output Deployment: - Multi-stage Dockerfile (python:3.13-slim) - Docker Compose prod stack (bridge networking, NET_ADMIN, nftables) - Forgejo CI: tests → semantic versioning → Docker registry push - Health endpoint at /api/health
2026-03-30 16:53:46 -05:00
### Target state
- Collector process: polls every 5s, writes to DB + VictoriaMetrics
- UI timers: 10s refresh
- Worst-case latency: ~15s
### Phase 1: Configuration ✅
- [x] Add settings to `config.py`:
- `WG_METRICS_ENABLED: bool = False`
- `WG_METRICS_POLL_INTERVAL: int = 5` (seconds)
- `WG_VICTORIAMETRICS_URL: str | None = None` (e.g. `http://localhost:8428`)
- [x] When `WG_METRICS_ENABLED=false`, keep existing `stats_loop` as fallback
- [x] When `WG_METRICS_ENABLED=true`, skip registering `stats_loop` in `main.py`
### Phase 2: Collector process ✅
- [x] Create `wiregui/collector.py` — standalone entry point (`python -m wiregui.collector`)
- [x] No NiceGUI dependency — only asyncio + asyncpg + httpx
- [x] Poll `wg show <iface> dump` every `WG_METRICS_POLL_INTERVAL` seconds
- [x] Update Device rows in PostgreSQL (same fields as current `stats_loop`)
- [x] Push metrics to VictoriaMetrics via `/api/v1/import/prometheus` (if URL configured)
- [x] Graceful shutdown on SIGTERM/SIGINT
- [x] Web app spawns collector as subprocess when `WG_METRICS_ENABLED=true`
- [x] Web app terminates collector on shutdown
### Phase 3: VictoriaMetrics metrics ✅
All metrics implemented in `collector.py` and verified by integration tests:
- [x] `wiregui_peer_rx_bytes{public_key, user_email, device_name}` — counter
- [x] `wiregui_peer_tx_bytes{public_key, user_email, device_name}` — counter
- [x] `wiregui_peer_latest_handshake_seconds{public_key, user_email, device_name}` — gauge
- [x] `wiregui_peer_connected{public_key, user_email, device_name}` — 1 if handshake < 180s, else 0
- [x] `wiregui_peers_total` — gauge, count of active peers
### Phase 4: UI improvements
- [x] Reduce UI timer from 30s to 5s on all device pages (devices.py, admin/devices.py, detail page)
- [x] Add connection status indicator (green/yellow/red dot) based on handshake age
- Green: handshake < 2 min
- Yellow: handshake < 5 min
- Red: no recent handshake or never connected
- [x] Status column in both user and admin device tables
- [x] Status badge on device detail page (live-updating)
- [x] Add traffic rate display (RX/s, TX/s computed from delta between 5s polls)
- [x] Device detail page: live ECharts traffic rate chart (RX/s + TX/s area lines, 60-point rolling window, auto-scaled axis with human-readable byte formatting)
### Phase 5: Infrastructure ✅
- [x] Create `compose.test.yml` — full integration stack with real WG
- [x] Add VictoriaMetrics (single-node, port 8428, 7d retention)
- [x] Add 3 mock WG client containers (alpine + wireguard-tools)
- [x] Clients generate traffic by pinging each other through the tunnel every 3s
- [x] Setup script (`docker/mock-clients/setup.py`) generates keypairs and configs
- [x] Collector runs as subprocess inside the WireGUI container (shares network namespace)
- [ ] Add VictoriaMetrics to dev `compose.yml` (optional, for local testing)
### Design notes
- **Why a separate process?** The `wg show` subprocess call and DB writes at 5s intervals shouldn't share the asyncio loop with the web app. A separate process ensures UI responsiveness isn't affected by stats collection.
- **Why not `run.cpu_bound`?** That uses `ProcessPoolExecutor` for one-shot CPU tasks inside request handling — not suitable for a long-running daemon. A separate entry point is cleaner.
- **VictoriaMetrics push model:** Use the Prometheus remote write API. No scrape config needed — the collector pushes directly. VictoriaMetrics is optional; the collector works fine with just PostgreSQL.
- **Backward compatible:** When `WG_METRICS_ENABLED=false` (default), everything works exactly as it does today.
---
## UI
- [ ] SAML provider management in Authentication tab (admin settings)
- [ ] SSO Providers on account page: add Status column, "Disconnect" action
- [ ] Admin pages (users, devices, rules): apply same card-based styling as account/settings/diagnostics
## Features
- [ ] First-run CLI setup command