ops(sol): add offline canary recovery path
All checks were successful
deploy-trade-r001-canary / apply (push) Successful in 6m45s

This commit is contained in:
mpabi
2026-04-12 19:25:55 +02:00
parent c76eb7d5f3
commit 1acb8d403e
7 changed files with 212 additions and 4 deletions

View File

@@ -45,6 +45,7 @@ Minimal canary namespace for migration baseline `R001` on `sol`.
- `trade-api` and `trade-frontend` use the current live images from Gitea registry and the same bootstrap wrapper/config pattern as the source environment.
- `dlob-publisher-hot` now targets the host validator on `sol` through `trade-infra` services and writes `dlob-hot:*` into the shared Redis host service.
- `dlob-publisher-all` now targets the same host validator path on `sol` and writes `dlob-all:*` into the shared Redis host service.
- `dlob-publisher-hot` and `dlob-publisher-all` use `/startup` for Kubernetes liveness on `sol`; the internal `/health` endpoint stayed noisy on self-hosted Agave while downstream `drift_ticks` data remained fresh, so the operator smoke check is the stronger health gate for canary rollouts.
- `dlob-hot-redis-to-postgres-raw-writer` and `dlob-hot-postgres-to-postgres-derived-writer` rebuild the first live DLOB derived path on `sol`.
- `dlob-all-redis-to-postgres-derived-writer` rebuilds the live full-market derived DLOB path on `sol`.
- The canary workflow re-runs:
@@ -67,6 +68,26 @@ kubectl apply -k environments/sol/trade-infra
./environments/sol/trade-r001-canary/scripts/create-gitea-registry-secret.sh
./environments/sol/trade-r001-canary/scripts/create-trade-dlob-rpc-secret.sh
./environments/sol/trade-r001-canary/scripts/sync-live-secrets.sh
./environments/sol/trade-r001-canary/scripts/snapshot-sol-secrets.sh
./environments/sol/trade-r001-canary/scripts/check-sol-canary-smoke.sh
```
After the prerequisites are seeded, push to `main` and let `deploy-trade-r001-canary` apply the environment.
The smoke check script validates:
- `agave-validator` and `k3s` service state on `sol`
- Agave RPC lag and health
- deployment readiness in `trade-r001-canary`
- derived DLOB and `drift_ticks` freshness in Postgres
- `trade-api` read-path and `trade-frontend` HTTP response
If `mevnode_bot` is no longer available, bootstrap scripts automatically prefer a local secret snapshot from `$HOME/.local/share/trade-bootstrap/sol/trade-r001-canary-secrets` when that directory exists:
```bash
./environments/sol/trade-r001-canary/scripts/snapshot-sol-secrets.sh
SOURCE_DIR="$HOME/.local/share/trade-bootstrap/sol/trade-r001-canary-secrets" \
./environments/sol/trade-r001-canary/scripts/prepare-sol-postgres.sh
SOURCE_DIR="$HOME/.local/share/trade-bootstrap/sol/trade-r001-canary-secrets" \
./environments/sol/trade-r001-canary/scripts/sync-live-secrets.sh
```