docs: clarify health dashboard remediation (#35)
This commit is contained in:
@@ -5,18 +5,44 @@ Use this runbook when the repository supervisor health dashboard becomes stale.
|
|||||||
## Signals
|
## Signals
|
||||||
|
|
||||||
- The pinned supervisor issue has an old `last_refresh:` marker.
|
- The pinned supervisor issue has an old `last_refresh:` marker.
|
||||||
- The stats scheduler exists and is loaded, but wrapper runs overlap or remain active beyond the expected timeout.
|
- The stats scheduler exists and is loaded, but wrapper runs overlap or remain
|
||||||
|
active beyond the expected timeout.
|
||||||
- Recent stats logs stop before the affected repository is updated.
|
- Recent stats logs stop before the affected repository is updated.
|
||||||
|
|
||||||
## Remediation
|
## Remediation
|
||||||
|
|
||||||
1. Confirm the launchd job and stats log are present.
|
1. Confirm the scheduler and stats log are present:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
launchctl list | grep -i aidevops-stats-wrapper
|
||||||
|
ls -la ~/Library/LaunchAgents/com.aidevops.aidevops-stats-wrapper.plist
|
||||||
|
tail -40 ~/.aidevops/logs/stats.log
|
||||||
|
```
|
||||||
|
|
||||||
2. Check for an active `stats-wrapper.sh` process that has exceeded `STATS_TIMEOUT`.
|
2. Check for an active `stats-wrapper.sh` process that has exceeded `STATS_TIMEOUT`.
|
||||||
3. Terminate the stale wrapper process, remove the stale stats pidfile, and run one targeted health issue refresh for this repository.
|
The default timeout is `600` seconds and is defined near the top of
|
||||||
4. Verify the pinned dashboard issue now has a fresh `last_refresh:` marker and recent `updated_at` timestamp.
|
`~/.aidevops/agents/scripts/stats-wrapper.sh`; an operator can confirm the
|
||||||
|
runtime value with a dry run:
|
||||||
|
|
||||||
## Verification evidence for issue #32
|
```bash
|
||||||
|
STATS_DRY_RUN=1 bash ~/.aidevops/agents/scripts/stats-wrapper.sh --dry-run
|
||||||
|
```
|
||||||
|
|
||||||
- Scheduler plist existed and launchd reported `com.aidevops.aidevops-stats-wrapper` loaded.
|
3. Terminate the stale wrapper process and remove the stale stats pidfile at
|
||||||
- The stats log showed repeated overlapping runs and a stale wrapper process, with the repository dashboard skipped before the targeted refresh.
|
`~/.aidevops/logs/stats.pid`.
|
||||||
- A targeted refresh updated dashboard issue #10 with `last_refresh: 2026-05-10T00:38:30Z`.
|
|
||||||
|
4. Run one targeted health issue refresh for this repository:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
REPO_SLUG="wpallstars/wp-fix-plugin-does-not-exist-notices" \
|
||||||
|
REPO_PATH="$HOME/Git/wordpress/wp-fix-plugin-does-not-exist-notices" \
|
||||||
|
bash -lc '
|
||||||
|
source "$HOME/.aidevops/agents/scripts/shared-constants.sh"
|
||||||
|
source "$HOME/.aidevops/agents/scripts/worker-lifecycle-common.sh"
|
||||||
|
source "$HOME/.aidevops/agents/scripts/stats-functions.sh"
|
||||||
|
_update_health_issue_for_repo "$REPO_SLUG" "$REPO_PATH" "" "" ""
|
||||||
|
'
|
||||||
|
```
|
||||||
|
|
||||||
|
5. Verify the pinned dashboard issue now has a fresh `last_refresh:` marker and
|
||||||
|
recent `updated_at` timestamp.
|
||||||
|
|||||||
Reference in New Issue
Block a user