docs: clarify health dashboard remediation (#35)
This commit is contained in:
@@ -5,18 +5,44 @@ Use this runbook when the repository supervisor health dashboard becomes stale.
|
||||
## Signals
|
||||
|
||||
- The pinned supervisor issue has an old `last_refresh:` marker.
|
||||
- The stats scheduler exists and is loaded, but wrapper runs overlap or remain active beyond the expected timeout.
|
||||
- The stats scheduler exists and is loaded, but wrapper runs overlap or remain
|
||||
active beyond the expected timeout.
|
||||
- Recent stats logs stop before the affected repository is updated.
|
||||
|
||||
## Remediation
|
||||
|
||||
1. Confirm the launchd job and stats log are present.
|
||||
1. Confirm the scheduler and stats log are present:
|
||||
|
||||
```bash
|
||||
launchctl list | grep -i aidevops-stats-wrapper
|
||||
ls -la ~/Library/LaunchAgents/com.aidevops.aidevops-stats-wrapper.plist
|
||||
tail -40 ~/.aidevops/logs/stats.log
|
||||
```
|
||||
|
||||
2. Check for an active `stats-wrapper.sh` process that has exceeded `STATS_TIMEOUT`.
|
||||
3. Terminate the stale wrapper process, remove the stale stats pidfile, and run one targeted health issue refresh for this repository.
|
||||
4. Verify the pinned dashboard issue now has a fresh `last_refresh:` marker and recent `updated_at` timestamp.
|
||||
The default timeout is `600` seconds and is defined near the top of
|
||||
`~/.aidevops/agents/scripts/stats-wrapper.sh`; an operator can confirm the
|
||||
runtime value with a dry run:
|
||||
|
||||
## Verification evidence for issue #32
|
||||
```bash
|
||||
STATS_DRY_RUN=1 bash ~/.aidevops/agents/scripts/stats-wrapper.sh --dry-run
|
||||
```
|
||||
|
||||
- Scheduler plist existed and launchd reported `com.aidevops.aidevops-stats-wrapper` loaded.
|
||||
- The stats log showed repeated overlapping runs and a stale wrapper process, with the repository dashboard skipped before the targeted refresh.
|
||||
- A targeted refresh updated dashboard issue #10 with `last_refresh: 2026-05-10T00:38:30Z`.
|
||||
3. Terminate the stale wrapper process and remove the stale stats pidfile at
|
||||
`~/.aidevops/logs/stats.pid`.
|
||||
|
||||
4. Run one targeted health issue refresh for this repository:
|
||||
|
||||
```bash
|
||||
REPO_SLUG="wpallstars/wp-fix-plugin-does-not-exist-notices" \
|
||||
REPO_PATH="$HOME/Git/wordpress/wp-fix-plugin-does-not-exist-notices" \
|
||||
bash -lc '
|
||||
source "$HOME/.aidevops/agents/scripts/shared-constants.sh"
|
||||
source "$HOME/.aidevops/agents/scripts/worker-lifecycle-common.sh"
|
||||
source "$HOME/.aidevops/agents/scripts/stats-functions.sh"
|
||||
_update_health_issue_for_repo "$REPO_SLUG" "$REPO_PATH" "" "" ""
|
||||
'
|
||||
```
|
||||
|
||||
5. Verify the pinned dashboard issue now has a fresh `last_refresh:` marker and
|
||||
recent `updated_at` timestamp.
|
||||
|
||||
Reference in New Issue
Block a user