Files
everything-claude-code/skills/continuous-learning-v2/agents/observer-loop.sh
ispaydeu c52a28ace9 fix(observe): 5-layer automated session guard to prevent self-loop observations (#399)
* fix(observe): add 5-layer automated session guard to prevent self-loop observations

observe.sh currently fires for ALL hook events including automated/programmatic
sessions: the ECC observer's own Haiku analysis runs, claude-mem observer
sessions, CI pipelines, and any other tool that spawns `claude --print`.

This causes an infinite feedback loop where automated sessions generate
observations that trigger more automated analysis, burning Haiku tokens with
no human activity.

Add a 5-layer guard block after the `disabled` check:

Layer 1: agent_id payload field — only present in subagent hooks; skip any
         subagent-scoped session (always automated by definition).

Layer 2: CLAUDE_CODE_ENTRYPOINT env var — Claude Code sets this to sdk-ts,
         sdk-py, sdk-cli, mcp, or remote for programmatic/SDK invocations.
         Skip if any non-cli entrypoint is detected. This is universal: catches
         any tool using the Anthropic SDK without requiring tool cooperation.

Layer 3: ECC_HOOK_PROFILE=minimal — existing ECC mechanism; respect it here
         to suppress non-essential hooks in observer contexts.

Layer 4: ECC_SKIP_OBSERVE=1 — cooperative env var any external tool can set
         before spawning automated sessions (explicit opt-out contract).

Layer 5: CWD path exclusions — skip sessions whose working directory matches
         known observer-session path patterns. Configurable via
         ECC_OBSERVE_SKIP_PATHS (comma-separated substrings, default:
         "observer-sessions,.claude-mem").

Also fix observer-loop.sh to set ECC_SKIP_OBSERVE=1 and ECC_HOOK_PROFILE=minimal
before spawning the Haiku analysis subprocess, making the observer loop
self-aware and closing the ECC→ECC self-observation loop without needing
external coordination.

Fixes: observe.sh fires unconditionally on automated sessions (#398)

* fix(observe): address review feedback — reorder guards cheapest-first, fix empty pattern bug

Two issues flagged by Copilot and CodeRabbit in PR #399:

1. Layer ordering: the agent_id check spawns a Python subprocess but ran
   before the cheap env-var checks (CLAUDE_CODE_ENTRYPOINT, ECC_HOOK_PROFILE,
   ECC_SKIP_OBSERVE). Reorder to put all env-var checks first (Layers 1-3),
   then the subprocess-requiring agent_id check (Layer 4). Automated sessions
   that set env vars — the common case — now exit without spawning Python.

2. Empty pattern bug in Layer 5: if ECC_OBSERVE_SKIP_PATHS contains a trailing
   comma or spaces after commas (e.g. "path1, path2" or "path1,"), _pattern
   becomes empty or whitespace-only, and the glob *""* matches every CWD,
   silently disabling all observations. Fix: trim leading/trailing whitespace
   from each pattern and skip empty patterns with `continue`.

* fix: fail closed for non-cli entrypoints

---------

Co-authored-by: Affaan Mustafa <affaan@dcube.ai>
2026-03-12 23:40:03 -07:00

146 lines
4.2 KiB
Bash
Executable File

#!/usr/bin/env bash
# Continuous Learning v2 - Observer background loop
set +e
unset CLAUDECODE
SLEEP_PID=""
USR1_FIRED=0
cleanup() {
[ -n "$SLEEP_PID" ] && kill "$SLEEP_PID" 2>/dev/null
if [ -f "$PID_FILE" ] && [ "$(cat "$PID_FILE" 2>/dev/null)" = "$$" ]; then
rm -f "$PID_FILE"
fi
exit 0
}
trap cleanup TERM INT
analyze_observations() {
if [ ! -f "$OBSERVATIONS_FILE" ]; then
return
fi
obs_count=$(wc -l < "$OBSERVATIONS_FILE" 2>/dev/null || echo 0)
if [ "$obs_count" -lt "$MIN_OBSERVATIONS" ]; then
return
fi
echo "[$(date)] Analyzing $obs_count observations for project ${PROJECT_NAME}..." >> "$LOG_FILE"
if [ "${CLV2_IS_WINDOWS:-false}" = "true" ] && [ "${ECC_OBSERVER_ALLOW_WINDOWS:-false}" != "true" ]; then
echo "[$(date)] Skipping claude analysis on Windows due to known non-interactive hang issue (#295). Set ECC_OBSERVER_ALLOW_WINDOWS=true to override." >> "$LOG_FILE"
return
fi
if ! command -v claude >/dev/null 2>&1; then
echo "[$(date)] claude CLI not found, skipping analysis" >> "$LOG_FILE"
return
fi
prompt_file="$(mktemp "${TMPDIR:-/tmp}/ecc-observer-prompt.XXXXXX")"
cat > "$prompt_file" <<PROMPT
Read ${OBSERVATIONS_FILE} and identify patterns for the project ${PROJECT_NAME} (user corrections, error resolutions, repeated workflows, tool preferences).
If you find 3+ occurrences of the same pattern, create an instinct file in ${INSTINCTS_DIR}/<id>.md.
CRITICAL: Every instinct file MUST use this exact format:
---
id: kebab-case-name
trigger: when <specific condition>
confidence: <0.3-0.85 based on frequency: 3-5 times=0.5, 6-10=0.7, 11+=0.85>
domain: <one of: code-style, testing, git, debugging, workflow, file-patterns>
source: session-observation
scope: project
project_id: ${PROJECT_ID}
project_name: ${PROJECT_NAME}
---
# Title
## Action
<what to do, one clear sentence>
## Evidence
- Observed N times in session <id>
- Pattern: <description>
- Last observed: <date>
Rules:
- Be conservative, only clear patterns with 3+ observations
- Use narrow, specific triggers
- Never include actual code snippets, only describe patterns
- If a similar instinct already exists in ${INSTINCTS_DIR}/, update it instead of creating a duplicate
- The YAML frontmatter (between --- markers) with id field is MANDATORY
- If a pattern seems universal (not project-specific), set scope to global instead of project
- Examples of global patterns: always validate user input, prefer explicit error handling
- Examples of project patterns: use React functional components, follow Django REST framework conventions
PROMPT
timeout_seconds="${ECC_OBSERVER_TIMEOUT_SECONDS:-120}"
max_turns="${ECC_OBSERVER_MAX_TURNS:-10}"
exit_code=0
case "$max_turns" in
''|*[!0-9]*)
max_turns=10
;;
esac
if [ "$max_turns" -lt 4 ]; then
max_turns=10
fi
# Prevent observe.sh from recording this automated Haiku session as observations
ECC_SKIP_OBSERVE=1 ECC_HOOK_PROFILE=minimal claude --model haiku --max-turns "$max_turns" --print < "$prompt_file" >> "$LOG_FILE" 2>&1 &
claude_pid=$!
(
sleep "$timeout_seconds"
if kill -0 "$claude_pid" 2>/dev/null; then
echo "[$(date)] Claude analysis timed out after ${timeout_seconds}s; terminating process" >> "$LOG_FILE"
kill "$claude_pid" 2>/dev/null || true
fi
) &
watchdog_pid=$!
wait "$claude_pid"
exit_code=$?
kill "$watchdog_pid" 2>/dev/null || true
rm -f "$prompt_file"
if [ "$exit_code" -ne 0 ]; then
echo "[$(date)] Claude analysis failed (exit $exit_code)" >> "$LOG_FILE"
fi
if [ -f "$OBSERVATIONS_FILE" ]; then
archive_dir="${PROJECT_DIR}/observations.archive"
mkdir -p "$archive_dir"
mv "$OBSERVATIONS_FILE" "$archive_dir/processed-$(date +%Y%m%d-%H%M%S)-$$.jsonl" 2>/dev/null || true
fi
}
on_usr1() {
[ -n "$SLEEP_PID" ] && kill "$SLEEP_PID" 2>/dev/null
SLEEP_PID=""
USR1_FIRED=1
analyze_observations
}
trap on_usr1 USR1
echo "$$" > "$PID_FILE"
echo "[$(date)] Observer started for ${PROJECT_NAME} (PID: $$)" >> "$LOG_FILE"
while true; do
sleep "$OBSERVER_INTERVAL_SECONDS" &
SLEEP_PID=$!
wait "$SLEEP_PID" 2>/dev/null
SLEEP_PID=""
if [ "$USR1_FIRED" -eq 1 ]; then
USR1_FIRED=0
else
analyze_observations
fi
done