Skip to content

How-to: backup and recover the framework state

The _framework/ directory contains all the durable state for an Organismo deployment:

Subdirectory Contents Recoverable?
audit/chain.jsonl HMAC-signed audit log (Rule 24) ✅ source of truth
events.jsonl EventBus stream derivable from chain
capsules.jsonl Capsule state log derivable from chain
memory/episodic/ MirrorLearner feed losable (will re-fill from new runs)
memory/procedural/ Skill execution patterns losable
observability/agent_metrics.jsonl Per-invocation telemetry losable
checkpoints/ Phase 4 + 6 snapshots losable
continuations/ Paused capsule state unrecoverable if lost mid-pause
federation/ FS transport buffer (if used) transient

Daily backup

Run the backup script via cron:

# /etc/cron.d/organismo-backup
0 2 * * * organismo /opt/organismo/.venv/bin/python \
  /opt/organismo/scripts/backup_framework.py \
  --source /var/lib/organismo \
  --output /var/backups/organismo \
  --gzip \
  --retention-days 30 \
  >> /var/log/organismo-backup.log 2>&1

Verify a backup completed:

ls -lh /var/backups/organismo/ | tail -5

Restore from backup

# Stop the service
sudo systemctl stop organismo-dashboard

# Move the existing dir aside (don't delete — investigate first)
sudo mv /var/lib/organismo /var/lib/organismo.broken.$(date +%s)

# Extract the backup
sudo tar -xzf /var/backups/organismo/framework-backup-20260526T020000Z.tar.gz \
  -C /var/lib/
sudo mv /var/lib/_framework /var/lib/organismo
sudo chown -R organismo:organismo /var/lib/organismo

# Verify chain integrity BEFORE starting
sudo -u organismo /opt/organismo/.venv/bin/python \
  /opt/organismo/scripts/recover_chain.py \
  --framework-dir /var/lib/organismo --verify

# Start the service
sudo systemctl start organismo-dashboard
sudo systemctl status organismo-dashboard

Recover from corruption

If events.jsonl or capsules.jsonl was truncated/lost but audit/chain.jsonl is intact, rebuild operational state from the audit chain:

# Dry-run first — show what would be reconstructed
python scripts/recover_chain.py \
  --framework-dir /var/lib/organismo \
  --rebuild capsules \
  --dry-run

# Apply
python scripts/recover_chain.py \
  --framework-dir /var/lib/organismo \
  --rebuild capsules

The audit chain itself is HMAC-chained — if a verify fails, you know exactly which line was tampered:

python scripts/recover_chain.py \
  --framework-dir /var/lib/organismo --verify
# → [FAIL] chain has 2 integrity error(s):
#   - line 47: hmac mismatch (tamper)
#   - line 48: prev_hmac mismatch

For tamper recovery, restore the chain from the most recent verified backup or from S3 Object Lock if enabled.

Backup to S3 (production)

If ObjectLockImmutability is enabled, EACH audit-chain append is uploaded to S3 with retention. The most recent S3 object IS your backup:

aws s3 ls s3://my-audit-bucket/audit/ | tail -5
# Restore the latest:
aws s3 cp s3://my-audit-bucket/audit/1716678400000.jsonl \
  /var/lib/organismo/audit/chain.jsonl

After restoring the chain, rebuild derived state with recover_chain.py --rebuild.

What CANNOT be recovered

  • Hormone levels — transient in-memory state. Reset to baseline 0.2 on restart.
  • In-progress capsules that hadn't reached complete_capsule() — their bridge handle is lost; they appear "stale active" until manually cancelled.
  • Live WebSocket connections — clients reconnect automatically (frontend banner).

See also