Every release, every change. Pull from GitHub and update anytime via python3 migrate.py.
Two silent correctness bugs fixed. The synthesis sync job was running a full knowledge-table sync up to 30 times per agent cycle — once per topic, without scoping to that topic. It should run once. Separately, checkpoint_hours_ago could produce negative values on systems where SQLite stores UTC and the local clock is offset, making health reports untrustworthy. Both are fixed with tests. The release also adds four new operational fields to stats.py --json so you can see skill version, open proposal count, last background job, and pending synthesis topics in a single call.
synthesis.sync job looped over up to 30 topics and called synthesize.py sync --quiet once per topic — without passing --topic, so each call synced the entire knowledge table. Fix: call synthesize.py sync exactly once per agent run. The command is idempotent and already handles all topics internally.
checkpoint_hours_ago timezone fragility. Previous code stripped the timezone offset from a stored timestamp then subtracted it from datetime.now() (local time). On any system where SQLite stores UTC and the local clock is offset, the result was wrong — potentially negative. Fix: use datetime.utcnow() consistently on both sides, treating stored timestamps as naive UTC.
--json output. skill_version — read live from setup.py, catches version drift immediately. open_proposals — count of pending review items. last_background_job — most recent agent run result. pending_synthesis_topics — knowledge topics with no synthesis yet; null if the synthesis engine hasn't been initialized. Human-readable header also now shows Skill: vX.Y.Z.
test_synthesis_sync.py verifies the sync job calls subprocess exactly once and never passes --topic. test_stats.py covers UTC correctness for checkpoint freshness and all four new JSON fields including schema-guarded pending_synthesis_topics.
SKILL_VERSION corrected to "2.14.0". The version string was left at "2.10.0" across the v2.11.0–v2.13.0 releases. setup.py is now the declared canonical version source — stats.py --json reads from it directly so version drift is immediately visible.
dashboard.py already ships at http://localhost:5000, separate from the Command Center on port 8080.
v2.12.0 built the full retrieval learning stack — pathway signatures, outcome tracking, score updates, the works. But the pipeline was silently disabled on every installation. The flag that turned it on existed; it just never got set. This release closes that gap: the learning pipeline activates automatically on every fresh install and upgrade, no manual configuration required. Memory that was designed to improve itself now actually does.
lastrowid in record_job(). With INSERT OR IGNORE, SQLite can leave cursor.lastrowid set to a value from a previous successful insert even when the current statement was skipped. This caused every context.build.projects run to fail with FOREIGN KEY constraint failed — the job ID being written to context_packs.created_by_job_id didn't exist. Fix: use SELECT changes() to detect whether the insert actually wrote a row; if not, query the existing row by idempotency_key to get a valid ID.
get_config_bool() — reads a boolean key from memory_config with a safe fallback. Allows runtime flags to be stored in the database and toggled without editing cron jobs or source code.
--disable-inference-jobs flag — mirror of --enable-inference-jobs. Lets a single cron invocation force-disable the learning pipeline without touching memory_config, useful for debugging or maintenance windows.
memory.learning.enabled = true into memory_config on install or upgrade. This is the default-on switch. Running python3 migrate.py on any existing install activates the learning pipeline immediately.
extract, canonicalize, and conflicts jobs previously required the --enable-inference-jobs flag. Now they run by default, controlled by memory_config.memory.learning.enabled. CLI flags can still override per-run.
--enable-inference-jobs as a belt-and-suspenders guarantee that learning runs even on installs where the DB migration hasn't been applied yet.
datetime.utcnow() (deprecated, breaks in Python 3.14) with datetime.now(timezone.utc) throughout all four files. Timestamps remain in the same ISO format; no data migration needed.
Autoglia now moves from passive persistence to measurable retrieval learning. Memory selection is logged, ranked, evaluated against outcomes, and continuously improved with deterministic rules. This release also ships the benchmark scaffold used to prove retrieval quality gains rather than relying on subjective claims.
MEMORY_INTELLIGENCE_SPEC.md freezes canonical names, scoring semantics, relation types, and benchmark contract before rollout.
OpenClaw compatibility update. Session paths updated from ~/.clawdbot/sessions/ to ~/.openclaw/agents/main/sessions/. Cron commands updated from clawdbot cron to openclaw cron. The skill now auto-detects whether OpenClaw or Clawdbot is installed and uses the correct CLI and state directory. Backward-compatible with existing Clawdbot installs.
~/.openclaw, falls back to ~/.clawdbot. Supports OPENCLAW_DIR env var.openclaw cron when OpenClaw is installed, clawdbot cron otherwise. Manifest path follows the same auto-detect logic.Every previous version was foreground-only: the agent recorded things when it remembered to, and the cron jobs cleaned up what it missed. But the database itself was passive — it never improved on its own. v2.10.0 adds a background maintenance layer that runs while you're away: normalizing entity names, building precomputed context snapshots, detecting stale loops, scoring health, and staging extraction proposals into a review queue. Non-deterministic extractions don't go directly to canonical tables — they land in memory_proposals for you to accept or reject. The system is now self-improving, not just self-recording.
--dry-run, --jobs, and --scope flags.
DecisionProposal, EntityProposal, OpenLoopProposal dataclasses and an Extractor protocol. HeuristicExtractor: regex + keyword baseline, zero API calls. ModelExtractor: feature-flagged, works with any OpenAI-compatible endpoint — enable via AUTOGLIA_MODEL_EXTRACT=1 plus AUTOGLIA_MODEL, AUTOGLIA_API_BASE, AUTOGLIA_API_KEY. Same typed output either way.
background_jobs — job queue with cross-run idempotency. Each entry records job_type, idempotency_key, run_id, status, result_json, and timestamps. Stable work-item keys (e.g. extract.all:session:{sk}:window:{ts}) prevent re-processing across cron runs.
memory_proposals — non-deterministic extraction staging table. Decisions, entities, and open loops land here first. Status: pending → accepted/rejected → promoted. Pending proposals surfaced in recover.py session briefing. Accepted proposals promoted to canonical tables by the next background run.
context_packs — precomputed context snapshots. Two types: startup (identity, active projects, open tasks, recent contacts) and project (per-project summary). Built by the background agent, freshness-validated on read, invalidated when underlying data changes.
entity_aliases — alias registry for entity normalization. Maps variant forms to a canonical name. Populated by the background agent via case-folding, whitespace normalization, and punctuation cleanup. High-value entity aliases can be pinned explicitly.
proposals subcommand: proposals list, proposals accept <id>, proposals reject <id>. Scope-filtered if MEMORY_SCOPE env var is set.
health_metrics() function — 5 bounded metrics: duplicate entity risk, stale open tasks, unsummarized session backlog, orphan task rows, and checkpoint freshness. Included in both JSON and human-readable full_report() output.
test_extractors.py: 16 tests with labeled segment fixtures and recall thresholds — quality gate for heuristic extraction. test_promotion.py: 13 tests covering accepted proposal → canonical write, duplicate prevention, bad payload handling, and promotion idempotency.
content_hash (SHA-256, 16-char prefix). Unique index changed from (session_key, timestamp, role) to (session_key, role, content_hash, timestamp) — eliminates timestamp collisions while allowing legitimate repeated messages at different times.
checkpoint.py call immediately after segmentation, rather than one merged summary. Topic and summary are scoped to the correct segment. Recovery is now precise.
MEMORY_SCOPE env var.
content_hash to conversation_transcripts, backfills existing rows, rebuilds unique index. Creates background_jobs, context_packs, entity_aliases, memory_proposals. v10: adds scope TEXT DEFAULT 'default' to knowledge, conversation_log, contacts, tasks, projects, memory_proposals, context_packs. Schema now at version 10.
PRAGMA busy_timeout=5000 applied in every get_conn() call across all scripts. WAL journal mode enforced. Background agent checks for foreground activity before running write-heavy stages.
The reliability gap this closes: even with the session startup protocol in place, a bot waking up in a new conversation had no proactive context. It knew its personality but nothing about who it was talking to, what projects were active, or what had been discussed recently — unless the user explicitly asked. The always-on briefing means the bot arrives oriented, not blank. No new commands, no new scripts, no new instructions to follow or forget. The existing startup path just does more.
get_user_identity() pulls name, timezone, and preferences from user_preferences (or preferences as fallback), active ventures from businesses, and recent contacts from contacts. The bot now knows who it's talking to at session start without being told.
$(eval echo ~"$(whoami)"). Works correctly for any user on any system.
Before v2.9.0, an entire session's history lived under one session key forever. If you talked Monday then again Thursday, auto-checkpoint merged both into one muddled summary. Segmentation fixes this — each gap becomes a boundary, each conversation gets its own checkpoint, topic, and analytics row. Recovery is now scoped to the right conversation. The topic index and watchwords make the whole system queryable in ways that weren't possible before. And for the first time, you can ask your bot to back up and email your entire database in one sentence.
--gap-minutes.
conversation_topics — normalized topic registry. Every unique topic is stored once, deduplicated case-insensitively, with a use_count so frequently-discussed subjects surface naturally.
conversation_topic_links — many-to-many join between conversation_log entries and conversation_topics. One checkpoint can now be tagged with multiple topics. Query "all sessions that discussed X" instead of guessing with substring search.
watchwords — keyword trigger registry. Store phrases with associated actions (checkpoint, tag_topic, flag). Check a message for hits via memdb.py check-watchwords — returns matches and increments match_count. Implemented at the Python layer where natural language processing actually belongs.
session_stats — per-session behavioral analytics written automatically after each checkpoint. Tracks message_count, user_messages, bot_messages, duration_minutes, avg_gap_minutes, and topics_json. Zero LLM involvement.
topics — list all conversation topics ordered by use count. watchwords [--all] — list active or all watchwords. check-watchwords "<text>" — scan a message for hits. backup [dest.db] — binary database backup. export [dest.json] — JSON export.
--backup dest.db uses SQLite's online backup API: safe while the database is open, lossless, complete. --restore src.db [--force] restores from a binary backup. --email-to address emails the backup file via mutt, sendmail, or smtplib (configurable via env vars). Users can ask their bot: "back up my database and email it to me."
--overwrite flag for JSON restore — clears all tables in reverse FK order, then reinserts using INSERT OR REPLACE in dependency order. Tables now inserted in a defined FK-safe sequence (parents before children). Default behavior unchanged: merge via INSERT OR IGNORE.
memdb.py, recover.py, checkpoint.py, auto-checkpoint.py, and sync-sessions.py. Each test uses a fresh temporary DB via shared fixture. No personal data, no side effects on the real database.
run_checkpoint() now returns (success, log_id) so callers can link topics to the new conversation_log row immediately after creation.
derive_summary_segmented() replaces derive_summary(). Multiple segments produce a summary that lists each segment's time range, message count, and topic on a separate line.
CORE_TABLES audit check. Fixed pre-existing NameError in cmd_check(). Added stats.py to required-file list.
python3 migrate.py to upgrade — existing data is untouched.
v2.7.0 ensured raw transcripts are always recorded. But conversation_log checkpoints — the structured summaries recover.py uses — still depended on the bot voluntarily running checkpoint.py at the right moment. Under load or after context loss, it simply didn't. auto-checkpoint.py + cron closes this loop: every conversation gets checkpointed on schedule, whether the bot cooperated or not.
checkpoint.py atomically. Flags: --threshold N, --all, --dry-run, --session, --quiet, --no-sync.
openclaw cron add) that runs auto-checkpoint.py --quiet automatically. Job ID stored in the manifest for clean uninstall.
The previous approach relied on the LLM voluntarily saving transcripts on every message. Under load, mid-task, or in fast-moving conversations, the LLM skips it — leaving conversation_transcripts empty and recover.py with nothing to reconstruct from. sync-sessions.py solves this at the infrastructure level. Context loss from compaction is now fully recoverable.
conversation_transcripts. Automatic dedup via INSERT OR IGNORE. Flags: --status, --dry-run, --since 24h/7d, --agent NAME. Graceful if OpenClaw is not installed.
conversation_transcripts then adds UNIQUE INDEX on (session_key, timestamp, role) to enable safe INSERT OR IGNORE upserts.
http://localhost:5000. Browse contacts, knowledge, projects, and conversations without writing SQL. Live updates via Server-Sent Events every 3 seconds. Tabs: Overview, Contacts, Knowledge, Projects, Conversations, Schema. --open flag auto-opens browser. Requires pip install flask.
_schema_log — permanent audit log for every user-created extension table. Stores action, object name, reason, session key, and timestamp.
create-table — the only correct way to create extension tables. Atomically creates the table, logs it to _schema_log, and bumps user_schema_rev in _meta. Replaces raw exec CREATE TABLE.
schema-log [limit] — view the full extension table audit trail.
conversation_log entry AND marks transcripts as is_summarized = 1 in one transaction. Keeps recover.py's compaction detection accurate.
upsert command — insert-or-update by key column. Prevents duplicate contacts. Usage: memdb.py upsert contacts name '{"name": "...", "email": "..."}'.
migrate.py fixes: renamed project_progress → projects, ensured preferences and bot_skills exist on older installsstats.py: Fixed incorrect timestamp column mappings on conversation_log, reminders, knowledge, meetings, and businessesstats.py: Now warns when unsummarized transcripts exist and suggests running checkpoint.py--dry-run, --yes flags. Python version gate, idempotent injection, conflict detection, atomic file writes, automatic backup, and rollback manifest at ~/.openclaw/memory-db-install.json.
memory_files for idempotent re-runs, smart-classifies files (dated → daily_log, known files → knowledge), stores source_file path on every entry. --dry-run and --force flags. Files are never deleted or modified.
--session, --hours, --json, --transcript-limit.
businesses, personality, conversation_log, importance, memory_config, outreach, meetings, tasksexport.py — export entire DB to JSON or CSVimport.py — seed DB from existing markdown filesstats.py — memory health report with row counts and recent activitymigrate.py — versioned schema migration systemEXAMPLES.md — real command examples with contextcontacts and projects (deleted_at column)knowledge, ideas, daily_log, conversation_logmemdb.py becomes the required interface — parameterized queries only. No SQL injection possible. Schema version tracking via _meta table.
contacts, knowledge, daily_log, projects, ideas, content, preferences, remindersmemdb.py — safe Python query wrappermigrate.py — migration runnerinit.sql — schema creationPull the latest from GitHub, then run python3 migrate.py. Do not run setup.py install on an existing installation unless you intend a full reinstall.