Skip to main content

What the Engine owns

src/pmb/core/engine/base.py builds one Engine per workspace, composed from focused mixins. The constructor opens SQLite immediately but keeps BM25, LanceDB, embedding models, and graph caches lazy - so pmb stats and pmb config don’t pay the vector-store cold start.

Config

Workspace detection, layered settings, model choice, and feature gates.

Stores

SQLite for truth, LanceDB for vectors, BM25 for exact terms, graph tables for links.

Runtime

Session tracker, recall cache, write outbox, embed queue, touch buffer.

Surfaces

Write, batch, recall, lessons, overview, goals, dedup, health, ambient APIs.

Storage schema

SQLite is the source of truth (events.sqlite); LanceDB is the vector side index beside it.
StoreTable / filePurpose
SQLiteeventsMain log: ULID, type, content, metadata, importance, access counts, tier.
SQLiteevent_edgesEvent-to-event reasoning edges (cause/support/conflict/derived).
SQLitelesson_surfacesWhich lessons were shown and whether they were followed.
SQLitewrite_outboxDurable queue for async record_batch writes (replayable).
SQLiteembed_queue_pendingDurable embedding queue with retry + dead-letter.
SQLiteerror_logSwallowed background errors, surfaced by pmb doctor.
SQLitegraph_entities / graph_edgesCanonical entities + weighted co-mention edges.
LanceDBeventsVector rows: ulid, vector, text.
Cachebm25_index.pklCached BM25 token index, rebuilt when stale.

Write path

Writes are durable first: the event lands in SQLite before any optional background work. Embeddings, graph extraction, and dedup may lag - they have retry paths and safe fallbacks - and recall can still find a fresh event via SQLite + BM25 while the embedder warms.

Recall path

The hot path never calls an LLM. Optional LLM work happens at write time, maintenance time, or for explicit commands (consolidate, reflect, distill).

Concurrency & durability

SQLite WAL + busy-timeout

Dashboard, MCP, hooks, and CLI can touch one workspace at once.

Single warm runtime

The daemon keeps one warm Engine/model/vector store instead of loading per agent.

Crash-safe queues

Async writes and pending embeddings have durable tables that replay after restart.

Buffered touches

Recall access-count updates are coalesced so parallel recalls don’t fight SQLite.

Code map

ConcernSource
Engine compositionsrc/pmb/core/engine/base.py
Write / recall / lessons / goalssrc/pmb/core/engine/
Event schema + lifecyclesrc/pmb/core/events.py
BM25 / LanceDB backendssrc/pmb/core/search.py
Entity graphsrc/pmb/graph/store.py
MCP tools + serversrc/pmb/mcp/