Skip to main content

memledger

Audit-grade memory for multi-agent systems

A Python memory layer for AI agents. Framework-agnostic. Any backend. Deterministic or LLM-driven intelligence — your choice.

The problem

Agents without memory repeat every mistake.

Monday 3am

Payment service goes down. OOM kills. The agent investigates from scratch. 45 minutes to diagnose a connection pool leak.

Wednesday 2am

Same service. Same symptoms. The agent doesn't remember Monday. Same 45 minutes. Same fix.

Friday 4am

Different service, same root cause. The agent has no idea this is a pattern. Third investigation from scratch.


With memledger

The same week — with persistent, scored, evolving memory.

Monday — agent stores the fix

The agent resolves the incident and stores the fix as an episodic memory — a specific event with context, outcome, and timestamp.

Wednesday — recalls in 2 minutes

Same symptoms. The agent searches memory first. Finds Monday's fix with a score breakdown showing it's relevant and proven.

Friday — auto-promoted to a runbook

Third success. The episodic memory auto-promotes to a procedural record — a proven runbook. Nobody wrote it. Evidence created it.


What makes memledger different

Framework-Agnostic

Pure Python library. Integrate with LangGraph, LangChain, CrewAI, AutoGen, or raw async Python. Ships LangGraph tools and an MCP server out of the box.

Composable Backends

pgvector, OpenSearch, DynamoDB, SQLite — or compose them. The Composition Router splits storage by access pattern. Switch with one YAML line. Extend with register_backend().

Kubernetes-Native

Helm chart for production K8s deployment. Persistent long-term memory layer for any agentic workflow running on Kubernetes.

Pluggable Strategies

Importance, write gating, conflict resolution, reranking — each is a pluggable strategy. Deterministic for speed. LLM-driven for judgment. Mix per decision. One YAML line.

Lineage and Audit

Supersession chains, derived records, accessor history. Append-only audit log. CLI renders lineage as interactive Mermaid diagrams.

Cognitive Memory Types

Semantic (facts), episodic (events), procedural (how-to). Lifecycle states. Scoring blends similarity with success rate, recency, and importance.


Architecture

Intelligence lives in the core. Storage is pluggable underneath.

memledger architecture diagram showing the agent layer, memledger core with API, records, intelligence, and strategies, the backend layer with pgvector, OpenSearch, DynamoDB, SQLite, and the composition router

Lifecycle states

Memories evolve. Old knowledge is deprecated, not deleted.

activedeprecatedexpiredarchived
engram demo
$ engram demo --step 6
━━━ Supersession: knowledge evolves ━━━

  → New: [cb971859] maxPoolSize=250 (active)
  → Old: [93123577] maxPoolSize=100 → status: DEPRECATED

  ★ Knowledge evolved. Old value deprecated. History preserved.

Soft delete by default — nothing is permanently destroyed.
Batch APIs enable scheduled lifecycle management (CronJobs on K8s).

Scoring and reranking

Retrieval ranked by relevance AND reliability. Two memories can match a query equally, but one has failed three times and the other is a proven fix.

engram demo
$ engram demo --step 2
━━━ Search: "How to fix OOM kills?" ━━━

  #1 [proven fix]   score=0.85
      cosine=0.72 + success_rate=1.0 + recency=0.9 + importance=0.8
      ↑ 3 successes, 0 failures — this fix works every time

  #2 [untested fix] score=0.61
      cosine=0.75 + success_rate=0.5 + recency=0.3 + importance=0.5
      ↑ semantically close, but mixed outcomes

  ★ final_score = similarity × cosine + policy × (success + recency + frequency + importance)
    All weights configurable. Defaults calibrated for production workloads.

Auto-promotion

Incidents become runbooks — automatically. A fix that works once is an anecdote. A fix that works three times is a procedure.

engram demo
$ engram demo --step 1
  → Stored episodic: "OOM fix — set maxPoolSize=50"

$ engram demo --step 3
  → Outcome #1: success_count=1
  → Outcome #2: success_count=2
  → Outcome #3: success_count=3

  ★ AUTO-PROMOTION: Episodic → Procedural!
    New record: "Proven procedure (from 3 successful uses)"
    created_by: engram:auto_promote
    derived_from: [original incident]

    Threshold configurable (default: 3). No human wrote the runbook.

Conflict detection

Contradictions are caught and surfaced. When a new memory conflicts with an existing one, memledger warns — but does not block.

engram demo
$ engram demo --step 5
  → Stored: "maxPoolSize should be 100"

  ★ CONFLICT DETECTED!
    Conflicting with: "maxPoolSize=250 is the standard"
    Similarity: 0.73

    Non-blocking: memory stored, conflict surfaced as a warning.
    A pluggable ConflictResolverStrategy determines the action.
    Threshold configurable per embedding model.

Lineage and provenance

Every piece of knowledge is traceable. Supersession chains, derived records, accessor history, confidence scores.

engram demo
$ engram demo --step 7
━━━ Memory lineage ━━━

  Record: [cb971859] semantic | active
  Confidence: 0.85

  Supersession chain:
    ← [93123577] maxPoolSize=100 (deprecated)
    ← [e0ea36f4] maxPoolSize=50  (deprecated)

  Derived records:
    → [4f579442] Pool sizing runbook (auto-promoted)

  ★ Who created it. What it replaced. What was derived.
    Full provenance for audit and compliance.

Audit trail

Every operation recorded. Append-only log of adds, searches, outcomes, promotions, conflicts, and lifecycle changes.

engram demo
$ engram demo --step 8
━━━ Recent audit log ━━━

  1. [14:51] ADD    episodic "Payment service OOM..."
  2. [14:53] SEARCH "OOM kills" → 1 result
  3. [14:53] OUTCOME success_count → 1
  4. [14:54] OUTCOME success_count → 2
  5. [14:54] OUTCOME success_count → 3
  6. [14:54] PROMOTE episodic → procedural
  7. [14:55] ADD    semantic "maxPoolSize=100"
  8. [14:55] CONFLICT_DETECTED similarity=0.73
  9. [14:56] ADD    semantic "maxPoolSize=250" supersedes

  In-memory by default (configurable). In production, audit entries
  are emitted as structured logs (CloudWatch, Elasticsearch, Loki).

Pluggable strategies

Deterministic by default. LLM-driven when you need it. One YAML change per decision.

DecisionDefaultLLM opt-in
ImportanceFixed 0.5LLM rates 1-10
Write policyAlways storeLLM gates
Conflict resolutionFire hookLLM resolves
RerankingPolicy blendLLM reranks
engram demo
# engram.yaml — switch importance to LLM
strategies:
  importance:
    provider: llm
    config:
      model: anthropic/claude-haiku-4-5

# Everything else stays deterministic.
# One line changed. Same backend. Same data.