> ## Documentation Index > Fetch the complete documentation index at: https://docs.pmbai.dev/llms.txt > Use this file to discover all available pages before exploring further. # Does PMB actually help? > How PMB measures its own value - conservatively, on your data, and honest about when it can't be trusted. Most memory tools assert they improve your agent and back it with one flattering number. PMB takes the opposite stance: **measure it conservatively, on *your* data, and say loudly when the signal isn't trustworthy yet.** A memory system you can't measure is one you can't trust. There are two different questions inside "does it help?", and they need two different methods. Does recall find the **right** memory? Measured with reproducible benchmarks - LoCoMo recall\@10 ≈ 94.5%, multilingual top-10 ≈ 99.2%. Does **using** memory change outcomes? Measured by Earned Memory, joining each surfaced lesson to the outcome of the turn it was active in. ## Earned Memory - three honest layers PMB joins each surfaced lesson to the turn's outcome (tests pass/fail, red→green, build, deploy - no LLM) and reports effectiveness at three levels of rigor, refusing to overclaim at each one. `success_rate(lesson active)` minus `success_rate(no lesson)`. Useful first look, but **confounded**: lessons surface on harder turns, so a helpful lesson can show negative lift. A flag for review, never ground truth. Each lesson carries a **95% Wilson confidence interval** and a conservative verdict - `useful`/`harmful` only when the CI clears the baseline **and** n ≥ `min_n`; otherwise `unverified` or `insufficient`. An n=1 fluke can never read as a real effect. The cleanest control without randomization: compare the **same lesson** when **followed** vs **ignored**. Both arms share the same trigger population, so it holds the surfacing trigger fixed. ## What PMB will not do An untrustworthy metric never drives behaviour. Earned Memory is measurement-only: it does not feed ranking or decay until the outcome signal is dense enough to trust. PMB would rather show you `insufficient` than let a flattering-but-wrong number quietly re-weight your memory. ## Run it on your own data ```bash theme={null} pmb health lessons-impact -w 90 ``` Seeing `signal: insufficient` early is the honest answer, **not a bug** - outcome turns are rare, so a young workspace simply hasn't earned a verdict yet. A lesson only earns "useful"/"helps" once the statistics back it.