commit 58f9f221b100aca67ddf13e0d384c99c56a72f63
Author: Carlos Ouyang <carlos@localhost.localdomain>
Date:   Wed Apr 15 11:06:05 2026 +0800

    Initial standalone memabra release

diff --git a/README.md b/README.md
new file mode 100644
index 0000000..fffe149
--- /dev/null
+++ b/README.md
@@ -0,0 +1,84 @@
+# memabra
+
+An intuition-driven control plane for agent memory and action selection.
+
+## What is memabra?
+
+memabra is a local-first, observable, trainable, and replayable agent memory and action orchestration system.
+
+Instead of being a simple memory database, memabra acts as a meta-cognitive controller for agents: given a task, it quickly decides whether to answer directly, recall memory, load a skill, or invoke a tool — and continuously improves this judgment based on task outcomes.
+
+## Install
+
+```bash
+git clone https://github.com/TacitLab/memabra.git
+cd memabra
+python -m venv venv
+source venv/bin/activate
+pip install -e ".[dev]"
+```
+
+## Quick start
+
+### 1. See the available commands
+
+```bash
+memabra --help
+```
+
+### 2. Run a dry-run evaluation
+
+A safe way to see the full workflow without actually promoting a new router version:
+
+```bash
+memabra run --dry-run --format text
+```
+
+### 3. Check system status
+
+```bash
+memabra status --format text
+```
+
+### 4. List saved router versions
+
+```bash
+memabra version list --format text
+```
+
+### 5. Roll back to a previous version
+
+```bash
+memabra version rollback <version-id> --format text
+```
+
+## CLI subcommands
+
+| Command | Description |
+|---------|-------------|
+| `memabra run` | Run the online learning workflow |
+| `memabra status` | Show current system state |
+| `memabra version list` | List all saved router versions |
+| `memabra version rollback <id>` | Roll back to a specific version |
+
+## Text output format
+
+By default, memabra prints JSON. For operator-friendly summaries, add `--format text`:
+
+- **Status** — current version, trajectory/report counts, latest report timing and promotion outcome.
+- **Version list** — total count, current active version highlighted.
+- **Workflow** — grouped into Summary, Baseline, Challenger, Deltas, and Decision sections with normalized `yes/no` flags and fixed-precision metrics.
+
+## Running tests
+
+```bash
+pytest tests/ -q
+```
+
+## Project status
+
+See [docs/PROGRESS.md](docs/PROGRESS.md) for a detailed capability roadmap and [docs/DEMO.md](docs/DEMO.md) for walkthrough examples.
+
+## License
+
+MIT
diff --git a/docs/ALPHA_ITERATION_1_PLAN.md b/docs/ALPHA_ITERATION_1_PLAN.md
new file mode 100644
index 0000000..382423f
--- /dev/null
+++ b/docs/ALPHA_ITERATION_1_PLAN.md
@@ -0,0 +1,252 @@
+# memabra Alpha Iteration 1 Plan
+
+> For Hermes: continue this plan autonomously in small TDD-driven increments. Each run should complete one or more concrete tasks, update this file's progress section, run targeted tests first, then run the full memabra test suite.
+
+Goal: turn memabra from a showable prototype into a safe self-improving alpha by adding an online learning loop with automatic training, evaluation, gated promotion, and rollback-safe router deployment.
+
+Architecture:
+- Keep the current layered design.
+- Do not replace existing routers; add an orchestration layer around them.
+- Promotion must be benchmark-gated: no automatic router switch without passing evaluation thresholds.
+- Persist every training/promotion attempt as an auditable artifact.
+
+Tech stack:
+- Existing memabra Python package under `src/memabra/`
+- Existing pytest suite under `tests/memabra/`
+- Existing persistence via JSON artifacts; keep it simple for alpha
+
+---
+
+## Acceptance criteria
+
+Alpha Iteration 1 is complete when memabra can:
+1. detect newly accumulated trajectories
+2. build a training dataset from eligible trajectories
+3. train a challenger router automatically
+4. run challenger vs baseline on a fixed benchmark set
+5. promote challenger only if thresholds are met
+6. save a versioned promoted router
+7. keep an auditable training/promotion report
+8. leave the currently active router unchanged when challenger loses
+
+---
+
+## Implementation phases
+
+### Phase A — Benchmark-gated online learning loop
+
+#### Task A1: Add a promotion policy object
+Objective: define explicit acceptance rules for promoting a challenger router.
+
+Files:
+- Create: `src/memabra/promotion.py`
+- Create: `tests/memabra/test_promotion.py`
+
+Required behavior:
+- Define a `PromotionPolicy` dataclass
+- Inputs should include at least:
+  - `min_reward_delta`
+  - `max_error_rate_increase`
+  - `max_latency_increase_ms`
+  - `required_task_count`
+- Provide `evaluate(baseline, challenger) -> PromotionDecision`
+- `PromotionDecision` should include:
+  - `accepted: bool`
+  - `reasons: list[str]`
+  - `metrics: dict`
+
+TDD steps:
+1. Write failing tests for accepted and rejected cases.
+2. Run targeted tests and verify failure.
+3. Implement minimal policy logic.
+4. Re-run targeted tests.
+5. Re-run full memabra suite.
+
+#### Task A2: Add benchmark suite persistence
+Objective: store and load a fixed benchmark task set for repeatable evaluations.
+
+Files:
+- Create: `src/memabra/benchmarks.py`
+- Create: `tests/memabra/test_benchmarks.py`
+
+Required behavior:
+- Define a serializable benchmark suite format
+- Load/save benchmark tasks from JSON
+- Provide a default benchmark seed for memory/tool/skill/composite coverage
+
+TDD steps:
+1. Write failing benchmark round-trip tests.
+2. Verify RED.
+3. Implement load/save helpers.
+4. Verify GREEN.
+5. Run full suite.
+
+#### Task A3: Add online training coordinator
+Objective: orchestrate dataset selection, training, evaluation, and promotion.
+
+Files:
+- Create: `src/memabra/online_learning.py`
+- Create: `tests/memabra/test_online_learning.py`
+
+Required behavior:
+- Define `OnlineLearningCoordinator`
+- It should:
+  - query trajectories from `ArtifactIndex`
+  - enforce minimum new trajectory count
+  - train a challenger with `DatasetBuilder`
+  - evaluate challenger with `Evaluator`
+  - apply `PromotionPolicy`
+  - save promoted routers via `RouterVersionStore`
+  - emit a structured report whether accepted or rejected
+
+TDD steps:
+1. Write failing tests for:
+  - skip when too few new trajectories
+  - reject when policy fails
+  - accept and save version when policy passes
+2. Verify failure.
+3. Implement minimal coordinator.
+4. Verify targeted tests.
+5. Run full suite.
+
+### Phase B — Auditability and safe deployment
+
+#### Task B1: Add training run reports
+Objective: persist every online-learning attempt, not just successful promotions.
+
+Files:
+- Extend: `src/memabra/persistence.py` or create `src/memabra/training_reports.py`
+- Create: `tests/memabra/test_training_reports.py`
+
+Required behavior:
+- Save a JSON report per training run
+- Include:
+  - timestamp
+  - source trajectory ids
+  - sample count
+  - baseline metrics
+  - challenger metrics
+  - promotion decision
+  - promoted version id if any
+
+#### Task B2: Add active router metadata tracking
+Objective: make it obvious which router is active and why.
+
+Files:
+- Extend: `src/memabra/router_versioning.py`
+- Extend: `tests/memabra/test_router_versioning.py`
+
+Required behavior:
+- Track metadata for current active router
+- Record promotion source, benchmark result summary, and prior version
+- Make rollback preserve audit trail
+
+### Phase C — Product surface and automation
+
+#### Task C1: Add app-level online learning entrypoint
+Objective: expose one-call retrain/evaluate/promote behavior from `MemabraApp`.
+
+Files:
+- Extend: `src/memabra/app.py`
+- Extend: `tests/memabra/test_app.py`
+
+Required behavior:
+- Add a method like `run_online_learning_cycle(...)`
+- Return a structured result dict/report
+
+#### Task C2: Add CLI entrypoint for the alpha loop
+Objective: make the safe online-learning loop runnable from the command line.
+
+Files:
+- Extend: `src/memabra/cli.py`
+- Extend: `tests/memabra/test_cli_workflow.py`
+- Update: `docs/projects/memabra/DEMO.md`
+
+Required behavior:
+- Add a callable workflow that:
+  - seeds or uses existing artifacts
+  - runs one online-learning cycle
+  - prints the report JSON
+
+#### Task C3: Update docs and wrap-up materials
+Objective: document the alpha loop clearly.
+
+Files:
+- Update: `docs/projects/memabra/PROGRESS.md`
+- Update: `docs/projects/memabra/ROADMAP.md`
+- Update: `docs/projects/memabra/DEMO.md`
+- Optional: create `docs/projects/memabra/ONLINE_LEARNING.md`
+
+Required behavior:
+- Explain promotion gates
+- Explain how to run one cycle manually
+- Explain where reports and versions are stored
+
+---
+
+## Suggested run order for autonomous 20-minute cycles
+
+Cycle group 1:
+- A1 promotion policy
+- A2 benchmark suite persistence
+
+Cycle group 2:
+- A3 online training coordinator
+
+Cycle group 3:
+- B1 training run reports
+- B2 active router metadata tracking
+
+Cycle group 4:
+- C1 app-level entrypoint
+- C2 CLI workflow
+- C3 docs cleanup
+
+---
+
+## Estimated autonomous runs
+
+Recommended initial budget: 18 runs at every 20 minutes.
+
+Reasoning:
+- 3 to 4 runs for Phase A
+- 3 to 4 runs for Phase B
+- 2 to 3 runs for Phase C
+- remaining runs as slack for regression fixes, docs cleanup, and one or two extra quality passes
+
+At 20 minutes per run, 18 runs gives about 6 hours of autonomous iteration, which is a reasonable overnight alpha push.
+
+---
+
+## Progress tracker
+
+- [x] Task A1 — promotion policy
+- [x] Task A2 — benchmark suite persistence
+- [x] Task A3 — online training coordinator
+- [x] Task B1 — training run reports
+- [x] Task B2 — active router metadata tracking
+- [x] Task C1 — app-level online learning entrypoint
+- [x] Task C2 — CLI online learning workflow
+- [x] Task C3 — docs cleanup and operator guidance
+- [x] Task D1 — baseline version selection for online learning
+- [x] Task E1 — task case index for episodic retrieval
+
+## Run log
+
+- 2026-04-14: Plan created. Ready for autonomous overnight execution.
+- 2026-04-14 22:52 UTC: Completed Tasks A1–A3. Promotion policy, benchmark persistence, and online training coordinator implemented with tests. Full suite: 71 passed.
+- 2026-04-14 23:22 UTC: Completed Tasks B1–C3. Training reports, active router metadata tracking, app/CLI entrypoints, and docs implemented with tests. Full suite: 78 passed.
+- 2026-04-14 23:24 UTC: Quality pass — CLI main() now defaults to online-learning workflow, fixed schema test resource warning, added missing alpha module exports to package __init__.py. Full suite: 82 passed.
+- 2026-04-14 23:50 UTC: Docs and repo hygiene pass — updated DEMO.md and ONLINE_LEARNING.md to reflect that `python -m src.memabra.cli` runs the online-learning workflow; added `docs/projects/memabra/demo-artifacts/` to `.gitignore`; verified CLI end-to-end (promoted=true, version saved, report emitted). Full suite: 82 passed.
+- 2026-04-15 00:49 UTC: Safety and usability pass — added exception handling in `OnlineLearningCoordinator` so training/evaluation failures emit error reports instead of crashing; added CLI argument parsing (`--base-dir`, `--min-new-trajectories`); fixed `python -m src.memabra.cli` RuntimeWarning via lazy `cli` import; added `TrainingReportStore.get_report()` for by-id lookup; exported `BenchmarkTask` from package `__init__.py`; updated DEMO.md and ONLINE_LEARNING.md. Full suite: 88 passed.
+- 2026-04-15 01:15 UTC: Repo hygiene and commit pass — verified end-to-end CLI workflow produced a promoted router, version, and report; updated `.gitignore` to exclude runtime artifact directories (`router-versions/`, `training-reports/`); committed entire memabra alpha codebase (67 files, 6,818 insertions). Full suite: 88 passed.
+- 2026-04-15 02:00 UTC: Persistence pass — `OnlineLearningCoordinator` now supports `seen_trajectory_store` to persist seen trajectory IDs across restarts, preventing duplicate retraining in cron jobs. Added `test_coordinator_persists_seen_trajectory_ids_across_restarts`. Fixed evaluation leakage by refreshing the artifact index after benchmarking and marking post-evaluation trajectories as seen. Wired `seen_trajectory_store` through `app.py` and `cli.py`; CLI now defaults to `<base-dir>/seen-trajectories.json`. Added corresponding tests. Full suite: 91 passed.
+- 2026-04-15 02:27 UTC: Dry-run pass — committed pending persistence-pass changes, then added `--dry-run` CLI flag and `dry_run` parameter through the full stack (`OnlineLearningCoordinator`, `app.py`, `cli.py`). In dry-run mode training and evaluation execute but promotion and version saving are skipped; an audit report is still emitted with `dry_run: true`. Added `test_coordinator_dry_run_does_not_promote_or_save_version` and `test_main_entrypoint_passes_dry_run_flag`. Updated `ONLINE_LEARNING.md`. Full suite: 93 passed.
+- 2026-04-15 02:51 UTC: Baseline-version pass — added `baseline_version_id` parameter to `OnlineLearningCoordinator.run_cycle()`, `MemabraApp.run_online_learning_cycle()`, and CLI `--baseline-version` flag. This lets operators evaluate a challenger against a specific saved router version rather than the currently active one. Added tests for coordinator, app, and CLI. Updated `ONLINE_LEARNING.md`. Full suite: 96 passed.
+- 2026-04-15 03:18 UTC: Verification pass — confirmed all tasks A1–D1 are complete and stable. Ran full memabra suite (96 passed) and end-to-end CLI workflow (promoted=true, version saved, report emitted). No code changes required; repo is clean and ready for operator review.
+- 2026-04-15 04:02 UTC: Started Phase E — added `CaseIndex` (`src/memabra/case_index.py`) for task-level episodic retrieval. Maps normalized task inputs to the highest-reward trajectory ID, with JSON save/load. Added `tests/memabra/test_case_index.py` (4 tests). Full suite: 100 passed.
+- 2026-04-15 04:27 UTC: Integrated `CaseIndex` into `MemabraApp` and `MemabraRunner` for episodic retrieval. Added app-level methods (`build_case_index`, `save_case_index`, `load_case_index`, `best_trajectory_for`). Runner now injects an episodic memory candidate when a case index hit occurs. Added CLI flags `--case-index` and `--rebuild-case-index`. Updated docs. Full suite: 107 passed.
+- 2026-04-15 04:54 UTC: Added `case_index_path` support to `OnlineLearningCoordinator` so the case index is automatically rebuilt after each online-learning cycle (including benchmark-generated trajectories). Wired parameter through `app.py` and `cli.py`. Added tests for coordinator, app, and CLI. Full suite: 110 passed.
+- 2026-04-15 05:18 UTC: Added `TrajectorySummarizer` (`src/memabra/trajectory_summary.py`) for generating human-readable trajectory summaries. Integrated summarizer into `MemabraRunner` so episodic memory candidates contain rich summaries when a `persistence_store` is available. Added `tests/memabra/test_trajectory_summary.py` (4 tests) and updated runner test. Full suite: 114 passed.
+- 2026-04-15 05:42 UTC: Added CLI `--status` flag (`src/memabra/cli.py`) to print current system state (active router version, version count, trajectory count, report count, latest report summary) without running a learning cycle. Added `tests/memabra/test_cli_workflow.py::test_main_status_flag_prints_status_and_skips_workflow`. Full suite: 115 passed.
+- 2026-04-15 06:05 UTC: Added CLI `--rollback` and `--list-versions` flags for operator-safe router version management. Added error handling for missing rollback targets (exits 1 with clean message). Added corresponding tests. Full suite: 118 passed. Updated `ONLINE_LEARNING.md` and `DEMO.md` documentation.
diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md
new file mode 100644
index 0000000..550fb35
--- /dev/null
+++ b/docs/ARCHITECTURE.md
@@ -0,0 +1,219 @@
+# Architecture
+
+## 1. 问题定义
+
+我们要解决的不是“怎样让模型记住更多”，而是：
+当 agent 遇到一个任务时，怎样在有限上下文、有限工具预算和有限时间下，快速决定是否要调用 memory、skill、tool，并让这个决策过程能够被训练和修正。
+
+## 2. 系统总览
+
+系统采用四层架构。
+
+### 2.1 Retrieval Layer（候选召回层）
+输入：
+- 当前用户任务
+- 对话短摘要
+- 当前环境状态
+- 失败历史 / 最近修正
+
+输出：
+- top-k memory candidates
+- top-k skill candidates
+- top-k tool candidates
+
+职责：
+- 从不同来源召回候选对象
+- 统一为标准候选格式
+- 不做最终决策，只做缩小搜索空间
+
+### 2.2 Policy Layer（直觉 / 路由层）
+输入：
+- 当前任务表示
+- 候选对象集合
+- 历史选择特征
+- 成本与风险信号
+
+输出：
+- 直接回答
+- 读取某条 memory
+- 加载某个 skill
+- 调用某个 tool
+- 组合动作（如先 skill 后 tool）
+- 请求澄清
+
+职责：
+- 模拟“直觉”
+- 先做快速动作选择
+- 后续可从规则逐步升级到分类器、reranker、bandit、RL policy
+
+### 2.3 Execution Layer（执行层）
+职责：
+- 注入记忆到上下文
+- 加载 skill 指令
+- 调用真实工具
+- 记录执行步骤、耗时、报错、产出
+
+### 2.4 Evaluation Layer（反馈 / 归因层）
+职责：
+- 判断任务是否成功
+- 分析步骤数、重试数、错误率、用户修正次数
+- 拆解 reward
+- 产生可训练轨迹
+
+没有这一层，就没有真正的“学习”，只有玄学调参。
+
+## 3. 统一对象模型
+
+虽然 memory、skill、tool 性质不同，但在召回和路由阶段可以统一成候选对象：
+
+```json
+{
+  "id": "string",
+  "type": "memory|skill|tool",
+  "title": "string",
+  "summary": "string",
+  "triggers": ["string"],
+  "cost": 0.0,
+  "confidence": 0.0,
+  "success_rate": 0.0,
+  "freshness": 0.0,
+  "risk": 0.0,
+  "embedding": "vector-ref",
+  "tags": ["string"],
+  "source": "user|system|generated|external"
+}
+```
+
+注意：统一的是候选接口，不是语义本体。
+三类对象必须保持边界：
+- memory 存事实
+- skill 存程序
+- tool 存动作能力
+
+## 4. 记忆系统分层
+
+### 4.1 Semantic Memory（事实记忆）
+例如：
+- 用户偏好
+- 机器环境
+- 项目约定
+- API 限制
+
+### 4.2 Procedural Memory（程序性记忆）
+即 skill：
+- 某类任务的处理流程
+- 踩坑经验
+- 验证步骤
+
+### 4.3 Episodic Memory（情景记忆）
+- 某次任务的具体轨迹
+- 当时用了什么资源
+- 为什么成功或失败
+
+### 4.4 Working Memory（工作记忆）
+- 当前任务临时状态
+- 本轮推理中间产物
+- 不应直接沉淀为长期记忆
+
+## 5. 训练策略：先外部策略，后端到端
+
+### 5.1 Phase A：不改基础模型权重
+先训练一个小型策略器，决定：
+- 要不要查记忆
+- 查哪类记忆
+- 要不要 skill
+- 先用哪个工具
+
+可选实现：
+- 规则 + 分数融合
+- 轻量分类器
+- reranker
+- contextual bandit
+
+### 5.2 Phase B：从轨迹中学 reranking / routing
+训练输入：
+- 任务上下文
+- 候选对象集合
+- 实际动作
+- 结果 reward
+
+训练目标：
+- 最大化任务完成率
+- 最小化无效调用
+- 减少用户重复提供信息
+- 减少不必要的上下文膨胀
+
+### 5.3 Phase C：端到端实验
+只有当以下条件成立，才值得考虑：
+- 已有高质量轨迹数据
+- 能做 credit assignment
+- 有稳定的离线评估环境
+- 能控制灾难性遗忘
+
+## 6. Feedback & Reward 设计
+
+reward 不能只看任务是否成功。要拆成多项：
+- task_success：最终是否完成
+- efficiency：用了多少步
+- retrieval_hit：是否命中关键 memory/skill/tool
+- user_correction_penalty：用户是否纠正
+- tool_error_penalty：是否触发无效工具调用
+- context_cost_penalty：上下文是否膨胀过度
+- latency_penalty：是否过慢
+
+可组合为：
+
+```text
+R = a*task_success + b*retrieval_hit - c*tool_error - d*user_correction - e*latency - f*context_cost
+```
+
+## 7. 关键难点
+
+### 7.1 Credit Assignment
+成功了，到底是谁的功劳？
+要记录候选集、最终选择、未选备选项，才能做反事实分析。
+
+### 7.2 False Reinforcement
+错误记忆被反复命中，会自我强化。
+需要：
+- 置信度
+- 可撤销
+- 最近验证时间
+- 来源追踪
+
+### 7.3 Exploitation vs Exploration
+老选最稳的对象会变保守，永远学不到新模式。
+需要安全探索机制。
+
+### 7.4 Type Boundary Collapse
+如果把 memory、skill、tool 混成一个大向量池，系统会越来越糊。
+
+## 8. 推荐 MVP
+
+### MVP-1：可观测系统
+- 定义对象 schema
+- 定义事件 schema
+- 统一记录轨迹
+- 做基础检索
+- 用规则路由
+
+### MVP-2：轻量学习型路由
+- 加入候选打分器
+- 从优秀轨迹训练动作选择器
+- 做离线回放评估
+
+### MVP-3：在线自适应
+- 使用 bandit / preference updates
+- 根据任务结果微调路由策略
+
+### MVP-4：端到端试验场
+- 小规模实验性训练
+- 与分层方案对比
+- 验证是否真有收益
+
+## 9. 核心原则
+
+1. 先可观测，再可学习
+2. 先学路由，再学大脑
+3. 先做分层归因，再做端到端优化
+4. 优化“何时依赖什么”，而不是盲目优化“模型看起来更聪明”
\ No newline at end of file
diff --git a/docs/DECISIONS.md b/docs/DECISIONS.md
new file mode 100644
index 0000000..7e01ce6
--- /dev/null
+++ b/docs/DECISIONS.md
@@ -0,0 +1,94 @@
+# Design Decisions
+
+## D-001: 不以端到端训练作为第一阶段目标
+
+决定：
+第一阶段采用分层架构，不直接训练一个从任务到动作的黑盒大模型。
+
+原因：
+- 反馈稀疏
+- credit assignment 困难
+- 数据量不足时容易学偏
+- 可解释性太差，难 debug
+
+影响：
+项目先构建 observability、logging、router 和 reward 层。
+
+## D-002: 将 memory、skill、tool 统一为候选对象接口，但不混淆类型
+
+决定：
+在召回和排序阶段，三者共享统一候选 schema；在存储、执行和评估阶段，保持强类型边界。
+
+原因：
+- 统一召回便于路由决策
+- 保持类型边界可避免语义坍塌
+
+影响：
+后续 schema 设计需要同时支持统一特征和类型特有字段。
+
+## D-003: 记忆分为 facts / procedures / episodes / working 四层
+
+决定：
+长期系统至少区分：
+- facts
+- procedures
+- episodes
+- working memory
+
+原因：
+“记忆”不是一坨文本，人的有效直觉来自多种记忆系统协同。
+
+影响：
+每个写入动作都要先判定落到哪一层，而不是直接塞进统一向量库。
+
+## D-004: 先优化路由策略，再考虑学习基础模型内部权重
+
+决定：
+学习目标先放在 external policy 上，而不是 foundation model 的参数上。
+
+原因：
+- 小模型更便宜
+- 训练更稳定
+- 更容易比较实验结果
+- 更适合本地部署
+
+影响：
+需要专门设计 router features、训练样本和离线评估框架。
+
+## D-005: reward 必须拆分，不使用单一任务成败信号
+
+决定：
+reward 将拆分为 success、efficiency、retrieval_hit、user_correction、tool_error、latency、context_cost 等因子。
+
+原因：
+只看任务成功会掩盖大量中间行为质量问题。
+
+影响：
+需要事件级 logging，不能只存最终答案。
+
+## D-006: 所有学习都建立在可回放轨迹上
+
+决定：
+任何策略更新都必须能追溯到完整 trajectory。
+
+原因：
+不可回放，就无法排查策略劣化；不可回放，也无法做人类审计。
+
+影响：
+trajectory schema 和 replay 工具会成为基础设施，而不是可选项。
+
+## D-007: 项目正式命名为 memabra
+
+决定：
+项目正式名采用 `memabra`。
+
+副标题：
+An intuition-driven control plane for agent memory and action selection.
+
+原因：
+- 需要一个可品牌化、可传播的短名
+- 技术本质由副标题补足
+- 避免旧名把项目误导成“单纯记忆管理工具”
+
+影响：
+后续所有原型代码、文档、schema 标识、演示材料统一使用 memabra。 
\ No newline at end of file
diff --git a/docs/DEMO.md b/docs/DEMO.md
new file mode 100644
index 0000000..913805e
--- /dev/null
+++ b/docs/DEMO.md
@@ -0,0 +1,148 @@
+# Demo
+
+memabra now has a polished wrap-up workflow in addition to the lower-level demo app.
+
+## Quick run
+
+If you installed the repo in editable mode, prefer the dedicated CLI command:
+
+```bash
+source venv/bin/activate
+memabra
+```
+
+The legacy developer entrypoint still works too:
+
+```bash
+source venv/bin/activate
+python -m src.memabra.cli
+```
+
+This runs the online-learning loop: it seeds demo tasks, trains a challenger router, evaluates it against a benchmark suite, promotes it if thresholds are met, and prints a JSON report.
+
+You can override the default artifact directory and minimum trajectory threshold:
+
+```bash
+source venv/bin/activate
+memabra run --base-dir /custom/artifacts --min-new-trajectories 5
+```
+
+You can also enable episodic retrieval by rebuilding the case index from saved trajectories:
+
+```bash
+source venv/bin/activate
+memabra run --rebuild-case-index
+```
+
+You can check system status, list versions, or roll back without running a learning cycle:
+
+```bash
+source venv/bin/activate
+memabra status
+memabra version list
+memabra version rollback 20260414-123456
+```
+
+If you want operator-friendly output instead of raw JSON, use `--format text`:
+
+```bash
+source venv/bin/activate
+memabra status --format text
+memabra version list --format text
+memabra version rollback 20260414-123456 --format text
+memabra run --dry-run --format text
+```
+
+The text formatter is aimed at operators: status output includes the latest report timing/outcome, version listings highlight the currently active router version, and workflow output is grouped into summary/baseline/challenger/deltas/decision sections with normalized yes/no and fixed-precision metrics.
+
+You can also call it programmatically:
+
+```bash
+source venv/bin/activate
+python - <<'PY'
+from src.memabra.cli import run_online_learning_workflow
+result = run_online_learning_workflow()
+print(result)
+PY
+```
+
+The online-learning workflow will:
+1. build a demo app
+2. seed example tasks (if no trajectories exist yet)
+3. run one online-learning cycle
+4. train a challenger router
+5. evaluate it against the baseline on a fixed benchmark suite
+6. promote it only if the promotion policy accepts
+7. persist a training report under `training-reports/`
+8. print a JSON report
+
+## Python API
+
+```python
+from src.memabra.cli import run_wrapup_workflow, run_online_learning_workflow
+
+# Legacy wrap-up demo
+result = run_wrapup_workflow()
+print(result)
+
+# Safe online-learning loop with benchmark-gated promotion
+result = run_online_learning_workflow()
+print(result)
+```
+
+## Lower-level demo app
+
+You can still drive the app manually:
+
+```bash
+source venv/bin/activate
+python - <<'PY'
+from src.memabra.app import build_demo_app
+app = build_demo_app()
+
+for prompt in [
+    'Use my telegram preference for this answer.',
+    'Check the current system status.',
+    'Deploy this service with the usual workflow.',
+]:
+    trajectory = app.run_task(prompt, channel='telegram', user_id='oza')
+    print(prompt)
+    print(trajectory['decisions'][0]['decision_type'], trajectory['outcome']['status'], trajectory['reward']['total'])
+    print([event['event_type'] for event in trajectory['events']])
+    print('---')
+
+print(app.replay_summary())
+PY
+```
+
+## Output locations
+
+By default the workflows write to:
+- `docs/projects/memabra/demo-artifacts/trajectories/`
+- `docs/projects/memabra/demo-artifacts/memories/`
+- `docs/projects/memabra/demo-artifacts/router-versions/`
+- `docs/projects/memabra/demo-artifacts/training-reports/`
+
+## What this proves
+
+The alpha is able to demonstrate the whole loop:
+- retrieval
+- routing
+- execution
+- persistence
+- replay
+- training
+- evaluation
+- router versioning
+- benchmark-gated promotion
+- auditable training reports
+
+## Limits
+
+This is still an alpha:
+- learning is lightweight, not a deep model
+- storage is JSON-file based
+- promotion policy thresholds are manually configured
+- tool/skill integration is still narrower than a production agent platform
+
+But it is now a safe, self-improving alpha, not just a pile of modules.
diff --git a/docs/EXECUTION_AND_PERSISTENCE.md b/docs/EXECUTION_AND_PERSISTENCE.md
new file mode 100644
index 0000000..17bee56
--- /dev/null
+++ b/docs/EXECUTION_AND_PERSISTENCE.md
@@ -0,0 +1,77 @@
+# Execution and Persistence
+
+## 目标
+
+给 memabra 补上两块真正让系统“落地”的骨头：
+- execution：让路由决策进入可执行动作层
+- persistence：让 trajectory 和 memory record 能落到磁盘
+
+## 当前实现
+
+### execution.py
+提供：
+- `ActionResult`
+- `MemoryExecutor`
+- `SkillExecutor`
+- `ToolExecutor` （原 MockToolExecutor，现已升级为可接真实后端）
+- `ExecutionEngine`
+- `ToolBackend` 协议（支持 `params` 传参）
+- `LocalFunctionToolAdapter` — 将工具映射到本地 Python 函数
+- `SubprocessToolAdapter` — 将工具映射到 shell 命令
+- `ToolRegistry` — 按 `tool_id` 注册、查找和执行工具
+
+当前行为：
+- `inject_memory` 会产出 `memory_injected` 事件，并在有 memory store 时标记 `last_used_at`
+- `load_skill` 会产出 `skill_loaded` 事件
+- `call_tool` 会通过 `ToolBackend` 协议调用真实后端，产出 `tool_called` 和 `tool_result` 事件
+- `RouteDecision` 现在携带 `selected_payloads`，可以将候选参数经由 `ToolExecutor` 传递给后端
+- 其他 decision_type 先走 noop
+
+这一步的意义是：
+memabra 第一次有了 execution stage，而不是只有 policy stage。
+并且 tool 层现在可以接入真实的本地函数或子进程后端，不再是纯 mock。
+
+### persistence.py
+提供：
+- `PersistenceStore`
+
+当前能力：
+- 保存 trajectory 到 `artifacts/trajectories/`
+- 读取 trajectory
+- 列出 trajectory 文件
+- 保存 memory record 到 `artifacts/memories/`
+- 读取 memory record
+- 列出 memory 文件
+
+这意味着 prototype artifacts 已经不再只是内存态漂浮物。
+
+### runner writeback integration
+runner 现在支持：
+- 挂 execution engine
+- 挂 persistence store
+- 挂 memory store
+- 执行后扩展 execution events
+- 可选把 trajectory 落盘
+- 对 memory inject 决策进行基本 writeback / mark_used
+
+## 当前闭环
+
+现在的最小系统流程已经变成：
+任务 -> retrieval -> router -> execution -> trajectory -> validation -> persistence -> replay
+
+这就真正有点 agent runtime 的味儿了。
+
+## 当前限制
+
+- ~~tool 执行还是 mock 的~~ 已升级为可插拔式真实后端
+- skill 执行只是事件层，不是真加载技能
+- writeback 逻辑还很粗糙
+- persistence 目前是 JSON 文件，没有索引层
+
+## 下一步建议
+
+1. ~~做真实 `ToolExecutor` / `SkillExecutor` adapter 协议~~ tool adapter 已完成
+2. 做真实 `SkillExecutor` adapter（从文件系统加载 skill payload）
+3. 把 persistence 接到 replay 默认数据源
+4. 给 runner 增加 outcome / reward 的真实更新逻辑
+5. 做 richer telemetry 和失败事件归因
diff --git a/docs/NAMING.md b/docs/NAMING.md
new file mode 100644
index 0000000..c8745df
--- /dev/null
+++ b/docs/NAMING.md
@@ -0,0 +1,48 @@
+# Naming
+
+最终命名确定为：
+
+# memabra
+
+副标题：
+An intuition-driven control plane for agent memory and action selection.
+
+## 选择理由
+
+这个名字成立，因为它同时满足两件事：
+
+1. 作为品牌名，它短、好记、有辨识度。
+2. 作为系统名，它配合副标题后，能准确表达项目本质不是“记忆库”，而是 memory、skill、tool 的动作选择与控制系统。
+
+## 命名策略
+
+- 品牌名：`memabra`
+- 技术描述：`An intuition-driven control plane for agent memory and action selection.`
+
+这样分层后：
+- `memabra` 负责让人记住
+- 副标题负责让人看懂
+
+## 为什么不用纯功能名
+
+像 `Agent Memory Manager` 这样直接描述功能的名字，问题是太窄：
+- 太像存储工具
+- 没体现 routing / policy / evaluation / learning
+- 没体现它是 agent 的元认知控制器
+
+## 内部表达建议
+
+在技术文档里，可以把 memabra 描述为：
+- local-first metacognitive router
+- agent memory and action orchestration system
+- intuition-driven control plane
+
+这三个说法分别适合：
+- 研究语境
+- 工程语境
+- 对外介绍语境
+
+## 结论
+
+命名不再强调“memory manager”，而强调“intuition-driven control”。
+这更接近项目真正的骨架。
\ No newline at end of file
diff --git a/docs/ONLINE_LEARNING.md b/docs/ONLINE_LEARNING.md
new file mode 100644
index 0000000..bd0ef43
--- /dev/null
+++ b/docs/ONLINE_LEARNING.md
@@ -0,0 +1,171 @@
+# Online Learning Operator Guide
+
+## What it does
+
+memabra's online learning loop lets the system safely retrain its router from accumulated trajectories, evaluate the new challenger against the current baseline, and promote it only if explicit thresholds are met.
+
+## How to run one cycle
+
+### From Python
+
+```python
+from src.memabra.cli import run_online_learning_workflow
+
+result = run_online_learning_workflow()
+print(result)
+```
+
+### From the shell
+
+```bash
+source venv/bin/activate
+python -m src.memabra.cli
+```
+
+Or with custom options:
+
+```bash
+source venv/bin/activate
+python -m src.memabra.cli --base-dir /custom/artifacts --min-new-trajectories 5
+```
+
+By default the CLI persists seen trajectory IDs to `<base-dir>/seen-trajectories.json` so repeated runs skip already-processed data. You can override the path:
+
+```bash
+source venv/bin/activate
+python -m src.memabra.cli --seen-trajectory-store /custom/artifacts/seen.json
+```
+
+### Dry-run mode
+
+To train and evaluate a challenger without actually promoting it or saving a new router version:
+
+```bash
+source venv/bin/activate
+python -m src.memabra.cli --dry-run
+```
+
+This still produces a training report (with `dry_run: true`) so you can inspect what would have happened before allowing a real promotion.
+
+### Evaluate against a specific baseline version
+
+By default the online-learning cycle uses the currently active router as the baseline. You can pin the baseline to a specific saved version instead:
+
+```bash
+source venv/bin/activate
+python -m src.memabra.cli --baseline-version 20260414-123456
+```
+
+This is useful when you want to compare a challenger against a known-good version rather than whatever happens to be active right now. The report will record `baseline_version_id` for audit.
+
+### Episodic retrieval with case index
+
+You can load or rebuild a case index for episodic retrieval during task execution:
+
+```bash
+source venv/bin/activate
+python -m src.memabra.cli --rebuild-case-index
+```
+
+This builds a `CaseIndex` from all saved trajectories and saves it to the default path (`<base-dir>/case-index.json`). On subsequent runs, load it without rebuilding:
+
+```bash
+source venv/bin/activate
+python -m src.memabra.cli --case-index /custom/artifacts/case-index.json
+```
+
+When a case index path is provided, the online-learning cycle automatically rebuilds the index after training and evaluation, so benchmark-generated trajectories are included for future episodic retrieval.
+
+When a case index is loaded, the runner injects an episodic memory candidate into retrieval for inputs that match a previously seen task, surfacing the best past trajectory as a hint to the router.
+
+Or inline:
+
+```bash
+source venv/bin/activate
+python - <<'PY'
+from src.memabra.cli import run_online_learning_workflow
+print(run_online_learning_workflow())
+PY
+```
+
+## Promotion gates
+
+A challenger is promoted only when **all** of the following are true:
+
+- `reward_delta >= min_reward_delta` — the challenger must improve average reward by at least this amount
+- `error_rate_delta <= max_error_rate_increase` — the challenger must not increase errors beyond this limit
+- `latency_delta_ms <= max_latency_increase_ms` — the challenger must not become slower beyond this limit
+- `task_count >= required_task_count` — the benchmark must include at least this many tasks
+
+Default policy in the CLI workflow is lenient for alpha exploration. In production you should tighten these thresholds.
+
+## Where reports and versions are stored
+
+By default everything lands under:
+
+- `docs/projects/memabra/demo-artifacts/trajectories/` — raw task trajectories
+- `docs/projects/memabra/demo-artifacts/router-versions/versions/` — versioned router weights
+- `docs/projects/memabra/demo-artifacts/router-versions/current.json` — active router metadata (includes promotion source, benchmark summary, prior version, rollback history)
+- `docs/projects/memabra/demo-artifacts/training-reports/` — one JSON report per training run
+
+## What happens when the challenger loses
+
+- The active router in the app **remains unchanged**
+- A training report is still saved with the rejection reasons
+- No new version is registered as current
+
+## Rolling back
+
+You can roll back to any previous version from Python:
+
+```python
+from src.memabra.router_versioning import RouterVersionStore
+
+store = RouterVersionStore()
+store.rollback("20260414-123456")
+current = store.get_current()
+print(current)
+```
+
+Or from the CLI:
+
+```bash
+source venv/bin/activate
+python -m src.memabra.cli --rollback 20260414-123456
+```
+
+To see all available versions before rolling back:
+
+```bash
+source venv/bin/activate
+python -m src.memabra.cli --list-versions
+```
+
+Rollback preserves an audit trail in `current.json` (`rollback_from`, `rolled_back_at`).
+
+## Status check
+
+To quickly inspect the current system state without running a learning cycle:
+
+```bash
+source venv/bin/activate
+python -m src.memabra.cli --status
+```
+
+## Architecture summary
+
+```
+Trajectories -> ArtifactIndex -> DatasetBuilder -> SimpleLearningRouter (challenger)
+                                        |
+                                        v
+BenchmarkSuite -> Evaluator -> baseline vs challenger
+                                        |
+                                        v
+                              PromotionPolicy.evaluate()
+                                        |
+                    +-------------------+-------------------+
+                    | accepted                                  | rejected
+                    v                                           v
+      RouterVersionStore.save()                    training report saved
+      app.set_router(challenger)                   active router unchanged
+```
diff --git a/docs/PROGRESS.md b/docs/PROGRESS.md
new file mode 100644
index 0000000..ad3e26b
--- /dev/null
+++ b/docs/PROGRESS.md
@@ -0,0 +1,162 @@
+# memabra Progress
+
+## Current status
+
+Project status: safe self-improving alpha, benchmark-gated online learning loop complete
+Date: 2026-04-15
+Project: memabra
+Subtitle: An intuition-driven control plane for agent memory and action selection.
+
+## What exists now
+
+memabra now has a complete safe self-improving alpha control-plane loop:
+- candidate retrieval
+- routing decisions
+- memory / skill / tool execution
+- telemetry events
+- trajectory construction
+- runtime validation
+- artifact persistence
+- replay and analytics
+- artifact indexing and dataset slicing
+- lightweight learning router training
+- A/B evaluation
+- router weight versioning and rollback
+- benchmark-gated promotion with explicit policy thresholds
+- auditable training reports
+- exception-safe online learning coordinator
+- configurable CLI entrypoint
+- persisted seen-trajectory tracking across restarts (safe for cron jobs)
+- dry-run mode for training/evaluation without promotion risk
+- baseline version selection for challenger evaluation
+- task case index (`CaseIndex`) for episodic retrieval: maps normalized inputs to the best past trajectory ID
+- `CaseIndex` integration into `MemabraApp` (build, save, load, lookup) and `MemabraRunner` (injects episodic candidate on matching inputs)
+- CLI flags `--case-index` and `--rebuild-case-index` for operator-managed episodic retrieval
+- `OnlineLearningCoordinator` auto-rebuilds case index after each cycle when `case_index_path` is provided, ensuring benchmark-generated trajectories are indexed
+- `TrajectorySummarizer` generates human-readable trajectory summaries from task input, decisions, outcome, and reward
+- `MemabraRunner` enriches episodic memory candidate summaries using `TrajectorySummarizer` when `persistence_store` is available
+- CLI `--status` flag prints current system state (active router version, counts, latest report) without triggering a learning cycle
+- CLI is now subcommand-driven (`run`, `status`, `version list`, `version rollback`) with a dedicated packaged `memabra` entrypoint
+- CLI `--format text` mode provides operator-friendly summaries for status checks, version listings, rollbacks, and workflow runs, including latest report details, current-version highlighting, sectioned workflow summaries, normalized yes/no flags, and fixed-precision benchmark/promotion metrics
+
+## Major completed capabilities
+
+### Foundations
+- project naming, architecture, roadmap, decisions, reward spec
+- candidate / event / trajectory / memory schemas
+- prototype package structure under `src/memabra/`
+
+### Runtime path
+- `retrieval.py`: typed candidate retrieval
+- `router.py`: heuristic router, feature-scoring router, learning router
+- `execution.py`: memory, skill, tool executors and adapters
+- `runner.py`: end-to-end task -> trajectory orchestration
+- `persistence.py`: trajectory and memory artifact storage
+- `replay.py`: replay summaries over examples and persisted runs
+- `memory_store.py`: typed memory records with verify/revoke support
+
+### Adapters and evaluation
+- real tool adapters:
+  - `LocalFunctionToolAdapter`
+  - `SubprocessToolAdapter`
+  - `ToolRegistry`
+- real skill loading:
+  - `FileSystemSkillBackend`
+- richer evaluation path:
+  - `OutcomeEngine`
+  - `RewardEngine`
+  - `ArtifactIndex`
+  - `DatasetBuilder`
+  - `Evaluator`
+  - `RouterVersionStore`
+- Alpha Iteration 1 — online learning loop:
+  - `PromotionPolicy` with benchmark-gated promotion rules
+  - `BenchmarkSuite` persistence (JSON load/save + default seed)
+  - `OnlineLearningCoordinator` for retrain/evaluate/promote cycles
+  - exception-safe coordinator: training/evaluation failures emit auditable error reports instead of crashing
+  - `TrainingReportStore.get_report()` for by-id report lookup
+
+### Product/demo surface
+- `app.py`: `MemabraApp`, demo builders, artifact index access, training hooks, `run_online_learning_cycle`
+- `cli.py`: wrap-up workflow and `run_online_learning_workflow` with benchmark-gated promotion
+- `cli.py`: argument parsing (`--base-dir`, `--min-new-trajectories`) and clean `python -m src.memabra.cli` execution
+- `DEMO.md`: runnable walkthrough with CLI options
+
+## Current test status
+
+Command:
+`source venv/bin/activate && python -m pytest tests/memabra -q`
+
+Latest result:
+`118 passed`
+
+All alpha iteration 1 source, tests, and documentation have been committed to the repository (commit `34cf507c`).
+
+## Most important current files
+
+### Core package
+- `src/memabra/app.py`
+- `src/memabra/cli.py`
+- `src/memabra/router.py`
+- `src/memabra/runner.py`
+- `src/memabra/execution.py`
+- `src/memabra/evaluator.py`
+- `src/memabra/router_versioning.py`
+- `src/memabra/promotion.py`
+- `src/memabra/online_learning.py`
+- `src/memabra/training_reports.py`
+- `src/memabra/benchmarks.py`
+- `src/memabra/case_index.py`
+
+### Tests
+- `tests/memabra/test_app.py`
+- `tests/memabra/test_cli_workflow.py`
+- `tests/memabra/test_package_exports.py`
+- `tests/memabra/test_promotion.py`
+- `tests/memabra/test_online_learning.py`
+- `tests/memabra/test_training_reports.py`
+- `tests/memabra/test_benchmarks.py`
+- `tests/memabra/test_router_versioning.py`
+- `tests/memabra/test_evaluator.py`
+- `tests/memabra/test_router_protocol.py`
+- `tests/memabra/test_execution_persistence.py`
+
+## Wrap-up status
+
+The project is now in a safe self-improving alpha state.
+It can:
+- run realistic demo tasks
+- persist trajectories
+- replay and inspect results
+- train a lightweight router from saved artifacts
+- compare baseline vs challenger routers
+- apply a promotion policy with explicit thresholds
+- save and reload router versions with metadata
+- emit auditable training reports
+- run an online-learning cycle from the CLI
+- leave the active router unchanged when challenger fails
+- survive training/evaluation failures gracefully and emit error reports
+- accept CLI overrides for artifact directory and trajectory thresholds
+- persist seen-trajectory state across restarts so cron jobs don't retrain on the same data
+- default CLI `main()` persists seen trajectories to `<base-dir>/seen-trajectories.json`
+- run in dry-run mode to evaluate a challenger without promoting it
+- run in baseline-version mode to compare a challenger against a specific saved version instead of the currently active router
+- index successful task cases by normalized input for episodic retrieval (`CaseIndex`)
+- build/save/load a case index from `MemabraApp`
+- inject episodic memory candidates during runner retrieval when a similar past task exists
+- use `--case-index` and `--rebuild-case-index` CLI flags to manage episodic retrieval
+- online-learning cycles automatically refresh the case index after training/evaluation when a case-index path is configured
+- episodic memory candidates now include rich human-readable summaries when the past trajectory is available via `persistence_store`
+- CLI `--status` flag provides a quick read-only snapshot of the active router, versions, trajectories, and reports
+- CLI `--rollback` and `--list-versions` flags enable operator-safe router version management without touching code
+
+## Next sensible frontier
+
+1. tighter integration with real Hermes trajectories
+2. multi-turn conversation state and working-memory updates
+3. richer real-world tool ecosystem integration (MCP, web, git, files)
+4. stronger storage/index backend beyond plain JSON files
+
+## One-line summary
+
+memabra is now a runnable, test-covered safe self-improving alpha for agent memory/action routing, with online learning, benchmark-gated promotion, and auditable reports.
diff --git a/docs/PROTOTYPE_LAYOUT.md b/docs/PROTOTYPE_LAYOUT.md
new file mode 100644
index 0000000..5b8b855
--- /dev/null
+++ b/docs/PROTOTYPE_LAYOUT.md
@@ -0,0 +1,90 @@
+# Prototype Layout
+
+## 目标
+
+为 memabra 建立一个最小可运行的原型目录结构，让后续 rule-based router、replay harness、sample trajectories 和训练样本生成都能有明确落点。
+
+## 目录结构
+
+```text
+src/memabra/
+├── __init__.py
+├── candidate_types.py      # 统一候选对象与决策类型
+├── router.py               # Rule-based router baseline
+├── telemetry.py            # 事件、reward、轨迹的运行时结构
+├── reward.py               # reward 聚合逻辑
+├── retrieval.py            # 后续：候选召回接口
+├── memory_store.py         # 后续：长期记忆存取
+├── replay.py               # 后续：trajectory 回放与评估
+└── schemas.py              # 后续：schema 装载/校验
+
+tests/memabra/
+└── test_router_smoke.py    # baseline 冒烟测试
+```
+
+## 当前已落地
+
+已创建：
+- `src/memabra/__init__.py`
+- `src/memabra/candidate_types.py`
+- `src/memabra/router.py`
+- `src/memabra/telemetry.py`
+- `src/memabra/reward.py`
+- `tests/memabra/test_router_smoke.py`
+
+## 模块边界
+
+### candidate_types.py
+负责：
+- `CandidateObject`
+- `DecisionType`
+- 后续可扩展 memory/skill/tool type-specific adapter
+
+### router.py
+负责：
+- `TaskContext`
+- `RouteDecision`
+- `RuleBasedRouter`
+
+当前只实现 baseline 启发式，后续升级为：
+- 特征打分器
+- reranker
+- learned policy
+
+### telemetry.py
+负责：
+- 原子事件结构
+- reward breakdown
+- 后续 trajectory runtime objects
+
+### reward.py
+负责：
+- reward 组合与计算
+- 后续权重版本化
+
+## 设计原则
+
+1. 先有可运行 baseline，再抽象复杂接口
+2. 数据结构先简单，但字段命名与 Phase 0 schema 保持一致
+3. 先保证 replayable，再考虑高性能
+4. 不提前引入数据库或向量库耦合
+
+## 下一步落点
+
+- `retrieval.py`：定义候选召回接口
+- `replay.py`：实现 trajectory 读取、回放和指标计算
+- `schemas.py`：把 JSON schema 转成运行时校验入口
+- `sample_data/`：放示例 candidates 和 trajectories
+
+## 验证建议
+
+在项目根目录运行：
+
+```bash
+source venv/bin/activate
+python -m pytest tests/memabra/test_router_smoke.py -q
+```
+
+期望：
+- baseline router 冒烟测试通过
+- 说明最小原型骨架已可被导入和调用
diff --git a/docs/README.md b/docs/README.md
new file mode 100644
index 0000000..64c3a7a
--- /dev/null
+++ b/docs/README.md
@@ -0,0 +1,87 @@
+# memabra
+
+An intuition-driven control plane for agent memory and action selection.
+
+## Quick start
+
+If you are working from this repository, activate the virtualenv and install the project in editable mode so the dedicated `memabra` command is available:
+
+```bash
+source venv/bin/activate
+uv pip install -e ".[dev]"
+memabra --help
+memabra run --base-dir /tmp/memabra-demo --format text --dry-run
+```
+
+The dedicated CLI is the fastest way to experience the alpha. It supports subcommands for different operations:
+
+- `memabra run` — run the online-learning loop
+- `memabra status` — show system status
+- `memabra version list` — list saved router versions
+- `memabra version rollback <id>` — roll back to a version
+
+memabra 的目标，不是做一个“会存东西的记忆库”，而是做一个本地 agent 的元认知控制器：
+在面对任务时，能像人的直觉一样，快速判断该直接回答、查记忆、加载 skill、还是调用工具；并且根据任务结果持续优化这种判断。
+
+一句话定义：
+这是一个 local-first、可观测、可训练、可回放的 agent memory and action orchestration system。
+
+## 为什么要做
+
+传统 agent 的常见问题：
+- 上下文越来越胖，什么都往 prompt 里塞
+- 记忆、skill、工具是三套割裂系统
+- 成功或失败后，很难知道到底是哪一步起了作用
+- 想“学习”时，缺少可归因的轨迹数据
+
+memabra 要解决的本质问题是：
+什么时候该依赖什么。
+
+## 核心观点
+
+先不要一上来做端到端神经网络大一统训练。
+先建立 4 层结构：
+1. 检索层：召回候选 memory / skill / tool
+2. 路由层：决定调用什么，以及先后顺序
+3. 执行层：真正注入记忆、加载 skill、调用工具
+4. 评估层：记录结果，分配 credit，形成训练样本
+
+如果这 4 层都看不清，直接端到端训练，大概率会学成“少调工具、靠模型硬猜”的歪路子。
+
+## 项目输出
+
+当前目录先以方案与设计文档为主：
+- `ARCHITECTURE.md`：系统架构
+- `ROADMAP.md`：分阶段路线图
+- `DECISIONS.md`：关键设计决策
+- `PROGRESS.md`：当前进度和下一步
+- `schemas/`：Phase 0 的统一 schema
+- `reward_spec.md`：奖励设计草案
+
+后续可以补：
+- `experiments/`：训练与评估实验
+- `src/`：原型代码
+- `tests/`：验证与回归测试
+
+## 目标能力
+
+最终希望具备：
+- 统一管理 facts / procedures / episodes 三类长期信息
+- 给 memory、skill、tool 建立统一候选召回机制
+- 让一个“直觉策略器”做快速动作选择
+- 通过任务结果反推策略好坏
+- 逐步从规则系统过渡到可学习策略
+- 在本地环境下可持续演化
+
+## 当前状态
+
+项目已初始化，并已进入 Phase 0 基础定义阶段：
+- 完成方向澄清
+- 确立分层路线
+- 完成命名
+- 建立项目目录
+- 写入首版架构、路线图、决策和进度文档
+- 准备补齐 schema 与 reward 规范
+
+下一步建议直接进入 Phase 0：
+定义统一对象模型、轨迹日志结构、reward 拆分方案。
\ No newline at end of file
diff --git a/docs/REPLAY_AND_RETRIEVAL.md b/docs/REPLAY_AND_RETRIEVAL.md
new file mode 100644
index 0000000..2c8265c
--- /dev/null
+++ b/docs/REPLAY_AND_RETRIEVAL.md
@@ -0,0 +1,60 @@
+# Replay and Retrieval
+
+## 目标
+
+把 memabra 的最小闭环接起来：
+- retrieval 负责把 memory / skill / tool 候选召回出来
+- replay 负责读取 trajectories 并汇总行为结果
+
+这两者一接上，系统就不再只是静态文档和单点 router，而是具备了：
+- 候选输入
+- 决策输出
+- 轨迹回放
+- 基础统计
+
+## 当前实现
+
+### retrieval.py
+提供：
+- `CandidateProvider` 协议
+- `InMemoryCandidateProvider`
+- `CandidateRetriever`
+- `RetrievalResult`
+
+当前策略：
+- 使用 trigger/tag 与任务文本做简单 lexical matching
+- 结合 confidence / success_rate / freshness / cost / risk 做 baseline 排序
+- 对不同 provider 输出做按类型聚合与去重
+
+### replay.py
+提供：
+- `TrajectoryReplay`
+- `ReplaySummary`
+
+当前能力：
+- 加载单个 trajectory JSON
+- 加载目录下多个 trajectory
+- 汇总 outcome counts
+- 汇总 reward、latency、steps、user corrections
+- 统计各类 decision_type 次数
+
+## 为什么这一步重要
+
+没有 retrieval，router 只能对空候选做假动作。
+没有 replay，reward 和 trajectory 只是躺在磁盘上的 JSON 标本。
+
+这一步之后，memabra 第一次拥有了最小闭环：
+任务 -> 候选 -> 决策 -> 轨迹 -> 回放统计
+
+## 当前局限
+
+- retrieval 还是词面匹配，不是 embedding 或 learned ranking
+- replay 只做汇总，不做 schema 校验和 counterfactual 对比
+- 还没有把 router 与 retriever 真正串成 end-to-end runner
+
+## 下一步
+
+1. 加 `schemas.py` 做运行时校验
+2. 做 `memory_store.py` 和 provider 接口
+3. 做 `runner.py` 把 retrieval + router + telemetry 串起来
+4. 给 replay 加基线比较和 reward breakdown 分析
diff --git a/docs/ROADMAP.md b/docs/ROADMAP.md
new file mode 100644
index 0000000..5d846c2
--- /dev/null
+++ b/docs/ROADMAP.md
@@ -0,0 +1,136 @@
+# Roadmap
+
+## 总体目标
+
+构建一个本地 agent 记忆管理与元认知控制系统，使 agent 能在 memory、skill、tool 之间做可学习的动作选择，并通过任务反馈逐步优化策略。
+
+## Phase 0 — Foundations / 仓基
+
+目标：先把“对象”和“轨迹”定义清楚。
+
+交付物：
+- 统一候选对象 schema
+- memory / skill / tool 类型边界定义
+- 事件日志 schema
+- trajectory schema
+- reward 拆解草案
+- 评估指标草案
+- 原型目录布局草案
+- baseline router 设计文档
+- 示例 trajectories
+
+成功标准：
+- 对任何一次任务，都能完整记录：看到了什么、选了什么、结果如何
+- 文档足够清晰，后续实现不靠拍脑袋
+- 有第一批 success / failure trajectory 样本可供 replay 使用
+
+状态：已完成
+
+## Phase 1 — Observable MVP / 可观测最小系统
+
+目标：做一个不学习、但能完整运行和记录的版本。
+
+交付物：
+- 候选召回模块
+- memory/skill/tool 统一候选接口
+- 基于规则或启发式的 router
+- 执行适配层
+- 轨迹日志落盘
+- 基础可视化 / 回放能力
+
+成功标准：
+- 给定任务，系统能做出动作选择
+- 每次动作都能复盘
+- 可以统计简单指标：命中率、工具调用率、任务完成率
+
+状态：已完成
+
+## Phase 2 — Learned Router / 学习型路由器
+
+目标：让"直觉"开始可训练。
+
+交付物：
+- 候选特征工程
+- 训练样本构建流程
+- 轻量分类器 / reranker / bandit
+- 离线评估基线
+- 路由策略 A/B 对比
+
+成功标准：
+- 学习型路由在离线回放中优于规则路由
+- 减少明显无效调用
+- 能识别高价值 memory / skill / tool 场景
+
+状态：已完成（SimpleLearningRouter、DatasetBuilder、Evaluator、A/B comparison、RouterVersionStore）
+
+## Phase 3 — Rewarded Adaptation / 带反馈的适应
+
+目标：利用任务结果对策略做持续更新。
+
+交付物：
+- reward 聚合器
+- 用户修正信号接入
+- online / batch 更新机制
+- safe exploration 策略
+- 记忆置信度更新机制
+- benchmark-gated promotion policy
+- training run reports
+- active router metadata tracking
+
+成功标准：
+- 策略可在连续任务中改善
+- 不会因为少量坏反馈快速崩掉
+- 可以识别并降权错误记忆
+- promotion 必须经过 benchmark 验证
+
+状态：已完成（online learning coordinator、promotion policy、training reports、version metadata、benchmark-gated promotion、active router tracking、app/CLI entrypoints 已实现）
+
+### Phase 4 — Episodic Learning / 情景学习
+
+目标：把过往任务轨迹变成真正有用的 episodic memory。
+
+交付物：
+- 任务案例索引 (done)
+- episode retrieval (done — via CaseIndex and runner injection)
+- 相似任务复用 (done — runner injects episodic candidate)
+- trajectory summarization (done — `TrajectorySummarizer` generates human-readable summaries)
+
+成功标准：
+- 对重复型任务，系统能复用历史成功路径
+- episode 不会污染事实记忆和 skill 库
+
+状态：进行中 (核心功能已完成)
+
+## Phase 5 — End-to-End Experiments / 端到端实验
+
+目标：验证是否值得把路由进一步内化到神经模型权重中。
+
+交付物：
+- 训练数据集定义
+- SFT / preference / RL 实验方案
+- 与分层系统的对照评估
+- 风险分析：遗忘、过拟合、行为漂移
+
+成功标准：
+- 至少在一组明确任务上优于分层基线
+- 不显著降低可解释性和稳定性
+
+状态：未开始
+
+## 每阶段都要守住的底线
+
+- 必须可回放
+- 必须可归因
+- 必须分清 memory、skill、tool
+- 必须有失败样本，不只看成功样本
+- 必须能撤销错误记忆与错误策略
+
+## 当前优先级
+
+1. real adapters
+2. richer reward/outcome updates
+3. persistence-backed replay
+4. router scoring v2
+5. 再谈 learned router
+
+这五步不打牢，后面训练都是空中楼阁。
\ No newline at end of file
diff --git a/docs/ROUTER_BASELINE.md b/docs/ROUTER_BASELINE.md
new file mode 100644
index 0000000..dacad3e
--- /dev/null
+++ b/docs/ROUTER_BASELINE.md
@@ -0,0 +1,213 @@
+# Rule-Based Router Baseline
+
+## 目标
+
+定义 memabra 在 Phase 1 使用的第一版路由策略。这个版本不学习，只靠显式规则和候选对象属性做动作选择。
+
+它的价值不在于聪明，而在于：
+- 可观察
+- 可解释
+- 可回放
+- 可作为 learned router 的基线
+
+## 动作空间
+
+router 当前允许的动作：
+
+1. `direct_answer`
+2. `inject_memory`
+3. `load_skill`
+4. `call_tool`
+5. `clarify`
+6. `composite_action`
+
+### direct_answer
+适用场景：
+- 纯分析、命名、结构设计、解释类任务
+- 不依赖实时状态
+- 没有明显外部资源调用必要
+
+### inject_memory
+适用场景：
+- 用户偏好
+- 项目约定
+- 环境事实
+- 历史已知稳定事实
+
+### load_skill
+适用场景：
+- 任务像一个可复用 procedure
+- 存在已知工作流
+- 过往在类似任务中复用价值高
+
+### call_tool
+适用场景：
+- 需要获取当前状态
+- 需要访问文件、系统、网页、进程、时间等实时信息
+- 需要执行动作而不是纯推理
+
+### clarify
+适用场景：
+- 高风险且候选信号弱
+- 信息缺失会显著改变动作选择
+- 所有候选都低置信度
+
+### composite_action
+适用场景：
+- 先 memory 再 tool
+- 先 skill 再 tool
+- 先 memory 再 skill
+
+当前 baseline 先以单动作为主，组合动作先作为保留动作类型。
+
+## 候选打分思路
+
+每个候选对象都有公共字段：
+- `confidence`
+- `success_rate`
+- `cost`
+- `freshness`
+- `risk`
+
+baseline 不做复杂学习，只用线性直觉打分。
+
+### memory score
+
+```text
+memory_score = confidence + freshness + success_rate - cost - risk
+```
+
+### skill score
+
+```text
+skill_score = confidence + success_rate - cost - risk
+```
+
+### tool score
+
+```text
+tool_score = confidence + success_rate - cost - risk
+```
+
+注意：
+- memory 更看 freshness
+- tool 更看 risk
+- skill 更看 success_rate
+
+## 第一版规则
+
+### Rule 1: reasoning-first 任务优先 direct_answer
+若用户输入中明显包含以下信号：
+- why
+- think
+- design
+- name
+
+且不存在强 tool 触发词，则优先 `direct_answer`。
+
+### Rule 2: 需要实时状态时优先 tool
+若输入中出现：
+- check
+- run
+- open
+- current
+- list
+- time
+
+则优先找高置信 `tool` 候选。
+
+额外门槛：
+- `confidence >= 0.6`
+- `risk <= 0.7`
+
+### Rule 3: 用户/项目稳定事实优先 memory
+若输入中出现：
+- prefer
+- remember
+- usually
+- my
+- our
+
+则优先找高置信、较新鲜的 `memory` 候选。
+
+额外门槛：
+- `confidence >= 0.65`
+- `freshness >= 0.3`
+
+### Rule 4: 可复用工作流优先 skill
+若输入中出现：
+- fix
+- deploy
+- review
+- setup
+- workflow
+
+则优先找高 success_rate 的 `skill` 候选。
+
+额外门槛：
+- `confidence >= 0.55`
+- `success_rate >= 0.4`
+
+### Rule 5: 没把握就 clarify
+如果没有任何一类候选达到门槛，则返回 `clarify`。
+
+这条规则很丑，但很必要。
+宁可问一句，也别瞎调一堆工具把屋顶掀了。
+
+## 冲突解决顺序
+
+当多个动作同时触发时，baseline 使用以下优先级：
+
+```text
+tool > memory > skill > direct_answer > clarify
+```
+
+原因：
+- 实时信息需求通常最硬
+- 事实约束其次
+- skill 更像增强器
+- 纯回答放在明确无外部需求时
+
+后续版本可改成：
+- 先 task intent classification
+- 再 per-type ranking
+- 最后做 global arbitration
+
+## 已知局限
+
+1. 关键词触发太脆
+2. 不看长程上下文
+3. 不支持真正的组合动作规划
+4. 不做反事实选择比较
+5. 容易被表面词汇误导
+
+## baseline 的真正用途
+
+不是追求高智能，而是提供：
+- 第一版可运行系统
+- 第一批可记录轨迹
+- 第一批失败样本
+- learned router 的比较对象
+
+## 下一步
+
+从这个 baseline 往后长，有三条路线：
+1. 引入显式特征工程
+2. 引入候选 reranker
+3. 引入 bandit / lightweight policy learning
+
+在此之前，不要急着把 heuristic 糊成“伪智能”。先把 replay 和 metrics 做出来。
+
+---
+
+## 实现进展：FeatureScoringRouter (v2)
+
+已在 `src/memabra/router.py` 中实现 `FeatureScoringRouter`，作为对 `RuleBasedRouter` 的升级：
+
+- 明确特征打分：memory / skill / tool 分别使用不同权重组合 `confidence`、`success_rate`、`freshness`、`cost`、`risk`
+- 失败惩罚：候选 `id` 出现在 `TaskContext.recent_failures` 中时，自动扣减 0.5 分
+- 复合动作前置条件：`CandidateObject` 新增 `preconditions` 字段，支持声明如 `["memory"]` 等前置类型
+- 复合动作执行：`ExecutionEngine` 已支持 `composite_action` 决策类型，按 `composite_steps` 顺序递归执行子步骤
+- 打分透明度：`RouteDecision.score_breakdown` 记录每个候选的最终得分，方便追溯与评估
+
+`FeatureScoringRouter` 保持了可解释性，同时为后续学习型策略提供了结构化特征输出。
\ No newline at end of file
diff --git a/docs/RUNNER_AND_STORE.md b/docs/RUNNER_AND_STORE.md
new file mode 100644
index 0000000..80aea99
--- /dev/null
+++ b/docs/RUNNER_AND_STORE.md
@@ -0,0 +1,83 @@
+# Runner, Schemas, and Memory Store
+
+## 目标
+
+把 memabra 从“能分别检索、路由、回放”推进到“能产出合法 draft trajectory、能校验数据、能管理 typed memory records”。
+
+## 当前实现
+
+### runner.py
+提供：
+- `MemabraRunner`
+
+能力：
+- 接收 `TaskContext`
+- 调用 retriever 获取候选
+- 调用 router 生成动作决策
+- 自动生成 draft trajectory
+- 产出最小事件流：
+  - `task_received`
+  - `candidates_recalled`
+  - `action_selected`
+
+意义：
+这让 memabra 第一次具备了一个 task-to-trajectory 的实际入口。
+
+### schemas.py
+提供：
+- `SchemaRegistry`
+- `SchemaValidationError`
+
+当前策略：
+- 先做轻量级 runtime validation
+- 不依赖外部库
+- 先校验关键 required keys
+
+这还不是完整 JSON Schema engine，但足够先守住地板线，避免样本结构乱飞。
+
+### memory_store.py
+提供：
+- `MemoryRecord`
+- `MemorySource`
+- `VerificationState`
+- `InMemoryMemoryStore`
+
+当前能力：
+- upsert
+- get
+- list_by_type
+- mark_used
+- verify
+- revoke
+
+意义：
+现在 memabra 终于不是只会“谈记忆”，而是有一个 typed memory record runtime 了。
+
+## 当前闭环
+
+现在已有：
+- retrieval
+- router
+- runner
+- replay
+- memory store
+- schema validation
+
+也就是：
+任务 -> 候选召回 -> 路由决策 -> trajectory 草稿 -> 回放统计
+并且 memory record 本身也能做校验和状态变更。
+
+## 还差什么
+
+- execution adapter（真实工具/skill/memory 注入）
+- 完整 JSON Schema 验证
+- trajectory 持久化层
+- richer reward aggregation
+- counterfactual replay
+
+## 建议下一步
+
+1. 做 `execution.py`
+2. 做 `persistence.py`
+3. 给 runner 接上 memory store 和 telemetry writeback
+4. 做 richer router scoring v2
diff --git a/docs/demo-artifacts/router-versions/current.json b/docs/demo-artifacts/router-versions/current.json
new file mode 100644
index 0000000..865ef95
--- /dev/null
+++ b/docs/demo-artifacts/router-versions/current.json
@@ -0,0 +1,13 @@
+{
+  "current_version_id": "20260414-165018",
+  "promotion_source": null,
+  "benchmark_summary": {
+    "reward_delta": -0.446,
+    "error_rate_delta": 0.0,
+    "latency_delta_ms": -21.0,
+    "baseline_avg_reward": 0.886,
+    "challenger_avg_reward": 0.44
+  },
+  "prior_version_id": "20260414-155224",
+  "saved_at": "2026-04-14T16:50:18.865976+00:00"
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/router-versions/versions/20260414-143742.json b/docs/demo-artifacts/router-versions/versions/20260414-143742.json
new file mode 100644
index 0000000..b743272
--- /dev/null
+++ b/docs/demo-artifacts/router-versions/versions/20260414-143742.json
@@ -0,0 +1,50 @@
+{
+  "version_id": "20260414-143742",
+  "weights": {
+    "inject_memory": {
+      "input_length": 43.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.95,
+      "top_skill_success_rate": 0.9,
+      "top_tool_confidence": 0.95,
+      "top_tool_risk": 0.0
+    },
+    "load_skill": {
+      "input_length": 44.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.95,
+      "top_skill_success_rate": 0.9,
+      "top_tool_confidence": 0.95,
+      "top_tool_risk": 0.0
+    },
+    "call_tool": {
+      "input_length": 32.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.9000000000000001,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "avg_reward": 1.04,
+    "task_count": 3,
+    "source": "wrapup_workflow"
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/router-versions/versions/20260414-152738.json b/docs/demo-artifacts/router-versions/versions/20260414-152738.json
new file mode 100644
index 0000000..cee247f
--- /dev/null
+++ b/docs/demo-artifacts/router-versions/versions/20260414-152738.json
@@ -0,0 +1,50 @@
+{
+  "version_id": "20260414-152738",
+  "weights": {
+    "load_skill": {
+      "input_length": 42.15803814713897,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9499999999999997,
+      "top_skill_success_rate": 0.9,
+      "top_tool_confidence": 0.9499999999999997,
+      "top_tool_risk": 0.0
+    },
+    "call_tool": {
+      "input_length": 32.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.95,
+      "top_skill_success_rate": 0.9000000000000001,
+      "top_tool_confidence": 0.95,
+      "top_tool_risk": 0.0
+    },
+    "inject_memory": {
+      "input_length": 42.99999999999999,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.95,
+      "top_skill_success_rate": 0.8999999999999999,
+      "top_tool_confidence": 0.95,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "avg_reward": 1.04,
+    "task_count": 3,
+    "source": "wrapup_workflow"
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/router-versions/versions/20260414-155224.json b/docs/demo-artifacts/router-versions/versions/20260414-155224.json
new file mode 100644
index 0000000..63b585f
--- /dev/null
+++ b/docs/demo-artifacts/router-versions/versions/20260414-155224.json
@@ -0,0 +1,55 @@
+{
+  "version_id": "20260414-155224",
+  "weights": {
+    "load_skill": {
+      "input_length": 42.38663484486874,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9499999999999997,
+      "top_skill_success_rate": 0.9,
+      "top_tool_confidence": 0.9499999999999997,
+      "top_tool_risk": 0.0
+    },
+    "call_tool": {
+      "input_length": 32.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.9000000000000001,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    },
+    "inject_memory": {
+      "input_length": 41.75894988066825,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9499999999999997,
+      "top_skill_success_rate": 0.8999999999999999,
+      "top_tool_confidence": 0.9499999999999997,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.154,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -21.0,
+      "baseline_avg_reward": 0.886,
+      "challenger_avg_reward": 1.04
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/router-versions/versions/20260414-165018.json b/docs/demo-artifacts/router-versions/versions/20260414-165018.json
new file mode 100644
index 0000000..41456d6
--- /dev/null
+++ b/docs/demo-artifacts/router-versions/versions/20260414-165018.json
@@ -0,0 +1,65 @@
+{
+  "version_id": "20260414-165018",
+  "weights": {
+    "load_skill": {
+      "input_length": 41.594896331738454,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9499999999999998,
+      "top_skill_success_rate": 0.9000000000000001,
+      "top_tool_confidence": 0.9499999999999998,
+      "top_tool_risk": 0.0
+    },
+    "call_tool": {
+      "input_length": 32.85406896551724,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.95,
+      "top_skill_success_rate": 0.9,
+      "top_tool_confidence": 0.95,
+      "top_tool_risk": 0.0
+    },
+    "clarify": {
+      "input_length": 51.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.95,
+      "top_skill_success_rate": 0.9,
+      "top_tool_confidence": 0.95,
+      "top_tool_risk": 0.0
+    },
+    "inject_memory": {
+      "input_length": 41.45435244161358,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9499999999999996,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9499999999999996,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": -0.446,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -21.0,
+      "baseline_avg_reward": 0.886,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/training-reports/report-886de309-18d0-4be6-b626-0f7d2edc8b72.json b/docs/demo-artifacts/training-reports/report-886de309-18d0-4be6-b626-0f7d2edc8b72.json
new file mode 100644
index 0000000..04702f2
--- /dev/null
+++ b/docs/demo-artifacts/training-reports/report-886de309-18d0-4be6-b626-0f7d2edc8b72.json
@@ -0,0 +1,52 @@
+{
+  "report_id": "report-886de309-18d0-4be6-b626-0f7d2edc8b72",
+  "timestamp": "2026-04-14T15:52:24.610516+00:00",
+  "source_trajectory_ids": [
+    "traj-004e53d5-006c-4e61-91a4-dc51cf7ee9bd",
+    "traj-120aec7e-a74d-42d6-8846-c472680cc2f3",
+    "traj-179d0c19-3f0f-4429-a85b-3e01802290d3",
+    "traj-1ac5bb3d-f865-4c8c-8ff4-a9c29472b303",
+    "traj-3f6687ff-3a55-4a26-a7bc-8397d8da7d15",
+    "traj-439e4552-f248-43cb-b4eb-25db14da1ebc",
+    "traj-58ec7a90-3ada-4b78-bc6a-6351be4eb4b5",
+    "traj-6a5aaff5-9336-4a1d-b102-80f1196427ae",
+    "traj-707b1dec-1d9a-4a71-a07a-54841155103c",
+    "traj-80784ce5-fc14-4fee-9f5f-90dcec26179b",
+    "traj-819443a2-79ea-48b7-a543-8bb7356dba36",
+    "traj-9144cbc3-1ccf-4660-aad9-8db5797461eb",
+    "traj-9190707c-5486-4266-a6c8-32f34c6c63ec",
+    "traj-adb05c91-4c0c-493a-af84-517efea3f406",
+    "traj-affbeb5b-eb52-40fd-94cb-48b7c374f1fc",
+    "traj-bcad8fa2-ffd3-4e5b-9ddb-720f3898826e",
+    "traj-c5907bfb-61d2-47f9-a6c5-2300701bb551",
+    "traj-c9c11bdc-852b-4aef-851c-f2968806e535",
+    "traj-d2d3a115-36d8-466f-9d14-bf741316f698",
+    "traj-d3575889-7458-44b9-b3f1-f04cd766ca76",
+    "traj-dd361c81-40a1-4892-9914-2140870fff95"
+  ],
+  "sample_count": 21,
+  "baseline_metrics": {
+    "task_count": 4,
+    "avg_reward": 0.886,
+    "error_rate": 0.0,
+    "avg_latency_ms": 21.0
+  },
+  "challenger_metrics": {
+    "task_count": 4,
+    "avg_reward": 1.04,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.154,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -21.0,
+      "baseline_avg_reward": 0.886,
+      "challenger_avg_reward": 1.04
+    }
+  },
+  "promoted_version_id": "20260414-155224"
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/training-reports/report-e7050e1f-fa3c-42e4-9178-e57f69b2dc1d.json b/docs/demo-artifacts/training-reports/report-e7050e1f-fa3c-42e4-9178-e57f69b2dc1d.json
new file mode 100644
index 0000000..59a8d56
--- /dev/null
+++ b/docs/demo-artifacts/training-reports/report-e7050e1f-fa3c-42e4-9178-e57f69b2dc1d.json
@@ -0,0 +1,60 @@
+{
+  "report_id": "report-e7050e1f-fa3c-42e4-9178-e57f69b2dc1d",
+  "timestamp": "2026-04-14T16:50:18.866221+00:00",
+  "source_trajectory_ids": [
+    "traj-004e53d5-006c-4e61-91a4-dc51cf7ee9bd",
+    "traj-120aec7e-a74d-42d6-8846-c472680cc2f3",
+    "traj-179d0c19-3f0f-4429-a85b-3e01802290d3",
+    "traj-1ac5bb3d-f865-4c8c-8ff4-a9c29472b303",
+    "traj-217ccafa-716c-4534-813b-a489ed7d6079",
+    "traj-3f6687ff-3a55-4a26-a7bc-8397d8da7d15",
+    "traj-439e4552-f248-43cb-b4eb-25db14da1ebc",
+    "traj-58ec7a90-3ada-4b78-bc6a-6351be4eb4b5",
+    "traj-6a5aaff5-9336-4a1d-b102-80f1196427ae",
+    "traj-707b1dec-1d9a-4a71-a07a-54841155103c",
+    "traj-74e92442-04fd-4f5a-979f-2dd81a7f08e1",
+    "traj-80784ce5-fc14-4fee-9f5f-90dcec26179b",
+    "traj-819443a2-79ea-48b7-a543-8bb7356dba36",
+    "traj-9144cbc3-1ccf-4660-aad9-8db5797461eb",
+    "traj-9190707c-5486-4266-a6c8-32f34c6c63ec",
+    "traj-9edc5088-09cc-42d6-a160-cede5357f535",
+    "traj-adb05c91-4c0c-493a-af84-517efea3f406",
+    "traj-affbeb5b-eb52-40fd-94cb-48b7c374f1fc",
+    "traj-b786c15f-388d-4228-9da4-c9e82b61570a",
+    "traj-bcad8fa2-ffd3-4e5b-9ddb-720f3898826e",
+    "traj-c5907bfb-61d2-47f9-a6c5-2300701bb551",
+    "traj-c9c11bdc-852b-4aef-851c-f2968806e535",
+    "traj-d2d3a115-36d8-466f-9d14-bf741316f698",
+    "traj-d3575889-7458-44b9-b3f1-f04cd766ca76",
+    "traj-dd361c81-40a1-4892-9914-2140870fff95",
+    "traj-e197ee51-e87c-4203-b9ee-c2f2d530cceb",
+    "traj-ed1d8812-f0ac-4994-86ab-21b3cf0fcb17",
+    "traj-f1d895a0-5442-448f-8936-4ee8b07822e6",
+    "traj-ffb40d01-7956-4d7b-a41c-9618487fe619"
+  ],
+  "sample_count": 29,
+  "baseline_metrics": {
+    "task_count": 4,
+    "avg_reward": 0.886,
+    "error_rate": 0.0,
+    "avg_latency_ms": 21.0
+  },
+  "challenger_metrics": {
+    "task_count": 4,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.446,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -21.0,
+      "baseline_avg_reward": 0.886,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-165018"
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/trajectories/traj-004e53d5-006c-4e61-91a4-dc51cf7ee9bd.json b/docs/demo-artifacts/trajectories/traj-004e53d5-006c-4e61-91a4-dc51cf7ee9bd.json
new file mode 100644
index 0000000..5dfae67
--- /dev/null
+++ b/docs/demo-artifacts/trajectories/traj-004e53d5-006c-4e61-91a4-dc51cf7ee9bd.json
@@ -0,0 +1,192 @@
+{
+  "trajectory_id": "traj-004e53d5-006c-4e61-91a4-dc51cf7ee9bd",
+  "task": {
+    "task_id": "task-5977495f-189b-4a87-8924-4834bded854c",
+    "input": "Check the current system status.",
+    "channel": "local",
+    "created_at": "2026-04-14T14:37:42.381631+00:00",
+    "user_id": "oza"
+  },
+  "context_snapshot": {
+    "conversation_summary": "",
+    "environment_summary": "",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-telegram-pref",
+        "type": "memory",
+        "title": "Telegram preference",
+        "summary": "Prefer plain text on Telegram.",
+        "triggers": [
+          "telegram",
+          "preference",
+          "answer"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 0.9,
+        "risk": 0.0,
+        "tags": [
+          "output"
+        ],
+        "source": "user",
+        "type_payload": {}
+      }
+    ],
+    "skill": [
+      {
+        "id": "skill-deploy",
+        "type": "skill",
+        "title": "Deploy workflow",
+        "summary": "Reusable deployment workflow.",
+        "triggers": [
+          "deploy",
+          "workflow",
+          "service"
+        ],
+        "cost": 0.0,
+        "confidence": 0.8,
+        "success_rate": 0.9,
+        "freshness": 0.8,
+        "risk": 0.0,
+        "tags": [
+          "ops"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ],
+    "tool": [
+      {
+        "id": "tool-terminal",
+        "type": "tool",
+        "title": "terminal",
+        "summary": "Run terminal-style inspection commands.",
+        "triggers": [
+          "check",
+          "current",
+          "status",
+          "system"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 1.0,
+        "risk": 0.0,
+        "tags": [
+          "inspection"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ]
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "load_skill",
+      "selected_ids": [
+        "skill-deploy"
+      ],
+      "selected_payloads": [
+        {}
+      ],
+      "rejected_ids": [],
+      "rationale": "Predicted by learning router (score=1413.615).",
+      "estimated_cost": 0.0
+    }
+  ],
+  "events": [
+    {
+      "event_id": "evt-501ed3a1-622f-4e8a-b90b-2fb0384d89bd",
+      "trajectory_id": "traj-004e53d5-006c-4e61-91a4-dc51cf7ee9bd",
+      "timestamp": "2026-04-14T14:37:42.381702+00:00",
+      "stage": "retrieval",
+      "event_type": "task_received",
+      "payload": {
+        "input": "Check the current system status."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-4b6839de-ac61-414f-8939-3ba335a93cfa",
+      "trajectory_id": "traj-004e53d5-006c-4e61-91a4-dc51cf7ee9bd",
+      "timestamp": "2026-04-14T14:37:42.381707+00:00",
+      "stage": "retrieval",
+      "event_type": "candidates_recalled",
+      "payload": {
+        "memory_ids": [
+          "mem-telegram-pref"
+        ],
+        "skill_ids": [
+          "skill-deploy"
+        ],
+        "tool_ids": [
+          "tool-terminal"
+        ]
+      },
+      "metrics": {},
+      "parent_event_id": "evt-501ed3a1-622f-4e8a-b90b-2fb0384d89bd"
+    },
+    {
+      "event_id": "evt-1b229a15-af51-4924-932d-4d0318f0ba26",
+      "trajectory_id": "traj-004e53d5-006c-4e61-91a4-dc51cf7ee9bd",
+      "timestamp": "2026-04-14T14:37:42.381711+00:00",
+      "stage": "policy",
+      "event_type": "action_selected",
+      "payload": {
+        "step": 1,
+        "decision_type": "load_skill",
+        "selected_ids": [
+          "skill-deploy"
+        ],
+        "selected_payloads": [
+          {}
+        ],
+        "rejected_ids": [],
+        "rationale": "Predicted by learning router (score=1413.615).",
+        "estimated_cost": 0.0
+      },
+      "metrics": {},
+      "parent_event_id": "evt-4b6839de-ac61-414f-8939-3ba335a93cfa"
+    },
+    {
+      "event_id": "evt-skill-traj-004e53d5-006c-4e61-91a4-dc51cf7ee9bd-skill-deploy",
+      "trajectory_id": "traj-004e53d5-006c-4e61-91a4-dc51cf7ee9bd",
+      "timestamp": "2026-04-14T14:37:42.381718+00:00",
+      "stage": "execution",
+      "event_type": "skill_loaded",
+      "payload": {
+        "skill_id": "skill-deploy",
+        "input": "Check the current system status.",
+        "instructions": "Demo skill payload loaded successfully."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    }
+  ],
+  "outcome": {
+    "status": "success",
+    "steps": 1,
+    "latency_ms": 0,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Draft trajectory generated by MemabraRunner with execution hooks."
+  },
+  "reward": {
+    "total": 1.04,
+    "components": {
+      "task_success": 0.8,
+      "retrieval_hit": 0.2,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.0,
+      "context_cost": 0.06,
+      "useful_reuse": 0.1
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/trajectories/traj-120aec7e-a74d-42d6-8846-c472680cc2f3.json b/docs/demo-artifacts/trajectories/traj-120aec7e-a74d-42d6-8846-c472680cc2f3.json
new file mode 100644
index 0000000..d36d651
--- /dev/null
+++ b/docs/demo-artifacts/trajectories/traj-120aec7e-a74d-42d6-8846-c472680cc2f3.json
@@ -0,0 +1,207 @@
+{
+  "trajectory_id": "traj-120aec7e-a74d-42d6-8846-c472680cc2f3",
+  "task": {
+    "task_id": "task-78a318e6-c8b4-4d05-bfd8-2ebe4b19710f",
+    "input": "Check the current system status.",
+    "channel": "local",
+    "created_at": "2026-04-14T15:27:38.518486+00:00",
+    "user_id": "oza"
+  },
+  "context_snapshot": {
+    "conversation_summary": "",
+    "environment_summary": "",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-telegram-pref",
+        "type": "memory",
+        "title": "Telegram preference",
+        "summary": "Prefer plain text on Telegram.",
+        "triggers": [
+          "telegram",
+          "preference",
+          "answer"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 0.9,
+        "risk": 0.0,
+        "tags": [
+          "output"
+        ],
+        "source": "user",
+        "type_payload": {}
+      }
+    ],
+    "skill": [
+      {
+        "id": "skill-deploy",
+        "type": "skill",
+        "title": "Deploy workflow",
+        "summary": "Reusable deployment workflow.",
+        "triggers": [
+          "deploy",
+          "workflow",
+          "service"
+        ],
+        "cost": 0.0,
+        "confidence": 0.8,
+        "success_rate": 0.9,
+        "freshness": 0.8,
+        "risk": 0.0,
+        "tags": [
+          "ops"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ],
+    "tool": [
+      {
+        "id": "tool-terminal",
+        "type": "tool",
+        "title": "terminal",
+        "summary": "Run terminal-style inspection commands.",
+        "triggers": [
+          "check",
+          "current",
+          "status",
+          "system"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 1.0,
+        "risk": 0.0,
+        "tags": [
+          "inspection"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ]
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "call_tool",
+      "selected_ids": [
+        "tool-terminal"
+      ],
+      "selected_payloads": [
+        {}
+      ],
+      "rejected_ids": [],
+      "rationale": "Task asks for current state or external action; tool use is justified.",
+      "estimated_cost": 0.0
+    }
+  ],
+  "events": [
+    {
+      "event_id": "evt-be0db4ba-93b9-4cf7-bd76-51c1af70c6d4",
+      "trajectory_id": "traj-120aec7e-a74d-42d6-8846-c472680cc2f3",
+      "timestamp": "2026-04-14T15:27:38.518550+00:00",
+      "stage": "retrieval",
+      "event_type": "task_received",
+      "payload": {
+        "input": "Check the current system status."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-fb7734b7-bdab-4e24-8dec-a9debf02529d",
+      "trajectory_id": "traj-120aec7e-a74d-42d6-8846-c472680cc2f3",
+      "timestamp": "2026-04-14T15:27:38.518556+00:00",
+      "stage": "retrieval",
+      "event_type": "candidates_recalled",
+      "payload": {
+        "memory_ids": [
+          "mem-telegram-pref"
+        ],
+        "skill_ids": [
+          "skill-deploy"
+        ],
+        "tool_ids": [
+          "tool-terminal"
+        ]
+      },
+      "metrics": {},
+      "parent_event_id": "evt-be0db4ba-93b9-4cf7-bd76-51c1af70c6d4"
+    },
+    {
+      "event_id": "evt-8ed4e73b-2b45-44a6-9ab6-cc6184202dc0",
+      "trajectory_id": "traj-120aec7e-a74d-42d6-8846-c472680cc2f3",
+      "timestamp": "2026-04-14T15:27:38.518561+00:00",
+      "stage": "policy",
+      "event_type": "action_selected",
+      "payload": {
+        "step": 1,
+        "decision_type": "call_tool",
+        "selected_ids": [
+          "tool-terminal"
+        ],
+        "selected_payloads": [
+          {}
+        ],
+        "rejected_ids": [],
+        "rationale": "Task asks for current state or external action; tool use is justified.",
+        "estimated_cost": 0.0
+      },
+      "metrics": {},
+      "parent_event_id": "evt-fb7734b7-bdab-4e24-8dec-a9debf02529d"
+    },
+    {
+      "event_id": "evt-tool-traj-120aec7e-a74d-42d6-8846-c472680cc2f3-tool-terminal",
+      "trajectory_id": "traj-120aec7e-a74d-42d6-8846-c472680cc2f3",
+      "timestamp": "2026-04-14T15:27:38.518572+00:00",
+      "stage": "execution",
+      "event_type": "tool_called",
+      "payload": {
+        "tool_id": "tool-terminal",
+        "input": "Check the current system status."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-tool-result-traj-120aec7e-a74d-42d6-8846-c472680cc2f3-tool-terminal",
+      "trajectory_id": "traj-120aec7e-a74d-42d6-8846-c472680cc2f3",
+      "timestamp": "2026-04-14T15:27:38.518575+00:00",
+      "stage": "execution",
+      "event_type": "tool_result",
+      "payload": {
+        "tool_id": "tool-terminal",
+        "status": "success",
+        "output": "demo-result-for:tool-terminal",
+        "error": null,
+        "latency_ms": 42
+      },
+      "metrics": {},
+      "parent_event_id": null
+    }
+  ],
+  "outcome": {
+    "status": "success",
+    "steps": 1,
+    "latency_ms": 42,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Draft trajectory generated by MemabraRunner with execution hooks."
+  },
+  "reward": {
+    "total": 1.032,
+    "components": {
+      "task_success": 0.8,
+      "retrieval_hit": 0.25,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.008,
+      "context_cost": 0.06,
+      "useful_reuse": 0.05
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/trajectories/traj-179d0c19-3f0f-4429-a85b-3e01802290d3.json b/docs/demo-artifacts/trajectories/traj-179d0c19-3f0f-4429-a85b-3e01802290d3.json
new file mode 100644
index 0000000..c0988ec
--- /dev/null
+++ b/docs/demo-artifacts/trajectories/traj-179d0c19-3f0f-4429-a85b-3e01802290d3.json
@@ -0,0 +1,207 @@
+{
+  "trajectory_id": "traj-179d0c19-3f0f-4429-a85b-3e01802290d3",
+  "task": {
+    "task_id": "task-c0d9120f-4b28-4815-bcbc-1ea1cb523129",
+    "input": "Check the current system status.",
+    "channel": "telegram",
+    "created_at": "2026-04-14T15:27:38.512676+00:00",
+    "user_id": "oza"
+  },
+  "context_snapshot": {
+    "conversation_summary": "",
+    "environment_summary": "",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-telegram-pref",
+        "type": "memory",
+        "title": "Telegram preference",
+        "summary": "Prefer plain text on Telegram.",
+        "triggers": [
+          "telegram",
+          "preference",
+          "answer"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 0.9,
+        "risk": 0.0,
+        "tags": [
+          "output"
+        ],
+        "source": "user",
+        "type_payload": {}
+      }
+    ],
+    "skill": [
+      {
+        "id": "skill-deploy",
+        "type": "skill",
+        "title": "Deploy workflow",
+        "summary": "Reusable deployment workflow.",
+        "triggers": [
+          "deploy",
+          "workflow",
+          "service"
+        ],
+        "cost": 0.0,
+        "confidence": 0.8,
+        "success_rate": 0.9,
+        "freshness": 0.8,
+        "risk": 0.0,
+        "tags": [
+          "ops"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ],
+    "tool": [
+      {
+        "id": "tool-terminal",
+        "type": "tool",
+        "title": "terminal",
+        "summary": "Run terminal-style inspection commands.",
+        "triggers": [
+          "check",
+          "current",
+          "status",
+          "system"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 1.0,
+        "risk": 0.0,
+        "tags": [
+          "inspection"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ]
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "call_tool",
+      "selected_ids": [
+        "tool-terminal"
+      ],
+      "selected_payloads": [
+        {}
+      ],
+      "rejected_ids": [],
+      "rationale": "Task asks for current state or external action; tool use is justified.",
+      "estimated_cost": 0.0
+    }
+  ],
+  "events": [
+    {
+      "event_id": "evt-2e159144-a5dc-4bab-bb15-026b156788a7",
+      "trajectory_id": "traj-179d0c19-3f0f-4429-a85b-3e01802290d3",
+      "timestamp": "2026-04-14T15:27:38.512756+00:00",
+      "stage": "retrieval",
+      "event_type": "task_received",
+      "payload": {
+        "input": "Check the current system status."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-84681604-ee59-4618-8b1b-bdc521e58e7d",
+      "trajectory_id": "traj-179d0c19-3f0f-4429-a85b-3e01802290d3",
+      "timestamp": "2026-04-14T15:27:38.512762+00:00",
+      "stage": "retrieval",
+      "event_type": "candidates_recalled",
+      "payload": {
+        "memory_ids": [
+          "mem-telegram-pref"
+        ],
+        "skill_ids": [
+          "skill-deploy"
+        ],
+        "tool_ids": [
+          "tool-terminal"
+        ]
+      },
+      "metrics": {},
+      "parent_event_id": "evt-2e159144-a5dc-4bab-bb15-026b156788a7"
+    },
+    {
+      "event_id": "evt-6404a35f-8775-4fc1-9648-62a27f4a1b23",
+      "trajectory_id": "traj-179d0c19-3f0f-4429-a85b-3e01802290d3",
+      "timestamp": "2026-04-14T15:27:38.512767+00:00",
+      "stage": "policy",
+      "event_type": "action_selected",
+      "payload": {
+        "step": 1,
+        "decision_type": "call_tool",
+        "selected_ids": [
+          "tool-terminal"
+        ],
+        "selected_payloads": [
+          {}
+        ],
+        "rejected_ids": [],
+        "rationale": "Task asks for current state or external action; tool use is justified.",
+        "estimated_cost": 0.0
+      },
+      "metrics": {},
+      "parent_event_id": "evt-84681604-ee59-4618-8b1b-bdc521e58e7d"
+    },
+    {
+      "event_id": "evt-tool-traj-179d0c19-3f0f-4429-a85b-3e01802290d3-tool-terminal",
+      "trajectory_id": "traj-179d0c19-3f0f-4429-a85b-3e01802290d3",
+      "timestamp": "2026-04-14T15:27:38.512781+00:00",
+      "stage": "execution",
+      "event_type": "tool_called",
+      "payload": {
+        "tool_id": "tool-terminal",
+        "input": "Check the current system status."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-tool-result-traj-179d0c19-3f0f-4429-a85b-3e01802290d3-tool-terminal",
+      "trajectory_id": "traj-179d0c19-3f0f-4429-a85b-3e01802290d3",
+      "timestamp": "2026-04-14T15:27:38.512785+00:00",
+      "stage": "execution",
+      "event_type": "tool_result",
+      "payload": {
+        "tool_id": "tool-terminal",
+        "status": "success",
+        "output": "demo-result-for:tool-terminal",
+        "error": null,
+        "latency_ms": 42
+      },
+      "metrics": {},
+      "parent_event_id": null
+    }
+  ],
+  "outcome": {
+    "status": "success",
+    "steps": 1,
+    "latency_ms": 42,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Draft trajectory generated by MemabraRunner with execution hooks."
+  },
+  "reward": {
+    "total": 1.032,
+    "components": {
+      "task_success": 0.8,
+      "retrieval_hit": 0.25,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.008,
+      "context_cost": 0.06,
+      "useful_reuse": 0.05
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/trajectories/traj-1ac5bb3d-f865-4c8c-8ff4-a9c29472b303.json b/docs/demo-artifacts/trajectories/traj-1ac5bb3d-f865-4c8c-8ff4-a9c29472b303.json
new file mode 100644
index 0000000..215d24b
--- /dev/null
+++ b/docs/demo-artifacts/trajectories/traj-1ac5bb3d-f865-4c8c-8ff4-a9c29472b303.json
@@ -0,0 +1,192 @@
+{
+  "trajectory_id": "traj-1ac5bb3d-f865-4c8c-8ff4-a9c29472b303",
+  "task": {
+    "task_id": "task-f3701d8c-4931-4e43-8488-5fc670e5b2b1",
+    "input": "Deploy this service with the usual workflow.",
+    "channel": "local",
+    "created_at": "2026-04-14T14:37:42.380802+00:00",
+    "user_id": "oza"
+  },
+  "context_snapshot": {
+    "conversation_summary": "",
+    "environment_summary": "",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-telegram-pref",
+        "type": "memory",
+        "title": "Telegram preference",
+        "summary": "Prefer plain text on Telegram.",
+        "triggers": [
+          "telegram",
+          "preference",
+          "answer"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 0.9,
+        "risk": 0.0,
+        "tags": [
+          "output"
+        ],
+        "source": "user",
+        "type_payload": {}
+      }
+    ],
+    "skill": [
+      {
+        "id": "skill-deploy",
+        "type": "skill",
+        "title": "Deploy workflow",
+        "summary": "Reusable deployment workflow.",
+        "triggers": [
+          "deploy",
+          "workflow",
+          "service"
+        ],
+        "cost": 0.0,
+        "confidence": 0.8,
+        "success_rate": 0.9,
+        "freshness": 0.8,
+        "risk": 0.0,
+        "tags": [
+          "ops"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ],
+    "tool": [
+      {
+        "id": "tool-terminal",
+        "type": "tool",
+        "title": "terminal",
+        "summary": "Run terminal-style inspection commands.",
+        "triggers": [
+          "check",
+          "current",
+          "status",
+          "system"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 1.0,
+        "risk": 0.0,
+        "tags": [
+          "inspection"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ]
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "load_skill",
+      "selected_ids": [
+        "skill-deploy"
+      ],
+      "selected_payloads": [
+        {}
+      ],
+      "rejected_ids": [],
+      "rationale": "Task resembles a reusable procedure; load a skill before action.",
+      "estimated_cost": 0.0
+    }
+  ],
+  "events": [
+    {
+      "event_id": "evt-480e859f-7e5f-42f0-bfcc-f3cb954f75d5",
+      "trajectory_id": "traj-1ac5bb3d-f865-4c8c-8ff4-a9c29472b303",
+      "timestamp": "2026-04-14T14:37:42.380861+00:00",
+      "stage": "retrieval",
+      "event_type": "task_received",
+      "payload": {
+        "input": "Deploy this service with the usual workflow."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-398d16c2-3d12-44a7-8af2-aa306e20195c",
+      "trajectory_id": "traj-1ac5bb3d-f865-4c8c-8ff4-a9c29472b303",
+      "timestamp": "2026-04-14T14:37:42.380867+00:00",
+      "stage": "retrieval",
+      "event_type": "candidates_recalled",
+      "payload": {
+        "memory_ids": [
+          "mem-telegram-pref"
+        ],
+        "skill_ids": [
+          "skill-deploy"
+        ],
+        "tool_ids": [
+          "tool-terminal"
+        ]
+      },
+      "metrics": {},
+      "parent_event_id": "evt-480e859f-7e5f-42f0-bfcc-f3cb954f75d5"
+    },
+    {
+      "event_id": "evt-b63063ea-1ac7-4b85-a6c7-76a03791bc85",
+      "trajectory_id": "traj-1ac5bb3d-f865-4c8c-8ff4-a9c29472b303",
+      "timestamp": "2026-04-14T14:37:42.380871+00:00",
+      "stage": "policy",
+      "event_type": "action_selected",
+      "payload": {
+        "step": 1,
+        "decision_type": "load_skill",
+        "selected_ids": [
+          "skill-deploy"
+        ],
+        "selected_payloads": [
+          {}
+        ],
+        "rejected_ids": [],
+        "rationale": "Task resembles a reusable procedure; load a skill before action.",
+        "estimated_cost": 0.0
+      },
+      "metrics": {},
+      "parent_event_id": "evt-398d16c2-3d12-44a7-8af2-aa306e20195c"
+    },
+    {
+      "event_id": "evt-skill-traj-1ac5bb3d-f865-4c8c-8ff4-a9c29472b303-skill-deploy",
+      "trajectory_id": "traj-1ac5bb3d-f865-4c8c-8ff4-a9c29472b303",
+      "timestamp": "2026-04-14T14:37:42.380877+00:00",
+      "stage": "execution",
+      "event_type": "skill_loaded",
+      "payload": {
+        "skill_id": "skill-deploy",
+        "input": "Deploy this service with the usual workflow.",
+        "instructions": "Demo skill payload loaded successfully."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    }
+  ],
+  "outcome": {
+    "status": "success",
+    "steps": 1,
+    "latency_ms": 0,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Draft trajectory generated by MemabraRunner with execution hooks."
+  },
+  "reward": {
+    "total": 1.04,
+    "components": {
+      "task_success": 0.8,
+      "retrieval_hit": 0.2,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.0,
+      "context_cost": 0.06,
+      "useful_reuse": 0.1
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/trajectories/traj-1c2b1a9e-7290-4ea4-be52-c6ba60b72da0.json b/docs/demo-artifacts/trajectories/traj-1c2b1a9e-7290-4ea4-be52-c6ba60b72da0.json
new file mode 100644
index 0000000..5849c0e
--- /dev/null
+++ b/docs/demo-artifacts/trajectories/traj-1c2b1a9e-7290-4ea4-be52-c6ba60b72da0.json
@@ -0,0 +1,170 @@
+{
+  "trajectory_id": "traj-1c2b1a9e-7290-4ea4-be52-c6ba60b72da0",
+  "task": {
+    "task_id": "task-bb730dc5-88ed-4455-9dbb-6cbba55ad0ce",
+    "input": "Check current system status with a tool.",
+    "channel": "local",
+    "created_at": "2026-04-14T16:50:18.864549+00:00",
+    "user_id": null
+  },
+  "context_snapshot": {
+    "conversation_summary": "",
+    "environment_summary": "",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-telegram-pref",
+        "type": "memory",
+        "title": "Telegram preference",
+        "summary": "Prefer plain text on Telegram.",
+        "triggers": [
+          "telegram",
+          "preference",
+          "answer"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 0.9,
+        "risk": 0.0,
+        "tags": [
+          "output"
+        ],
+        "source": "user",
+        "type_payload": {}
+      }
+    ],
+    "skill": [
+      {
+        "id": "skill-deploy",
+        "type": "skill",
+        "title": "Deploy workflow",
+        "summary": "Reusable deployment workflow.",
+        "triggers": [
+          "deploy",
+          "workflow",
+          "service"
+        ],
+        "cost": 0.0,
+        "confidence": 0.8,
+        "success_rate": 0.9,
+        "freshness": 0.8,
+        "risk": 0.0,
+        "tags": [
+          "ops"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ],
+    "tool": [
+      {
+        "id": "tool-terminal",
+        "type": "tool",
+        "title": "terminal",
+        "summary": "Run terminal-style inspection commands.",
+        "triggers": [
+          "check",
+          "current",
+          "status",
+          "system"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 1.0,
+        "risk": 0.0,
+        "tags": [
+          "inspection"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ]
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "clarify",
+      "selected_ids": [],
+      "selected_payloads": [],
+      "rejected_ids": [],
+      "rationale": "Predicted by learning router (score=2045.615).",
+      "estimated_cost": 0.0
+    }
+  ],
+  "events": [
+    {
+      "event_id": "evt-f491ed7a-0017-463f-a346-2b13aac2ef27",
+      "trajectory_id": "traj-1c2b1a9e-7290-4ea4-be52-c6ba60b72da0",
+      "timestamp": "2026-04-14T16:50:18.864653+00:00",
+      "stage": "retrieval",
+      "event_type": "task_received",
+      "payload": {
+        "input": "Check current system status with a tool."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-9b88da4b-fe41-4522-ba53-e88adf3df3b4",
+      "trajectory_id": "traj-1c2b1a9e-7290-4ea4-be52-c6ba60b72da0",
+      "timestamp": "2026-04-14T16:50:18.864663+00:00",
+      "stage": "retrieval",
+      "event_type": "candidates_recalled",
+      "payload": {
+        "memory_ids": [
+          "mem-telegram-pref"
+        ],
+        "skill_ids": [
+          "skill-deploy"
+        ],
+        "tool_ids": [
+          "tool-terminal"
+        ]
+      },
+      "metrics": {},
+      "parent_event_id": "evt-f491ed7a-0017-463f-a346-2b13aac2ef27"
+    },
+    {
+      "event_id": "evt-2fc97f2c-8219-44d3-98c7-5a86ad88326d",
+      "trajectory_id": "traj-1c2b1a9e-7290-4ea4-be52-c6ba60b72da0",
+      "timestamp": "2026-04-14T16:50:18.864669+00:00",
+      "stage": "policy",
+      "event_type": "action_selected",
+      "payload": {
+        "step": 1,
+        "decision_type": "clarify",
+        "selected_ids": [],
+        "selected_payloads": [],
+        "rejected_ids": [],
+        "rationale": "Predicted by learning router (score=2045.615).",
+        "estimated_cost": 0.0
+      },
+      "metrics": {},
+      "parent_event_id": "evt-9b88da4b-fe41-4522-ba53-e88adf3df3b4"
+    }
+  ],
+  "outcome": {
+    "status": "partial_success",
+    "steps": 1,
+    "latency_ms": 0,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Draft trajectory generated by MemabraRunner with execution hooks."
+  },
+  "reward": {
+    "total": 0.44,
+    "components": {
+      "task_success": 0.4,
+      "retrieval_hit": 0.1,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.0,
+      "context_cost": 0.06,
+      "useful_reuse": 0.0
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/trajectories/traj-1ea60d6e-0b83-4cdf-a601-159373c780ee.json b/docs/demo-artifacts/trajectories/traj-1ea60d6e-0b83-4cdf-a601-159373c780ee.json
new file mode 100644
index 0000000..77de653
--- /dev/null
+++ b/docs/demo-artifacts/trajectories/traj-1ea60d6e-0b83-4cdf-a601-159373c780ee.json
@@ -0,0 +1,207 @@
+{
+  "trajectory_id": "traj-1ea60d6e-0b83-4cdf-a601-159373c780ee",
+  "task": {
+    "task_id": "task-c5221ec3-e5b9-4a2f-9774-fbb75018fe08",
+    "input": "Check current system status with a tool.",
+    "channel": "local",
+    "created_at": "2026-04-14T16:50:18.862393+00:00",
+    "user_id": null
+  },
+  "context_snapshot": {
+    "conversation_summary": "",
+    "environment_summary": "",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-telegram-pref",
+        "type": "memory",
+        "title": "Telegram preference",
+        "summary": "Prefer plain text on Telegram.",
+        "triggers": [
+          "telegram",
+          "preference",
+          "answer"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 0.9,
+        "risk": 0.0,
+        "tags": [
+          "output"
+        ],
+        "source": "user",
+        "type_payload": {}
+      }
+    ],
+    "skill": [
+      {
+        "id": "skill-deploy",
+        "type": "skill",
+        "title": "Deploy workflow",
+        "summary": "Reusable deployment workflow.",
+        "triggers": [
+          "deploy",
+          "workflow",
+          "service"
+        ],
+        "cost": 0.0,
+        "confidence": 0.8,
+        "success_rate": 0.9,
+        "freshness": 0.8,
+        "risk": 0.0,
+        "tags": [
+          "ops"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ],
+    "tool": [
+      {
+        "id": "tool-terminal",
+        "type": "tool",
+        "title": "terminal",
+        "summary": "Run terminal-style inspection commands.",
+        "triggers": [
+          "check",
+          "current",
+          "status",
+          "system"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 1.0,
+        "risk": 0.0,
+        "tags": [
+          "inspection"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ]
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "call_tool",
+      "selected_ids": [
+        "tool-terminal"
+      ],
+      "selected_payloads": [
+        {}
+      ],
+      "rejected_ids": [],
+      "rationale": "Task asks for current state or external action; tool use is justified.",
+      "estimated_cost": 0.0
+    }
+  ],
+  "events": [
+    {
+      "event_id": "evt-93525bc5-5e71-481c-a7d4-0282ef59e0a3",
+      "trajectory_id": "traj-1ea60d6e-0b83-4cdf-a601-159373c780ee",
+      "timestamp": "2026-04-14T16:50:18.862483+00:00",
+      "stage": "retrieval",
+      "event_type": "task_received",
+      "payload": {
+        "input": "Check current system status with a tool."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-a01d1dff-a6dc-4c25-a5a5-14efd6f182b2",
+      "trajectory_id": "traj-1ea60d6e-0b83-4cdf-a601-159373c780ee",
+      "timestamp": "2026-04-14T16:50:18.862492+00:00",
+      "stage": "retrieval",
+      "event_type": "candidates_recalled",
+      "payload": {
+        "memory_ids": [
+          "mem-telegram-pref"
+        ],
+        "skill_ids": [
+          "skill-deploy"
+        ],
+        "tool_ids": [
+          "tool-terminal"
+        ]
+      },
+      "metrics": {},
+      "parent_event_id": "evt-93525bc5-5e71-481c-a7d4-0282ef59e0a3"
+    },
+    {
+      "event_id": "evt-28946864-c699-42fd-9802-dbfe6cb09043",
+      "trajectory_id": "traj-1ea60d6e-0b83-4cdf-a601-159373c780ee",
+      "timestamp": "2026-04-14T16:50:18.862498+00:00",
+      "stage": "policy",
+      "event_type": "action_selected",
+      "payload": {
+        "step": 1,
+        "decision_type": "call_tool",
+        "selected_ids": [
+          "tool-terminal"
+        ],
+        "selected_payloads": [
+          {}
+        ],
+        "rejected_ids": [],
+        "rationale": "Task asks for current state or external action; tool use is justified.",
+        "estimated_cost": 0.0
+      },
+      "metrics": {},
+      "parent_event_id": "evt-a01d1dff-a6dc-4c25-a5a5-14efd6f182b2"
+    },
+    {
+      "event_id": "evt-tool-traj-1ea60d6e-0b83-4cdf-a601-159373c780ee-tool-terminal",
+      "trajectory_id": "traj-1ea60d6e-0b83-4cdf-a601-159373c780ee",
+      "timestamp": "2026-04-14T16:50:18.862511+00:00",
+      "stage": "execution",
+      "event_type": "tool_called",
+      "payload": {
+        "tool_id": "tool-terminal",
+        "input": "Check current system status with a tool."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-tool-result-traj-1ea60d6e-0b83-4cdf-a601-159373c780ee-tool-terminal",
+      "trajectory_id": "traj-1ea60d6e-0b83-4cdf-a601-159373c780ee",
+      "timestamp": "2026-04-14T16:50:18.862515+00:00",
+      "stage": "execution",
+      "event_type": "tool_result",
+      "payload": {
+        "tool_id": "tool-terminal",
+        "status": "success",
+        "output": "demo-result-for:tool-terminal",
+        "error": null,
+        "latency_ms": 42
+      },
+      "metrics": {},
+      "parent_event_id": null
+    }
+  ],
+  "outcome": {
+    "status": "success",
+    "steps": 1,
+    "latency_ms": 42,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Draft trajectory generated by MemabraRunner with execution hooks."
+  },
+  "reward": {
+    "total": 1.032,
+    "components": {
+      "task_success": 0.8,
+      "retrieval_hit": 0.25,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.008,
+      "context_cost": 0.06,
+      "useful_reuse": 0.05
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/trajectories/traj-217ccafa-716c-4534-813b-a489ed7d6079.json b/docs/demo-artifacts/trajectories/traj-217ccafa-716c-4534-813b-a489ed7d6079.json
new file mode 100644
index 0000000..2feda3b
--- /dev/null
+++ b/docs/demo-artifacts/trajectories/traj-217ccafa-716c-4534-813b-a489ed7d6079.json
@@ -0,0 +1,170 @@
+{
+  "trajectory_id": "traj-217ccafa-716c-4534-813b-a489ed7d6079",
+  "task": {
+    "task_id": "task-5f14e5ed-0635-44a0-82e8-419187b040f3",
+    "input": "Use multiple capabilities: memory, skill, and tool.",
+    "channel": "local",
+    "created_at": "2026-04-14T15:52:24.605025+00:00",
+    "user_id": null
+  },
+  "context_snapshot": {
+    "conversation_summary": "",
+    "environment_summary": "",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-telegram-pref",
+        "type": "memory",
+        "title": "Telegram preference",
+        "summary": "Prefer plain text on Telegram.",
+        "triggers": [
+          "telegram",
+          "preference",
+          "answer"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 0.9,
+        "risk": 0.0,
+        "tags": [
+          "output"
+        ],
+        "source": "user",
+        "type_payload": {}
+      }
+    ],
+    "skill": [
+      {
+        "id": "skill-deploy",
+        "type": "skill",
+        "title": "Deploy workflow",
+        "summary": "Reusable deployment workflow.",
+        "triggers": [
+          "deploy",
+          "workflow",
+          "service"
+        ],
+        "cost": 0.0,
+        "confidence": 0.8,
+        "success_rate": 0.9,
+        "freshness": 0.8,
+        "risk": 0.0,
+        "tags": [
+          "ops"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ],
+    "tool": [
+      {
+        "id": "tool-terminal",
+        "type": "tool",
+        "title": "terminal",
+        "summary": "Run terminal-style inspection commands.",
+        "triggers": [
+          "check",
+          "current",
+          "status",
+          "system"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 1.0,
+        "risk": 0.0,
+        "tags": [
+          "inspection"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ]
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "clarify",
+      "selected_ids": [],
+      "selected_payloads": [],
+      "rejected_ids": [],
+      "rationale": "No high-confidence route found from the current heuristic baseline.",
+      "estimated_cost": 0.0
+    }
+  ],
+  "events": [
+    {
+      "event_id": "evt-13ccd07e-9bfd-4ff8-8080-47c400f0be6f",
+      "trajectory_id": "traj-217ccafa-716c-4534-813b-a489ed7d6079",
+      "timestamp": "2026-04-14T15:52:24.605116+00:00",
+      "stage": "retrieval",
+      "event_type": "task_received",
+      "payload": {
+        "input": "Use multiple capabilities: memory, skill, and tool."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-7ecaa289-b7bb-4ac6-ad62-9afb4a49d4a8",
+      "trajectory_id": "traj-217ccafa-716c-4534-813b-a489ed7d6079",
+      "timestamp": "2026-04-14T15:52:24.605126+00:00",
+      "stage": "retrieval",
+      "event_type": "candidates_recalled",
+      "payload": {
+        "memory_ids": [
+          "mem-telegram-pref"
+        ],
+        "skill_ids": [
+          "skill-deploy"
+        ],
+        "tool_ids": [
+          "tool-terminal"
+        ]
+      },
+      "metrics": {},
+      "parent_event_id": "evt-13ccd07e-9bfd-4ff8-8080-47c400f0be6f"
+    },
+    {
+      "event_id": "evt-ad398931-c79d-411a-93f8-8c5834f5446d",
+      "trajectory_id": "traj-217ccafa-716c-4534-813b-a489ed7d6079",
+      "timestamp": "2026-04-14T15:52:24.605138+00:00",
+      "stage": "policy",
+      "event_type": "action_selected",
+      "payload": {
+        "step": 1,
+        "decision_type": "clarify",
+        "selected_ids": [],
+        "selected_payloads": [],
+        "rejected_ids": [],
+        "rationale": "No high-confidence route found from the current heuristic baseline.",
+        "estimated_cost": 0.0
+      },
+      "metrics": {},
+      "parent_event_id": "evt-7ecaa289-b7bb-4ac6-ad62-9afb4a49d4a8"
+    }
+  ],
+  "outcome": {
+    "status": "partial_success",
+    "steps": 1,
+    "latency_ms": 0,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Draft trajectory generated by MemabraRunner with execution hooks."
+  },
+  "reward": {
+    "total": 0.44,
+    "components": {
+      "task_success": 0.4,
+      "retrieval_hit": 0.1,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.0,
+      "context_cost": 0.06,
+      "useful_reuse": 0.0
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/trajectories/traj-3f6687ff-3a55-4a26-a7bc-8397d8da7d15.json b/docs/demo-artifacts/trajectories/traj-3f6687ff-3a55-4a26-a7bc-8397d8da7d15.json
new file mode 100644
index 0000000..7c188b8
--- /dev/null
+++ b/docs/demo-artifacts/trajectories/traj-3f6687ff-3a55-4a26-a7bc-8397d8da7d15.json
@@ -0,0 +1,185 @@
+{
+  "trajectory_id": "traj-3f6687ff-3a55-4a26-a7bc-8397d8da7d15",
+  "task": {
+    "task_id": "task-aeed227c-2e87-45d8-8d98-e270656556b6",
+    "input": "Use my telegram preference for this answer.",
+    "channel": "telegram",
+    "created_at": "2026-04-14T06:53:08.731336+00:00",
+    "user_id": "oza"
+  },
+  "context_snapshot": {
+    "conversation_summary": "",
+    "environment_summary": "",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-telegram-pref",
+        "type": "memory",
+        "title": "Telegram preference",
+        "summary": "Prefer plain text on Telegram.",
+        "triggers": [
+          "telegram",
+          "preference",
+          "answer"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 0.9,
+        "risk": 0.0,
+        "tags": [
+          "output"
+        ],
+        "source": "user",
+        "type_payload": {}
+      }
+    ],
+    "skill": [
+      {
+        "id": "skill-deploy",
+        "type": "skill",
+        "title": "Deploy workflow",
+        "summary": "Reusable deployment workflow.",
+        "triggers": [
+          "deploy",
+          "workflow",
+          "service"
+        ],
+        "cost": 0.0,
+        "confidence": 0.8,
+        "success_rate": 0.9,
+        "freshness": 0.8,
+        "risk": 0.0,
+        "tags": [
+          "ops"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ],
+    "tool": [
+      {
+        "id": "tool-terminal",
+        "type": "tool",
+        "title": "terminal",
+        "summary": "Run terminal-style inspection commands.",
+        "triggers": [
+          "check",
+          "current",
+          "status",
+          "system"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 1.0,
+        "risk": 0.0,
+        "tags": [
+          "inspection"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ]
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "inject_memory",
+      "selected_ids": [
+        "mem-telegram-pref"
+      ],
+      "rejected_ids": [],
+      "rationale": "Task likely depends on stable user/project facts.",
+      "estimated_cost": 0.0
+    }
+  ],
+  "events": [
+    {
+      "event_id": "evt-d71b1fdf-5343-4ac1-89a0-75488c1ce30b",
+      "trajectory_id": "traj-3f6687ff-3a55-4a26-a7bc-8397d8da7d15",
+      "timestamp": "2026-04-14T06:53:08.731418+00:00",
+      "stage": "retrieval",
+      "event_type": "task_received",
+      "payload": {
+        "input": "Use my telegram preference for this answer."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-1f750475-1127-41e5-9f94-c87e4b019ee2",
+      "trajectory_id": "traj-3f6687ff-3a55-4a26-a7bc-8397d8da7d15",
+      "timestamp": "2026-04-14T06:53:08.731427+00:00",
+      "stage": "retrieval",
+      "event_type": "candidates_recalled",
+      "payload": {
+        "memory_ids": [
+          "mem-telegram-pref"
+        ],
+        "skill_ids": [
+          "skill-deploy"
+        ],
+        "tool_ids": [
+          "tool-terminal"
+        ]
+      },
+      "metrics": {},
+      "parent_event_id": "evt-d71b1fdf-5343-4ac1-89a0-75488c1ce30b"
+    },
+    {
+      "event_id": "evt-741967a5-41b9-4917-9b95-4047f89e6e19",
+      "trajectory_id": "traj-3f6687ff-3a55-4a26-a7bc-8397d8da7d15",
+      "timestamp": "2026-04-14T06:53:08.731432+00:00",
+      "stage": "policy",
+      "event_type": "action_selected",
+      "payload": {
+        "step": 1,
+        "decision_type": "inject_memory",
+        "selected_ids": [
+          "mem-telegram-pref"
+        ],
+        "rejected_ids": [],
+        "rationale": "Task likely depends on stable user/project facts.",
+        "estimated_cost": 0.0
+      },
+      "metrics": {},
+      "parent_event_id": "evt-1f750475-1127-41e5-9f94-c87e4b019ee2"
+    },
+    {
+      "event_id": "evt-memory-traj-3f6687ff-3a55-4a26-a7bc-8397d8da7d15-mem-telegram-pref",
+      "trajectory_id": "traj-3f6687ff-3a55-4a26-a7bc-8397d8da7d15",
+      "timestamp": "2026-04-14T06:53:08.731437+00:00",
+      "stage": "execution",
+      "event_type": "memory_injected",
+      "payload": {
+        "record_id": "mem-telegram-pref",
+        "input": "Use my telegram preference for this answer."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    }
+  ],
+  "outcome": {
+    "status": "success",
+    "steps": 1,
+    "latency_ms": 0,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Draft trajectory generated by MemabraRunner with execution hooks."
+  },
+  "reward": {
+    "total": 1.1,
+    "components": {
+      "task_success": 0.8,
+      "retrieval_hit": 0.2,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.0,
+      "context_cost": 0.0,
+      "useful_reuse": 0.1
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/trajectories/traj-439e4552-f248-43cb-b4eb-25db14da1ebc.json b/docs/demo-artifacts/trajectories/traj-439e4552-f248-43cb-b4eb-25db14da1ebc.json
new file mode 100644
index 0000000..78cb92b
--- /dev/null
+++ b/docs/demo-artifacts/trajectories/traj-439e4552-f248-43cb-b4eb-25db14da1ebc.json
@@ -0,0 +1,207 @@
+{
+  "trajectory_id": "traj-439e4552-f248-43cb-b4eb-25db14da1ebc",
+  "task": {
+    "task_id": "task-cde62e1c-0106-4803-9c7d-a0c2f58206d6",
+    "input": "Check the current system status.",
+    "channel": "local",
+    "created_at": "2026-04-14T14:37:42.380386+00:00",
+    "user_id": "oza"
+  },
+  "context_snapshot": {
+    "conversation_summary": "",
+    "environment_summary": "",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-telegram-pref",
+        "type": "memory",
+        "title": "Telegram preference",
+        "summary": "Prefer plain text on Telegram.",
+        "triggers": [
+          "telegram",
+          "preference",
+          "answer"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 0.9,
+        "risk": 0.0,
+        "tags": [
+          "output"
+        ],
+        "source": "user",
+        "type_payload": {}
+      }
+    ],
+    "skill": [
+      {
+        "id": "skill-deploy",
+        "type": "skill",
+        "title": "Deploy workflow",
+        "summary": "Reusable deployment workflow.",
+        "triggers": [
+          "deploy",
+          "workflow",
+          "service"
+        ],
+        "cost": 0.0,
+        "confidence": 0.8,
+        "success_rate": 0.9,
+        "freshness": 0.8,
+        "risk": 0.0,
+        "tags": [
+          "ops"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ],
+    "tool": [
+      {
+        "id": "tool-terminal",
+        "type": "tool",
+        "title": "terminal",
+        "summary": "Run terminal-style inspection commands.",
+        "triggers": [
+          "check",
+          "current",
+          "status",
+          "system"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 1.0,
+        "risk": 0.0,
+        "tags": [
+          "inspection"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ]
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "call_tool",
+      "selected_ids": [
+        "tool-terminal"
+      ],
+      "selected_payloads": [
+        {}
+      ],
+      "rejected_ids": [],
+      "rationale": "Task asks for current state or external action; tool use is justified.",
+      "estimated_cost": 0.0
+    }
+  ],
+  "events": [
+    {
+      "event_id": "evt-9252427a-3ceb-476a-b72d-a7e4f812194c",
+      "trajectory_id": "traj-439e4552-f248-43cb-b4eb-25db14da1ebc",
+      "timestamp": "2026-04-14T14:37:42.380442+00:00",
+      "stage": "retrieval",
+      "event_type": "task_received",
+      "payload": {
+        "input": "Check the current system status."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-333fbd7f-75b1-495f-acfa-6a66348ef16e",
+      "trajectory_id": "traj-439e4552-f248-43cb-b4eb-25db14da1ebc",
+      "timestamp": "2026-04-14T14:37:42.380447+00:00",
+      "stage": "retrieval",
+      "event_type": "candidates_recalled",
+      "payload": {
+        "memory_ids": [
+          "mem-telegram-pref"
+        ],
+        "skill_ids": [
+          "skill-deploy"
+        ],
+        "tool_ids": [
+          "tool-terminal"
+        ]
+      },
+      "metrics": {},
+      "parent_event_id": "evt-9252427a-3ceb-476a-b72d-a7e4f812194c"
+    },
+    {
+      "event_id": "evt-7f4eddba-f609-4d72-bf7c-cd6a938233a7",
+      "trajectory_id": "traj-439e4552-f248-43cb-b4eb-25db14da1ebc",
+      "timestamp": "2026-04-14T14:37:42.380452+00:00",
+      "stage": "policy",
+      "event_type": "action_selected",
+      "payload": {
+        "step": 1,
+        "decision_type": "call_tool",
+        "selected_ids": [
+          "tool-terminal"
+        ],
+        "selected_payloads": [
+          {}
+        ],
+        "rejected_ids": [],
+        "rationale": "Task asks for current state or external action; tool use is justified.",
+        "estimated_cost": 0.0
+      },
+      "metrics": {},
+      "parent_event_id": "evt-333fbd7f-75b1-495f-acfa-6a66348ef16e"
+    },
+    {
+      "event_id": "evt-tool-traj-439e4552-f248-43cb-b4eb-25db14da1ebc-tool-terminal",
+      "trajectory_id": "traj-439e4552-f248-43cb-b4eb-25db14da1ebc",
+      "timestamp": "2026-04-14T14:37:42.380461+00:00",
+      "stage": "execution",
+      "event_type": "tool_called",
+      "payload": {
+        "tool_id": "tool-terminal",
+        "input": "Check the current system status."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-tool-result-traj-439e4552-f248-43cb-b4eb-25db14da1ebc-tool-terminal",
+      "trajectory_id": "traj-439e4552-f248-43cb-b4eb-25db14da1ebc",
+      "timestamp": "2026-04-14T14:37:42.380464+00:00",
+      "stage": "execution",
+      "event_type": "tool_result",
+      "payload": {
+        "tool_id": "tool-terminal",
+        "status": "success",
+        "output": "demo-result-for:tool-terminal",
+        "error": null,
+        "latency_ms": 42
+      },
+      "metrics": {},
+      "parent_event_id": null
+    }
+  ],
+  "outcome": {
+    "status": "success",
+    "steps": 1,
+    "latency_ms": 42,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Draft trajectory generated by MemabraRunner with execution hooks."
+  },
+  "reward": {
+    "total": 1.032,
+    "components": {
+      "task_success": 0.8,
+      "retrieval_hit": 0.25,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.008,
+      "context_cost": 0.06,
+      "useful_reuse": 0.05
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/trajectories/traj-58ec7a90-3ada-4b78-bc6a-6351be4eb4b5.json b/docs/demo-artifacts/trajectories/traj-58ec7a90-3ada-4b78-bc6a-6351be4eb4b5.json
new file mode 100644
index 0000000..fe9613c
--- /dev/null
+++ b/docs/demo-artifacts/trajectories/traj-58ec7a90-3ada-4b78-bc6a-6351be4eb4b5.json
@@ -0,0 +1,192 @@
+{
+  "trajectory_id": "traj-58ec7a90-3ada-4b78-bc6a-6351be4eb4b5",
+  "task": {
+    "task_id": "task-0c82e670-45ab-45f9-af74-c5920f5eb9b3",
+    "input": "Deploy this service with the usual workflow.",
+    "channel": "telegram",
+    "created_at": "2026-04-14T14:37:42.378256+00:00",
+    "user_id": "oza"
+  },
+  "context_snapshot": {
+    "conversation_summary": "",
+    "environment_summary": "",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-telegram-pref",
+        "type": "memory",
+        "title": "Telegram preference",
+        "summary": "Prefer plain text on Telegram.",
+        "triggers": [
+          "telegram",
+          "preference",
+          "answer"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 0.9,
+        "risk": 0.0,
+        "tags": [
+          "output"
+        ],
+        "source": "user",
+        "type_payload": {}
+      }
+    ],
+    "skill": [
+      {
+        "id": "skill-deploy",
+        "type": "skill",
+        "title": "Deploy workflow",
+        "summary": "Reusable deployment workflow.",
+        "triggers": [
+          "deploy",
+          "workflow",
+          "service"
+        ],
+        "cost": 0.0,
+        "confidence": 0.8,
+        "success_rate": 0.9,
+        "freshness": 0.8,
+        "risk": 0.0,
+        "tags": [
+          "ops"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ],
+    "tool": [
+      {
+        "id": "tool-terminal",
+        "type": "tool",
+        "title": "terminal",
+        "summary": "Run terminal-style inspection commands.",
+        "triggers": [
+          "check",
+          "current",
+          "status",
+          "system"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 1.0,
+        "risk": 0.0,
+        "tags": [
+          "inspection"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ]
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "load_skill",
+      "selected_ids": [
+        "skill-deploy"
+      ],
+      "selected_payloads": [
+        {}
+      ],
+      "rejected_ids": [],
+      "rationale": "Task resembles a reusable procedure; load a skill before action.",
+      "estimated_cost": 0.0
+    }
+  ],
+  "events": [
+    {
+      "event_id": "evt-757f035e-551f-4b55-a506-2aac41134885",
+      "trajectory_id": "traj-58ec7a90-3ada-4b78-bc6a-6351be4eb4b5",
+      "timestamp": "2026-04-14T14:37:42.378322+00:00",
+      "stage": "retrieval",
+      "event_type": "task_received",
+      "payload": {
+        "input": "Deploy this service with the usual workflow."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-dfcdd452-1902-4a6c-97fc-fd6a993c2045",
+      "trajectory_id": "traj-58ec7a90-3ada-4b78-bc6a-6351be4eb4b5",
+      "timestamp": "2026-04-14T14:37:42.378327+00:00",
+      "stage": "retrieval",
+      "event_type": "candidates_recalled",
+      "payload": {
+        "memory_ids": [
+          "mem-telegram-pref"
+        ],
+        "skill_ids": [
+          "skill-deploy"
+        ],
+        "tool_ids": [
+          "tool-terminal"
+        ]
+      },
+      "metrics": {},
+      "parent_event_id": "evt-757f035e-551f-4b55-a506-2aac41134885"
+    },
+    {
+      "event_id": "evt-c680ed8f-a6b0-48d1-bcd4-7423089aa916",
+      "trajectory_id": "traj-58ec7a90-3ada-4b78-bc6a-6351be4eb4b5",
+      "timestamp": "2026-04-14T14:37:42.378332+00:00",
+      "stage": "policy",
+      "event_type": "action_selected",
+      "payload": {
+        "step": 1,
+        "decision_type": "load_skill",
+        "selected_ids": [
+          "skill-deploy"
+        ],
+        "selected_payloads": [
+          {}
+        ],
+        "rejected_ids": [],
+        "rationale": "Task resembles a reusable procedure; load a skill before action.",
+        "estimated_cost": 0.0
+      },
+      "metrics": {},
+      "parent_event_id": "evt-dfcdd452-1902-4a6c-97fc-fd6a993c2045"
+    },
+    {
+      "event_id": "evt-skill-traj-58ec7a90-3ada-4b78-bc6a-6351be4eb4b5-skill-deploy",
+      "trajectory_id": "traj-58ec7a90-3ada-4b78-bc6a-6351be4eb4b5",
+      "timestamp": "2026-04-14T14:37:42.378339+00:00",
+      "stage": "execution",
+      "event_type": "skill_loaded",
+      "payload": {
+        "skill_id": "skill-deploy",
+        "input": "Deploy this service with the usual workflow.",
+        "instructions": "Demo skill payload loaded successfully."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    }
+  ],
+  "outcome": {
+    "status": "success",
+    "steps": 1,
+    "latency_ms": 0,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Draft trajectory generated by MemabraRunner with execution hooks."
+  },
+  "reward": {
+    "total": 1.04,
+    "components": {
+      "task_success": 0.8,
+      "retrieval_hit": 0.2,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.0,
+      "context_cost": 0.06,
+      "useful_reuse": 0.1
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/trajectories/traj-6a5aaff5-9336-4a1d-b102-80f1196427ae.json b/docs/demo-artifacts/trajectories/traj-6a5aaff5-9336-4a1d-b102-80f1196427ae.json
new file mode 100644
index 0000000..da8a717
--- /dev/null
+++ b/docs/demo-artifacts/trajectories/traj-6a5aaff5-9336-4a1d-b102-80f1196427ae.json
@@ -0,0 +1,191 @@
+{
+  "trajectory_id": "traj-6a5aaff5-9336-4a1d-b102-80f1196427ae",
+  "task": {
+    "task_id": "task-549e2de3-bb55-4797-a862-e59f8d69a7e5",
+    "input": "Use my telegram preference for this answer.",
+    "channel": "telegram",
+    "created_at": "2026-04-14T15:27:38.519692+00:00",
+    "user_id": "oza"
+  },
+  "context_snapshot": {
+    "conversation_summary": "",
+    "environment_summary": "",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-telegram-pref",
+        "type": "memory",
+        "title": "Telegram preference",
+        "summary": "Prefer plain text on Telegram.",
+        "triggers": [
+          "telegram",
+          "preference",
+          "answer"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 0.9,
+        "risk": 0.0,
+        "tags": [
+          "output"
+        ],
+        "source": "user",
+        "type_payload": {}
+      }
+    ],
+    "skill": [
+      {
+        "id": "skill-deploy",
+        "type": "skill",
+        "title": "Deploy workflow",
+        "summary": "Reusable deployment workflow.",
+        "triggers": [
+          "deploy",
+          "workflow",
+          "service"
+        ],
+        "cost": 0.0,
+        "confidence": 0.8,
+        "success_rate": 0.9,
+        "freshness": 0.8,
+        "risk": 0.0,
+        "tags": [
+          "ops"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ],
+    "tool": [
+      {
+        "id": "tool-terminal",
+        "type": "tool",
+        "title": "terminal",
+        "summary": "Run terminal-style inspection commands.",
+        "triggers": [
+          "check",
+          "current",
+          "status",
+          "system"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 1.0,
+        "risk": 0.0,
+        "tags": [
+          "inspection"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ]
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "inject_memory",
+      "selected_ids": [
+        "mem-telegram-pref"
+      ],
+      "selected_payloads": [
+        {}
+      ],
+      "rejected_ids": [],
+      "rationale": "Predicted by learning router (score=1854.615).",
+      "estimated_cost": 0.0
+    }
+  ],
+  "events": [
+    {
+      "event_id": "evt-369333af-5ca9-4c11-b163-6144d925ba91",
+      "trajectory_id": "traj-6a5aaff5-9336-4a1d-b102-80f1196427ae",
+      "timestamp": "2026-04-14T15:27:38.519774+00:00",
+      "stage": "retrieval",
+      "event_type": "task_received",
+      "payload": {
+        "input": "Use my telegram preference for this answer."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-51d31531-c49b-4af7-86f8-9fc3b5aff7a0",
+      "trajectory_id": "traj-6a5aaff5-9336-4a1d-b102-80f1196427ae",
+      "timestamp": "2026-04-14T15:27:38.519780+00:00",
+      "stage": "retrieval",
+      "event_type": "candidates_recalled",
+      "payload": {
+        "memory_ids": [
+          "mem-telegram-pref"
+        ],
+        "skill_ids": [
+          "skill-deploy"
+        ],
+        "tool_ids": [
+          "tool-terminal"
+        ]
+      },
+      "metrics": {},
+      "parent_event_id": "evt-369333af-5ca9-4c11-b163-6144d925ba91"
+    },
+    {
+      "event_id": "evt-3a842acf-5111-4b77-98a2-2a18c5a4a61d",
+      "trajectory_id": "traj-6a5aaff5-9336-4a1d-b102-80f1196427ae",
+      "timestamp": "2026-04-14T15:27:38.519784+00:00",
+      "stage": "policy",
+      "event_type": "action_selected",
+      "payload": {
+        "step": 1,
+        "decision_type": "inject_memory",
+        "selected_ids": [
+          "mem-telegram-pref"
+        ],
+        "selected_payloads": [
+          {}
+        ],
+        "rejected_ids": [],
+        "rationale": "Predicted by learning router (score=1854.615).",
+        "estimated_cost": 0.0
+      },
+      "metrics": {},
+      "parent_event_id": "evt-51d31531-c49b-4af7-86f8-9fc3b5aff7a0"
+    },
+    {
+      "event_id": "evt-memory-traj-6a5aaff5-9336-4a1d-b102-80f1196427ae-mem-telegram-pref",
+      "trajectory_id": "traj-6a5aaff5-9336-4a1d-b102-80f1196427ae",
+      "timestamp": "2026-04-14T15:27:38.519790+00:00",
+      "stage": "execution",
+      "event_type": "memory_injected",
+      "payload": {
+        "record_id": "mem-telegram-pref",
+        "input": "Use my telegram preference for this answer."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    }
+  ],
+  "outcome": {
+    "status": "success",
+    "steps": 1,
+    "latency_ms": 0,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Draft trajectory generated by MemabraRunner with execution hooks."
+  },
+  "reward": {
+    "total": 1.04,
+    "components": {
+      "task_success": 0.8,
+      "retrieval_hit": 0.2,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.0,
+      "context_cost": 0.06,
+      "useful_reuse": 0.1
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/trajectories/traj-707b1dec-1d9a-4a71-a07a-54841155103c.json b/docs/demo-artifacts/trajectories/traj-707b1dec-1d9a-4a71-a07a-54841155103c.json
new file mode 100644
index 0000000..d71b6df
--- /dev/null
+++ b/docs/demo-artifacts/trajectories/traj-707b1dec-1d9a-4a71-a07a-54841155103c.json
@@ -0,0 +1,207 @@
+{
+  "trajectory_id": "traj-707b1dec-1d9a-4a71-a07a-54841155103c",
+  "task": {
+    "task_id": "task-23d5816f-12f3-4247-8c4f-9c01d13b1fd8",
+    "input": "Check the current system status.",
+    "channel": "telegram",
+    "created_at": "2026-04-14T14:37:42.377746+00:00",
+    "user_id": "oza"
+  },
+  "context_snapshot": {
+    "conversation_summary": "",
+    "environment_summary": "",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-telegram-pref",
+        "type": "memory",
+        "title": "Telegram preference",
+        "summary": "Prefer plain text on Telegram.",
+        "triggers": [
+          "telegram",
+          "preference",
+          "answer"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 0.9,
+        "risk": 0.0,
+        "tags": [
+          "output"
+        ],
+        "source": "user",
+        "type_payload": {}
+      }
+    ],
+    "skill": [
+      {
+        "id": "skill-deploy",
+        "type": "skill",
+        "title": "Deploy workflow",
+        "summary": "Reusable deployment workflow.",
+        "triggers": [
+          "deploy",
+          "workflow",
+          "service"
+        ],
+        "cost": 0.0,
+        "confidence": 0.8,
+        "success_rate": 0.9,
+        "freshness": 0.8,
+        "risk": 0.0,
+        "tags": [
+          "ops"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ],
+    "tool": [
+      {
+        "id": "tool-terminal",
+        "type": "tool",
+        "title": "terminal",
+        "summary": "Run terminal-style inspection commands.",
+        "triggers": [
+          "check",
+          "current",
+          "status",
+          "system"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 1.0,
+        "risk": 0.0,
+        "tags": [
+          "inspection"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ]
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "call_tool",
+      "selected_ids": [
+        "tool-terminal"
+      ],
+      "selected_payloads": [
+        {}
+      ],
+      "rejected_ids": [],
+      "rationale": "Task asks for current state or external action; tool use is justified.",
+      "estimated_cost": 0.0
+    }
+  ],
+  "events": [
+    {
+      "event_id": "evt-15616207-b055-41b3-98e7-fca3fdd89ce9",
+      "trajectory_id": "traj-707b1dec-1d9a-4a71-a07a-54841155103c",
+      "timestamp": "2026-04-14T14:37:42.377821+00:00",
+      "stage": "retrieval",
+      "event_type": "task_received",
+      "payload": {
+        "input": "Check the current system status."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-431bb458-0488-4712-93d5-d7a689048022",
+      "trajectory_id": "traj-707b1dec-1d9a-4a71-a07a-54841155103c",
+      "timestamp": "2026-04-14T14:37:42.377827+00:00",
+      "stage": "retrieval",
+      "event_type": "candidates_recalled",
+      "payload": {
+        "memory_ids": [
+          "mem-telegram-pref"
+        ],
+        "skill_ids": [
+          "skill-deploy"
+        ],
+        "tool_ids": [
+          "tool-terminal"
+        ]
+      },
+      "metrics": {},
+      "parent_event_id": "evt-15616207-b055-41b3-98e7-fca3fdd89ce9"
+    },
+    {
+      "event_id": "evt-8bb2db02-56ae-4fad-a0bc-e30cd7fed98e",
+      "trajectory_id": "traj-707b1dec-1d9a-4a71-a07a-54841155103c",
+      "timestamp": "2026-04-14T14:37:42.377831+00:00",
+      "stage": "policy",
+      "event_type": "action_selected",
+      "payload": {
+        "step": 1,
+        "decision_type": "call_tool",
+        "selected_ids": [
+          "tool-terminal"
+        ],
+        "selected_payloads": [
+          {}
+        ],
+        "rejected_ids": [],
+        "rationale": "Task asks for current state or external action; tool use is justified.",
+        "estimated_cost": 0.0
+      },
+      "metrics": {},
+      "parent_event_id": "evt-431bb458-0488-4712-93d5-d7a689048022"
+    },
+    {
+      "event_id": "evt-tool-traj-707b1dec-1d9a-4a71-a07a-54841155103c-tool-terminal",
+      "trajectory_id": "traj-707b1dec-1d9a-4a71-a07a-54841155103c",
+      "timestamp": "2026-04-14T14:37:42.377843+00:00",
+      "stage": "execution",
+      "event_type": "tool_called",
+      "payload": {
+        "tool_id": "tool-terminal",
+        "input": "Check the current system status."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-tool-result-traj-707b1dec-1d9a-4a71-a07a-54841155103c-tool-terminal",
+      "trajectory_id": "traj-707b1dec-1d9a-4a71-a07a-54841155103c",
+      "timestamp": "2026-04-14T14:37:42.377846+00:00",
+      "stage": "execution",
+      "event_type": "tool_result",
+      "payload": {
+        "tool_id": "tool-terminal",
+        "status": "success",
+        "output": "demo-result-for:tool-terminal",
+        "error": null,
+        "latency_ms": 42
+      },
+      "metrics": {},
+      "parent_event_id": null
+    }
+  ],
+  "outcome": {
+    "status": "success",
+    "steps": 1,
+    "latency_ms": 42,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Draft trajectory generated by MemabraRunner with execution hooks."
+  },
+  "reward": {
+    "total": 1.032,
+    "components": {
+      "task_success": 0.8,
+      "retrieval_hit": 0.25,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.008,
+      "context_cost": 0.06,
+      "useful_reuse": 0.05
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/trajectories/traj-74e92442-04fd-4f5a-979f-2dd81a7f08e1.json b/docs/demo-artifacts/trajectories/traj-74e92442-04fd-4f5a-979f-2dd81a7f08e1.json
new file mode 100644
index 0000000..b6121b9
--- /dev/null
+++ b/docs/demo-artifacts/trajectories/traj-74e92442-04fd-4f5a-979f-2dd81a7f08e1.json
@@ -0,0 +1,207 @@
+{
+  "trajectory_id": "traj-74e92442-04fd-4f5a-979f-2dd81a7f08e1",
+  "task": {
+    "task_id": "task-e0c612c6-d846-4dc0-9c30-4a66d0a78d2a",
+    "input": "Check current system status with a tool.",
+    "channel": "local",
+    "created_at": "2026-04-14T15:52:24.604470+00:00",
+    "user_id": null
+  },
+  "context_snapshot": {
+    "conversation_summary": "",
+    "environment_summary": "",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-telegram-pref",
+        "type": "memory",
+        "title": "Telegram preference",
+        "summary": "Prefer plain text on Telegram.",
+        "triggers": [
+          "telegram",
+          "preference",
+          "answer"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 0.9,
+        "risk": 0.0,
+        "tags": [
+          "output"
+        ],
+        "source": "user",
+        "type_payload": {}
+      }
+    ],
+    "skill": [
+      {
+        "id": "skill-deploy",
+        "type": "skill",
+        "title": "Deploy workflow",
+        "summary": "Reusable deployment workflow.",
+        "triggers": [
+          "deploy",
+          "workflow",
+          "service"
+        ],
+        "cost": 0.0,
+        "confidence": 0.8,
+        "success_rate": 0.9,
+        "freshness": 0.8,
+        "risk": 0.0,
+        "tags": [
+          "ops"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ],
+    "tool": [
+      {
+        "id": "tool-terminal",
+        "type": "tool",
+        "title": "terminal",
+        "summary": "Run terminal-style inspection commands.",
+        "triggers": [
+          "check",
+          "current",
+          "status",
+          "system"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 1.0,
+        "risk": 0.0,
+        "tags": [
+          "inspection"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ]
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "call_tool",
+      "selected_ids": [
+        "tool-terminal"
+      ],
+      "selected_payloads": [
+        {}
+      ],
+      "rejected_ids": [],
+      "rationale": "Task asks for current state or external action; tool use is justified.",
+      "estimated_cost": 0.0
+    }
+  ],
+  "events": [
+    {
+      "event_id": "evt-7befe34c-6cf6-422b-9615-11fd64b50899",
+      "trajectory_id": "traj-74e92442-04fd-4f5a-979f-2dd81a7f08e1",
+      "timestamp": "2026-04-14T15:52:24.604556+00:00",
+      "stage": "retrieval",
+      "event_type": "task_received",
+      "payload": {
+        "input": "Check current system status with a tool."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-8533f7c9-696d-413d-8484-d434ffccdd02",
+      "trajectory_id": "traj-74e92442-04fd-4f5a-979f-2dd81a7f08e1",
+      "timestamp": "2026-04-14T15:52:24.604565+00:00",
+      "stage": "retrieval",
+      "event_type": "candidates_recalled",
+      "payload": {
+        "memory_ids": [
+          "mem-telegram-pref"
+        ],
+        "skill_ids": [
+          "skill-deploy"
+        ],
+        "tool_ids": [
+          "tool-terminal"
+        ]
+      },
+      "metrics": {},
+      "parent_event_id": "evt-7befe34c-6cf6-422b-9615-11fd64b50899"
+    },
+    {
+      "event_id": "evt-2f878de3-e77d-42f6-8252-b692a11a69ac",
+      "trajectory_id": "traj-74e92442-04fd-4f5a-979f-2dd81a7f08e1",
+      "timestamp": "2026-04-14T15:52:24.604571+00:00",
+      "stage": "policy",
+      "event_type": "action_selected",
+      "payload": {
+        "step": 1,
+        "decision_type": "call_tool",
+        "selected_ids": [
+          "tool-terminal"
+        ],
+        "selected_payloads": [
+          {}
+        ],
+        "rejected_ids": [],
+        "rationale": "Task asks for current state or external action; tool use is justified.",
+        "estimated_cost": 0.0
+      },
+      "metrics": {},
+      "parent_event_id": "evt-8533f7c9-696d-413d-8484-d434ffccdd02"
+    },
+    {
+      "event_id": "evt-tool-traj-74e92442-04fd-4f5a-979f-2dd81a7f08e1-tool-terminal",
+      "trajectory_id": "traj-74e92442-04fd-4f5a-979f-2dd81a7f08e1",
+      "timestamp": "2026-04-14T15:52:24.604584+00:00",
+      "stage": "execution",
+      "event_type": "tool_called",
+      "payload": {
+        "tool_id": "tool-terminal",
+        "input": "Check current system status with a tool."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-tool-result-traj-74e92442-04fd-4f5a-979f-2dd81a7f08e1-tool-terminal",
+      "trajectory_id": "traj-74e92442-04fd-4f5a-979f-2dd81a7f08e1",
+      "timestamp": "2026-04-14T15:52:24.604588+00:00",
+      "stage": "execution",
+      "event_type": "tool_result",
+      "payload": {
+        "tool_id": "tool-terminal",
+        "status": "success",
+        "output": "demo-result-for:tool-terminal",
+        "error": null,
+        "latency_ms": 42
+      },
+      "metrics": {},
+      "parent_event_id": null
+    }
+  ],
+  "outcome": {
+    "status": "success",
+    "steps": 1,
+    "latency_ms": 42,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Draft trajectory generated by MemabraRunner with execution hooks."
+  },
+  "reward": {
+    "total": 1.032,
+    "components": {
+      "task_success": 0.8,
+      "retrieval_hit": 0.25,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.008,
+      "context_cost": 0.06,
+      "useful_reuse": 0.05
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/trajectories/traj-77ab4624-013b-4f56-b600-b3e0cbef7a06.json b/docs/demo-artifacts/trajectories/traj-77ab4624-013b-4f56-b600-b3e0cbef7a06.json
new file mode 100644
index 0000000..8092207
--- /dev/null
+++ b/docs/demo-artifacts/trajectories/traj-77ab4624-013b-4f56-b600-b3e0cbef7a06.json
@@ -0,0 +1,191 @@
+{
+  "trajectory_id": "traj-77ab4624-013b-4f56-b600-b3e0cbef7a06",
+  "task": {
+    "task_id": "task-ad6649f7-dcca-4dd3-9521-3409c5f4e746",
+    "input": "Recall my saved preference from memory.",
+    "channel": "local",
+    "created_at": "2026-04-14T16:50:18.861213+00:00",
+    "user_id": null
+  },
+  "context_snapshot": {
+    "conversation_summary": "",
+    "environment_summary": "",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-telegram-pref",
+        "type": "memory",
+        "title": "Telegram preference",
+        "summary": "Prefer plain text on Telegram.",
+        "triggers": [
+          "telegram",
+          "preference",
+          "answer"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 0.9,
+        "risk": 0.0,
+        "tags": [
+          "output"
+        ],
+        "source": "user",
+        "type_payload": {}
+      }
+    ],
+    "skill": [
+      {
+        "id": "skill-deploy",
+        "type": "skill",
+        "title": "Deploy workflow",
+        "summary": "Reusable deployment workflow.",
+        "triggers": [
+          "deploy",
+          "workflow",
+          "service"
+        ],
+        "cost": 0.0,
+        "confidence": 0.8,
+        "success_rate": 0.9,
+        "freshness": 0.8,
+        "risk": 0.0,
+        "tags": [
+          "ops"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ],
+    "tool": [
+      {
+        "id": "tool-terminal",
+        "type": "tool",
+        "title": "terminal",
+        "summary": "Run terminal-style inspection commands.",
+        "triggers": [
+          "check",
+          "current",
+          "status",
+          "system"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 1.0,
+        "risk": 0.0,
+        "tags": [
+          "inspection"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ]
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "inject_memory",
+      "selected_ids": [
+        "mem-telegram-pref"
+      ],
+      "selected_payloads": [
+        {}
+      ],
+      "rejected_ids": [],
+      "rationale": "Task likely depends on stable user/project facts.",
+      "estimated_cost": 0.0
+    }
+  ],
+  "events": [
+    {
+      "event_id": "evt-61f191c1-68c7-4f0b-ab9b-f22b131e2637",
+      "trajectory_id": "traj-77ab4624-013b-4f56-b600-b3e0cbef7a06",
+      "timestamp": "2026-04-14T16:50:18.861293+00:00",
+      "stage": "retrieval",
+      "event_type": "task_received",
+      "payload": {
+        "input": "Recall my saved preference from memory."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-f0fbf671-18c5-4db2-86e5-68950b030992",
+      "trajectory_id": "traj-77ab4624-013b-4f56-b600-b3e0cbef7a06",
+      "timestamp": "2026-04-14T16:50:18.861299+00:00",
+      "stage": "retrieval",
+      "event_type": "candidates_recalled",
+      "payload": {
+        "memory_ids": [
+          "mem-telegram-pref"
+        ],
+        "skill_ids": [
+          "skill-deploy"
+        ],
+        "tool_ids": [
+          "tool-terminal"
+        ]
+      },
+      "metrics": {},
+      "parent_event_id": "evt-61f191c1-68c7-4f0b-ab9b-f22b131e2637"
+    },
+    {
+      "event_id": "evt-168a76e7-3c64-4f65-8a74-0969942d6d94",
+      "trajectory_id": "traj-77ab4624-013b-4f56-b600-b3e0cbef7a06",
+      "timestamp": "2026-04-14T16:50:18.861304+00:00",
+      "stage": "policy",
+      "event_type": "action_selected",
+      "payload": {
+        "step": 1,
+        "decision_type": "inject_memory",
+        "selected_ids": [
+          "mem-telegram-pref"
+        ],
+        "selected_payloads": [
+          {}
+        ],
+        "rejected_ids": [],
+        "rationale": "Task likely depends on stable user/project facts.",
+        "estimated_cost": 0.0
+      },
+      "metrics": {},
+      "parent_event_id": "evt-f0fbf671-18c5-4db2-86e5-68950b030992"
+    },
+    {
+      "event_id": "evt-memory-traj-77ab4624-013b-4f56-b600-b3e0cbef7a06-mem-telegram-pref",
+      "trajectory_id": "traj-77ab4624-013b-4f56-b600-b3e0cbef7a06",
+      "timestamp": "2026-04-14T16:50:18.861310+00:00",
+      "stage": "execution",
+      "event_type": "memory_injected",
+      "payload": {
+        "record_id": "mem-telegram-pref",
+        "input": "Recall my saved preference from memory."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    }
+  ],
+  "outcome": {
+    "status": "success",
+    "steps": 1,
+    "latency_ms": 0,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Draft trajectory generated by MemabraRunner with execution hooks."
+  },
+  "reward": {
+    "total": 1.04,
+    "components": {
+      "task_success": 0.8,
+      "retrieval_hit": 0.2,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.0,
+      "context_cost": 0.06,
+      "useful_reuse": 0.1
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/trajectories/traj-80784ce5-fc14-4fee-9f5f-90dcec26179b.json b/docs/demo-artifacts/trajectories/traj-80784ce5-fc14-4fee-9f5f-90dcec26179b.json
new file mode 100644
index 0000000..09f34e3
--- /dev/null
+++ b/docs/demo-artifacts/trajectories/traj-80784ce5-fc14-4fee-9f5f-90dcec26179b.json
@@ -0,0 +1,192 @@
+{
+  "trajectory_id": "traj-80784ce5-fc14-4fee-9f5f-90dcec26179b",
+  "task": {
+    "task_id": "task-37fe7921-66da-4390-a9bf-31209ae8a890",
+    "input": "Use my telegram preference for this answer.",
+    "channel": "telegram",
+    "created_at": "2026-04-14T14:37:42.381229+00:00",
+    "user_id": "oza"
+  },
+  "context_snapshot": {
+    "conversation_summary": "",
+    "environment_summary": "",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-telegram-pref",
+        "type": "memory",
+        "title": "Telegram preference",
+        "summary": "Prefer plain text on Telegram.",
+        "triggers": [
+          "telegram",
+          "preference",
+          "answer"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 0.9,
+        "risk": 0.0,
+        "tags": [
+          "output"
+        ],
+        "source": "user",
+        "type_payload": {}
+      }
+    ],
+    "skill": [
+      {
+        "id": "skill-deploy",
+        "type": "skill",
+        "title": "Deploy workflow",
+        "summary": "Reusable deployment workflow.",
+        "triggers": [
+          "deploy",
+          "workflow",
+          "service"
+        ],
+        "cost": 0.0,
+        "confidence": 0.8,
+        "success_rate": 0.9,
+        "freshness": 0.8,
+        "risk": 0.0,
+        "tags": [
+          "ops"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ],
+    "tool": [
+      {
+        "id": "tool-terminal",
+        "type": "tool",
+        "title": "terminal",
+        "summary": "Run terminal-style inspection commands.",
+        "triggers": [
+          "check",
+          "current",
+          "status",
+          "system"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 1.0,
+        "risk": 0.0,
+        "tags": [
+          "inspection"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ]
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "load_skill",
+      "selected_ids": [
+        "skill-deploy"
+      ],
+      "selected_payloads": [
+        {}
+      ],
+      "rejected_ids": [],
+      "rationale": "Predicted by learning router (score=1897.615).",
+      "estimated_cost": 0.0
+    }
+  ],
+  "events": [
+    {
+      "event_id": "evt-01cb59f2-27b0-4be7-b0f9-c878634363ba",
+      "trajectory_id": "traj-80784ce5-fc14-4fee-9f5f-90dcec26179b",
+      "timestamp": "2026-04-14T14:37:42.381299+00:00",
+      "stage": "retrieval",
+      "event_type": "task_received",
+      "payload": {
+        "input": "Use my telegram preference for this answer."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-4281fe16-c753-4024-a0ff-e82f518e16dc",
+      "trajectory_id": "traj-80784ce5-fc14-4fee-9f5f-90dcec26179b",
+      "timestamp": "2026-04-14T14:37:42.381305+00:00",
+      "stage": "retrieval",
+      "event_type": "candidates_recalled",
+      "payload": {
+        "memory_ids": [
+          "mem-telegram-pref"
+        ],
+        "skill_ids": [
+          "skill-deploy"
+        ],
+        "tool_ids": [
+          "tool-terminal"
+        ]
+      },
+      "metrics": {},
+      "parent_event_id": "evt-01cb59f2-27b0-4be7-b0f9-c878634363ba"
+    },
+    {
+      "event_id": "evt-ccc6afd4-82c2-4774-ba2a-732ffa9296a4",
+      "trajectory_id": "traj-80784ce5-fc14-4fee-9f5f-90dcec26179b",
+      "timestamp": "2026-04-14T14:37:42.381309+00:00",
+      "stage": "policy",
+      "event_type": "action_selected",
+      "payload": {
+        "step": 1,
+        "decision_type": "load_skill",
+        "selected_ids": [
+          "skill-deploy"
+        ],
+        "selected_payloads": [
+          {}
+        ],
+        "rejected_ids": [],
+        "rationale": "Predicted by learning router (score=1897.615).",
+        "estimated_cost": 0.0
+      },
+      "metrics": {},
+      "parent_event_id": "evt-4281fe16-c753-4024-a0ff-e82f518e16dc"
+    },
+    {
+      "event_id": "evt-skill-traj-80784ce5-fc14-4fee-9f5f-90dcec26179b-skill-deploy",
+      "trajectory_id": "traj-80784ce5-fc14-4fee-9f5f-90dcec26179b",
+      "timestamp": "2026-04-14T14:37:42.381314+00:00",
+      "stage": "execution",
+      "event_type": "skill_loaded",
+      "payload": {
+        "skill_id": "skill-deploy",
+        "input": "Use my telegram preference for this answer.",
+        "instructions": "Demo skill payload loaded successfully."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    }
+  ],
+  "outcome": {
+    "status": "success",
+    "steps": 1,
+    "latency_ms": 0,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Draft trajectory generated by MemabraRunner with execution hooks."
+  },
+  "reward": {
+    "total": 1.04,
+    "components": {
+      "task_success": 0.8,
+      "retrieval_hit": 0.2,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.0,
+      "context_cost": 0.06,
+      "useful_reuse": 0.1
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/trajectories/traj-819443a2-79ea-48b7-a543-8bb7356dba36.json b/docs/demo-artifacts/trajectories/traj-819443a2-79ea-48b7-a543-8bb7356dba36.json
new file mode 100644
index 0000000..467973e
--- /dev/null
+++ b/docs/demo-artifacts/trajectories/traj-819443a2-79ea-48b7-a543-8bb7356dba36.json
@@ -0,0 +1,191 @@
+{
+  "trajectory_id": "traj-819443a2-79ea-48b7-a543-8bb7356dba36",
+  "task": {
+    "task_id": "task-8e991184-4d09-47bd-9a70-2f3d591d875c",
+    "input": "Use my telegram preference for this answer.",
+    "channel": "telegram",
+    "created_at": "2026-04-14T14:37:42.377206+00:00",
+    "user_id": "oza"
+  },
+  "context_snapshot": {
+    "conversation_summary": "",
+    "environment_summary": "",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-telegram-pref",
+        "type": "memory",
+        "title": "Telegram preference",
+        "summary": "Prefer plain text on Telegram.",
+        "triggers": [
+          "telegram",
+          "preference",
+          "answer"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 0.9,
+        "risk": 0.0,
+        "tags": [
+          "output"
+        ],
+        "source": "user",
+        "type_payload": {}
+      }
+    ],
+    "skill": [
+      {
+        "id": "skill-deploy",
+        "type": "skill",
+        "title": "Deploy workflow",
+        "summary": "Reusable deployment workflow.",
+        "triggers": [
+          "deploy",
+          "workflow",
+          "service"
+        ],
+        "cost": 0.0,
+        "confidence": 0.8,
+        "success_rate": 0.9,
+        "freshness": 0.8,
+        "risk": 0.0,
+        "tags": [
+          "ops"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ],
+    "tool": [
+      {
+        "id": "tool-terminal",
+        "type": "tool",
+        "title": "terminal",
+        "summary": "Run terminal-style inspection commands.",
+        "triggers": [
+          "check",
+          "current",
+          "status",
+          "system"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 1.0,
+        "risk": 0.0,
+        "tags": [
+          "inspection"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ]
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "inject_memory",
+      "selected_ids": [
+        "mem-telegram-pref"
+      ],
+      "selected_payloads": [
+        {}
+      ],
+      "rejected_ids": [],
+      "rationale": "Task likely depends on stable user/project facts.",
+      "estimated_cost": 0.0
+    }
+  ],
+  "events": [
+    {
+      "event_id": "evt-79db8272-c394-40e1-b0d3-c905c305ea26",
+      "trajectory_id": "traj-819443a2-79ea-48b7-a543-8bb7356dba36",
+      "timestamp": "2026-04-14T14:37:42.377281+00:00",
+      "stage": "retrieval",
+      "event_type": "task_received",
+      "payload": {
+        "input": "Use my telegram preference for this answer."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-22367f19-007b-49cd-9ac4-30bbcc77e8a2",
+      "trajectory_id": "traj-819443a2-79ea-48b7-a543-8bb7356dba36",
+      "timestamp": "2026-04-14T14:37:42.377287+00:00",
+      "stage": "retrieval",
+      "event_type": "candidates_recalled",
+      "payload": {
+        "memory_ids": [
+          "mem-telegram-pref"
+        ],
+        "skill_ids": [
+          "skill-deploy"
+        ],
+        "tool_ids": [
+          "tool-terminal"
+        ]
+      },
+      "metrics": {},
+      "parent_event_id": "evt-79db8272-c394-40e1-b0d3-c905c305ea26"
+    },
+    {
+      "event_id": "evt-84fe05fe-8ccc-4782-8cd2-28d56a659658",
+      "trajectory_id": "traj-819443a2-79ea-48b7-a543-8bb7356dba36",
+      "timestamp": "2026-04-14T14:37:42.377292+00:00",
+      "stage": "policy",
+      "event_type": "action_selected",
+      "payload": {
+        "step": 1,
+        "decision_type": "inject_memory",
+        "selected_ids": [
+          "mem-telegram-pref"
+        ],
+        "selected_payloads": [
+          {}
+        ],
+        "rejected_ids": [],
+        "rationale": "Task likely depends on stable user/project facts.",
+        "estimated_cost": 0.0
+      },
+      "metrics": {},
+      "parent_event_id": "evt-22367f19-007b-49cd-9ac4-30bbcc77e8a2"
+    },
+    {
+      "event_id": "evt-memory-traj-819443a2-79ea-48b7-a543-8bb7356dba36-mem-telegram-pref",
+      "trajectory_id": "traj-819443a2-79ea-48b7-a543-8bb7356dba36",
+      "timestamp": "2026-04-14T14:37:42.377297+00:00",
+      "stage": "execution",
+      "event_type": "memory_injected",
+      "payload": {
+        "record_id": "mem-telegram-pref",
+        "input": "Use my telegram preference for this answer."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    }
+  ],
+  "outcome": {
+    "status": "success",
+    "steps": 1,
+    "latency_ms": 0,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Draft trajectory generated by MemabraRunner with execution hooks."
+  },
+  "reward": {
+    "total": 1.04,
+    "components": {
+      "task_success": 0.8,
+      "retrieval_hit": 0.2,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.0,
+      "context_cost": 0.06,
+      "useful_reuse": 0.1
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/trajectories/traj-9144cbc3-1ccf-4660-aad9-8db5797461eb.json b/docs/demo-artifacts/trajectories/traj-9144cbc3-1ccf-4660-aad9-8db5797461eb.json
new file mode 100644
index 0000000..d398499
--- /dev/null
+++ b/docs/demo-artifacts/trajectories/traj-9144cbc3-1ccf-4660-aad9-8db5797461eb.json
@@ -0,0 +1,192 @@
+{
+  "trajectory_id": "traj-9144cbc3-1ccf-4660-aad9-8db5797461eb",
+  "task": {
+    "task_id": "task-57677ff6-710a-478e-9a5d-e1367db05212",
+    "input": "Deploy this service with the usual workflow.",
+    "channel": "telegram",
+    "created_at": "2026-04-14T15:27:38.514525+00:00",
+    "user_id": "oza"
+  },
+  "context_snapshot": {
+    "conversation_summary": "",
+    "environment_summary": "",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-telegram-pref",
+        "type": "memory",
+        "title": "Telegram preference",
+        "summary": "Prefer plain text on Telegram.",
+        "triggers": [
+          "telegram",
+          "preference",
+          "answer"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 0.9,
+        "risk": 0.0,
+        "tags": [
+          "output"
+        ],
+        "source": "user",
+        "type_payload": {}
+      }
+    ],
+    "skill": [
+      {
+        "id": "skill-deploy",
+        "type": "skill",
+        "title": "Deploy workflow",
+        "summary": "Reusable deployment workflow.",
+        "triggers": [
+          "deploy",
+          "workflow",
+          "service"
+        ],
+        "cost": 0.0,
+        "confidence": 0.8,
+        "success_rate": 0.9,
+        "freshness": 0.8,
+        "risk": 0.0,
+        "tags": [
+          "ops"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ],
+    "tool": [
+      {
+        "id": "tool-terminal",
+        "type": "tool",
+        "title": "terminal",
+        "summary": "Run terminal-style inspection commands.",
+        "triggers": [
+          "check",
+          "current",
+          "status",
+          "system"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 1.0,
+        "risk": 0.0,
+        "tags": [
+          "inspection"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ]
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "load_skill",
+      "selected_ids": [
+        "skill-deploy"
+      ],
+      "selected_payloads": [
+        {}
+      ],
+      "rejected_ids": [],
+      "rationale": "Task resembles a reusable procedure; load a skill before action.",
+      "estimated_cost": 0.0
+    }
+  ],
+  "events": [
+    {
+      "event_id": "evt-fce4e540-2400-45f8-8050-50f7631422e4",
+      "trajectory_id": "traj-9144cbc3-1ccf-4660-aad9-8db5797461eb",
+      "timestamp": "2026-04-14T15:27:38.514602+00:00",
+      "stage": "retrieval",
+      "event_type": "task_received",
+      "payload": {
+        "input": "Deploy this service with the usual workflow."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-eef1d203-79d4-4037-ae0b-6dff74e035f5",
+      "trajectory_id": "traj-9144cbc3-1ccf-4660-aad9-8db5797461eb",
+      "timestamp": "2026-04-14T15:27:38.514609+00:00",
+      "stage": "retrieval",
+      "event_type": "candidates_recalled",
+      "payload": {
+        "memory_ids": [
+          "mem-telegram-pref"
+        ],
+        "skill_ids": [
+          "skill-deploy"
+        ],
+        "tool_ids": [
+          "tool-terminal"
+        ]
+      },
+      "metrics": {},
+      "parent_event_id": "evt-fce4e540-2400-45f8-8050-50f7631422e4"
+    },
+    {
+      "event_id": "evt-da150fe5-beff-45b0-a67d-9860205a9690",
+      "trajectory_id": "traj-9144cbc3-1ccf-4660-aad9-8db5797461eb",
+      "timestamp": "2026-04-14T15:27:38.514615+00:00",
+      "stage": "policy",
+      "event_type": "action_selected",
+      "payload": {
+        "step": 1,
+        "decision_type": "load_skill",
+        "selected_ids": [
+          "skill-deploy"
+        ],
+        "selected_payloads": [
+          {}
+        ],
+        "rejected_ids": [],
+        "rationale": "Task resembles a reusable procedure; load a skill before action.",
+        "estimated_cost": 0.0
+      },
+      "metrics": {},
+      "parent_event_id": "evt-eef1d203-79d4-4037-ae0b-6dff74e035f5"
+    },
+    {
+      "event_id": "evt-skill-traj-9144cbc3-1ccf-4660-aad9-8db5797461eb-skill-deploy",
+      "trajectory_id": "traj-9144cbc3-1ccf-4660-aad9-8db5797461eb",
+      "timestamp": "2026-04-14T15:27:38.514623+00:00",
+      "stage": "execution",
+      "event_type": "skill_loaded",
+      "payload": {
+        "skill_id": "skill-deploy",
+        "input": "Deploy this service with the usual workflow.",
+        "instructions": "Demo skill payload loaded successfully."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    }
+  ],
+  "outcome": {
+    "status": "success",
+    "steps": 1,
+    "latency_ms": 0,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Draft trajectory generated by MemabraRunner with execution hooks."
+  },
+  "reward": {
+    "total": 1.04,
+    "components": {
+      "task_success": 0.8,
+      "retrieval_hit": 0.2,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.0,
+      "context_cost": 0.06,
+      "useful_reuse": 0.1
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/trajectories/traj-9190707c-5486-4266-a6c8-32f34c6c63ec.json b/docs/demo-artifacts/trajectories/traj-9190707c-5486-4266-a6c8-32f34c6c63ec.json
new file mode 100644
index 0000000..112e195
--- /dev/null
+++ b/docs/demo-artifacts/trajectories/traj-9190707c-5486-4266-a6c8-32f34c6c63ec.json
@@ -0,0 +1,191 @@
+{
+  "trajectory_id": "traj-9190707c-5486-4266-a6c8-32f34c6c63ec",
+  "task": {
+    "task_id": "task-9f58c7ff-0bfb-4a46-bfbc-94b72b454f44",
+    "input": "Use my telegram preference for this answer.",
+    "channel": "telegram",
+    "created_at": "2026-04-14T14:37:42.379938+00:00",
+    "user_id": "oza"
+  },
+  "context_snapshot": {
+    "conversation_summary": "",
+    "environment_summary": "",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-telegram-pref",
+        "type": "memory",
+        "title": "Telegram preference",
+        "summary": "Prefer plain text on Telegram.",
+        "triggers": [
+          "telegram",
+          "preference",
+          "answer"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 0.9,
+        "risk": 0.0,
+        "tags": [
+          "output"
+        ],
+        "source": "user",
+        "type_payload": {}
+      }
+    ],
+    "skill": [
+      {
+        "id": "skill-deploy",
+        "type": "skill",
+        "title": "Deploy workflow",
+        "summary": "Reusable deployment workflow.",
+        "triggers": [
+          "deploy",
+          "workflow",
+          "service"
+        ],
+        "cost": 0.0,
+        "confidence": 0.8,
+        "success_rate": 0.9,
+        "freshness": 0.8,
+        "risk": 0.0,
+        "tags": [
+          "ops"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ],
+    "tool": [
+      {
+        "id": "tool-terminal",
+        "type": "tool",
+        "title": "terminal",
+        "summary": "Run terminal-style inspection commands.",
+        "triggers": [
+          "check",
+          "current",
+          "status",
+          "system"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 1.0,
+        "risk": 0.0,
+        "tags": [
+          "inspection"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ]
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "inject_memory",
+      "selected_ids": [
+        "mem-telegram-pref"
+      ],
+      "selected_payloads": [
+        {}
+      ],
+      "rejected_ids": [],
+      "rationale": "Task likely depends on stable user/project facts.",
+      "estimated_cost": 0.0
+    }
+  ],
+  "events": [
+    {
+      "event_id": "evt-1e3b099e-dc40-45e4-9710-3d7f96dc459c",
+      "trajectory_id": "traj-9190707c-5486-4266-a6c8-32f34c6c63ec",
+      "timestamp": "2026-04-14T14:37:42.379999+00:00",
+      "stage": "retrieval",
+      "event_type": "task_received",
+      "payload": {
+        "input": "Use my telegram preference for this answer."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-95d17ab2-a0af-44e6-97db-55600c5d0517",
+      "trajectory_id": "traj-9190707c-5486-4266-a6c8-32f34c6c63ec",
+      "timestamp": "2026-04-14T14:37:42.380024+00:00",
+      "stage": "retrieval",
+      "event_type": "candidates_recalled",
+      "payload": {
+        "memory_ids": [
+          "mem-telegram-pref"
+        ],
+        "skill_ids": [
+          "skill-deploy"
+        ],
+        "tool_ids": [
+          "tool-terminal"
+        ]
+      },
+      "metrics": {},
+      "parent_event_id": "evt-1e3b099e-dc40-45e4-9710-3d7f96dc459c"
+    },
+    {
+      "event_id": "evt-cef79d76-9bcf-41c7-a430-13e18d46e95f",
+      "trajectory_id": "traj-9190707c-5486-4266-a6c8-32f34c6c63ec",
+      "timestamp": "2026-04-14T14:37:42.380029+00:00",
+      "stage": "policy",
+      "event_type": "action_selected",
+      "payload": {
+        "step": 1,
+        "decision_type": "inject_memory",
+        "selected_ids": [
+          "mem-telegram-pref"
+        ],
+        "selected_payloads": [
+          {}
+        ],
+        "rejected_ids": [],
+        "rationale": "Task likely depends on stable user/project facts.",
+        "estimated_cost": 0.0
+      },
+      "metrics": {},
+      "parent_event_id": "evt-95d17ab2-a0af-44e6-97db-55600c5d0517"
+    },
+    {
+      "event_id": "evt-memory-traj-9190707c-5486-4266-a6c8-32f34c6c63ec-mem-telegram-pref",
+      "trajectory_id": "traj-9190707c-5486-4266-a6c8-32f34c6c63ec",
+      "timestamp": "2026-04-14T14:37:42.380034+00:00",
+      "stage": "execution",
+      "event_type": "memory_injected",
+      "payload": {
+        "record_id": "mem-telegram-pref",
+        "input": "Use my telegram preference for this answer."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    }
+  ],
+  "outcome": {
+    "status": "success",
+    "steps": 1,
+    "latency_ms": 0,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Draft trajectory generated by MemabraRunner with execution hooks."
+  },
+  "reward": {
+    "total": 1.04,
+    "components": {
+      "task_success": 0.8,
+      "retrieval_hit": 0.2,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.0,
+      "context_cost": 0.06,
+      "useful_reuse": 0.1
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/trajectories/traj-9edc5088-09cc-42d6-a160-cede5357f535.json b/docs/demo-artifacts/trajectories/traj-9edc5088-09cc-42d6-a160-cede5357f535.json
new file mode 100644
index 0000000..04aa383
--- /dev/null
+++ b/docs/demo-artifacts/trajectories/traj-9edc5088-09cc-42d6-a160-cede5357f535.json
@@ -0,0 +1,207 @@
+{
+  "trajectory_id": "traj-9edc5088-09cc-42d6-a160-cede5357f535",
+  "task": {
+    "task_id": "task-18b8251b-4a68-45e1-93ba-645fe21a279f",
+    "input": "Run the deploy workflow skill.",
+    "channel": "local",
+    "created_at": "2026-04-14T15:52:24.603850+00:00",
+    "user_id": null
+  },
+  "context_snapshot": {
+    "conversation_summary": "",
+    "environment_summary": "",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-telegram-pref",
+        "type": "memory",
+        "title": "Telegram preference",
+        "summary": "Prefer plain text on Telegram.",
+        "triggers": [
+          "telegram",
+          "preference",
+          "answer"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 0.9,
+        "risk": 0.0,
+        "tags": [
+          "output"
+        ],
+        "source": "user",
+        "type_payload": {}
+      }
+    ],
+    "skill": [
+      {
+        "id": "skill-deploy",
+        "type": "skill",
+        "title": "Deploy workflow",
+        "summary": "Reusable deployment workflow.",
+        "triggers": [
+          "deploy",
+          "workflow",
+          "service"
+        ],
+        "cost": 0.0,
+        "confidence": 0.8,
+        "success_rate": 0.9,
+        "freshness": 0.8,
+        "risk": 0.0,
+        "tags": [
+          "ops"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ],
+    "tool": [
+      {
+        "id": "tool-terminal",
+        "type": "tool",
+        "title": "terminal",
+        "summary": "Run terminal-style inspection commands.",
+        "triggers": [
+          "check",
+          "current",
+          "status",
+          "system"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 1.0,
+        "risk": 0.0,
+        "tags": [
+          "inspection"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ]
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "call_tool",
+      "selected_ids": [
+        "tool-terminal"
+      ],
+      "selected_payloads": [
+        {}
+      ],
+      "rejected_ids": [],
+      "rationale": "Task asks for current state or external action; tool use is justified.",
+      "estimated_cost": 0.0
+    }
+  ],
+  "events": [
+    {
+      "event_id": "evt-33bd0017-cf1b-44ac-892b-c2004bc44c1a",
+      "trajectory_id": "traj-9edc5088-09cc-42d6-a160-cede5357f535",
+      "timestamp": "2026-04-14T15:52:24.603951+00:00",
+      "stage": "retrieval",
+      "event_type": "task_received",
+      "payload": {
+        "input": "Run the deploy workflow skill."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-de288e29-228d-46d4-a657-34edae35fea4",
+      "trajectory_id": "traj-9edc5088-09cc-42d6-a160-cede5357f535",
+      "timestamp": "2026-04-14T15:52:24.603961+00:00",
+      "stage": "retrieval",
+      "event_type": "candidates_recalled",
+      "payload": {
+        "memory_ids": [
+          "mem-telegram-pref"
+        ],
+        "skill_ids": [
+          "skill-deploy"
+        ],
+        "tool_ids": [
+          "tool-terminal"
+        ]
+      },
+      "metrics": {},
+      "parent_event_id": "evt-33bd0017-cf1b-44ac-892b-c2004bc44c1a"
+    },
+    {
+      "event_id": "evt-256cd272-bcee-48e2-b36b-a4048b6aef3e",
+      "trajectory_id": "traj-9edc5088-09cc-42d6-a160-cede5357f535",
+      "timestamp": "2026-04-14T15:52:24.603968+00:00",
+      "stage": "policy",
+      "event_type": "action_selected",
+      "payload": {
+        "step": 1,
+        "decision_type": "call_tool",
+        "selected_ids": [
+          "tool-terminal"
+        ],
+        "selected_payloads": [
+          {}
+        ],
+        "rejected_ids": [],
+        "rationale": "Task asks for current state or external action; tool use is justified.",
+        "estimated_cost": 0.0
+      },
+      "metrics": {},
+      "parent_event_id": "evt-de288e29-228d-46d4-a657-34edae35fea4"
+    },
+    {
+      "event_id": "evt-tool-traj-9edc5088-09cc-42d6-a160-cede5357f535-tool-terminal",
+      "trajectory_id": "traj-9edc5088-09cc-42d6-a160-cede5357f535",
+      "timestamp": "2026-04-14T15:52:24.603984+00:00",
+      "stage": "execution",
+      "event_type": "tool_called",
+      "payload": {
+        "tool_id": "tool-terminal",
+        "input": "Run the deploy workflow skill."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-tool-result-traj-9edc5088-09cc-42d6-a160-cede5357f535-tool-terminal",
+      "trajectory_id": "traj-9edc5088-09cc-42d6-a160-cede5357f535",
+      "timestamp": "2026-04-14T15:52:24.603990+00:00",
+      "stage": "execution",
+      "event_type": "tool_result",
+      "payload": {
+        "tool_id": "tool-terminal",
+        "status": "success",
+        "output": "demo-result-for:tool-terminal",
+        "error": null,
+        "latency_ms": 42
+      },
+      "metrics": {},
+      "parent_event_id": null
+    }
+  ],
+  "outcome": {
+    "status": "success",
+    "steps": 1,
+    "latency_ms": 42,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Draft trajectory generated by MemabraRunner with execution hooks."
+  },
+  "reward": {
+    "total": 1.032,
+    "components": {
+      "task_success": 0.8,
+      "retrieval_hit": 0.25,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.008,
+      "context_cost": 0.06,
+      "useful_reuse": 0.05
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/trajectories/traj-adb05c91-4c0c-493a-af84-517efea3f406.json b/docs/demo-artifacts/trajectories/traj-adb05c91-4c0c-493a-af84-517efea3f406.json
new file mode 100644
index 0000000..ef0dc41
--- /dev/null
+++ b/docs/demo-artifacts/trajectories/traj-adb05c91-4c0c-493a-af84-517efea3f406.json
@@ -0,0 +1,191 @@
+{
+  "trajectory_id": "traj-adb05c91-4c0c-493a-af84-517efea3f406",
+  "task": {
+    "task_id": "task-66d9a459-4bad-40a5-beda-a9cb30f2e790",
+    "input": "Use my telegram preference for this answer.",
+    "channel": "telegram",
+    "created_at": "2026-04-14T15:27:38.517870+00:00",
+    "user_id": "oza"
+  },
+  "context_snapshot": {
+    "conversation_summary": "",
+    "environment_summary": "",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-telegram-pref",
+        "type": "memory",
+        "title": "Telegram preference",
+        "summary": "Prefer plain text on Telegram.",
+        "triggers": [
+          "telegram",
+          "preference",
+          "answer"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 0.9,
+        "risk": 0.0,
+        "tags": [
+          "output"
+        ],
+        "source": "user",
+        "type_payload": {}
+      }
+    ],
+    "skill": [
+      {
+        "id": "skill-deploy",
+        "type": "skill",
+        "title": "Deploy workflow",
+        "summary": "Reusable deployment workflow.",
+        "triggers": [
+          "deploy",
+          "workflow",
+          "service"
+        ],
+        "cost": 0.0,
+        "confidence": 0.8,
+        "success_rate": 0.9,
+        "freshness": 0.8,
+        "risk": 0.0,
+        "tags": [
+          "ops"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ],
+    "tool": [
+      {
+        "id": "tool-terminal",
+        "type": "tool",
+        "title": "terminal",
+        "summary": "Run terminal-style inspection commands.",
+        "triggers": [
+          "check",
+          "current",
+          "status",
+          "system"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 1.0,
+        "risk": 0.0,
+        "tags": [
+          "inspection"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ]
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "inject_memory",
+      "selected_ids": [
+        "mem-telegram-pref"
+      ],
+      "selected_payloads": [
+        {}
+      ],
+      "rejected_ids": [],
+      "rationale": "Task likely depends on stable user/project facts.",
+      "estimated_cost": 0.0
+    }
+  ],
+  "events": [
+    {
+      "event_id": "evt-57882c3b-f081-4cb6-b622-98594bfd7b82",
+      "trajectory_id": "traj-adb05c91-4c0c-493a-af84-517efea3f406",
+      "timestamp": "2026-04-14T15:27:38.517938+00:00",
+      "stage": "retrieval",
+      "event_type": "task_received",
+      "payload": {
+        "input": "Use my telegram preference for this answer."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-6eafb4ae-7960-4f17-a928-77834f432cbb",
+      "trajectory_id": "traj-adb05c91-4c0c-493a-af84-517efea3f406",
+      "timestamp": "2026-04-14T15:27:38.517945+00:00",
+      "stage": "retrieval",
+      "event_type": "candidates_recalled",
+      "payload": {
+        "memory_ids": [
+          "mem-telegram-pref"
+        ],
+        "skill_ids": [
+          "skill-deploy"
+        ],
+        "tool_ids": [
+          "tool-terminal"
+        ]
+      },
+      "metrics": {},
+      "parent_event_id": "evt-57882c3b-f081-4cb6-b622-98594bfd7b82"
+    },
+    {
+      "event_id": "evt-b90de7a7-83a3-4bed-b63d-bf07ba3fc06a",
+      "trajectory_id": "traj-adb05c91-4c0c-493a-af84-517efea3f406",
+      "timestamp": "2026-04-14T15:27:38.517950+00:00",
+      "stage": "policy",
+      "event_type": "action_selected",
+      "payload": {
+        "step": 1,
+        "decision_type": "inject_memory",
+        "selected_ids": [
+          "mem-telegram-pref"
+        ],
+        "selected_payloads": [
+          {}
+        ],
+        "rejected_ids": [],
+        "rationale": "Task likely depends on stable user/project facts.",
+        "estimated_cost": 0.0
+      },
+      "metrics": {},
+      "parent_event_id": "evt-6eafb4ae-7960-4f17-a928-77834f432cbb"
+    },
+    {
+      "event_id": "evt-memory-traj-adb05c91-4c0c-493a-af84-517efea3f406-mem-telegram-pref",
+      "trajectory_id": "traj-adb05c91-4c0c-493a-af84-517efea3f406",
+      "timestamp": "2026-04-14T15:27:38.517955+00:00",
+      "stage": "execution",
+      "event_type": "memory_injected",
+      "payload": {
+        "record_id": "mem-telegram-pref",
+        "input": "Use my telegram preference for this answer."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    }
+  ],
+  "outcome": {
+    "status": "success",
+    "steps": 1,
+    "latency_ms": 0,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Draft trajectory generated by MemabraRunner with execution hooks."
+  },
+  "reward": {
+    "total": 1.04,
+    "components": {
+      "task_success": 0.8,
+      "retrieval_hit": 0.2,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.0,
+      "context_cost": 0.06,
+      "useful_reuse": 0.1
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/trajectories/traj-affbeb5b-eb52-40fd-94cb-48b7c374f1fc.json b/docs/demo-artifacts/trajectories/traj-affbeb5b-eb52-40fd-94cb-48b7c374f1fc.json
new file mode 100644
index 0000000..4f4b335
--- /dev/null
+++ b/docs/demo-artifacts/trajectories/traj-affbeb5b-eb52-40fd-94cb-48b7c374f1fc.json
@@ -0,0 +1,186 @@
+{
+  "trajectory_id": "traj-affbeb5b-eb52-40fd-94cb-48b7c374f1fc",
+  "task": {
+    "task_id": "task-c88d23cc-88f6-4352-a506-e37187a0e28a",
+    "input": "Deploy this service with the usual workflow.",
+    "channel": "telegram",
+    "created_at": "2026-04-14T06:53:08.732451+00:00",
+    "user_id": "oza"
+  },
+  "context_snapshot": {
+    "conversation_summary": "",
+    "environment_summary": "",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-telegram-pref",
+        "type": "memory",
+        "title": "Telegram preference",
+        "summary": "Prefer plain text on Telegram.",
+        "triggers": [
+          "telegram",
+          "preference",
+          "answer"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 0.9,
+        "risk": 0.0,
+        "tags": [
+          "output"
+        ],
+        "source": "user",
+        "type_payload": {}
+      }
+    ],
+    "skill": [
+      {
+        "id": "skill-deploy",
+        "type": "skill",
+        "title": "Deploy workflow",
+        "summary": "Reusable deployment workflow.",
+        "triggers": [
+          "deploy",
+          "workflow",
+          "service"
+        ],
+        "cost": 0.0,
+        "confidence": 0.8,
+        "success_rate": 0.9,
+        "freshness": 0.8,
+        "risk": 0.0,
+        "tags": [
+          "ops"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ],
+    "tool": [
+      {
+        "id": "tool-terminal",
+        "type": "tool",
+        "title": "terminal",
+        "summary": "Run terminal-style inspection commands.",
+        "triggers": [
+          "check",
+          "current",
+          "status",
+          "system"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 1.0,
+        "risk": 0.0,
+        "tags": [
+          "inspection"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ]
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "load_skill",
+      "selected_ids": [
+        "skill-deploy"
+      ],
+      "rejected_ids": [],
+      "rationale": "Task resembles a reusable procedure; load a skill before action.",
+      "estimated_cost": 0.0
+    }
+  ],
+  "events": [
+    {
+      "event_id": "evt-56b47bb2-7cd9-4d1a-9364-b2b6c2b82759",
+      "trajectory_id": "traj-affbeb5b-eb52-40fd-94cb-48b7c374f1fc",
+      "timestamp": "2026-04-14T06:53:08.732515+00:00",
+      "stage": "retrieval",
+      "event_type": "task_received",
+      "payload": {
+        "input": "Deploy this service with the usual workflow."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-62bc72e7-4b3f-4a72-a98e-1ad5bf86aaa4",
+      "trajectory_id": "traj-affbeb5b-eb52-40fd-94cb-48b7c374f1fc",
+      "timestamp": "2026-04-14T06:53:08.732521+00:00",
+      "stage": "retrieval",
+      "event_type": "candidates_recalled",
+      "payload": {
+        "memory_ids": [
+          "mem-telegram-pref"
+        ],
+        "skill_ids": [
+          "skill-deploy"
+        ],
+        "tool_ids": [
+          "tool-terminal"
+        ]
+      },
+      "metrics": {},
+      "parent_event_id": "evt-56b47bb2-7cd9-4d1a-9364-b2b6c2b82759"
+    },
+    {
+      "event_id": "evt-23968c32-845c-4fb2-86bb-723d70dfec80",
+      "trajectory_id": "traj-affbeb5b-eb52-40fd-94cb-48b7c374f1fc",
+      "timestamp": "2026-04-14T06:53:08.732525+00:00",
+      "stage": "policy",
+      "event_type": "action_selected",
+      "payload": {
+        "step": 1,
+        "decision_type": "load_skill",
+        "selected_ids": [
+          "skill-deploy"
+        ],
+        "rejected_ids": [],
+        "rationale": "Task resembles a reusable procedure; load a skill before action.",
+        "estimated_cost": 0.0
+      },
+      "metrics": {},
+      "parent_event_id": "evt-62bc72e7-4b3f-4a72-a98e-1ad5bf86aaa4"
+    },
+    {
+      "event_id": "evt-skill-traj-affbeb5b-eb52-40fd-94cb-48b7c374f1fc-skill-deploy",
+      "trajectory_id": "traj-affbeb5b-eb52-40fd-94cb-48b7c374f1fc",
+      "timestamp": "2026-04-14T06:53:08.732531+00:00",
+      "stage": "execution",
+      "event_type": "skill_loaded",
+      "payload": {
+        "skill_id": "skill-deploy",
+        "input": "Deploy this service with the usual workflow.",
+        "instructions": "Demo skill payload loaded successfully."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    }
+  ],
+  "outcome": {
+    "status": "success",
+    "steps": 1,
+    "latency_ms": 0,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Draft trajectory generated by MemabraRunner with execution hooks."
+  },
+  "reward": {
+    "total": 1.1,
+    "components": {
+      "task_success": 0.8,
+      "retrieval_hit": 0.2,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.0,
+      "context_cost": 0.0,
+      "useful_reuse": 0.1
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/trajectories/traj-b786c15f-388d-4228-9da4-c9e82b61570a.json b/docs/demo-artifacts/trajectories/traj-b786c15f-388d-4228-9da4-c9e82b61570a.json
new file mode 100644
index 0000000..24c14f5
--- /dev/null
+++ b/docs/demo-artifacts/trajectories/traj-b786c15f-388d-4228-9da4-c9e82b61570a.json
@@ -0,0 +1,191 @@
+{
+  "trajectory_id": "traj-b786c15f-388d-4228-9da4-c9e82b61570a",
+  "task": {
+    "task_id": "task-920b26df-8e03-47b3-af48-99454d142e90",
+    "input": "Recall my saved preference from memory.",
+    "channel": "local",
+    "created_at": "2026-04-14T15:52:24.603298+00:00",
+    "user_id": null
+  },
+  "context_snapshot": {
+    "conversation_summary": "",
+    "environment_summary": "",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-telegram-pref",
+        "type": "memory",
+        "title": "Telegram preference",
+        "summary": "Prefer plain text on Telegram.",
+        "triggers": [
+          "telegram",
+          "preference",
+          "answer"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 0.9,
+        "risk": 0.0,
+        "tags": [
+          "output"
+        ],
+        "source": "user",
+        "type_payload": {}
+      }
+    ],
+    "skill": [
+      {
+        "id": "skill-deploy",
+        "type": "skill",
+        "title": "Deploy workflow",
+        "summary": "Reusable deployment workflow.",
+        "triggers": [
+          "deploy",
+          "workflow",
+          "service"
+        ],
+        "cost": 0.0,
+        "confidence": 0.8,
+        "success_rate": 0.9,
+        "freshness": 0.8,
+        "risk": 0.0,
+        "tags": [
+          "ops"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ],
+    "tool": [
+      {
+        "id": "tool-terminal",
+        "type": "tool",
+        "title": "terminal",
+        "summary": "Run terminal-style inspection commands.",
+        "triggers": [
+          "check",
+          "current",
+          "status",
+          "system"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 1.0,
+        "risk": 0.0,
+        "tags": [
+          "inspection"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ]
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "inject_memory",
+      "selected_ids": [
+        "mem-telegram-pref"
+      ],
+      "selected_payloads": [
+        {}
+      ],
+      "rejected_ids": [],
+      "rationale": "Task likely depends on stable user/project facts.",
+      "estimated_cost": 0.0
+    }
+  ],
+  "events": [
+    {
+      "event_id": "evt-795ad519-4e78-4fdd-b1a9-3e1e2b2cdea0",
+      "trajectory_id": "traj-b786c15f-388d-4228-9da4-c9e82b61570a",
+      "timestamp": "2026-04-14T15:52:24.603384+00:00",
+      "stage": "retrieval",
+      "event_type": "task_received",
+      "payload": {
+        "input": "Recall my saved preference from memory."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-1fbe3cfc-ed78-40f6-b0d9-25ccd14a0110",
+      "trajectory_id": "traj-b786c15f-388d-4228-9da4-c9e82b61570a",
+      "timestamp": "2026-04-14T15:52:24.603390+00:00",
+      "stage": "retrieval",
+      "event_type": "candidates_recalled",
+      "payload": {
+        "memory_ids": [
+          "mem-telegram-pref"
+        ],
+        "skill_ids": [
+          "skill-deploy"
+        ],
+        "tool_ids": [
+          "tool-terminal"
+        ]
+      },
+      "metrics": {},
+      "parent_event_id": "evt-795ad519-4e78-4fdd-b1a9-3e1e2b2cdea0"
+    },
+    {
+      "event_id": "evt-a57f0922-dbfe-424a-a704-2a382ffa219b",
+      "trajectory_id": "traj-b786c15f-388d-4228-9da4-c9e82b61570a",
+      "timestamp": "2026-04-14T15:52:24.603396+00:00",
+      "stage": "policy",
+      "event_type": "action_selected",
+      "payload": {
+        "step": 1,
+        "decision_type": "inject_memory",
+        "selected_ids": [
+          "mem-telegram-pref"
+        ],
+        "selected_payloads": [
+          {}
+        ],
+        "rejected_ids": [],
+        "rationale": "Task likely depends on stable user/project facts.",
+        "estimated_cost": 0.0
+      },
+      "metrics": {},
+      "parent_event_id": "evt-1fbe3cfc-ed78-40f6-b0d9-25ccd14a0110"
+    },
+    {
+      "event_id": "evt-memory-traj-b786c15f-388d-4228-9da4-c9e82b61570a-mem-telegram-pref",
+      "trajectory_id": "traj-b786c15f-388d-4228-9da4-c9e82b61570a",
+      "timestamp": "2026-04-14T15:52:24.603401+00:00",
+      "stage": "execution",
+      "event_type": "memory_injected",
+      "payload": {
+        "record_id": "mem-telegram-pref",
+        "input": "Recall my saved preference from memory."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    }
+  ],
+  "outcome": {
+    "status": "success",
+    "steps": 1,
+    "latency_ms": 0,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Draft trajectory generated by MemabraRunner with execution hooks."
+  },
+  "reward": {
+    "total": 1.04,
+    "components": {
+      "task_success": 0.8,
+      "retrieval_hit": 0.2,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.0,
+      "context_cost": 0.06,
+      "useful_reuse": 0.1
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/trajectories/traj-bcad8fa2-ffd3-4e5b-9ddb-720f3898826e.json b/docs/demo-artifacts/trajectories/traj-bcad8fa2-ffd3-4e5b-9ddb-720f3898826e.json
new file mode 100644
index 0000000..5f2bca2
--- /dev/null
+++ b/docs/demo-artifacts/trajectories/traj-bcad8fa2-ffd3-4e5b-9ddb-720f3898826e.json
@@ -0,0 +1,192 @@
+{
+  "trajectory_id": "traj-bcad8fa2-ffd3-4e5b-9ddb-720f3898826e",
+  "task": {
+    "task_id": "task-35b31642-86af-4e2c-a255-cdbe19659101",
+    "input": "Deploy this service with the usual workflow.",
+    "channel": "local",
+    "created_at": "2026-04-14T14:37:42.382074+00:00",
+    "user_id": "oza"
+  },
+  "context_snapshot": {
+    "conversation_summary": "",
+    "environment_summary": "",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-telegram-pref",
+        "type": "memory",
+        "title": "Telegram preference",
+        "summary": "Prefer plain text on Telegram.",
+        "triggers": [
+          "telegram",
+          "preference",
+          "answer"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 0.9,
+        "risk": 0.0,
+        "tags": [
+          "output"
+        ],
+        "source": "user",
+        "type_payload": {}
+      }
+    ],
+    "skill": [
+      {
+        "id": "skill-deploy",
+        "type": "skill",
+        "title": "Deploy workflow",
+        "summary": "Reusable deployment workflow.",
+        "triggers": [
+          "deploy",
+          "workflow",
+          "service"
+        ],
+        "cost": 0.0,
+        "confidence": 0.8,
+        "success_rate": 0.9,
+        "freshness": 0.8,
+        "risk": 0.0,
+        "tags": [
+          "ops"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ],
+    "tool": [
+      {
+        "id": "tool-terminal",
+        "type": "tool",
+        "title": "terminal",
+        "summary": "Run terminal-style inspection commands.",
+        "triggers": [
+          "check",
+          "current",
+          "status",
+          "system"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 1.0,
+        "risk": 0.0,
+        "tags": [
+          "inspection"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ]
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "load_skill",
+      "selected_ids": [
+        "skill-deploy"
+      ],
+      "selected_payloads": [
+        {}
+      ],
+      "rejected_ids": [],
+      "rationale": "Predicted by learning router (score=1941.615).",
+      "estimated_cost": 0.0
+    }
+  ],
+  "events": [
+    {
+      "event_id": "evt-8f072b70-4161-46fc-bede-cceb930d4cc2",
+      "trajectory_id": "traj-bcad8fa2-ffd3-4e5b-9ddb-720f3898826e",
+      "timestamp": "2026-04-14T14:37:42.382140+00:00",
+      "stage": "retrieval",
+      "event_type": "task_received",
+      "payload": {
+        "input": "Deploy this service with the usual workflow."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-fc15daf4-f738-455e-8b12-39143b3c3d6c",
+      "trajectory_id": "traj-bcad8fa2-ffd3-4e5b-9ddb-720f3898826e",
+      "timestamp": "2026-04-14T14:37:42.382146+00:00",
+      "stage": "retrieval",
+      "event_type": "candidates_recalled",
+      "payload": {
+        "memory_ids": [
+          "mem-telegram-pref"
+        ],
+        "skill_ids": [
+          "skill-deploy"
+        ],
+        "tool_ids": [
+          "tool-terminal"
+        ]
+      },
+      "metrics": {},
+      "parent_event_id": "evt-8f072b70-4161-46fc-bede-cceb930d4cc2"
+    },
+    {
+      "event_id": "evt-d899c751-6157-4548-893e-b766eeafeb3d",
+      "trajectory_id": "traj-bcad8fa2-ffd3-4e5b-9ddb-720f3898826e",
+      "timestamp": "2026-04-14T14:37:42.382150+00:00",
+      "stage": "policy",
+      "event_type": "action_selected",
+      "payload": {
+        "step": 1,
+        "decision_type": "load_skill",
+        "selected_ids": [
+          "skill-deploy"
+        ],
+        "selected_payloads": [
+          {}
+        ],
+        "rejected_ids": [],
+        "rationale": "Predicted by learning router (score=1941.615).",
+        "estimated_cost": 0.0
+      },
+      "metrics": {},
+      "parent_event_id": "evt-fc15daf4-f738-455e-8b12-39143b3c3d6c"
+    },
+    {
+      "event_id": "evt-skill-traj-bcad8fa2-ffd3-4e5b-9ddb-720f3898826e-skill-deploy",
+      "trajectory_id": "traj-bcad8fa2-ffd3-4e5b-9ddb-720f3898826e",
+      "timestamp": "2026-04-14T14:37:42.382155+00:00",
+      "stage": "execution",
+      "event_type": "skill_loaded",
+      "payload": {
+        "skill_id": "skill-deploy",
+        "input": "Deploy this service with the usual workflow.",
+        "instructions": "Demo skill payload loaded successfully."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    }
+  ],
+  "outcome": {
+    "status": "success",
+    "steps": 1,
+    "latency_ms": 0,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Draft trajectory generated by MemabraRunner with execution hooks."
+  },
+  "reward": {
+    "total": 1.04,
+    "components": {
+      "task_success": 0.8,
+      "retrieval_hit": 0.2,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.0,
+      "context_cost": 0.06,
+      "useful_reuse": 0.1
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/trajectories/traj-c0faa5d1-dcb4-4e86-ac6b-2abb15026f43.json b/docs/demo-artifacts/trajectories/traj-c0faa5d1-dcb4-4e86-ac6b-2abb15026f43.json
new file mode 100644
index 0000000..5609905
--- /dev/null
+++ b/docs/demo-artifacts/trajectories/traj-c0faa5d1-dcb4-4e86-ac6b-2abb15026f43.json
@@ -0,0 +1,207 @@
+{
+  "trajectory_id": "traj-c0faa5d1-dcb4-4e86-ac6b-2abb15026f43",
+  "task": {
+    "task_id": "task-1a24d0bb-b2e6-44f0-8095-2ed74368dc9d",
+    "input": "Run the deploy workflow skill.",
+    "channel": "local",
+    "created_at": "2026-04-14T16:50:18.861760+00:00",
+    "user_id": null
+  },
+  "context_snapshot": {
+    "conversation_summary": "",
+    "environment_summary": "",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-telegram-pref",
+        "type": "memory",
+        "title": "Telegram preference",
+        "summary": "Prefer plain text on Telegram.",
+        "triggers": [
+          "telegram",
+          "preference",
+          "answer"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 0.9,
+        "risk": 0.0,
+        "tags": [
+          "output"
+        ],
+        "source": "user",
+        "type_payload": {}
+      }
+    ],
+    "skill": [
+      {
+        "id": "skill-deploy",
+        "type": "skill",
+        "title": "Deploy workflow",
+        "summary": "Reusable deployment workflow.",
+        "triggers": [
+          "deploy",
+          "workflow",
+          "service"
+        ],
+        "cost": 0.0,
+        "confidence": 0.8,
+        "success_rate": 0.9,
+        "freshness": 0.8,
+        "risk": 0.0,
+        "tags": [
+          "ops"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ],
+    "tool": [
+      {
+        "id": "tool-terminal",
+        "type": "tool",
+        "title": "terminal",
+        "summary": "Run terminal-style inspection commands.",
+        "triggers": [
+          "check",
+          "current",
+          "status",
+          "system"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 1.0,
+        "risk": 0.0,
+        "tags": [
+          "inspection"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ]
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "call_tool",
+      "selected_ids": [
+        "tool-terminal"
+      ],
+      "selected_payloads": [
+        {}
+      ],
+      "rejected_ids": [],
+      "rationale": "Task asks for current state or external action; tool use is justified.",
+      "estimated_cost": 0.0
+    }
+  ],
+  "events": [
+    {
+      "event_id": "evt-b4437076-cc94-4903-a2c7-3dd7c644dcc5",
+      "trajectory_id": "traj-c0faa5d1-dcb4-4e86-ac6b-2abb15026f43",
+      "timestamp": "2026-04-14T16:50:18.861861+00:00",
+      "stage": "retrieval",
+      "event_type": "task_received",
+      "payload": {
+        "input": "Run the deploy workflow skill."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-dd3dad15-7ace-47f7-9dd0-cf4955aa16ec",
+      "trajectory_id": "traj-c0faa5d1-dcb4-4e86-ac6b-2abb15026f43",
+      "timestamp": "2026-04-14T16:50:18.861871+00:00",
+      "stage": "retrieval",
+      "event_type": "candidates_recalled",
+      "payload": {
+        "memory_ids": [
+          "mem-telegram-pref"
+        ],
+        "skill_ids": [
+          "skill-deploy"
+        ],
+        "tool_ids": [
+          "tool-terminal"
+        ]
+      },
+      "metrics": {},
+      "parent_event_id": "evt-b4437076-cc94-4903-a2c7-3dd7c644dcc5"
+    },
+    {
+      "event_id": "evt-c3b04a4a-2506-47db-8d08-c8939c0eba08",
+      "trajectory_id": "traj-c0faa5d1-dcb4-4e86-ac6b-2abb15026f43",
+      "timestamp": "2026-04-14T16:50:18.861878+00:00",
+      "stage": "policy",
+      "event_type": "action_selected",
+      "payload": {
+        "step": 1,
+        "decision_type": "call_tool",
+        "selected_ids": [
+          "tool-terminal"
+        ],
+        "selected_payloads": [
+          {}
+        ],
+        "rejected_ids": [],
+        "rationale": "Task asks for current state or external action; tool use is justified.",
+        "estimated_cost": 0.0
+      },
+      "metrics": {},
+      "parent_event_id": "evt-dd3dad15-7ace-47f7-9dd0-cf4955aa16ec"
+    },
+    {
+      "event_id": "evt-tool-traj-c0faa5d1-dcb4-4e86-ac6b-2abb15026f43-tool-terminal",
+      "trajectory_id": "traj-c0faa5d1-dcb4-4e86-ac6b-2abb15026f43",
+      "timestamp": "2026-04-14T16:50:18.861901+00:00",
+      "stage": "execution",
+      "event_type": "tool_called",
+      "payload": {
+        "tool_id": "tool-terminal",
+        "input": "Run the deploy workflow skill."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-tool-result-traj-c0faa5d1-dcb4-4e86-ac6b-2abb15026f43-tool-terminal",
+      "trajectory_id": "traj-c0faa5d1-dcb4-4e86-ac6b-2abb15026f43",
+      "timestamp": "2026-04-14T16:50:18.861906+00:00",
+      "stage": "execution",
+      "event_type": "tool_result",
+      "payload": {
+        "tool_id": "tool-terminal",
+        "status": "success",
+        "output": "demo-result-for:tool-terminal",
+        "error": null,
+        "latency_ms": 42
+      },
+      "metrics": {},
+      "parent_event_id": null
+    }
+  ],
+  "outcome": {
+    "status": "success",
+    "steps": 1,
+    "latency_ms": 42,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Draft trajectory generated by MemabraRunner with execution hooks."
+  },
+  "reward": {
+    "total": 1.032,
+    "components": {
+      "task_success": 0.8,
+      "retrieval_hit": 0.25,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.008,
+      "context_cost": 0.06,
+      "useful_reuse": 0.05
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/trajectories/traj-c5907bfb-61d2-47f9-a6c5-2300701bb551.json b/docs/demo-artifacts/trajectories/traj-c5907bfb-61d2-47f9-a6c5-2300701bb551.json
new file mode 100644
index 0000000..f87dcef
--- /dev/null
+++ b/docs/demo-artifacts/trajectories/traj-c5907bfb-61d2-47f9-a6c5-2300701bb551.json
@@ -0,0 +1,191 @@
+{
+  "trajectory_id": "traj-c5907bfb-61d2-47f9-a6c5-2300701bb551",
+  "task": {
+    "task_id": "task-c1f58e80-f0eb-47e9-92ab-9b1a84351dff",
+    "input": "Use my telegram preference for this answer.",
+    "channel": "telegram",
+    "created_at": "2026-04-14T15:27:38.512116+00:00",
+    "user_id": "oza"
+  },
+  "context_snapshot": {
+    "conversation_summary": "",
+    "environment_summary": "",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-telegram-pref",
+        "type": "memory",
+        "title": "Telegram preference",
+        "summary": "Prefer plain text on Telegram.",
+        "triggers": [
+          "telegram",
+          "preference",
+          "answer"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 0.9,
+        "risk": 0.0,
+        "tags": [
+          "output"
+        ],
+        "source": "user",
+        "type_payload": {}
+      }
+    ],
+    "skill": [
+      {
+        "id": "skill-deploy",
+        "type": "skill",
+        "title": "Deploy workflow",
+        "summary": "Reusable deployment workflow.",
+        "triggers": [
+          "deploy",
+          "workflow",
+          "service"
+        ],
+        "cost": 0.0,
+        "confidence": 0.8,
+        "success_rate": 0.9,
+        "freshness": 0.8,
+        "risk": 0.0,
+        "tags": [
+          "ops"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ],
+    "tool": [
+      {
+        "id": "tool-terminal",
+        "type": "tool",
+        "title": "terminal",
+        "summary": "Run terminal-style inspection commands.",
+        "triggers": [
+          "check",
+          "current",
+          "status",
+          "system"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 1.0,
+        "risk": 0.0,
+        "tags": [
+          "inspection"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ]
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "inject_memory",
+      "selected_ids": [
+        "mem-telegram-pref"
+      ],
+      "selected_payloads": [
+        {}
+      ],
+      "rejected_ids": [],
+      "rationale": "Task likely depends on stable user/project facts.",
+      "estimated_cost": 0.0
+    }
+  ],
+  "events": [
+    {
+      "event_id": "evt-212f6d74-bafd-483b-b8ec-cf4a33bf67da",
+      "trajectory_id": "traj-c5907bfb-61d2-47f9-a6c5-2300701bb551",
+      "timestamp": "2026-04-14T15:27:38.512204+00:00",
+      "stage": "retrieval",
+      "event_type": "task_received",
+      "payload": {
+        "input": "Use my telegram preference for this answer."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-34b409a4-9ba9-4921-b3a6-e4c41bf7660c",
+      "trajectory_id": "traj-c5907bfb-61d2-47f9-a6c5-2300701bb551",
+      "timestamp": "2026-04-14T15:27:38.512211+00:00",
+      "stage": "retrieval",
+      "event_type": "candidates_recalled",
+      "payload": {
+        "memory_ids": [
+          "mem-telegram-pref"
+        ],
+        "skill_ids": [
+          "skill-deploy"
+        ],
+        "tool_ids": [
+          "tool-terminal"
+        ]
+      },
+      "metrics": {},
+      "parent_event_id": "evt-212f6d74-bafd-483b-b8ec-cf4a33bf67da"
+    },
+    {
+      "event_id": "evt-d117772a-0e77-4068-8ca5-0adacfcee184",
+      "trajectory_id": "traj-c5907bfb-61d2-47f9-a6c5-2300701bb551",
+      "timestamp": "2026-04-14T15:27:38.512216+00:00",
+      "stage": "policy",
+      "event_type": "action_selected",
+      "payload": {
+        "step": 1,
+        "decision_type": "inject_memory",
+        "selected_ids": [
+          "mem-telegram-pref"
+        ],
+        "selected_payloads": [
+          {}
+        ],
+        "rejected_ids": [],
+        "rationale": "Task likely depends on stable user/project facts.",
+        "estimated_cost": 0.0
+      },
+      "metrics": {},
+      "parent_event_id": "evt-34b409a4-9ba9-4921-b3a6-e4c41bf7660c"
+    },
+    {
+      "event_id": "evt-memory-traj-c5907bfb-61d2-47f9-a6c5-2300701bb551-mem-telegram-pref",
+      "trajectory_id": "traj-c5907bfb-61d2-47f9-a6c5-2300701bb551",
+      "timestamp": "2026-04-14T15:27:38.512223+00:00",
+      "stage": "execution",
+      "event_type": "memory_injected",
+      "payload": {
+        "record_id": "mem-telegram-pref",
+        "input": "Use my telegram preference for this answer."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    }
+  ],
+  "outcome": {
+    "status": "success",
+    "steps": 1,
+    "latency_ms": 0,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Draft trajectory generated by MemabraRunner with execution hooks."
+  },
+  "reward": {
+    "total": 1.04,
+    "components": {
+      "task_success": 0.8,
+      "retrieval_hit": 0.2,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.0,
+      "context_cost": 0.06,
+      "useful_reuse": 0.1
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/trajectories/traj-c9c11bdc-852b-4aef-851c-f2968806e535.json b/docs/demo-artifacts/trajectories/traj-c9c11bdc-852b-4aef-851c-f2968806e535.json
new file mode 100644
index 0000000..a15aa41
--- /dev/null
+++ b/docs/demo-artifacts/trajectories/traj-c9c11bdc-852b-4aef-851c-f2968806e535.json
@@ -0,0 +1,191 @@
+{
+  "trajectory_id": "traj-c9c11bdc-852b-4aef-851c-f2968806e535",
+  "task": {
+    "task_id": "task-c08fbd42-a324-4430-8277-94c666661238",
+    "input": "Check the current system status.",
+    "channel": "local",
+    "created_at": "2026-04-14T15:27:38.520185+00:00",
+    "user_id": "oza"
+  },
+  "context_snapshot": {
+    "conversation_summary": "",
+    "environment_summary": "",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-telegram-pref",
+        "type": "memory",
+        "title": "Telegram preference",
+        "summary": "Prefer plain text on Telegram.",
+        "triggers": [
+          "telegram",
+          "preference",
+          "answer"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 0.9,
+        "risk": 0.0,
+        "tags": [
+          "output"
+        ],
+        "source": "user",
+        "type_payload": {}
+      }
+    ],
+    "skill": [
+      {
+        "id": "skill-deploy",
+        "type": "skill",
+        "title": "Deploy workflow",
+        "summary": "Reusable deployment workflow.",
+        "triggers": [
+          "deploy",
+          "workflow",
+          "service"
+        ],
+        "cost": 0.0,
+        "confidence": 0.8,
+        "success_rate": 0.9,
+        "freshness": 0.8,
+        "risk": 0.0,
+        "tags": [
+          "ops"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ],
+    "tool": [
+      {
+        "id": "tool-terminal",
+        "type": "tool",
+        "title": "terminal",
+        "summary": "Run terminal-style inspection commands.",
+        "triggers": [
+          "check",
+          "current",
+          "status",
+          "system"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 1.0,
+        "risk": 0.0,
+        "tags": [
+          "inspection"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ]
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "inject_memory",
+      "selected_ids": [
+        "mem-telegram-pref"
+      ],
+      "selected_payloads": [
+        {}
+      ],
+      "rejected_ids": [],
+      "rationale": "Predicted by learning router (score=1381.615).",
+      "estimated_cost": 0.0
+    }
+  ],
+  "events": [
+    {
+      "event_id": "evt-2e0920c4-6830-4c86-a4a3-139028e46176",
+      "trajectory_id": "traj-c9c11bdc-852b-4aef-851c-f2968806e535",
+      "timestamp": "2026-04-14T15:27:38.520262+00:00",
+      "stage": "retrieval",
+      "event_type": "task_received",
+      "payload": {
+        "input": "Check the current system status."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-f81b2a77-a012-4c62-9700-93f1b31daeb2",
+      "trajectory_id": "traj-c9c11bdc-852b-4aef-851c-f2968806e535",
+      "timestamp": "2026-04-14T15:27:38.520268+00:00",
+      "stage": "retrieval",
+      "event_type": "candidates_recalled",
+      "payload": {
+        "memory_ids": [
+          "mem-telegram-pref"
+        ],
+        "skill_ids": [
+          "skill-deploy"
+        ],
+        "tool_ids": [
+          "tool-terminal"
+        ]
+      },
+      "metrics": {},
+      "parent_event_id": "evt-2e0920c4-6830-4c86-a4a3-139028e46176"
+    },
+    {
+      "event_id": "evt-2b1fe09d-30b3-46c9-a706-373d5c8da08e",
+      "trajectory_id": "traj-c9c11bdc-852b-4aef-851c-f2968806e535",
+      "timestamp": "2026-04-14T15:27:38.520273+00:00",
+      "stage": "policy",
+      "event_type": "action_selected",
+      "payload": {
+        "step": 1,
+        "decision_type": "inject_memory",
+        "selected_ids": [
+          "mem-telegram-pref"
+        ],
+        "selected_payloads": [
+          {}
+        ],
+        "rejected_ids": [],
+        "rationale": "Predicted by learning router (score=1381.615).",
+        "estimated_cost": 0.0
+      },
+      "metrics": {},
+      "parent_event_id": "evt-f81b2a77-a012-4c62-9700-93f1b31daeb2"
+    },
+    {
+      "event_id": "evt-memory-traj-c9c11bdc-852b-4aef-851c-f2968806e535-mem-telegram-pref",
+      "trajectory_id": "traj-c9c11bdc-852b-4aef-851c-f2968806e535",
+      "timestamp": "2026-04-14T15:27:38.520280+00:00",
+      "stage": "execution",
+      "event_type": "memory_injected",
+      "payload": {
+        "record_id": "mem-telegram-pref",
+        "input": "Check the current system status."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    }
+  ],
+  "outcome": {
+    "status": "success",
+    "steps": 1,
+    "latency_ms": 0,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Draft trajectory generated by MemabraRunner with execution hooks."
+  },
+  "reward": {
+    "total": 1.04,
+    "components": {
+      "task_success": 0.8,
+      "retrieval_hit": 0.2,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.0,
+      "context_cost": 0.06,
+      "useful_reuse": 0.1
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/trajectories/traj-d2d3a115-36d8-466f-9d14-bf741316f698.json b/docs/demo-artifacts/trajectories/traj-d2d3a115-36d8-466f-9d14-bf741316f698.json
new file mode 100644
index 0000000..94906f7
--- /dev/null
+++ b/docs/demo-artifacts/trajectories/traj-d2d3a115-36d8-466f-9d14-bf741316f698.json
@@ -0,0 +1,201 @@
+{
+  "trajectory_id": "traj-d2d3a115-36d8-466f-9d14-bf741316f698",
+  "task": {
+    "task_id": "task-00ccd7d0-72d9-458f-87fa-be0ee5571e44",
+    "input": "Check the current system status.",
+    "channel": "telegram",
+    "created_at": "2026-04-14T06:53:08.731950+00:00",
+    "user_id": "oza"
+  },
+  "context_snapshot": {
+    "conversation_summary": "",
+    "environment_summary": "",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-telegram-pref",
+        "type": "memory",
+        "title": "Telegram preference",
+        "summary": "Prefer plain text on Telegram.",
+        "triggers": [
+          "telegram",
+          "preference",
+          "answer"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 0.9,
+        "risk": 0.0,
+        "tags": [
+          "output"
+        ],
+        "source": "user",
+        "type_payload": {}
+      }
+    ],
+    "skill": [
+      {
+        "id": "skill-deploy",
+        "type": "skill",
+        "title": "Deploy workflow",
+        "summary": "Reusable deployment workflow.",
+        "triggers": [
+          "deploy",
+          "workflow",
+          "service"
+        ],
+        "cost": 0.0,
+        "confidence": 0.8,
+        "success_rate": 0.9,
+        "freshness": 0.8,
+        "risk": 0.0,
+        "tags": [
+          "ops"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ],
+    "tool": [
+      {
+        "id": "tool-terminal",
+        "type": "tool",
+        "title": "terminal",
+        "summary": "Run terminal-style inspection commands.",
+        "triggers": [
+          "check",
+          "current",
+          "status",
+          "system"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 1.0,
+        "risk": 0.0,
+        "tags": [
+          "inspection"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ]
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "call_tool",
+      "selected_ids": [
+        "tool-terminal"
+      ],
+      "rejected_ids": [],
+      "rationale": "Task asks for current state or external action; tool use is justified.",
+      "estimated_cost": 0.0
+    }
+  ],
+  "events": [
+    {
+      "event_id": "evt-63d64eb8-16b1-4dc7-ae03-7c094bc6e64f",
+      "trajectory_id": "traj-d2d3a115-36d8-466f-9d14-bf741316f698",
+      "timestamp": "2026-04-14T06:53:08.732042+00:00",
+      "stage": "retrieval",
+      "event_type": "task_received",
+      "payload": {
+        "input": "Check the current system status."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-04ef718b-6973-465d-920e-bc501a6e02ad",
+      "trajectory_id": "traj-d2d3a115-36d8-466f-9d14-bf741316f698",
+      "timestamp": "2026-04-14T06:53:08.732049+00:00",
+      "stage": "retrieval",
+      "event_type": "candidates_recalled",
+      "payload": {
+        "memory_ids": [
+          "mem-telegram-pref"
+        ],
+        "skill_ids": [
+          "skill-deploy"
+        ],
+        "tool_ids": [
+          "tool-terminal"
+        ]
+      },
+      "metrics": {},
+      "parent_event_id": "evt-63d64eb8-16b1-4dc7-ae03-7c094bc6e64f"
+    },
+    {
+      "event_id": "evt-50f19e1e-8771-42c1-8846-95b5e4a6f491",
+      "trajectory_id": "traj-d2d3a115-36d8-466f-9d14-bf741316f698",
+      "timestamp": "2026-04-14T06:53:08.732053+00:00",
+      "stage": "policy",
+      "event_type": "action_selected",
+      "payload": {
+        "step": 1,
+        "decision_type": "call_tool",
+        "selected_ids": [
+          "tool-terminal"
+        ],
+        "rejected_ids": [],
+        "rationale": "Task asks for current state or external action; tool use is justified.",
+        "estimated_cost": 0.0
+      },
+      "metrics": {},
+      "parent_event_id": "evt-04ef718b-6973-465d-920e-bc501a6e02ad"
+    },
+    {
+      "event_id": "evt-tool-traj-d2d3a115-36d8-466f-9d14-bf741316f698-tool-terminal",
+      "trajectory_id": "traj-d2d3a115-36d8-466f-9d14-bf741316f698",
+      "timestamp": "2026-04-14T06:53:08.732064+00:00",
+      "stage": "execution",
+      "event_type": "tool_called",
+      "payload": {
+        "tool_id": "tool-terminal",
+        "input": "Check the current system status."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-tool-result-traj-d2d3a115-36d8-466f-9d14-bf741316f698-tool-terminal",
+      "trajectory_id": "traj-d2d3a115-36d8-466f-9d14-bf741316f698",
+      "timestamp": "2026-04-14T06:53:08.732068+00:00",
+      "stage": "execution",
+      "event_type": "tool_result",
+      "payload": {
+        "tool_id": "tool-terminal",
+        "status": "success",
+        "output": "demo-result-for:tool-terminal",
+        "error": null,
+        "latency_ms": 42
+      },
+      "metrics": {},
+      "parent_event_id": null
+    }
+  ],
+  "outcome": {
+    "status": "success",
+    "steps": 1,
+    "latency_ms": 42,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Draft trajectory generated by MemabraRunner with execution hooks."
+  },
+  "reward": {
+    "total": 1.058,
+    "components": {
+      "task_success": 0.8,
+      "retrieval_hit": 0.25,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.042,
+      "context_cost": 0.0,
+      "useful_reuse": 0.05
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/trajectories/traj-d3575889-7458-44b9-b3f1-f04cd766ca76.json b/docs/demo-artifacts/trajectories/traj-d3575889-7458-44b9-b3f1-f04cd766ca76.json
new file mode 100644
index 0000000..be5c37a
--- /dev/null
+++ b/docs/demo-artifacts/trajectories/traj-d3575889-7458-44b9-b3f1-f04cd766ca76.json
@@ -0,0 +1,191 @@
+{
+  "trajectory_id": "traj-d3575889-7458-44b9-b3f1-f04cd766ca76",
+  "task": {
+    "task_id": "task-9db54b7d-a508-49ac-bd3c-bd5af3eabc61",
+    "input": "Deploy this service with the usual workflow.",
+    "channel": "local",
+    "created_at": "2026-04-14T15:27:38.520867+00:00",
+    "user_id": "oza"
+  },
+  "context_snapshot": {
+    "conversation_summary": "",
+    "environment_summary": "",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-telegram-pref",
+        "type": "memory",
+        "title": "Telegram preference",
+        "summary": "Prefer plain text on Telegram.",
+        "triggers": [
+          "telegram",
+          "preference",
+          "answer"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 0.9,
+        "risk": 0.0,
+        "tags": [
+          "output"
+        ],
+        "source": "user",
+        "type_payload": {}
+      }
+    ],
+    "skill": [
+      {
+        "id": "skill-deploy",
+        "type": "skill",
+        "title": "Deploy workflow",
+        "summary": "Reusable deployment workflow.",
+        "triggers": [
+          "deploy",
+          "workflow",
+          "service"
+        ],
+        "cost": 0.0,
+        "confidence": 0.8,
+        "success_rate": 0.9,
+        "freshness": 0.8,
+        "risk": 0.0,
+        "tags": [
+          "ops"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ],
+    "tool": [
+      {
+        "id": "tool-terminal",
+        "type": "tool",
+        "title": "terminal",
+        "summary": "Run terminal-style inspection commands.",
+        "triggers": [
+          "check",
+          "current",
+          "status",
+          "system"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 1.0,
+        "risk": 0.0,
+        "tags": [
+          "inspection"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ]
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "inject_memory",
+      "selected_ids": [
+        "mem-telegram-pref"
+      ],
+      "selected_payloads": [
+        {}
+      ],
+      "rejected_ids": [],
+      "rationale": "Predicted by learning router (score=1897.615).",
+      "estimated_cost": 0.0
+    }
+  ],
+  "events": [
+    {
+      "event_id": "evt-3e658630-fea8-44c3-afd2-fc936a2eed37",
+      "trajectory_id": "traj-d3575889-7458-44b9-b3f1-f04cd766ca76",
+      "timestamp": "2026-04-14T15:27:38.520945+00:00",
+      "stage": "retrieval",
+      "event_type": "task_received",
+      "payload": {
+        "input": "Deploy this service with the usual workflow."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-03990424-3433-4147-a963-353863758b31",
+      "trajectory_id": "traj-d3575889-7458-44b9-b3f1-f04cd766ca76",
+      "timestamp": "2026-04-14T15:27:38.520951+00:00",
+      "stage": "retrieval",
+      "event_type": "candidates_recalled",
+      "payload": {
+        "memory_ids": [
+          "mem-telegram-pref"
+        ],
+        "skill_ids": [
+          "skill-deploy"
+        ],
+        "tool_ids": [
+          "tool-terminal"
+        ]
+      },
+      "metrics": {},
+      "parent_event_id": "evt-3e658630-fea8-44c3-afd2-fc936a2eed37"
+    },
+    {
+      "event_id": "evt-10dfab37-ded7-473e-9de9-2f922c5bf7c8",
+      "trajectory_id": "traj-d3575889-7458-44b9-b3f1-f04cd766ca76",
+      "timestamp": "2026-04-14T15:27:38.520956+00:00",
+      "stage": "policy",
+      "event_type": "action_selected",
+      "payload": {
+        "step": 1,
+        "decision_type": "inject_memory",
+        "selected_ids": [
+          "mem-telegram-pref"
+        ],
+        "selected_payloads": [
+          {}
+        ],
+        "rejected_ids": [],
+        "rationale": "Predicted by learning router (score=1897.615).",
+        "estimated_cost": 0.0
+      },
+      "metrics": {},
+      "parent_event_id": "evt-03990424-3433-4147-a963-353863758b31"
+    },
+    {
+      "event_id": "evt-memory-traj-d3575889-7458-44b9-b3f1-f04cd766ca76-mem-telegram-pref",
+      "trajectory_id": "traj-d3575889-7458-44b9-b3f1-f04cd766ca76",
+      "timestamp": "2026-04-14T15:27:38.520961+00:00",
+      "stage": "execution",
+      "event_type": "memory_injected",
+      "payload": {
+        "record_id": "mem-telegram-pref",
+        "input": "Deploy this service with the usual workflow."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    }
+  ],
+  "outcome": {
+    "status": "success",
+    "steps": 1,
+    "latency_ms": 0,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Draft trajectory generated by MemabraRunner with execution hooks."
+  },
+  "reward": {
+    "total": 1.04,
+    "components": {
+      "task_success": 0.8,
+      "retrieval_hit": 0.2,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.0,
+      "context_cost": 0.06,
+      "useful_reuse": 0.1
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/trajectories/traj-d99b5307-1749-4e80-867a-877e087f226f.json b/docs/demo-artifacts/trajectories/traj-d99b5307-1749-4e80-867a-877e087f226f.json
new file mode 100644
index 0000000..80ca2a8
--- /dev/null
+++ b/docs/demo-artifacts/trajectories/traj-d99b5307-1749-4e80-867a-877e087f226f.json
@@ -0,0 +1,170 @@
+{
+  "trajectory_id": "traj-d99b5307-1749-4e80-867a-877e087f226f",
+  "task": {
+    "task_id": "task-9cda8e38-dcdf-4877-bc19-48444df0531e",
+    "input": "Use multiple capabilities: memory, skill, and tool.",
+    "channel": "local",
+    "created_at": "2026-04-14T16:50:18.865109+00:00",
+    "user_id": null
+  },
+  "context_snapshot": {
+    "conversation_summary": "",
+    "environment_summary": "",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-telegram-pref",
+        "type": "memory",
+        "title": "Telegram preference",
+        "summary": "Prefer plain text on Telegram.",
+        "triggers": [
+          "telegram",
+          "preference",
+          "answer"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 0.9,
+        "risk": 0.0,
+        "tags": [
+          "output"
+        ],
+        "source": "user",
+        "type_payload": {}
+      }
+    ],
+    "skill": [
+      {
+        "id": "skill-deploy",
+        "type": "skill",
+        "title": "Deploy workflow",
+        "summary": "Reusable deployment workflow.",
+        "triggers": [
+          "deploy",
+          "workflow",
+          "service"
+        ],
+        "cost": 0.0,
+        "confidence": 0.8,
+        "success_rate": 0.9,
+        "freshness": 0.8,
+        "risk": 0.0,
+        "tags": [
+          "ops"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ],
+    "tool": [
+      {
+        "id": "tool-terminal",
+        "type": "tool",
+        "title": "terminal",
+        "summary": "Run terminal-style inspection commands.",
+        "triggers": [
+          "check",
+          "current",
+          "status",
+          "system"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 1.0,
+        "risk": 0.0,
+        "tags": [
+          "inspection"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ]
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "clarify",
+      "selected_ids": [],
+      "selected_payloads": [],
+      "rejected_ids": [],
+      "rationale": "Predicted by learning router (score=2606.615).",
+      "estimated_cost": 0.0
+    }
+  ],
+  "events": [
+    {
+      "event_id": "evt-88a21058-c409-4836-a1b8-ef6cc63ac51e",
+      "trajectory_id": "traj-d99b5307-1749-4e80-867a-877e087f226f",
+      "timestamp": "2026-04-14T16:50:18.865214+00:00",
+      "stage": "retrieval",
+      "event_type": "task_received",
+      "payload": {
+        "input": "Use multiple capabilities: memory, skill, and tool."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-44d46564-2d71-4bed-8a3f-d3fc96fce9ef",
+      "trajectory_id": "traj-d99b5307-1749-4e80-867a-877e087f226f",
+      "timestamp": "2026-04-14T16:50:18.865225+00:00",
+      "stage": "retrieval",
+      "event_type": "candidates_recalled",
+      "payload": {
+        "memory_ids": [
+          "mem-telegram-pref"
+        ],
+        "skill_ids": [
+          "skill-deploy"
+        ],
+        "tool_ids": [
+          "tool-terminal"
+        ]
+      },
+      "metrics": {},
+      "parent_event_id": "evt-88a21058-c409-4836-a1b8-ef6cc63ac51e"
+    },
+    {
+      "event_id": "evt-e21e8afe-d676-4839-b9d0-fd60441b983a",
+      "trajectory_id": "traj-d99b5307-1749-4e80-867a-877e087f226f",
+      "timestamp": "2026-04-14T16:50:18.865231+00:00",
+      "stage": "policy",
+      "event_type": "action_selected",
+      "payload": {
+        "step": 1,
+        "decision_type": "clarify",
+        "selected_ids": [],
+        "selected_payloads": [],
+        "rejected_ids": [],
+        "rationale": "Predicted by learning router (score=2606.615).",
+        "estimated_cost": 0.0
+      },
+      "metrics": {},
+      "parent_event_id": "evt-44d46564-2d71-4bed-8a3f-d3fc96fce9ef"
+    }
+  ],
+  "outcome": {
+    "status": "partial_success",
+    "steps": 1,
+    "latency_ms": 0,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Draft trajectory generated by MemabraRunner with execution hooks."
+  },
+  "reward": {
+    "total": 0.44,
+    "components": {
+      "task_success": 0.4,
+      "retrieval_hit": 0.1,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.0,
+      "context_cost": 0.06,
+      "useful_reuse": 0.0
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/trajectories/traj-dd361c81-40a1-4892-9914-2140870fff95.json b/docs/demo-artifacts/trajectories/traj-dd361c81-40a1-4892-9914-2140870fff95.json
new file mode 100644
index 0000000..379c22f
--- /dev/null
+++ b/docs/demo-artifacts/trajectories/traj-dd361c81-40a1-4892-9914-2140870fff95.json
@@ -0,0 +1,192 @@
+{
+  "trajectory_id": "traj-dd361c81-40a1-4892-9914-2140870fff95",
+  "task": {
+    "task_id": "task-789e89f1-828b-405e-ab11-43dd00107f5f",
+    "input": "Deploy this service with the usual workflow.",
+    "channel": "local",
+    "created_at": "2026-04-14T15:27:38.519101+00:00",
+    "user_id": "oza"
+  },
+  "context_snapshot": {
+    "conversation_summary": "",
+    "environment_summary": "",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-telegram-pref",
+        "type": "memory",
+        "title": "Telegram preference",
+        "summary": "Prefer plain text on Telegram.",
+        "triggers": [
+          "telegram",
+          "preference",
+          "answer"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 0.9,
+        "risk": 0.0,
+        "tags": [
+          "output"
+        ],
+        "source": "user",
+        "type_payload": {}
+      }
+    ],
+    "skill": [
+      {
+        "id": "skill-deploy",
+        "type": "skill",
+        "title": "Deploy workflow",
+        "summary": "Reusable deployment workflow.",
+        "triggers": [
+          "deploy",
+          "workflow",
+          "service"
+        ],
+        "cost": 0.0,
+        "confidence": 0.8,
+        "success_rate": 0.9,
+        "freshness": 0.8,
+        "risk": 0.0,
+        "tags": [
+          "ops"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ],
+    "tool": [
+      {
+        "id": "tool-terminal",
+        "type": "tool",
+        "title": "terminal",
+        "summary": "Run terminal-style inspection commands.",
+        "triggers": [
+          "check",
+          "current",
+          "status",
+          "system"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 1.0,
+        "risk": 0.0,
+        "tags": [
+          "inspection"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ]
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "load_skill",
+      "selected_ids": [
+        "skill-deploy"
+      ],
+      "selected_payloads": [
+        {}
+      ],
+      "rejected_ids": [],
+      "rationale": "Task resembles a reusable procedure; load a skill before action.",
+      "estimated_cost": 0.0
+    }
+  ],
+  "events": [
+    {
+      "event_id": "evt-ec9bd980-c648-43fc-8428-83a6ce0cf375",
+      "trajectory_id": "traj-dd361c81-40a1-4892-9914-2140870fff95",
+      "timestamp": "2026-04-14T15:27:38.519171+00:00",
+      "stage": "retrieval",
+      "event_type": "task_received",
+      "payload": {
+        "input": "Deploy this service with the usual workflow."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-e0f1f4e9-2a70-424d-bff6-34a156134b0f",
+      "trajectory_id": "traj-dd361c81-40a1-4892-9914-2140870fff95",
+      "timestamp": "2026-04-14T15:27:38.519177+00:00",
+      "stage": "retrieval",
+      "event_type": "candidates_recalled",
+      "payload": {
+        "memory_ids": [
+          "mem-telegram-pref"
+        ],
+        "skill_ids": [
+          "skill-deploy"
+        ],
+        "tool_ids": [
+          "tool-terminal"
+        ]
+      },
+      "metrics": {},
+      "parent_event_id": "evt-ec9bd980-c648-43fc-8428-83a6ce0cf375"
+    },
+    {
+      "event_id": "evt-9b1ea6f8-ac54-4aa4-ae0f-44aa3a0128dd",
+      "trajectory_id": "traj-dd361c81-40a1-4892-9914-2140870fff95",
+      "timestamp": "2026-04-14T15:27:38.519181+00:00",
+      "stage": "policy",
+      "event_type": "action_selected",
+      "payload": {
+        "step": 1,
+        "decision_type": "load_skill",
+        "selected_ids": [
+          "skill-deploy"
+        ],
+        "selected_payloads": [
+          {}
+        ],
+        "rejected_ids": [],
+        "rationale": "Task resembles a reusable procedure; load a skill before action.",
+        "estimated_cost": 0.0
+      },
+      "metrics": {},
+      "parent_event_id": "evt-e0f1f4e9-2a70-424d-bff6-34a156134b0f"
+    },
+    {
+      "event_id": "evt-skill-traj-dd361c81-40a1-4892-9914-2140870fff95-skill-deploy",
+      "trajectory_id": "traj-dd361c81-40a1-4892-9914-2140870fff95",
+      "timestamp": "2026-04-14T15:27:38.519188+00:00",
+      "stage": "execution",
+      "event_type": "skill_loaded",
+      "payload": {
+        "skill_id": "skill-deploy",
+        "input": "Deploy this service with the usual workflow.",
+        "instructions": "Demo skill payload loaded successfully."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    }
+  ],
+  "outcome": {
+    "status": "success",
+    "steps": 1,
+    "latency_ms": 0,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Draft trajectory generated by MemabraRunner with execution hooks."
+  },
+  "reward": {
+    "total": 1.04,
+    "components": {
+      "task_success": 0.8,
+      "retrieval_hit": 0.2,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.0,
+      "context_cost": 0.06,
+      "useful_reuse": 0.1
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/trajectories/traj-e197ee51-e87c-4203-b9ee-c2f2d530cceb.json b/docs/demo-artifacts/trajectories/traj-e197ee51-e87c-4203-b9ee-c2f2d530cceb.json
new file mode 100644
index 0000000..f64ddc8
--- /dev/null
+++ b/docs/demo-artifacts/trajectories/traj-e197ee51-e87c-4203-b9ee-c2f2d530cceb.json
@@ -0,0 +1,192 @@
+{
+  "trajectory_id": "traj-e197ee51-e87c-4203-b9ee-c2f2d530cceb",
+  "task": {
+    "task_id": "task-144d7465-796c-4dd0-a4e2-c2be42872c4a",
+    "input": "Run the deploy workflow skill.",
+    "channel": "local",
+    "created_at": "2026-04-14T15:52:24.606059+00:00",
+    "user_id": null
+  },
+  "context_snapshot": {
+    "conversation_summary": "",
+    "environment_summary": "",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-telegram-pref",
+        "type": "memory",
+        "title": "Telegram preference",
+        "summary": "Prefer plain text on Telegram.",
+        "triggers": [
+          "telegram",
+          "preference",
+          "answer"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 0.9,
+        "risk": 0.0,
+        "tags": [
+          "output"
+        ],
+        "source": "user",
+        "type_payload": {}
+      }
+    ],
+    "skill": [
+      {
+        "id": "skill-deploy",
+        "type": "skill",
+        "title": "Deploy workflow",
+        "summary": "Reusable deployment workflow.",
+        "triggers": [
+          "deploy",
+          "workflow",
+          "service"
+        ],
+        "cost": 0.0,
+        "confidence": 0.8,
+        "success_rate": 0.9,
+        "freshness": 0.8,
+        "risk": 0.0,
+        "tags": [
+          "ops"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ],
+    "tool": [
+      {
+        "id": "tool-terminal",
+        "type": "tool",
+        "title": "terminal",
+        "summary": "Run terminal-style inspection commands.",
+        "triggers": [
+          "check",
+          "current",
+          "status",
+          "system"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 1.0,
+        "risk": 0.0,
+        "tags": [
+          "inspection"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ]
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "load_skill",
+      "selected_ids": [
+        "skill-deploy"
+      ],
+      "selected_payloads": [
+        {}
+      ],
+      "rejected_ids": [],
+      "rationale": "Predicted by learning router (score=1277.214).",
+      "estimated_cost": 0.0
+    }
+  ],
+  "events": [
+    {
+      "event_id": "evt-184ab2f3-c1c6-4af1-8241-d55b4731e606",
+      "trajectory_id": "traj-e197ee51-e87c-4203-b9ee-c2f2d530cceb",
+      "timestamp": "2026-04-14T15:52:24.606169+00:00",
+      "stage": "retrieval",
+      "event_type": "task_received",
+      "payload": {
+        "input": "Run the deploy workflow skill."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-9dd959ce-a5ce-42fd-b975-a03dd713adf6",
+      "trajectory_id": "traj-e197ee51-e87c-4203-b9ee-c2f2d530cceb",
+      "timestamp": "2026-04-14T15:52:24.606180+00:00",
+      "stage": "retrieval",
+      "event_type": "candidates_recalled",
+      "payload": {
+        "memory_ids": [
+          "mem-telegram-pref"
+        ],
+        "skill_ids": [
+          "skill-deploy"
+        ],
+        "tool_ids": [
+          "tool-terminal"
+        ]
+      },
+      "metrics": {},
+      "parent_event_id": "evt-184ab2f3-c1c6-4af1-8241-d55b4731e606"
+    },
+    {
+      "event_id": "evt-537a8488-f6eb-4f15-94ac-3e1f195c584a",
+      "trajectory_id": "traj-e197ee51-e87c-4203-b9ee-c2f2d530cceb",
+      "timestamp": "2026-04-14T15:52:24.606193+00:00",
+      "stage": "policy",
+      "event_type": "action_selected",
+      "payload": {
+        "step": 1,
+        "decision_type": "load_skill",
+        "selected_ids": [
+          "skill-deploy"
+        ],
+        "selected_payloads": [
+          {}
+        ],
+        "rejected_ids": [],
+        "rationale": "Predicted by learning router (score=1277.214).",
+        "estimated_cost": 0.0
+      },
+      "metrics": {},
+      "parent_event_id": "evt-9dd959ce-a5ce-42fd-b975-a03dd713adf6"
+    },
+    {
+      "event_id": "evt-skill-traj-e197ee51-e87c-4203-b9ee-c2f2d530cceb-skill-deploy",
+      "trajectory_id": "traj-e197ee51-e87c-4203-b9ee-c2f2d530cceb",
+      "timestamp": "2026-04-14T15:52:24.606202+00:00",
+      "stage": "execution",
+      "event_type": "skill_loaded",
+      "payload": {
+        "skill_id": "skill-deploy",
+        "input": "Run the deploy workflow skill.",
+        "instructions": "Demo skill payload loaded successfully."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    }
+  ],
+  "outcome": {
+    "status": "success",
+    "steps": 1,
+    "latency_ms": 0,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Draft trajectory generated by MemabraRunner with execution hooks."
+  },
+  "reward": {
+    "total": 1.04,
+    "components": {
+      "task_success": 0.8,
+      "retrieval_hit": 0.2,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.0,
+      "context_cost": 0.06,
+      "useful_reuse": 0.1
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/trajectories/traj-e9c37170-8764-4d70-ba0d-90213b275229.json b/docs/demo-artifacts/trajectories/traj-e9c37170-8764-4d70-ba0d-90213b275229.json
new file mode 100644
index 0000000..a7dd3df
--- /dev/null
+++ b/docs/demo-artifacts/trajectories/traj-e9c37170-8764-4d70-ba0d-90213b275229.json
@@ -0,0 +1,170 @@
+{
+  "trajectory_id": "traj-e9c37170-8764-4d70-ba0d-90213b275229",
+  "task": {
+    "task_id": "task-f61f5344-3be7-4a7a-9dfa-b8d2a9c30a42",
+    "input": "Recall my saved preference from memory.",
+    "channel": "local",
+    "created_at": "2026-04-14T16:50:18.863539+00:00",
+    "user_id": null
+  },
+  "context_snapshot": {
+    "conversation_summary": "",
+    "environment_summary": "",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-telegram-pref",
+        "type": "memory",
+        "title": "Telegram preference",
+        "summary": "Prefer plain text on Telegram.",
+        "triggers": [
+          "telegram",
+          "preference",
+          "answer"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 0.9,
+        "risk": 0.0,
+        "tags": [
+          "output"
+        ],
+        "source": "user",
+        "type_payload": {}
+      }
+    ],
+    "skill": [
+      {
+        "id": "skill-deploy",
+        "type": "skill",
+        "title": "Deploy workflow",
+        "summary": "Reusable deployment workflow.",
+        "triggers": [
+          "deploy",
+          "workflow",
+          "service"
+        ],
+        "cost": 0.0,
+        "confidence": 0.8,
+        "success_rate": 0.9,
+        "freshness": 0.8,
+        "risk": 0.0,
+        "tags": [
+          "ops"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ],
+    "tool": [
+      {
+        "id": "tool-terminal",
+        "type": "tool",
+        "title": "terminal",
+        "summary": "Run terminal-style inspection commands.",
+        "triggers": [
+          "check",
+          "current",
+          "status",
+          "system"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 1.0,
+        "risk": 0.0,
+        "tags": [
+          "inspection"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ]
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "clarify",
+      "selected_ids": [],
+      "selected_payloads": [],
+      "rejected_ids": [],
+      "rationale": "Predicted by learning router (score=1994.615).",
+      "estimated_cost": 0.0
+    }
+  ],
+  "events": [
+    {
+      "event_id": "evt-21762ef9-6490-4e3f-8f3c-2ba17e20c050",
+      "trajectory_id": "traj-e9c37170-8764-4d70-ba0d-90213b275229",
+      "timestamp": "2026-04-14T16:50:18.863643+00:00",
+      "stage": "retrieval",
+      "event_type": "task_received",
+      "payload": {
+        "input": "Recall my saved preference from memory."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-40f5b045-1e94-4c07-8cf5-5a245a946b9d",
+      "trajectory_id": "traj-e9c37170-8764-4d70-ba0d-90213b275229",
+      "timestamp": "2026-04-14T16:50:18.863652+00:00",
+      "stage": "retrieval",
+      "event_type": "candidates_recalled",
+      "payload": {
+        "memory_ids": [
+          "mem-telegram-pref"
+        ],
+        "skill_ids": [
+          "skill-deploy"
+        ],
+        "tool_ids": [
+          "tool-terminal"
+        ]
+      },
+      "metrics": {},
+      "parent_event_id": "evt-21762ef9-6490-4e3f-8f3c-2ba17e20c050"
+    },
+    {
+      "event_id": "evt-5ed49c2e-d2b3-46ec-859e-ec00f8c001c2",
+      "trajectory_id": "traj-e9c37170-8764-4d70-ba0d-90213b275229",
+      "timestamp": "2026-04-14T16:50:18.863659+00:00",
+      "stage": "policy",
+      "event_type": "action_selected",
+      "payload": {
+        "step": 1,
+        "decision_type": "clarify",
+        "selected_ids": [],
+        "selected_payloads": [],
+        "rejected_ids": [],
+        "rationale": "Predicted by learning router (score=1994.615).",
+        "estimated_cost": 0.0
+      },
+      "metrics": {},
+      "parent_event_id": "evt-40f5b045-1e94-4c07-8cf5-5a245a946b9d"
+    }
+  ],
+  "outcome": {
+    "status": "partial_success",
+    "steps": 1,
+    "latency_ms": 0,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Draft trajectory generated by MemabraRunner with execution hooks."
+  },
+  "reward": {
+    "total": 0.44,
+    "components": {
+      "task_success": 0.4,
+      "retrieval_hit": 0.1,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.0,
+      "context_cost": 0.06,
+      "useful_reuse": 0.0
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/trajectories/traj-ebc0d1f0-d01f-4c1f-8cdb-23c3d184b2c5.json b/docs/demo-artifacts/trajectories/traj-ebc0d1f0-d01f-4c1f-8cdb-23c3d184b2c5.json
new file mode 100644
index 0000000..3a4f27f
--- /dev/null
+++ b/docs/demo-artifacts/trajectories/traj-ebc0d1f0-d01f-4c1f-8cdb-23c3d184b2c5.json
@@ -0,0 +1,170 @@
+{
+  "trajectory_id": "traj-ebc0d1f0-d01f-4c1f-8cdb-23c3d184b2c5",
+  "task": {
+    "task_id": "task-d7578bf3-95da-43f2-9b31-2c80ccb4fe33",
+    "input": "Run the deploy workflow skill.",
+    "channel": "local",
+    "created_at": "2026-04-14T16:50:18.864056+00:00",
+    "user_id": null
+  },
+  "context_snapshot": {
+    "conversation_summary": "",
+    "environment_summary": "",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-telegram-pref",
+        "type": "memory",
+        "title": "Telegram preference",
+        "summary": "Prefer plain text on Telegram.",
+        "triggers": [
+          "telegram",
+          "preference",
+          "answer"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 0.9,
+        "risk": 0.0,
+        "tags": [
+          "output"
+        ],
+        "source": "user",
+        "type_payload": {}
+      }
+    ],
+    "skill": [
+      {
+        "id": "skill-deploy",
+        "type": "skill",
+        "title": "Deploy workflow",
+        "summary": "Reusable deployment workflow.",
+        "triggers": [
+          "deploy",
+          "workflow",
+          "service"
+        ],
+        "cost": 0.0,
+        "confidence": 0.8,
+        "success_rate": 0.9,
+        "freshness": 0.8,
+        "risk": 0.0,
+        "tags": [
+          "ops"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ],
+    "tool": [
+      {
+        "id": "tool-terminal",
+        "type": "tool",
+        "title": "terminal",
+        "summary": "Run terminal-style inspection commands.",
+        "triggers": [
+          "check",
+          "current",
+          "status",
+          "system"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 1.0,
+        "risk": 0.0,
+        "tags": [
+          "inspection"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ]
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "clarify",
+      "selected_ids": [],
+      "selected_payloads": [],
+      "rejected_ids": [],
+      "rationale": "Predicted by learning router (score=1535.615).",
+      "estimated_cost": 0.0
+    }
+  ],
+  "events": [
+    {
+      "event_id": "evt-4e1aa172-112d-4000-8708-f2184e114ee5",
+      "trajectory_id": "traj-ebc0d1f0-d01f-4c1f-8cdb-23c3d184b2c5",
+      "timestamp": "2026-04-14T16:50:18.864163+00:00",
+      "stage": "retrieval",
+      "event_type": "task_received",
+      "payload": {
+        "input": "Run the deploy workflow skill."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-b9e7f5b9-2f27-4f4d-8f76-8ba6b39620eb",
+      "trajectory_id": "traj-ebc0d1f0-d01f-4c1f-8cdb-23c3d184b2c5",
+      "timestamp": "2026-04-14T16:50:18.864173+00:00",
+      "stage": "retrieval",
+      "event_type": "candidates_recalled",
+      "payload": {
+        "memory_ids": [
+          "mem-telegram-pref"
+        ],
+        "skill_ids": [
+          "skill-deploy"
+        ],
+        "tool_ids": [
+          "tool-terminal"
+        ]
+      },
+      "metrics": {},
+      "parent_event_id": "evt-4e1aa172-112d-4000-8708-f2184e114ee5"
+    },
+    {
+      "event_id": "evt-07dcc07d-d9c4-4698-881d-925294dadadf",
+      "trajectory_id": "traj-ebc0d1f0-d01f-4c1f-8cdb-23c3d184b2c5",
+      "timestamp": "2026-04-14T16:50:18.864179+00:00",
+      "stage": "policy",
+      "event_type": "action_selected",
+      "payload": {
+        "step": 1,
+        "decision_type": "clarify",
+        "selected_ids": [],
+        "selected_payloads": [],
+        "rejected_ids": [],
+        "rationale": "Predicted by learning router (score=1535.615).",
+        "estimated_cost": 0.0
+      },
+      "metrics": {},
+      "parent_event_id": "evt-b9e7f5b9-2f27-4f4d-8f76-8ba6b39620eb"
+    }
+  ],
+  "outcome": {
+    "status": "partial_success",
+    "steps": 1,
+    "latency_ms": 0,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Draft trajectory generated by MemabraRunner with execution hooks."
+  },
+  "reward": {
+    "total": 0.44,
+    "components": {
+      "task_success": 0.4,
+      "retrieval_hit": 0.1,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.0,
+      "context_cost": 0.06,
+      "useful_reuse": 0.0
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/trajectories/traj-ed1d8812-f0ac-4994-86ab-21b3cf0fcb17.json b/docs/demo-artifacts/trajectories/traj-ed1d8812-f0ac-4994-86ab-21b3cf0fcb17.json
new file mode 100644
index 0000000..8257a07
--- /dev/null
+++ b/docs/demo-artifacts/trajectories/traj-ed1d8812-f0ac-4994-86ab-21b3cf0fcb17.json
@@ -0,0 +1,192 @@
+{
+  "trajectory_id": "traj-ed1d8812-f0ac-4994-86ab-21b3cf0fcb17",
+  "task": {
+    "task_id": "task-d9131553-8868-4dac-8f06-69be44c43f4e",
+    "input": "Use multiple capabilities: memory, skill, and tool.",
+    "channel": "local",
+    "created_at": "2026-04-14T15:52:24.607062+00:00",
+    "user_id": null
+  },
+  "context_snapshot": {
+    "conversation_summary": "",
+    "environment_summary": "",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-telegram-pref",
+        "type": "memory",
+        "title": "Telegram preference",
+        "summary": "Prefer plain text on Telegram.",
+        "triggers": [
+          "telegram",
+          "preference",
+          "answer"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 0.9,
+        "risk": 0.0,
+        "tags": [
+          "output"
+        ],
+        "source": "user",
+        "type_payload": {}
+      }
+    ],
+    "skill": [
+      {
+        "id": "skill-deploy",
+        "type": "skill",
+        "title": "Deploy workflow",
+        "summary": "Reusable deployment workflow.",
+        "triggers": [
+          "deploy",
+          "workflow",
+          "service"
+        ],
+        "cost": 0.0,
+        "confidence": 0.8,
+        "success_rate": 0.9,
+        "freshness": 0.8,
+        "risk": 0.0,
+        "tags": [
+          "ops"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ],
+    "tool": [
+      {
+        "id": "tool-terminal",
+        "type": "tool",
+        "title": "terminal",
+        "summary": "Run terminal-style inspection commands.",
+        "triggers": [
+          "check",
+          "current",
+          "status",
+          "system"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 1.0,
+        "risk": 0.0,
+        "tags": [
+          "inspection"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ]
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "load_skill",
+      "selected_ids": [
+        "skill-deploy"
+      ],
+      "selected_payloads": [
+        {}
+      ],
+      "rejected_ids": [],
+      "rationale": "Predicted by learning router (score=2167.3334).",
+      "estimated_cost": 0.0
+    }
+  ],
+  "events": [
+    {
+      "event_id": "evt-a8bc2d4a-1557-4029-899f-7fa93b764b11",
+      "trajectory_id": "traj-ed1d8812-f0ac-4994-86ab-21b3cf0fcb17",
+      "timestamp": "2026-04-14T15:52:24.607165+00:00",
+      "stage": "retrieval",
+      "event_type": "task_received",
+      "payload": {
+        "input": "Use multiple capabilities: memory, skill, and tool."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-832ac7e6-d619-4e24-ad74-bcca1042806e",
+      "trajectory_id": "traj-ed1d8812-f0ac-4994-86ab-21b3cf0fcb17",
+      "timestamp": "2026-04-14T15:52:24.607175+00:00",
+      "stage": "retrieval",
+      "event_type": "candidates_recalled",
+      "payload": {
+        "memory_ids": [
+          "mem-telegram-pref"
+        ],
+        "skill_ids": [
+          "skill-deploy"
+        ],
+        "tool_ids": [
+          "tool-terminal"
+        ]
+      },
+      "metrics": {},
+      "parent_event_id": "evt-a8bc2d4a-1557-4029-899f-7fa93b764b11"
+    },
+    {
+      "event_id": "evt-0aaff11d-de9f-4e28-bc92-6def76857a20",
+      "trajectory_id": "traj-ed1d8812-f0ac-4994-86ab-21b3cf0fcb17",
+      "timestamp": "2026-04-14T15:52:24.607182+00:00",
+      "stage": "policy",
+      "event_type": "action_selected",
+      "payload": {
+        "step": 1,
+        "decision_type": "load_skill",
+        "selected_ids": [
+          "skill-deploy"
+        ],
+        "selected_payloads": [
+          {}
+        ],
+        "rejected_ids": [],
+        "rationale": "Predicted by learning router (score=2167.3334).",
+        "estimated_cost": 0.0
+      },
+      "metrics": {},
+      "parent_event_id": "evt-832ac7e6-d619-4e24-ad74-bcca1042806e"
+    },
+    {
+      "event_id": "evt-skill-traj-ed1d8812-f0ac-4994-86ab-21b3cf0fcb17-skill-deploy",
+      "trajectory_id": "traj-ed1d8812-f0ac-4994-86ab-21b3cf0fcb17",
+      "timestamp": "2026-04-14T15:52:24.607192+00:00",
+      "stage": "execution",
+      "event_type": "skill_loaded",
+      "payload": {
+        "skill_id": "skill-deploy",
+        "input": "Use multiple capabilities: memory, skill, and tool.",
+        "instructions": "Demo skill payload loaded successfully."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    }
+  ],
+  "outcome": {
+    "status": "success",
+    "steps": 1,
+    "latency_ms": 0,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Draft trajectory generated by MemabraRunner with execution hooks."
+  },
+  "reward": {
+    "total": 1.04,
+    "components": {
+      "task_success": 0.8,
+      "retrieval_hit": 0.2,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.0,
+      "context_cost": 0.06,
+      "useful_reuse": 0.1
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/trajectories/traj-f1d895a0-5442-448f-8936-4ee8b07822e6.json b/docs/demo-artifacts/trajectories/traj-f1d895a0-5442-448f-8936-4ee8b07822e6.json
new file mode 100644
index 0000000..9d1d797
--- /dev/null
+++ b/docs/demo-artifacts/trajectories/traj-f1d895a0-5442-448f-8936-4ee8b07822e6.json
@@ -0,0 +1,192 @@
+{
+  "trajectory_id": "traj-f1d895a0-5442-448f-8936-4ee8b07822e6",
+  "task": {
+    "task_id": "task-053282d0-1f43-409f-a230-343d3faa02df",
+    "input": "Check current system status with a tool.",
+    "channel": "local",
+    "created_at": "2026-04-14T15:52:24.606551+00:00",
+    "user_id": null
+  },
+  "context_snapshot": {
+    "conversation_summary": "",
+    "environment_summary": "",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-telegram-pref",
+        "type": "memory",
+        "title": "Telegram preference",
+        "summary": "Prefer plain text on Telegram.",
+        "triggers": [
+          "telegram",
+          "preference",
+          "answer"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 0.9,
+        "risk": 0.0,
+        "tags": [
+          "output"
+        ],
+        "source": "user",
+        "type_payload": {}
+      }
+    ],
+    "skill": [
+      {
+        "id": "skill-deploy",
+        "type": "skill",
+        "title": "Deploy workflow",
+        "summary": "Reusable deployment workflow.",
+        "triggers": [
+          "deploy",
+          "workflow",
+          "service"
+        ],
+        "cost": 0.0,
+        "confidence": 0.8,
+        "success_rate": 0.9,
+        "freshness": 0.8,
+        "risk": 0.0,
+        "tags": [
+          "ops"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ],
+    "tool": [
+      {
+        "id": "tool-terminal",
+        "type": "tool",
+        "title": "terminal",
+        "summary": "Run terminal-style inspection commands.",
+        "triggers": [
+          "check",
+          "current",
+          "status",
+          "system"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 1.0,
+        "risk": 0.0,
+        "tags": [
+          "inspection"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ]
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "load_skill",
+      "selected_ids": [
+        "skill-deploy"
+      ],
+      "selected_payloads": [
+        {}
+      ],
+      "rejected_ids": [],
+      "rationale": "Predicted by learning router (score=1701.0804).",
+      "estimated_cost": 0.0
+    }
+  ],
+  "events": [
+    {
+      "event_id": "evt-5900439a-2a97-41fe-a82e-96181c99fee1",
+      "trajectory_id": "traj-f1d895a0-5442-448f-8936-4ee8b07822e6",
+      "timestamp": "2026-04-14T15:52:24.606656+00:00",
+      "stage": "retrieval",
+      "event_type": "task_received",
+      "payload": {
+        "input": "Check current system status with a tool."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-e0965597-ddee-4ccd-ae72-b51105101428",
+      "trajectory_id": "traj-f1d895a0-5442-448f-8936-4ee8b07822e6",
+      "timestamp": "2026-04-14T15:52:24.606666+00:00",
+      "stage": "retrieval",
+      "event_type": "candidates_recalled",
+      "payload": {
+        "memory_ids": [
+          "mem-telegram-pref"
+        ],
+        "skill_ids": [
+          "skill-deploy"
+        ],
+        "tool_ids": [
+          "tool-terminal"
+        ]
+      },
+      "metrics": {},
+      "parent_event_id": "evt-5900439a-2a97-41fe-a82e-96181c99fee1"
+    },
+    {
+      "event_id": "evt-047dc545-d6c2-4a67-b0db-26b79e994e63",
+      "trajectory_id": "traj-f1d895a0-5442-448f-8936-4ee8b07822e6",
+      "timestamp": "2026-04-14T15:52:24.606672+00:00",
+      "stage": "policy",
+      "event_type": "action_selected",
+      "payload": {
+        "step": 1,
+        "decision_type": "load_skill",
+        "selected_ids": [
+          "skill-deploy"
+        ],
+        "selected_payloads": [
+          {}
+        ],
+        "rejected_ids": [],
+        "rationale": "Predicted by learning router (score=1701.0804).",
+        "estimated_cost": 0.0
+      },
+      "metrics": {},
+      "parent_event_id": "evt-e0965597-ddee-4ccd-ae72-b51105101428"
+    },
+    {
+      "event_id": "evt-skill-traj-f1d895a0-5442-448f-8936-4ee8b07822e6-skill-deploy",
+      "trajectory_id": "traj-f1d895a0-5442-448f-8936-4ee8b07822e6",
+      "timestamp": "2026-04-14T15:52:24.606681+00:00",
+      "stage": "execution",
+      "event_type": "skill_loaded",
+      "payload": {
+        "skill_id": "skill-deploy",
+        "input": "Check current system status with a tool.",
+        "instructions": "Demo skill payload loaded successfully."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    }
+  ],
+  "outcome": {
+    "status": "success",
+    "steps": 1,
+    "latency_ms": 0,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Draft trajectory generated by MemabraRunner with execution hooks."
+  },
+  "reward": {
+    "total": 1.04,
+    "components": {
+      "task_success": 0.8,
+      "retrieval_hit": 0.2,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.0,
+      "context_cost": 0.06,
+      "useful_reuse": 0.1
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/trajectories/traj-f511e978-ad79-4be6-bbab-461b5ad9ecb3.json b/docs/demo-artifacts/trajectories/traj-f511e978-ad79-4be6-bbab-461b5ad9ecb3.json
new file mode 100644
index 0000000..5108176
--- /dev/null
+++ b/docs/demo-artifacts/trajectories/traj-f511e978-ad79-4be6-bbab-461b5ad9ecb3.json
@@ -0,0 +1,170 @@
+{
+  "trajectory_id": "traj-f511e978-ad79-4be6-bbab-461b5ad9ecb3",
+  "task": {
+    "task_id": "task-c3c52f6d-4793-4687-9838-d98fd99a6074",
+    "input": "Use multiple capabilities: memory, skill, and tool.",
+    "channel": "local",
+    "created_at": "2026-04-14T16:50:18.863031+00:00",
+    "user_id": null
+  },
+  "context_snapshot": {
+    "conversation_summary": "",
+    "environment_summary": "",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-telegram-pref",
+        "type": "memory",
+        "title": "Telegram preference",
+        "summary": "Prefer plain text on Telegram.",
+        "triggers": [
+          "telegram",
+          "preference",
+          "answer"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 0.9,
+        "risk": 0.0,
+        "tags": [
+          "output"
+        ],
+        "source": "user",
+        "type_payload": {}
+      }
+    ],
+    "skill": [
+      {
+        "id": "skill-deploy",
+        "type": "skill",
+        "title": "Deploy workflow",
+        "summary": "Reusable deployment workflow.",
+        "triggers": [
+          "deploy",
+          "workflow",
+          "service"
+        ],
+        "cost": 0.0,
+        "confidence": 0.8,
+        "success_rate": 0.9,
+        "freshness": 0.8,
+        "risk": 0.0,
+        "tags": [
+          "ops"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ],
+    "tool": [
+      {
+        "id": "tool-terminal",
+        "type": "tool",
+        "title": "terminal",
+        "summary": "Run terminal-style inspection commands.",
+        "triggers": [
+          "check",
+          "current",
+          "status",
+          "system"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 1.0,
+        "risk": 0.0,
+        "tags": [
+          "inspection"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ]
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "clarify",
+      "selected_ids": [],
+      "selected_payloads": [],
+      "rejected_ids": [],
+      "rationale": "No high-confidence route found from the current heuristic baseline.",
+      "estimated_cost": 0.0
+    }
+  ],
+  "events": [
+    {
+      "event_id": "evt-1cfd8f39-f961-43da-9fb4-9e37dd7072f0",
+      "trajectory_id": "traj-f511e978-ad79-4be6-bbab-461b5ad9ecb3",
+      "timestamp": "2026-04-14T16:50:18.863119+00:00",
+      "stage": "retrieval",
+      "event_type": "task_received",
+      "payload": {
+        "input": "Use multiple capabilities: memory, skill, and tool."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-a7f6a38f-76c5-4342-a592-4acbd15efe9f",
+      "trajectory_id": "traj-f511e978-ad79-4be6-bbab-461b5ad9ecb3",
+      "timestamp": "2026-04-14T16:50:18.863129+00:00",
+      "stage": "retrieval",
+      "event_type": "candidates_recalled",
+      "payload": {
+        "memory_ids": [
+          "mem-telegram-pref"
+        ],
+        "skill_ids": [
+          "skill-deploy"
+        ],
+        "tool_ids": [
+          "tool-terminal"
+        ]
+      },
+      "metrics": {},
+      "parent_event_id": "evt-1cfd8f39-f961-43da-9fb4-9e37dd7072f0"
+    },
+    {
+      "event_id": "evt-79e3d820-34bf-4c20-9286-2e20dd3e068c",
+      "trajectory_id": "traj-f511e978-ad79-4be6-bbab-461b5ad9ecb3",
+      "timestamp": "2026-04-14T16:50:18.863136+00:00",
+      "stage": "policy",
+      "event_type": "action_selected",
+      "payload": {
+        "step": 1,
+        "decision_type": "clarify",
+        "selected_ids": [],
+        "selected_payloads": [],
+        "rejected_ids": [],
+        "rationale": "No high-confidence route found from the current heuristic baseline.",
+        "estimated_cost": 0.0
+      },
+      "metrics": {},
+      "parent_event_id": "evt-a7f6a38f-76c5-4342-a592-4acbd15efe9f"
+    }
+  ],
+  "outcome": {
+    "status": "partial_success",
+    "steps": 1,
+    "latency_ms": 0,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Draft trajectory generated by MemabraRunner with execution hooks."
+  },
+  "reward": {
+    "total": 0.44,
+    "components": {
+      "task_success": 0.4,
+      "retrieval_hit": 0.1,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.0,
+      "context_cost": 0.06,
+      "useful_reuse": 0.0
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/demo-artifacts/trajectories/traj-ffb40d01-7956-4d7b-a41c-9618487fe619.json b/docs/demo-artifacts/trajectories/traj-ffb40d01-7956-4d7b-a41c-9618487fe619.json
new file mode 100644
index 0000000..adfa60d
--- /dev/null
+++ b/docs/demo-artifacts/trajectories/traj-ffb40d01-7956-4d7b-a41c-9618487fe619.json
@@ -0,0 +1,192 @@
+{
+  "trajectory_id": "traj-ffb40d01-7956-4d7b-a41c-9618487fe619",
+  "task": {
+    "task_id": "task-f0aed2e6-8d9b-42f8-a20c-5eb8af052d3b",
+    "input": "Recall my saved preference from memory.",
+    "channel": "local",
+    "created_at": "2026-04-14T15:52:24.605509+00:00",
+    "user_id": null
+  },
+  "context_snapshot": {
+    "conversation_summary": "",
+    "environment_summary": "",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-telegram-pref",
+        "type": "memory",
+        "title": "Telegram preference",
+        "summary": "Prefer plain text on Telegram.",
+        "triggers": [
+          "telegram",
+          "preference",
+          "answer"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 0.9,
+        "risk": 0.0,
+        "tags": [
+          "output"
+        ],
+        "source": "user",
+        "type_payload": {}
+      }
+    ],
+    "skill": [
+      {
+        "id": "skill-deploy",
+        "type": "skill",
+        "title": "Deploy workflow",
+        "summary": "Reusable deployment workflow.",
+        "triggers": [
+          "deploy",
+          "workflow",
+          "service"
+        ],
+        "cost": 0.0,
+        "confidence": 0.8,
+        "success_rate": 0.9,
+        "freshness": 0.8,
+        "risk": 0.0,
+        "tags": [
+          "ops"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ],
+    "tool": [
+      {
+        "id": "tool-terminal",
+        "type": "tool",
+        "title": "terminal",
+        "summary": "Run terminal-style inspection commands.",
+        "triggers": [
+          "check",
+          "current",
+          "status",
+          "system"
+        ],
+        "cost": 0.0,
+        "confidence": 0.95,
+        "success_rate": 0.9,
+        "freshness": 1.0,
+        "risk": 0.0,
+        "tags": [
+          "inspection"
+        ],
+        "source": "system",
+        "type_payload": {}
+      }
+    ]
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "load_skill",
+      "selected_ids": [
+        "skill-deploy"
+      ],
+      "selected_payloads": [
+        {}
+      ],
+      "rejected_ids": [],
+      "rationale": "Predicted by learning router (score=1658.6938).",
+      "estimated_cost": 0.0
+    }
+  ],
+  "events": [
+    {
+      "event_id": "evt-44233637-eb1a-47de-972c-942ee409dd78",
+      "trajectory_id": "traj-ffb40d01-7956-4d7b-a41c-9618487fe619",
+      "timestamp": "2026-04-14T15:52:24.605614+00:00",
+      "stage": "retrieval",
+      "event_type": "task_received",
+      "payload": {
+        "input": "Recall my saved preference from memory."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    },
+    {
+      "event_id": "evt-01123ad4-7d52-4c82-bca1-1a3b5014196f",
+      "trajectory_id": "traj-ffb40d01-7956-4d7b-a41c-9618487fe619",
+      "timestamp": "2026-04-14T15:52:24.605625+00:00",
+      "stage": "retrieval",
+      "event_type": "candidates_recalled",
+      "payload": {
+        "memory_ids": [
+          "mem-telegram-pref"
+        ],
+        "skill_ids": [
+          "skill-deploy"
+        ],
+        "tool_ids": [
+          "tool-terminal"
+        ]
+      },
+      "metrics": {},
+      "parent_event_id": "evt-44233637-eb1a-47de-972c-942ee409dd78"
+    },
+    {
+      "event_id": "evt-a9a657a1-1e3e-49f3-8ea0-9528c12c633f",
+      "trajectory_id": "traj-ffb40d01-7956-4d7b-a41c-9618487fe619",
+      "timestamp": "2026-04-14T15:52:24.605632+00:00",
+      "stage": "policy",
+      "event_type": "action_selected",
+      "payload": {
+        "step": 1,
+        "decision_type": "load_skill",
+        "selected_ids": [
+          "skill-deploy"
+        ],
+        "selected_payloads": [
+          {}
+        ],
+        "rejected_ids": [],
+        "rationale": "Predicted by learning router (score=1658.6938).",
+        "estimated_cost": 0.0
+      },
+      "metrics": {},
+      "parent_event_id": "evt-01123ad4-7d52-4c82-bca1-1a3b5014196f"
+    },
+    {
+      "event_id": "evt-skill-traj-ffb40d01-7956-4d7b-a41c-9618487fe619-skill-deploy",
+      "trajectory_id": "traj-ffb40d01-7956-4d7b-a41c-9618487fe619",
+      "timestamp": "2026-04-14T15:52:24.605642+00:00",
+      "stage": "execution",
+      "event_type": "skill_loaded",
+      "payload": {
+        "skill_id": "skill-deploy",
+        "input": "Recall my saved preference from memory.",
+        "instructions": "Demo skill payload loaded successfully."
+      },
+      "metrics": {},
+      "parent_event_id": null
+    }
+  ],
+  "outcome": {
+    "status": "success",
+    "steps": 1,
+    "latency_ms": 0,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Draft trajectory generated by MemabraRunner with execution hooks."
+  },
+  "reward": {
+    "total": 1.04,
+    "components": {
+      "task_success": 0.8,
+      "retrieval_hit": 0.2,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.0,
+      "context_cost": 0.06,
+      "useful_reuse": 0.1
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/examples/trajectory_failure_missed_memory.json b/docs/examples/trajectory_failure_missed_memory.json
new file mode 100644
index 0000000..c72c1e1
--- /dev/null
+++ b/docs/examples/trajectory_failure_missed_memory.json
@@ -0,0 +1,66 @@
+{
+  "trajectory_id": "traj-failure-missed-memory-001",
+  "task": {
+    "task_id": "task-004",
+    "input": "Use my usual formatting preferences for this write-up.",
+    "channel": "telegram",
+    "created_at": "2026-04-14T13:05:00Z",
+    "user_id": "oza"
+  },
+  "context_snapshot": {
+    "conversation_summary": "User has repeated stable formatting preferences in earlier sessions.",
+    "environment_summary": "No tool call required.",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-format-1",
+        "type": "memory",
+        "title": "Telegram formatting preference",
+        "summary": "Prefer plain text over markdown for Telegram delivery.",
+        "triggers": ["format", "telegram", "write-up"],
+        "cost": 0.05,
+        "confidence": 0.9,
+        "success_rate": 0.95,
+        "freshness": 0.95,
+        "risk": 0.05,
+        "tags": ["preference", "output"],
+        "source": "system"
+      }
+    ],
+    "skill": [],
+    "tool": []
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "direct_answer",
+      "selected_ids": [],
+      "rejected_ids": ["mem-format-1"],
+      "rationale": "Router failed to recognize a preference-triggered task and skipped memory injection.",
+      "estimated_cost": 0.0
+    }
+  ],
+  "events": [],
+  "outcome": {
+    "status": "partial_success",
+    "steps": 1,
+    "latency_ms": 300,
+    "user_corrections": 1,
+    "tool_errors": 0,
+    "notes": "Answer was serviceable but ignored known formatting preference."
+  },
+  "reward": {
+    "total": 0.18,
+    "components": {
+      "task_success": 0.5,
+      "retrieval_hit": -0.1,
+      "tool_error": 0.0,
+      "user_correction": 0.2,
+      "latency": 0.02,
+      "context_cost": 0.0,
+      "useful_reuse": 0.0
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/examples/trajectory_failure_overtool.json b/docs/examples/trajectory_failure_overtool.json
new file mode 100644
index 0000000..3879356
--- /dev/null
+++ b/docs/examples/trajectory_failure_overtool.json
@@ -0,0 +1,67 @@
+{
+  "trajectory_id": "traj-failure-overtool-001",
+  "task": {
+    "task_id": "task-003",
+    "input": "Name this project.",
+    "channel": "telegram",
+    "created_at": "2026-04-14T13:04:00Z",
+    "user_id": "oza"
+  },
+  "context_snapshot": {
+    "conversation_summary": "User asks for naming help for an agent memory project.",
+    "environment_summary": "No real-time state lookup required.",
+    "recent_failures": ["The agent previously overused tools for pure reasoning tasks."]
+  },
+  "candidate_sets": {
+    "memory": [],
+    "skill": [],
+    "tool": [
+      {
+        "id": "tool-web-1",
+        "type": "tool",
+        "title": "web_search",
+        "summary": "Search the web for information.",
+        "triggers": ["name", "idea"],
+        "cost": 0.4,
+        "confidence": 0.62,
+        "success_rate": 0.55,
+        "freshness": 1.0,
+        "risk": 0.3,
+        "tags": ["research"],
+        "source": "system"
+      }
+    ],
+    "skill": []
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "call_tool",
+      "selected_ids": ["tool-web-1"],
+      "rejected_ids": [],
+      "rationale": "Incorrectly treated naming as a research task rather than a reasoning task.",
+      "estimated_cost": 0.4
+    }
+  ],
+  "events": [],
+  "outcome": {
+    "status": "failure",
+    "steps": 2,
+    "latency_ms": 2400,
+    "user_corrections": 1,
+    "tool_errors": 1,
+    "notes": "Over-tooled a pure reasoning task and forced unnecessary latency."
+  },
+  "reward": {
+    "total": -0.82,
+    "components": {
+      "task_success": -0.3,
+      "retrieval_hit": 0.0,
+      "tool_error": 0.35,
+      "user_correction": 0.25,
+      "latency": 0.12,
+      "context_cost": 0.1,
+      "useful_reuse": 0.0
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/examples/trajectory_success_memory.json b/docs/examples/trajectory_success_memory.json
new file mode 100644
index 0000000..292df6d
--- /dev/null
+++ b/docs/examples/trajectory_success_memory.json
@@ -0,0 +1,66 @@
+{
+  "trajectory_id": "traj-success-memory-001",
+  "task": {
+    "task_id": "task-001",
+    "input": "Remember my preferred deployment region and use it next time.",
+    "channel": "telegram",
+    "created_at": "2026-04-14T13:02:00Z",
+    "user_id": "oza"
+  },
+  "context_snapshot": {
+    "conversation_summary": "User is defining a local agent memory project and references recurring preferences.",
+    "environment_summary": "No live tool call required.",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [
+      {
+        "id": "mem-region-1",
+        "type": "memory",
+        "title": "Preferred deployment region",
+        "summary": "User prefers us-west-2 for deployments.",
+        "triggers": ["deployment", "region", "preference"],
+        "cost": 0.1,
+        "confidence": 0.93,
+        "success_rate": 0.88,
+        "freshness": 0.9,
+        "risk": 0.1,
+        "tags": ["preference", "deployment"],
+        "source": "user"
+      }
+    ],
+    "skill": [],
+    "tool": []
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "inject_memory",
+      "selected_ids": ["mem-region-1"],
+      "rejected_ids": [],
+      "rationale": "User request depends on a stable preference, so memory injection is the lowest-cost correct route.",
+      "estimated_cost": 0.1
+    }
+  ],
+  "events": [],
+  "outcome": {
+    "status": "success",
+    "steps": 1,
+    "latency_ms": 350,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Correctly identified preference storage request without unnecessary tools."
+  },
+  "reward": {
+    "total": 1.72,
+    "components": {
+      "task_success": 1.0,
+      "retrieval_hit": 0.45,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.03,
+      "context_cost": 0.05,
+      "useful_reuse": 0.35
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/examples/trajectory_success_tool.json b/docs/examples/trajectory_success_tool.json
new file mode 100644
index 0000000..7a8692e
--- /dev/null
+++ b/docs/examples/trajectory_success_tool.json
@@ -0,0 +1,67 @@
+{
+  "trajectory_id": "traj-success-tool-001",
+  "task": {
+    "task_id": "task-002",
+    "input": "Check the current test status for the prototype.",
+    "channel": "telegram",
+    "created_at": "2026-04-14T13:03:00Z",
+    "user_id": "oza"
+  },
+  "context_snapshot": {
+    "conversation_summary": "User wants concrete progress on the memabra prototype.",
+    "environment_summary": "Pytest is available in the local repo environment.",
+    "recent_failures": []
+  },
+  "candidate_sets": {
+    "memory": [],
+    "skill": [],
+    "tool": [
+      {
+        "id": "tool-terminal-1",
+        "type": "tool",
+        "title": "terminal",
+        "summary": "Run shell commands in the local environment.",
+        "triggers": ["check", "current", "test"],
+        "cost": 0.2,
+        "confidence": 0.95,
+        "success_rate": 0.92,
+        "freshness": 1.0,
+        "risk": 0.2,
+        "tags": ["system", "tests"],
+        "source": "system"
+      }
+    ],
+    "skill": []
+  },
+  "decisions": [
+    {
+      "step": 1,
+      "decision_type": "call_tool",
+      "selected_ids": ["tool-terminal-1"],
+      "rejected_ids": [],
+      "rationale": "Current test status is a live system fact and must be observed with a tool.",
+      "estimated_cost": 0.2
+    }
+  ],
+  "events": [],
+  "outcome": {
+    "status": "success",
+    "steps": 1,
+    "latency_ms": 700,
+    "user_corrections": 0,
+    "tool_errors": 0,
+    "notes": "Terminal used appropriately to inspect live test state."
+  },
+  "reward": {
+    "total": 1.6,
+    "components": {
+      "task_success": 1.0,
+      "retrieval_hit": 0.4,
+      "tool_error": 0.0,
+      "user_correction": 0.0,
+      "latency": 0.08,
+      "context_cost": 0.02,
+      "useful_reuse": 0.3
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/reward_spec.md b/docs/reward_spec.md
new file mode 100644
index 0000000..9a540ce
--- /dev/null
+++ b/docs/reward_spec.md
@@ -0,0 +1,191 @@
+# Reward Specification
+
+## 目标
+
+memabra 的 reward 不是简单判断“任务做成没”，而是评估：
+- 是否选对了 memory / skill / tool
+- 是否高效
+- 是否稳定
+- 是否减少了用户重复输入和纠正
+- 是否控制了工具成本与上下文成本
+
+reward 的作用不是直接美化分数，而是给路由策略提供可归因、可优化的训练信号。
+
+## Reward 组成
+
+总奖励记为：
+
+```text
+R = ws*S + wr*H - we*E - wc*C - wl*L - wx*X + wu*U
+```
+
+其中：
+- `S` = task success
+- `H` = retrieval hit quality
+- `E` = execution/tool error penalty
+- `C` = user correction penalty
+- `L` = latency penalty
+- `X` = context cost penalty
+- `U` = useful reuse bonus
+
+## 1. Task Success (`S`)
+
+定义：任务最终是否完成，以及完成质量如何。
+
+建议取值：
+- `1.0`：完整达成目标
+- `0.5`：部分达成
+- `0.0`：未完成
+- `-0.5`：明显误导或做错方向
+
+数据来源：
+- 自动任务验收器
+- 用户显式反馈
+- 回放对比规则
+
+## 2. Retrieval Hit Quality (`H`)
+
+定义：是否命中对任务真正有帮助的 memory / skill / tool。
+
+建议拆分：
+- `Hm`：memory hit
+- `Hs`：skill hit
+- `Ht`：tool hit
+
+取值思路：
+- 命中高价值候选并帮助减少步骤：正奖励
+- 召回很多但没用：低奖励或 0
+- 漏掉关键候选：负奖励
+
+## 3. Execution / Tool Error Penalty (`E`)
+
+定义：是否出现无效调用、错误调用、明显多余调用。
+
+示例：
+- 调了不该调的工具
+- 工具参数明显错
+- 重复调用同一无效动作
+- 本可以直接答，却走了长链路
+
+建议取值：
+- 每次轻微错误：`0.1` 到 `0.3`
+- 严重错误：`0.5` 到 `1.0`
+
+## 4. User Correction Penalty (`C`)
+
+定义：用户是否需要补充本应已知的信息，或纠正错误动作。
+
+示例：
+- 用户重复说明偏好
+- 用户指出调用了错误工具
+- 用户要求撤回错误记忆
+
+解释：
+这项对长期系统非常关键，因为它直接代表“系统到底有没有真正学会”。
+
+## 5. Latency Penalty (`L`)
+
+定义：系统完成任务消耗的时间和步骤是否过长。
+
+建议包括：
+- wall-clock latency
+- action count
+- retry count
+
+思路：
+- 少量额外推理可以接受
+- 大量无效绕路必须惩罚
+
+## 6. Context Cost Penalty (`X`)
+
+定义：是否过度膨胀上下文。
+
+包括：
+- 注入了太多无关 memory
+- 加载了不必要的 skill
+- 输出了过大的中间内容
+
+原因：
+agent 很容易“为了保险多塞一点”，结果把上下文拖死。
+这个成本必须显式进 reward。
+
+## 7. Useful Reuse Bonus (`U`)
+
+定义：是否复用了正确的长期信息，并确实提升了效率或质量。
+
+例子：
+- 成功复用用户偏好，避免再次确认
+- 复用已验证的 skill，减少试错
+- 复用相似 episode，加速完成任务
+
+## 初始权重建议
+
+可先用一个朴素版本：
+
+```text
+ws = 1.0
+wr = 0.35
+we = 0.30
+wc = 0.40
+wl = 0.15
+wx = 0.20
+wu = 0.25
+```
+
+解释：
+- success 最高
+- user correction 罚得较重，因为它直接暴露系统没学会
+- retrieval hit 有明显价值，但不能盖过结果
+- latency/context 重要，但初期不该过重
+
+## 信号来源
+
+reward 可来自三类来源：
+
+### A. 显式信号
+- 用户说“对/不对”
+- 用户纠正
+- 用户二次要求重做
+
+### B. 隐式信号
+- 是否减少步骤
+- 是否触发错误
+- 是否重复问同样的问题
+- 是否超时
+
+### C. 程序性验收
+- 测试是否通过
+- 目标文件是否生成
+- 指定字段是否匹配
+- 工具执行是否成功
+
+## 反事实记录要求
+
+为后续训练，必须记录：
+- 候选集有哪些
+- 最终选了谁
+- 哪些高分候选没有被选
+- 每个动作的局部 outcome
+
+否则 reward 只能打给“整个过程”，无法学习具体路由策略。
+
+## 初期策略
+
+Phase 0 / Phase 1 不建议直接把 reward 用于大模型权重更新。
+先用于：
+- 路由规则评估
+- 样本打标
+- 候选排序优化
+- bandit / reranker 训练
+
+## 风险
+
+- 只看 success，会奖励瞎猫碰死耗子
+- 只看效率，会让系统不敢探索
+- 只看用户反馈，会受用户表达噪声影响
+- 不记录反事实，训练会非常盲
+
+## 当前结论
+
+reward 在 memabra 中不是附属件，而是学习闭环的核心基础设施。
+如果 reward 设计不清，后面所有“根据结果更新权重”都会变成伪学习。 
\ No newline at end of file
diff --git a/docs/router-versions/current.json b/docs/router-versions/current.json
new file mode 100644
index 0000000..99f5dff
--- /dev/null
+++ b/docs/router-versions/current.json
@@ -0,0 +1,13 @@
+{
+  "current_version_id": "20260415-023347",
+  "promotion_source": null,
+  "benchmark_summary": {
+    "reward_delta": 0.0,
+    "error_rate_delta": 0.0,
+    "latency_delta_ms": 0.0,
+    "baseline_avg_reward": 0.44,
+    "challenger_avg_reward": 0.44
+  },
+  "prior_version_id": "20260415-023347",
+  "saved_at": "2026-04-15T02:33:47.916903+00:00"
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-150123.json b/docs/router-versions/versions/20260414-150123.json
new file mode 100644
index 0000000..981eef7
--- /dev/null
+++ b/docs/router-versions/versions/20260414-150123.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-150123",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-150127.json b/docs/router-versions/versions/20260414-150127.json
new file mode 100644
index 0000000..a2ff8e9
--- /dev/null
+++ b/docs/router-versions/versions/20260414-150127.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-150127",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-150228.json b/docs/router-versions/versions/20260414-150228.json
new file mode 100644
index 0000000..87fef3e
--- /dev/null
+++ b/docs/router-versions/versions/20260414-150228.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-150228",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-150426.json b/docs/router-versions/versions/20260414-150426.json
new file mode 100644
index 0000000..2c35bc1
--- /dev/null
+++ b/docs/router-versions/versions/20260414-150426.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-150426",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-152505.json b/docs/router-versions/versions/20260414-152505.json
new file mode 100644
index 0000000..89177f8
--- /dev/null
+++ b/docs/router-versions/versions/20260414-152505.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-152505",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-152530.json b/docs/router-versions/versions/20260414-152530.json
new file mode 100644
index 0000000..c0e6e3f
--- /dev/null
+++ b/docs/router-versions/versions/20260414-152530.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-152530",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-152625.json b/docs/router-versions/versions/20260414-152625.json
new file mode 100644
index 0000000..e999388
--- /dev/null
+++ b/docs/router-versions/versions/20260414-152625.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-152625",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-152935.json b/docs/router-versions/versions/20260414-152935.json
new file mode 100644
index 0000000..264b6e5
--- /dev/null
+++ b/docs/router-versions/versions/20260414-152935.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-152935",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-152941.json b/docs/router-versions/versions/20260414-152941.json
new file mode 100644
index 0000000..839774c
--- /dev/null
+++ b/docs/router-versions/versions/20260414-152941.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-152941",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-155036.json b/docs/router-versions/versions/20260414-155036.json
new file mode 100644
index 0000000..7dc14d0
--- /dev/null
+++ b/docs/router-versions/versions/20260414-155036.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-155036",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-155251.json b/docs/router-versions/versions/20260414-155251.json
new file mode 100644
index 0000000..f7adf54
--- /dev/null
+++ b/docs/router-versions/versions/20260414-155251.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-155251",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-155350.json b/docs/router-versions/versions/20260414-155350.json
new file mode 100644
index 0000000..21756af
--- /dev/null
+++ b/docs/router-versions/versions/20260414-155350.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-155350",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-164944.json b/docs/router-versions/versions/20260414-164944.json
new file mode 100644
index 0000000..bbb009f
--- /dev/null
+++ b/docs/router-versions/versions/20260414-164944.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-164944",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-165138.json b/docs/router-versions/versions/20260414-165138.json
new file mode 100644
index 0000000..eced56e
--- /dev/null
+++ b/docs/router-versions/versions/20260414-165138.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-165138",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-165207.json b/docs/router-versions/versions/20260414-165207.json
new file mode 100644
index 0000000..f5e1e9d
--- /dev/null
+++ b/docs/router-versions/versions/20260414-165207.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-165207",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-165241.json b/docs/router-versions/versions/20260414-165241.json
new file mode 100644
index 0000000..534781e
--- /dev/null
+++ b/docs/router-versions/versions/20260414-165241.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-165241",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-165316.json b/docs/router-versions/versions/20260414-165316.json
new file mode 100644
index 0000000..5710405
--- /dev/null
+++ b/docs/router-versions/versions/20260414-165316.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-165316",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-165359.json b/docs/router-versions/versions/20260414-165359.json
new file mode 100644
index 0000000..f5c2a67
--- /dev/null
+++ b/docs/router-versions/versions/20260414-165359.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-165359",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-165450.json b/docs/router-versions/versions/20260414-165450.json
new file mode 100644
index 0000000..31153a6
--- /dev/null
+++ b/docs/router-versions/versions/20260414-165450.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-165450",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-171516.json b/docs/router-versions/versions/20260414-171516.json
new file mode 100644
index 0000000..3e158f7
--- /dev/null
+++ b/docs/router-versions/versions/20260414-171516.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-171516",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-171623.json b/docs/router-versions/versions/20260414-171623.json
new file mode 100644
index 0000000..a132719
--- /dev/null
+++ b/docs/router-versions/versions/20260414-171623.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-171623",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-171651.json b/docs/router-versions/versions/20260414-171651.json
new file mode 100644
index 0000000..c34f0c5
--- /dev/null
+++ b/docs/router-versions/versions/20260414-171651.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-171651",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-171757.json b/docs/router-versions/versions/20260414-171757.json
new file mode 100644
index 0000000..65dd12d
--- /dev/null
+++ b/docs/router-versions/versions/20260414-171757.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-171757",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-173832.json b/docs/router-versions/versions/20260414-173832.json
new file mode 100644
index 0000000..725bfc5
--- /dev/null
+++ b/docs/router-versions/versions/20260414-173832.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-173832",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-180027.json b/docs/router-versions/versions/20260414-180027.json
new file mode 100644
index 0000000..b58c8ab
--- /dev/null
+++ b/docs/router-versions/versions/20260414-180027.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-180027",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-180106.json b/docs/router-versions/versions/20260414-180106.json
new file mode 100644
index 0000000..fd7ddda
--- /dev/null
+++ b/docs/router-versions/versions/20260414-180106.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-180106",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-180343.json b/docs/router-versions/versions/20260414-180343.json
new file mode 100644
index 0000000..3348994
--- /dev/null
+++ b/docs/router-versions/versions/20260414-180343.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-180343",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-180515.json b/docs/router-versions/versions/20260414-180515.json
new file mode 100644
index 0000000..67ade92
--- /dev/null
+++ b/docs/router-versions/versions/20260414-180515.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-180515",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-180553.json b/docs/router-versions/versions/20260414-180553.json
new file mode 100644
index 0000000..dbe91d3
--- /dev/null
+++ b/docs/router-versions/versions/20260414-180553.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-180553",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-180625.json b/docs/router-versions/versions/20260414-180625.json
new file mode 100644
index 0000000..98ff42d
--- /dev/null
+++ b/docs/router-versions/versions/20260414-180625.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-180625",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-180658.json b/docs/router-versions/versions/20260414-180658.json
new file mode 100644
index 0000000..accb2b3
--- /dev/null
+++ b/docs/router-versions/versions/20260414-180658.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-180658",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-182721.json b/docs/router-versions/versions/20260414-182721.json
new file mode 100644
index 0000000..e418761
--- /dev/null
+++ b/docs/router-versions/versions/20260414-182721.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-182721",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-182806.json b/docs/router-versions/versions/20260414-182806.json
new file mode 100644
index 0000000..ca80cf9
--- /dev/null
+++ b/docs/router-versions/versions/20260414-182806.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-182806",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-183024.json b/docs/router-versions/versions/20260414-183024.json
new file mode 100644
index 0000000..b502e71
--- /dev/null
+++ b/docs/router-versions/versions/20260414-183024.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-183024",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-183107.json b/docs/router-versions/versions/20260414-183107.json
new file mode 100644
index 0000000..37ba03e
--- /dev/null
+++ b/docs/router-versions/versions/20260414-183107.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-183107",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-185133.json b/docs/router-versions/versions/20260414-185133.json
new file mode 100644
index 0000000..df79a5d
--- /dev/null
+++ b/docs/router-versions/versions/20260414-185133.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-185133",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-185710.json b/docs/router-versions/versions/20260414-185710.json
new file mode 100644
index 0000000..11eeb76
--- /dev/null
+++ b/docs/router-versions/versions/20260414-185710.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-185710",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-185816.json b/docs/router-versions/versions/20260414-185816.json
new file mode 100644
index 0000000..8b98a8a
--- /dev/null
+++ b/docs/router-versions/versions/20260414-185816.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-185816",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-185837.json b/docs/router-versions/versions/20260414-185837.json
new file mode 100644
index 0000000..dd8c930
--- /dev/null
+++ b/docs/router-versions/versions/20260414-185837.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-185837",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-191901.json b/docs/router-versions/versions/20260414-191901.json
new file mode 100644
index 0000000..4739452
--- /dev/null
+++ b/docs/router-versions/versions/20260414-191901.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-191901",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-192109.json b/docs/router-versions/versions/20260414-192109.json
new file mode 100644
index 0000000..bff3115
--- /dev/null
+++ b/docs/router-versions/versions/20260414-192109.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-192109",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-194133.json b/docs/router-versions/versions/20260414-194133.json
new file mode 100644
index 0000000..b333bab
--- /dev/null
+++ b/docs/router-versions/versions/20260414-194133.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-194133",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-194158.json b/docs/router-versions/versions/20260414-194158.json
new file mode 100644
index 0000000..8d624b2
--- /dev/null
+++ b/docs/router-versions/versions/20260414-194158.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-194158",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-200220.json b/docs/router-versions/versions/20260414-200220.json
new file mode 100644
index 0000000..d82c9b1
--- /dev/null
+++ b/docs/router-versions/versions/20260414-200220.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-200220",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-200302.json b/docs/router-versions/versions/20260414-200302.json
new file mode 100644
index 0000000..b4e7d22
--- /dev/null
+++ b/docs/router-versions/versions/20260414-200302.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-200302",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-200458.json b/docs/router-versions/versions/20260414-200458.json
new file mode 100644
index 0000000..28f3df2
--- /dev/null
+++ b/docs/router-versions/versions/20260414-200458.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-200458",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-200616.json b/docs/router-versions/versions/20260414-200616.json
new file mode 100644
index 0000000..6364746
--- /dev/null
+++ b/docs/router-versions/versions/20260414-200616.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-200616",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-200738.json b/docs/router-versions/versions/20260414-200738.json
new file mode 100644
index 0000000..c71f1af
--- /dev/null
+++ b/docs/router-versions/versions/20260414-200738.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-200738",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-202805.json b/docs/router-versions/versions/20260414-202805.json
new file mode 100644
index 0000000..1f11f79
--- /dev/null
+++ b/docs/router-versions/versions/20260414-202805.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-202805",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-203008.json b/docs/router-versions/versions/20260414-203008.json
new file mode 100644
index 0000000..4d19849
--- /dev/null
+++ b/docs/router-versions/versions/20260414-203008.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-203008",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-203111.json b/docs/router-versions/versions/20260414-203111.json
new file mode 100644
index 0000000..b1107cb
--- /dev/null
+++ b/docs/router-versions/versions/20260414-203111.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-203111",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-203237.json b/docs/router-versions/versions/20260414-203237.json
new file mode 100644
index 0000000..77d5df4
--- /dev/null
+++ b/docs/router-versions/versions/20260414-203237.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-203237",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-203328.json b/docs/router-versions/versions/20260414-203328.json
new file mode 100644
index 0000000..7aae422
--- /dev/null
+++ b/docs/router-versions/versions/20260414-203328.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-203328",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-203401.json b/docs/router-versions/versions/20260414-203401.json
new file mode 100644
index 0000000..cf9911d
--- /dev/null
+++ b/docs/router-versions/versions/20260414-203401.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-203401",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-205435.json b/docs/router-versions/versions/20260414-205435.json
new file mode 100644
index 0000000..6c6b636
--- /dev/null
+++ b/docs/router-versions/versions/20260414-205435.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-205435",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-205607.json b/docs/router-versions/versions/20260414-205607.json
new file mode 100644
index 0000000..15dbbda
--- /dev/null
+++ b/docs/router-versions/versions/20260414-205607.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-205607",
+  "weights": {
+    "clarify": {
+      "input_length": 11.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-205611.json b/docs/router-versions/versions/20260414-205611.json
new file mode 100644
index 0000000..066f12a
--- /dev/null
+++ b/docs/router-versions/versions/20260414-205611.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-205611",
+  "weights": {
+    "clarify": {
+      "input_length": 11.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-205703.json b/docs/router-versions/versions/20260414-205703.json
new file mode 100644
index 0000000..06983c9
--- /dev/null
+++ b/docs/router-versions/versions/20260414-205703.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-205703",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-205728.json b/docs/router-versions/versions/20260414-205728.json
new file mode 100644
index 0000000..8477afd
--- /dev/null
+++ b/docs/router-versions/versions/20260414-205728.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-205728",
+  "weights": {
+    "clarify": {
+      "input_length": 11.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-205805.json b/docs/router-versions/versions/20260414-205805.json
new file mode 100644
index 0000000..71603c3
--- /dev/null
+++ b/docs/router-versions/versions/20260414-205805.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-205805",
+  "weights": {
+    "clarify": {
+      "input_length": 11.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-211836.json b/docs/router-versions/versions/20260414-211836.json
new file mode 100644
index 0000000..dc012fc
--- /dev/null
+++ b/docs/router-versions/versions/20260414-211836.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-211836",
+  "weights": {
+    "clarify": {
+      "input_length": 11.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-212115.json b/docs/router-versions/versions/20260414-212115.json
new file mode 100644
index 0000000..85d6659
--- /dev/null
+++ b/docs/router-versions/versions/20260414-212115.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-212115",
+  "weights": {
+    "clarify": {
+      "input_length": 11.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-212202.json b/docs/router-versions/versions/20260414-212202.json
new file mode 100644
index 0000000..96564a3
--- /dev/null
+++ b/docs/router-versions/versions/20260414-212202.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-212202",
+  "weights": {
+    "clarify": {
+      "input_length": 11.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-212214.json b/docs/router-versions/versions/20260414-212214.json
new file mode 100644
index 0000000..6cacf77
--- /dev/null
+++ b/docs/router-versions/versions/20260414-212214.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-212214",
+  "weights": {
+    "clarify": {
+      "input_length": 11.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-214245.json b/docs/router-versions/versions/20260414-214245.json
new file mode 100644
index 0000000..013cf37
--- /dev/null
+++ b/docs/router-versions/versions/20260414-214245.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-214245",
+  "weights": {
+    "clarify": {
+      "input_length": 11.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-214448.json b/docs/router-versions/versions/20260414-214448.json
new file mode 100644
index 0000000..5e2da46
--- /dev/null
+++ b/docs/router-versions/versions/20260414-214448.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-214448",
+  "weights": {
+    "clarify": {
+      "input_length": 11.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-220543.json b/docs/router-versions/versions/20260414-220543.json
new file mode 100644
index 0000000..bf4b3a8
--- /dev/null
+++ b/docs/router-versions/versions/20260414-220543.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-220543",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-220544.json b/docs/router-versions/versions/20260414-220544.json
new file mode 100644
index 0000000..3192191
--- /dev/null
+++ b/docs/router-versions/versions/20260414-220544.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-220544",
+  "weights": {
+    "clarify": {
+      "input_length": 11.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-220559.json b/docs/router-versions/versions/20260414-220559.json
new file mode 100644
index 0000000..0d6d8ec
--- /dev/null
+++ b/docs/router-versions/versions/20260414-220559.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-220559",
+  "weights": {
+    "clarify": {
+      "input_length": 11.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-220819.json b/docs/router-versions/versions/20260414-220819.json
new file mode 100644
index 0000000..dcf6ec2
--- /dev/null
+++ b/docs/router-versions/versions/20260414-220819.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-220819",
+  "weights": {
+    "clarify": {
+      "input_length": 11.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-220857.json b/docs/router-versions/versions/20260414-220857.json
new file mode 100644
index 0000000..8837b83
--- /dev/null
+++ b/docs/router-versions/versions/20260414-220857.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-220857",
+  "weights": {
+    "clarify": {
+      "input_length": 11.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-220938.json b/docs/router-versions/versions/20260414-220938.json
new file mode 100644
index 0000000..a830b0f
--- /dev/null
+++ b/docs/router-versions/versions/20260414-220938.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-220938",
+  "weights": {
+    "clarify": {
+      "input_length": 6.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-220939.json b/docs/router-versions/versions/20260414-220939.json
new file mode 100644
index 0000000..ba0f02f
--- /dev/null
+++ b/docs/router-versions/versions/20260414-220939.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-220939",
+  "weights": {
+    "clarify": {
+      "input_length": 11.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-221015.json b/docs/router-versions/versions/20260414-221015.json
new file mode 100644
index 0000000..5696f93
--- /dev/null
+++ b/docs/router-versions/versions/20260414-221015.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-221015",
+  "weights": {
+    "clarify": {
+      "input_length": 11.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260414-221023.json b/docs/router-versions/versions/20260414-221023.json
new file mode 100644
index 0000000..d586cf7
--- /dev/null
+++ b/docs/router-versions/versions/20260414-221023.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260414-221023",
+  "weights": {
+    "clarify": {
+      "input_length": 11.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260415-012153.json b/docs/router-versions/versions/20260415-012153.json
new file mode 100644
index 0000000..e6d47c1
--- /dev/null
+++ b/docs/router-versions/versions/20260415-012153.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260415-012153",
+  "weights": {
+    "clarify": {
+      "input_length": 11.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260415-012533.json b/docs/router-versions/versions/20260415-012533.json
new file mode 100644
index 0000000..83e02d6
--- /dev/null
+++ b/docs/router-versions/versions/20260415-012533.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260415-012533",
+  "weights": {
+    "clarify": {
+      "input_length": 11.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260415-012918.json b/docs/router-versions/versions/20260415-012918.json
new file mode 100644
index 0000000..abe5867
--- /dev/null
+++ b/docs/router-versions/versions/20260415-012918.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260415-012918",
+  "weights": {
+    "clarify": {
+      "input_length": 11.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260415-013334.json b/docs/router-versions/versions/20260415-013334.json
new file mode 100644
index 0000000..14a4658
--- /dev/null
+++ b/docs/router-versions/versions/20260415-013334.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260415-013334",
+  "weights": {
+    "clarify": {
+      "input_length": 11.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260415-013636.json b/docs/router-versions/versions/20260415-013636.json
new file mode 100644
index 0000000..2e75c9c
--- /dev/null
+++ b/docs/router-versions/versions/20260415-013636.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260415-013636",
+  "weights": {
+    "clarify": {
+      "input_length": 11.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260415-014152.json b/docs/router-versions/versions/20260415-014152.json
new file mode 100644
index 0000000..bf7c6f4
--- /dev/null
+++ b/docs/router-versions/versions/20260415-014152.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260415-014152",
+  "weights": {
+    "clarify": {
+      "input_length": 11.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260415-015732.json b/docs/router-versions/versions/20260415-015732.json
new file mode 100644
index 0000000..5005a12
--- /dev/null
+++ b/docs/router-versions/versions/20260415-015732.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260415-015732",
+  "weights": {
+    "clarify": {
+      "input_length": 11.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260415-023117.json b/docs/router-versions/versions/20260415-023117.json
new file mode 100644
index 0000000..fceb022
--- /dev/null
+++ b/docs/router-versions/versions/20260415-023117.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260415-023117",
+  "weights": {
+    "clarify": {
+      "input_length": 11.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/router-versions/versions/20260415-023347.json b/docs/router-versions/versions/20260415-023347.json
new file mode 100644
index 0000000..cf75f78
--- /dev/null
+++ b/docs/router-versions/versions/20260415-023347.json
@@ -0,0 +1,35 @@
+{
+  "version_id": "20260415-023347",
+  "weights": {
+    "clarify": {
+      "input_length": 11.0,
+      "memory_count": 1.0,
+      "skill_count": 1.0,
+      "tool_count": 1.0,
+      "top_memory_confidence": 0.9500000000000001,
+      "top_skill_success_rate": 0.8999999999999998,
+      "top_tool_confidence": 0.9500000000000001,
+      "top_tool_risk": 0.0
+    }
+  },
+  "feature_keys": [
+    "input_length",
+    "memory_count",
+    "skill_count",
+    "tool_count",
+    "top_memory_confidence",
+    "top_skill_success_rate",
+    "top_tool_confidence",
+    "top_tool_risk"
+  ],
+  "metadata": {
+    "source": "online_learning",
+    "benchmark_summary": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/schemas/candidate_object.schema.json b/docs/schemas/candidate_object.schema.json
new file mode 100644
index 0000000..908d35f
--- /dev/null
+++ b/docs/schemas/candidate_object.schema.json
@@ -0,0 +1,86 @@
+{
+  "$schema": "https://json-schema.org/draft/2020-12/schema",
+  "$id": "https://memabra.local/schemas/candidate_object.schema.json",
+  "title": "CandidateObject",
+  "description": "Unified retrieval/routing candidate for memory, skill, or tool objects in memabra.",
+  "type": "object",
+  "additionalProperties": false,
+  "required": [
+    "id",
+    "type",
+    "title",
+    "summary",
+    "triggers",
+    "cost",
+    "confidence",
+    "success_rate",
+    "freshness",
+    "risk",
+    "tags",
+    "source"
+  ],
+  "properties": {
+    "id": {
+      "type": "string",
+      "minLength": 1
+    },
+    "type": {
+      "type": "string",
+      "enum": ["memory", "skill", "tool"]
+    },
+    "title": {
+      "type": "string",
+      "minLength": 1
+    },
+    "summary": {
+      "type": "string",
+      "minLength": 1
+    },
+    "triggers": {
+      "type": "array",
+      "items": {"type": "string"},
+      "default": []
+    },
+    "cost": {
+      "type": "number",
+      "minimum": 0
+    },
+    "confidence": {
+      "type": "number",
+      "minimum": 0,
+      "maximum": 1
+    },
+    "success_rate": {
+      "type": "number",
+      "minimum": 0,
+      "maximum": 1
+    },
+    "freshness": {
+      "type": "number",
+      "minimum": 0,
+      "maximum": 1
+    },
+    "risk": {
+      "type": "number",
+      "minimum": 0,
+      "maximum": 1
+    },
+    "embedding_ref": {
+      "type": ["string", "null"]
+    },
+    "tags": {
+      "type": "array",
+      "items": {"type": "string"},
+      "default": []
+    },
+    "source": {
+      "type": "string",
+      "enum": ["user", "system", "generated", "external"]
+    },
+    "type_payload": {
+      "type": "object",
+      "description": "Type-specific metadata retained without collapsing semantic boundaries.",
+      "default": {}
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/schemas/event.schema.json b/docs/schemas/event.schema.json
new file mode 100644
index 0000000..8450951
--- /dev/null
+++ b/docs/schemas/event.schema.json
@@ -0,0 +1,57 @@
+{
+  "$schema": "https://json-schema.org/draft/2020-12/schema",
+  "$id": "https://memabra.local/schemas/event.schema.json",
+  "title": "Event",
+  "description": "Atomic event emitted during retrieval, routing, execution, and evaluation in memabra.",
+  "type": "object",
+  "additionalProperties": false,
+  "required": [
+    "event_id",
+    "trajectory_id",
+    "timestamp",
+    "stage",
+    "event_type",
+    "payload"
+  ],
+  "properties": {
+    "event_id": {"type": "string", "minLength": 1},
+    "trajectory_id": {"type": "string", "minLength": 1},
+    "timestamp": {"type": "string", "format": "date-time"},
+    "stage": {
+      "type": "string",
+      "enum": ["retrieval", "policy", "execution", "evaluation", "memory_writeback"]
+    },
+    "event_type": {
+      "type": "string",
+      "enum": [
+        "task_received",
+        "context_summarized",
+        "candidates_recalled",
+        "candidate_scored",
+        "action_selected",
+        "tool_called",
+        "tool_result",
+        "skill_loaded",
+        "memory_injected",
+        "user_clarified",
+        "user_corrected",
+        "reward_computed",
+        "memory_written",
+        "memory_revoked",
+        "task_completed",
+        "task_failed"
+      ]
+    },
+    "payload": {
+      "type": "object",
+      "description": "Event-specific structured body"
+    },
+    "metrics": {
+      "type": "object",
+      "default": {}
+    },
+    "parent_event_id": {
+      "type": ["string", "null"]
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/schemas/memory_record.schema.json b/docs/schemas/memory_record.schema.json
new file mode 100644
index 0000000..ec402c7
--- /dev/null
+++ b/docs/schemas/memory_record.schema.json
@@ -0,0 +1,75 @@
+{
+  "$schema": "https://json-schema.org/draft/2020-12/schema",
+  "$id": "https://memabra.local/schemas/memory_record.schema.json",
+  "title": "MemoryRecord",
+  "description": "Long-term memory record stored by memabra with explicit layer typing and verification metadata.",
+  "type": "object",
+  "additionalProperties": false,
+  "required": [
+    "id",
+    "memory_type",
+    "fact_status",
+    "content",
+    "summary",
+    "source",
+    "confidence",
+    "created_at",
+    "updated_at",
+    "verification"
+  ],
+  "properties": {
+    "id": {"type": "string", "minLength": 1},
+    "memory_type": {
+      "type": "string",
+      "enum": ["semantic", "procedural", "episodic", "working"]
+    },
+    "fact_status": {
+      "type": "string",
+      "enum": ["draft", "assumed", "verified", "deprecated", "revoked"]
+    },
+    "content": {"type": "string", "minLength": 1},
+    "summary": {"type": "string", "minLength": 1},
+    "source": {
+      "type": "object",
+      "additionalProperties": false,
+      "required": ["kind", "ref"],
+      "properties": {
+        "kind": {"type": "string", "enum": ["user", "session", "tool", "import", "system"]},
+        "ref": {"type": "string", "minLength": 1}
+      }
+    },
+    "confidence": {"type": "number", "minimum": 0, "maximum": 1},
+    "tags": {
+      "type": "array",
+      "items": {"type": "string"},
+      "default": []
+    },
+    "related_entities": {
+      "type": "array",
+      "items": {"type": "string"},
+      "default": []
+    },
+    "created_at": {"type": "string", "format": "date-time"},
+    "updated_at": {"type": "string", "format": "date-time"},
+    "last_used_at": {"type": ["string", "null"], "format": "date-time"},
+    "expires_at": {"type": ["string", "null"], "format": "date-time"},
+    "verification": {
+      "type": "object",
+      "additionalProperties": false,
+      "required": ["status", "last_checked_at", "check_method"],
+      "properties": {
+        "status": {"type": "string", "enum": ["unknown", "pending", "confirmed", "disputed", "failed"]},
+        "last_checked_at": {"type": ["string", "null"], "format": "date-time"},
+        "check_method": {"type": ["string", "null"]}
+      }
+    },
+    "revocation": {
+      "type": ["object", "null"],
+      "additionalProperties": false,
+      "properties": {
+        "reason": {"type": "string"},
+        "revoked_at": {"type": "string", "format": "date-time"}
+      }
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/schemas/trajectory.schema.json b/docs/schemas/trajectory.schema.json
new file mode 100644
index 0000000..61d2218
--- /dev/null
+++ b/docs/schemas/trajectory.schema.json
@@ -0,0 +1,121 @@
+{
+  "$schema": "https://json-schema.org/draft/2020-12/schema",
+  "$id": "https://memabra.local/schemas/trajectory.schema.json",
+  "title": "Trajectory",
+  "description": "Replayable task-level trace for routing and learning in memabra.",
+  "type": "object",
+  "additionalProperties": false,
+  "required": [
+    "trajectory_id",
+    "task",
+    "context_snapshot",
+    "candidate_sets",
+    "decisions",
+    "events",
+    "outcome",
+    "reward"
+  ],
+  "properties": {
+    "trajectory_id": {"type": "string", "minLength": 1},
+    "task": {
+      "type": "object",
+      "additionalProperties": false,
+      "required": ["task_id", "input", "channel", "created_at"],
+      "properties": {
+        "task_id": {"type": "string", "minLength": 1},
+        "input": {"type": "string", "minLength": 1},
+        "channel": {"type": "string", "minLength": 1},
+        "created_at": {"type": "string", "format": "date-time"},
+        "user_id": {"type": ["string", "null"]}
+      }
+    },
+    "context_snapshot": {
+      "type": "object",
+      "additionalProperties": false,
+      "required": ["conversation_summary", "environment_summary"],
+      "properties": {
+        "conversation_summary": {"type": "string"},
+        "environment_summary": {"type": "string"},
+        "recent_failures": {
+          "type": "array",
+          "items": {"type": "string"},
+          "default": []
+        }
+      }
+    },
+    "candidate_sets": {
+      "type": "object",
+      "additionalProperties": false,
+      "required": ["memory", "skill", "tool"],
+      "properties": {
+        "memory": {"type": "array", "items": {"$ref": "candidate_object.schema.json"}},
+        "skill": {"type": "array", "items": {"$ref": "candidate_object.schema.json"}},
+        "tool": {"type": "array", "items": {"$ref": "candidate_object.schema.json"}}
+      }
+    },
+    "decisions": {
+      "type": "array",
+      "items": {
+        "type": "object",
+        "additionalProperties": false,
+        "required": ["step", "decision_type", "selected_ids", "rationale"],
+        "properties": {
+          "step": {"type": "integer", "minimum": 1},
+          "decision_type": {
+            "type": "string",
+            "enum": ["direct_answer", "inject_memory", "load_skill", "call_tool", "clarify", "composite_action"]
+          },
+          "selected_ids": {
+            "type": "array",
+            "items": {"type": "string"}
+          },
+          "rejected_ids": {
+            "type": "array",
+            "items": {"type": "string"},
+            "default": []
+          },
+          "rationale": {"type": "string"},
+          "estimated_cost": {"type": ["number", "null" ]}
+        }
+      }
+    },
+    "events": {
+      "type": "array",
+      "items": {"$ref": "event.schema.json"}
+    },
+    "outcome": {
+      "type": "object",
+      "additionalProperties": false,
+      "required": ["status", "steps", "latency_ms", "user_corrections"],
+      "properties": {
+        "status": {"type": "string", "enum": ["success", "partial_success", "failure"]},
+        "steps": {"type": "integer", "minimum": 0},
+        "latency_ms": {"type": "integer", "minimum": 0},
+        "user_corrections": {"type": "integer", "minimum": 0},
+        "tool_errors": {"type": "integer", "minimum": 0},
+        "notes": {"type": ["string", "null"]}
+      }
+    },
+    "reward": {
+      "type": "object",
+      "additionalProperties": false,
+      "required": ["total", "components"],
+      "properties": {
+        "total": {"type": "number"},
+        "components": {
+          "type": "object",
+          "required": ["task_success", "retrieval_hit", "tool_error", "user_correction", "latency", "context_cost", "useful_reuse"],
+          "properties": {
+            "task_success": {"type": "number"},
+            "retrieval_hit": {"type": "number"},
+            "tool_error": {"type": "number"},
+            "user_correction": {"type": "number"},
+            "latency": {"type": "number"},
+            "context_cost": {"type": "number"},
+            "useful_reuse": {"type": "number"}
+          }
+        }
+      }
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-0036bcfb-88dc-4636-897e-89fc909a810e.json b/docs/training-reports/report-0036bcfb-88dc-4636-897e-89fc909a810e.json
new file mode 100644
index 0000000..b2865d7
--- /dev/null
+++ b/docs/training-reports/report-0036bcfb-88dc-4636-897e-89fc909a810e.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-0036bcfb-88dc-4636-897e-89fc909a810e",
+  "timestamp": "2026-04-14T16:51:38.314846+00:00",
+  "source_trajectory_ids": [
+    "traj-22d17281-9e5c-435d-852e-fa646d15afc4",
+    "traj-29a77a54-36ed-4885-b77f-ffc131425d2c",
+    "traj-40bce4b3-20ba-47ab-ac8d-4f3c494bffd1",
+    "traj-6ce2c5e5-6d58-439a-82ec-21f77f6de860",
+    "traj-76480a70-fbe1-4481-848b-a7e8d37643f5",
+    "traj-9a588dc5-9ef2-4290-8712-0b31946536a2",
+    "traj-b43b4a4e-4dfb-4ba9-8c56-29ea09e00e17",
+    "traj-ba03c72c-b782-400f-a9b1-4a4f6c0d7769",
+    "traj-be3bf833-bc49-4852-9ea2-ca04aeea8f31",
+    "traj-ebafcf74-923e-4af1-b64d-45c7cdbb4b04"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-009a9d41-ba23-4e38-85ad-cd6af5971d8b.json b/docs/training-reports/report-009a9d41-ba23-4e38-85ad-cd6af5971d8b.json
new file mode 100644
index 0000000..aab2739
--- /dev/null
+++ b/docs/training-reports/report-009a9d41-ba23-4e38-85ad-cd6af5971d8b.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-009a9d41-ba23-4e38-85ad-cd6af5971d8b",
+  "timestamp": "2026-04-14T19:41:33.462482+00:00",
+  "source_trajectory_ids": [
+    "traj-0e089eaf-e132-405d-992f-a912f6baaaea",
+    "traj-2881966c-ad32-44c8-9c05-a50b0a2b784c",
+    "traj-5b351fac-7019-4807-a18a-c66b1c95c3e0",
+    "traj-6ef4ff84-d199-4864-8c99-6cd9efded1c6",
+    "traj-a62f2760-76ec-41f9-a20a-3ba8912c7c55",
+    "traj-b50b6662-ff12-4ec6-a112-c56e989bd768",
+    "traj-c1683dc5-e3d0-4421-aad7-fa42581096b2",
+    "traj-c3c9bd98-8c59-4cc4-8ad9-6d7b3c0be987",
+    "traj-d9a4fcc7-e929-48e6-8153-5d5c9c04f798",
+    "traj-e357c149-301f-4826-8812-6a1dab9087bd"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-025f0317-eb57-4357-a944-57c83e768e2b.json b/docs/training-reports/report-025f0317-eb57-4357-a944-57c83e768e2b.json
new file mode 100644
index 0000000..d970059
--- /dev/null
+++ b/docs/training-reports/report-025f0317-eb57-4357-a944-57c83e768e2b.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-025f0317-eb57-4357-a944-57c83e768e2b",
+  "timestamp": "2026-04-14T20:54:35.912785+00:00",
+  "source_trajectory_ids": [
+    "traj-0a386589-4f3d-4427-8bb6-984395bc391e",
+    "traj-0f04b540-f8bb-46d7-aeb4-ea65a723b82e",
+    "traj-14e30ab1-29e9-4356-ae7e-a8cea48c0b60",
+    "traj-4143c1db-ac63-4bc9-b427-a8f4d64c63f8",
+    "traj-43de5dee-3e20-42cf-91c1-2371b2f31329",
+    "traj-55766bd5-37dc-4216-9e29-3aea0a8a5095",
+    "traj-745e3299-1fd3-4af8-b6e2-4ebc4a47d389",
+    "traj-7c938b98-8346-48f8-a676-adb2e72e7259",
+    "traj-7dd2e59b-f65f-4870-b09e-69b95438b57b",
+    "traj-ab956b24-6aaa-49a2-8841-544cf9555959"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-0335fde2-290a-4346-91b0-d1224cb1253f.json b/docs/training-reports/report-0335fde2-290a-4346-91b0-d1224cb1253f.json
new file mode 100644
index 0000000..1cb03da
--- /dev/null
+++ b/docs/training-reports/report-0335fde2-290a-4346-91b0-d1224cb1253f.json
@@ -0,0 +1,41 @@
+{
+  "report_id": "report-0335fde2-290a-4346-91b0-d1224cb1253f",
+  "timestamp": "2026-04-14T15:26:25.556320+00:00",
+  "source_trajectory_ids": [
+    "traj-0ccf1900-1e3b-4465-8f02-c51d07d7934c",
+    "traj-22a75db4-1794-4b10-ba4f-61539ae28352",
+    "traj-2ec475f3-4500-4c56-b317-ddb692e6eae5",
+    "traj-35007253-de45-43f6-a64c-121230ae0e1f",
+    "traj-3d3548d3-1981-46ad-be73-33b0420e58f4",
+    "traj-4d5bb70e-9529-4c2c-bb5b-da7f7d09f1f4",
+    "traj-5a663b45-d37f-489f-a403-6dd73d7b2b52",
+    "traj-bdbf6fab-cccd-4381-b3dc-ee7533b5be0e",
+    "traj-c4ea76f2-4403-430f-8821-91f14822e41f",
+    "traj-f2bf1402-39da-4ec2-97ed-b9349ca87581"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-152625"
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-04b8cf41-45f2-4870-ba8b-b509f7d3da48.json b/docs/training-reports/report-04b8cf41-45f2-4870-ba8b-b509f7d3da48.json
new file mode 100644
index 0000000..326dc24
--- /dev/null
+++ b/docs/training-reports/report-04b8cf41-45f2-4870-ba8b-b509f7d3da48.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-04b8cf41-45f2-4870-ba8b-b509f7d3da48",
+  "timestamp": "2026-04-14T18:58:37.161636+00:00",
+  "source_trajectory_ids": [
+    "traj-10fd1aac-8da8-4f5d-be73-feea5fb4e60d",
+    "traj-4b7226f5-e3ed-47de-b0bb-febcad399f82",
+    "traj-6364e000-05f1-4de2-b018-090d2dd922bf",
+    "traj-6ea75734-5be4-4d8c-b5c2-88d971a12763",
+    "traj-7e17a2ac-0aaf-49a8-aed5-552ce80dcfc8",
+    "traj-92c21045-ff0f-4ad0-855f-307a9f509ef7",
+    "traj-b8cefeb5-17ae-4be4-a756-a4c9c453d3c2",
+    "traj-c6a2dba6-dd4f-4c9f-9455-4fb3db4d44b1",
+    "traj-dcac6477-8278-43ab-8efe-226cc8acdeaf",
+    "traj-eec22ce4-682b-4694-bdb8-657a84c4a76c"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-07a477c9-2b2f-4505-a392-5dce58b67829.json b/docs/training-reports/report-07a477c9-2b2f-4505-a392-5dce58b67829.json
new file mode 100644
index 0000000..dc34954
--- /dev/null
+++ b/docs/training-reports/report-07a477c9-2b2f-4505-a392-5dce58b67829.json
@@ -0,0 +1,41 @@
+{
+  "report_id": "report-07a477c9-2b2f-4505-a392-5dce58b67829",
+  "timestamp": "2026-04-14T18:01:06.160145+00:00",
+  "source_trajectory_ids": [
+    "traj-04f74afc-d341-4f63-b5ab-32f6d0fb33fb",
+    "traj-1fce1f44-c31b-4143-a0af-05b14783299c",
+    "traj-5fd71ba8-a8ed-4c52-bd5c-3dc0196b954a",
+    "traj-66dbaed9-42ee-4736-bed9-2a7d8260b81e",
+    "traj-89ceb3cf-0bfa-477b-b80c-76392bc7e9db",
+    "traj-8a7a589b-422d-4c51-b209-1f8a28bbe624",
+    "traj-b885ff21-6df2-4ea2-a39e-bb47a5aca56e",
+    "traj-df5f18eb-825e-4c80-b047-f9798bbeb654",
+    "traj-f47da721-1886-4466-b389-32ef359b58e6",
+    "traj-f673ef5f-700a-4dc8-b5ce-0ae2d3ebeeab"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-180106"
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-0856c8c4-bc0a-402d-8e4c-2e946029226b.json b/docs/training-reports/report-0856c8c4-bc0a-402d-8e4c-2e946029226b.json
new file mode 100644
index 0000000..ee955bc
--- /dev/null
+++ b/docs/training-reports/report-0856c8c4-bc0a-402d-8e4c-2e946029226b.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-0856c8c4-bc0a-402d-8e4c-2e946029226b",
+  "timestamp": "2026-04-14T21:22:02.802049+00:00",
+  "source_trajectory_ids": [
+    "traj-29cd218f-e9b2-487d-ab34-620450a27cf7",
+    "traj-41331e52-1bb0-48c7-a65e-2749f3341018",
+    "traj-42b96e93-b37e-4518-84ce-90b243a4a9e2",
+    "traj-42c90394-8ac2-4a7d-8c12-4b4a78ab7a87",
+    "traj-47a0fae8-60f3-4a1a-90a1-0f643e2d9920",
+    "traj-7deb603d-b31e-4625-abaf-344ec12efe44",
+    "traj-dc35bca8-1bca-442a-93a1-4d77e360aba0",
+    "traj-e135ebd2-c850-4f9b-a6df-24b1d7eff190",
+    "traj-e87835bb-ba03-453e-8a50-49ddeeb7268d",
+    "traj-f1557075-9c9a-4f2a-bc48-6a919a379ae0"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-212202",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-08ef866e-b477-4e72-a32c-30003d2b91e9.json b/docs/training-reports/report-08ef866e-b477-4e72-a32c-30003d2b91e9.json
new file mode 100644
index 0000000..965e996
--- /dev/null
+++ b/docs/training-reports/report-08ef866e-b477-4e72-a32c-30003d2b91e9.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-08ef866e-b477-4e72-a32c-30003d2b91e9",
+  "timestamp": "2026-04-14T20:54:35.812564+00:00",
+  "source_trajectory_ids": [
+    "traj-0c9390f7-31ef-48fa-896b-093f9cd4c0ce",
+    "traj-1822f88a-0a09-4536-8022-24a7a73ba6df",
+    "traj-28ba821c-6a0c-4d40-a008-14497585c3d7",
+    "traj-483b03b2-41ee-4228-9d26-bb4e45eb241c",
+    "traj-73718bd7-97ad-424e-a049-e1ecc05ad770",
+    "traj-b4089209-fcf2-4139-9ed1-e9db5caaff69",
+    "traj-c097a470-093d-4dd1-a6c0-d21110fea346",
+    "traj-fc0408dc-5429-4af8-87d0-8c6212ae2623",
+    "traj-fd8f8a32-d9cb-4bda-a70b-bd95e604e037",
+    "traj-fe7cd38e-3e5d-4ae3-9ee0-6a2c0caafb2b"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-205435",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-09798c98-3bcb-4298-a546-2f531f875853.json b/docs/training-reports/report-09798c98-3bcb-4298-a546-2f531f875853.json
new file mode 100644
index 0000000..121baa6
--- /dev/null
+++ b/docs/training-reports/report-09798c98-3bcb-4298-a546-2f531f875853.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-09798c98-3bcb-4298-a546-2f531f875853",
+  "timestamp": "2026-04-14T22:10:23.222959+00:00",
+  "source_trajectory_ids": [
+    "traj-221f0c59-ad6b-4526-ae14-b5bb558b01ca",
+    "traj-24cbe596-6ee7-444a-b600-32d9b55422db",
+    "traj-2c5a6c34-df10-411d-9709-2a2e07cfca5e",
+    "traj-58e6dcd6-c688-4a66-a1a8-2fc64b06452a",
+    "traj-6a713624-ac97-42db-9946-9919da454d47",
+    "traj-813e4f86-4aab-420b-86fc-8a8694670c84",
+    "traj-a4c8fee4-428b-471c-8634-05d09c430b32",
+    "traj-caa40a45-47ec-4369-9519-ebdb038d5d6a",
+    "traj-d0d72631-39d8-47d7-83a5-76f424553eca",
+    "traj-d5dde094-d5e7-465d-8d7d-0f55356ae159"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-221023",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-097c5767-cf9b-42f5-9f8b-dee8a6224a67.json b/docs/training-reports/report-097c5767-cf9b-42f5-9f8b-dee8a6224a67.json
new file mode 100644
index 0000000..dce1d39
--- /dev/null
+++ b/docs/training-reports/report-097c5767-cf9b-42f5-9f8b-dee8a6224a67.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-097c5767-cf9b-42f5-9f8b-dee8a6224a67",
+  "timestamp": "2026-04-14T21:42:45.782035+00:00",
+  "source_trajectory_ids": [
+    "traj-1212bad2-f0fe-4d95-afd8-2a711775ccfb",
+    "traj-4d53da5b-5a10-4689-8dd9-0a1b2fa74083",
+    "traj-50942176-422b-4653-9477-48e1d16c0d34",
+    "traj-6d886402-bc74-4c9e-998e-ad9e4177b08d",
+    "traj-7d91fc48-dd1d-4c40-9785-5ebef05378a4",
+    "traj-8c012adb-959e-4eab-ac6c-3c5c4854720c",
+    "traj-8df099f2-5180-4ac1-8519-3204b9cffe07",
+    "traj-9c21877b-f093-44cc-af53-9b2961a4dd46",
+    "traj-ff0df310-0704-4619-9346-b27f1df0f237",
+    "traj-ffb39e69-6327-4985-8858-730a3c00a806"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-214245",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-09ac51b7-988c-4b9c-ba38-3511d728c61d.json b/docs/training-reports/report-09ac51b7-988c-4b9c-ba38-3511d728c61d.json
new file mode 100644
index 0000000..f4050d3
--- /dev/null
+++ b/docs/training-reports/report-09ac51b7-988c-4b9c-ba38-3511d728c61d.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-09ac51b7-988c-4b9c-ba38-3511d728c61d",
+  "timestamp": "2026-04-14T22:10:23.371730+00:00",
+  "source_trajectory_ids": [
+    "traj-00565435-0e10-46c5-82bd-3ba97f356fb2",
+    "traj-0249861e-b2f3-4e73-8e38-0599d0b7e8f0",
+    "traj-283bbc3b-5f88-48de-87b8-192418f70445",
+    "traj-4c674844-5148-4428-8607-22ae4ad7361d",
+    "traj-5689e1e1-0a03-45e8-b006-c692325fcc45",
+    "traj-a809b624-6317-4ce7-b809-4ab3479566ee",
+    "traj-b1e3a397-8678-436c-a9ed-17ea168c203a",
+    "traj-b87b739b-6daa-49a8-a7de-8d4509659328",
+    "traj-bc0076e9-0167-44ac-89ae-634b02890cb5",
+    "traj-d42ef1c3-cd8f-4474-a5d6-f0a42ff0a2f3"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-0a3e40be-b389-4041-bab7-cd99e4c8eac0.json b/docs/training-reports/report-0a3e40be-b389-4041-bab7-cd99e4c8eac0.json
new file mode 100644
index 0000000..e27d86a
--- /dev/null
+++ b/docs/training-reports/report-0a3e40be-b389-4041-bab7-cd99e4c8eac0.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-0a3e40be-b389-4041-bab7-cd99e4c8eac0",
+  "timestamp": "2026-04-14T20:07:38.841838+00:00",
+  "source_trajectory_ids": [
+    "traj-04eb60db-62bd-46c2-afe3-ecba6eac900a",
+    "traj-26387046-6b12-4841-9129-735599f13261",
+    "traj-34b22e88-a95e-4f12-84e5-da9af52a7381",
+    "traj-4f4eb7ad-1d11-4852-adde-eab50619c2bc",
+    "traj-51ecc36d-be08-4bcf-b645-7553e9b03992",
+    "traj-6b9f4f38-dc89-4abd-870e-c48c92d2b40e",
+    "traj-82fb2e11-fb35-4960-90d2-b2e53a1ea2ed",
+    "traj-ba5fa9da-693f-4a36-ab0f-c2efbe798ece",
+    "traj-c9c3403c-2ff3-4d84-85dd-731620583118",
+    "traj-ede4f925-e445-4cae-a3ba-0d30973294ae"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-0a675757-4870-4b12-98fb-ab093889eff3.json b/docs/training-reports/report-0a675757-4870-4b12-98fb-ab093889eff3.json
new file mode 100644
index 0000000..e9f69f4
--- /dev/null
+++ b/docs/training-reports/report-0a675757-4870-4b12-98fb-ab093889eff3.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-0a675757-4870-4b12-98fb-ab093889eff3",
+  "timestamp": "2026-04-15T01:57:32.814873+00:00",
+  "source_trajectory_ids": [
+    "traj-1119ae3b-8cb6-4391-b283-bfeefdd12afe",
+    "traj-11afd403-d5d6-4af6-85b8-b015ed5bb1d3",
+    "traj-5c68a94c-d276-4e0f-9356-b99b6163e4e5",
+    "traj-5cdcaf3b-851f-45da-9fe1-253060428059",
+    "traj-5fd13613-7463-4969-ab53-c2e7e8555df3",
+    "traj-617478e4-408a-4ed6-a06d-b84da7be94b1",
+    "traj-a3a5c39f-592b-4b4b-94e9-59f63039e53e",
+    "traj-b5ffa504-3f59-4a75-8c10-1f4f5b5aa8c8",
+    "traj-c4e73e5c-5b4d-4f4c-a209-3f6147263622",
+    "traj-cccb6c09-84cd-4247-aa40-9ec6e0a9f1bd"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-0b64fe15-dd10-4f78-916b-200ec6483fcd.json b/docs/training-reports/report-0b64fe15-dd10-4f78-916b-200ec6483fcd.json
new file mode 100644
index 0000000..d0baa1b
--- /dev/null
+++ b/docs/training-reports/report-0b64fe15-dd10-4f78-916b-200ec6483fcd.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-0b64fe15-dd10-4f78-916b-200ec6483fcd",
+  "timestamp": "2026-04-14T14:59:48.944796+00:00",
+  "source_trajectory_ids": [
+    "traj-05a12459-be9e-484e-9a58-83b465e24092",
+    "traj-0924a001-1055-4126-b241-bcdd2c078494",
+    "traj-17ae339d-c886-414e-94b3-5e570093c8e4",
+    "traj-18e10e43-4694-43e1-be9a-f16fdf123e35",
+    "traj-5460af6d-b2c1-4a71-aaff-0060c05a4421",
+    "traj-5517734f-c4fe-497e-a402-aa5228395d34",
+    "traj-59b1a050-9be6-4aca-bbfb-0e1da246da2d",
+    "traj-7094a080-3592-4e33-9ba3-f32fdbe02e76",
+    "traj-8f62ae27-0a61-43bf-a8e6-7b73e9a1c888",
+    "traj-967b670c-429a-4be2-a8c2-ec341ff3106e"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-0c59209d-fc75-4b15-bcdb-239138c12b79.json b/docs/training-reports/report-0c59209d-fc75-4b15-bcdb-239138c12b79.json
new file mode 100644
index 0000000..411c092
--- /dev/null
+++ b/docs/training-reports/report-0c59209d-fc75-4b15-bcdb-239138c12b79.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-0c59209d-fc75-4b15-bcdb-239138c12b79",
+  "timestamp": "2026-04-15T01:41:52.505512+00:00",
+  "source_trajectory_ids": [
+    "traj-0ddf19dd-e828-4035-bd6e-29e627769d2e",
+    "traj-1c185db1-61db-4f0c-b069-8e8e4ced92e7",
+    "traj-23b5c08b-2fca-4f55-84a6-7068af698780",
+    "traj-3f243fab-5841-41e6-acf6-f8c9f40cf515",
+    "traj-713b43da-4b0f-4fbe-b190-9b508d1244f0",
+    "traj-9764ffe9-c580-4b3f-88a3-beead04a1df3",
+    "traj-a6e4d148-6744-4fda-a1bf-26603166117c",
+    "traj-cc7aa6c3-de72-4bfd-83fc-761eaa8cc8a7",
+    "traj-d12e9387-0fd2-4a4f-b387-3cd58cbf12f4",
+    "traj-fdb7911a-4cc0-4906-b5bc-658be058653e"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260415-014152",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-0cb5da05-94e7-4f55-b759-1338cebaf5fd.json b/docs/training-reports/report-0cb5da05-94e7-4f55-b759-1338cebaf5fd.json
new file mode 100644
index 0000000..1edd189
--- /dev/null
+++ b/docs/training-reports/report-0cb5da05-94e7-4f55-b759-1338cebaf5fd.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-0cb5da05-94e7-4f55-b759-1338cebaf5fd",
+  "timestamp": "2026-04-14T21:22:02.988940+00:00",
+  "source_trajectory_ids": [
+    "traj-2704df3d-419c-4206-af3c-afd5466b305c",
+    "traj-49134729-36f2-467c-83ac-da261acc561b",
+    "traj-7adf97fd-549f-4123-9340-5b49f024f6d7",
+    "traj-852b116d-998b-48fd-aa71-293f8b31c6e4",
+    "traj-92a1293d-5d98-4efb-88ba-125ca308d246",
+    "traj-ab7f2171-8a7c-4315-a80f-fb88168b794e",
+    "traj-c3788641-46cf-43fe-bb92-4b3871f1b20e",
+    "traj-cf3e5d99-8b30-4877-a93d-481968801eaf",
+    "traj-e051963a-bada-41f6-9fc5-4ba429d136c9",
+    "traj-f5d94b37-3215-4a5f-9528-8b13dd9f4ceb"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-212202",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-0e0b0f65-2073-445d-8b24-753642e15b88.json b/docs/training-reports/report-0e0b0f65-2073-445d-8b24-753642e15b88.json
new file mode 100644
index 0000000..2b74bb3
--- /dev/null
+++ b/docs/training-reports/report-0e0b0f65-2073-445d-8b24-753642e15b88.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-0e0b0f65-2073-445d-8b24-753642e15b88",
+  "timestamp": "2026-04-15T02:31:17.525516+00:00",
+  "source_trajectory_ids": [
+    "traj-22f50846-2268-4d3f-94ab-cef4813aa471",
+    "traj-34c1633d-c680-49ae-81de-f9c4942f3d1f",
+    "traj-3fb919b1-e148-4764-9197-aea2c313ec6a",
+    "traj-4dd3e06b-8c8d-41c2-a06a-006f024b868a",
+    "traj-6b505d4e-b7da-4f0d-81a5-1c37f89ca93e",
+    "traj-84925606-3ef8-47bc-8f58-97a22083b6ad",
+    "traj-d532be68-7050-4ec1-bf21-06085d8894f9",
+    "traj-e6eec076-e362-4dc1-8444-af9f6bb659b2",
+    "traj-f273ccb2-a6d6-40dd-836f-e6835a7aa55c",
+    "traj-ff6a2137-d4ab-406a-b43a-d2741f6dd91b"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260415-023117",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-10a2f403-eca3-4ec5-ac0e-8e907322679d.json b/docs/training-reports/report-10a2f403-eca3-4ec5-ac0e-8e907322679d.json
new file mode 100644
index 0000000..04bf5bf
--- /dev/null
+++ b/docs/training-reports/report-10a2f403-eca3-4ec5-ac0e-8e907322679d.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-10a2f403-eca3-4ec5-ac0e-8e907322679d",
+  "timestamp": "2026-04-14T18:58:37.085590+00:00",
+  "source_trajectory_ids": [
+    "traj-6aa4c009-836a-4013-887a-07ed5b767a2f",
+    "traj-6c9e26f0-ed6d-4bcb-aa25-e79789688ccb",
+    "traj-8898f73e-1ce5-4770-9966-359bef9958ae",
+    "traj-9055a563-4353-4975-a970-5ef46a472d45",
+    "traj-91d1f2db-f516-4fc9-8a74-8bc6ebe0be47",
+    "traj-b5b5fd0f-a745-48aa-a6b9-8c50451a7b07",
+    "traj-d3f8003f-39fd-4eb7-a68a-41d137da964a",
+    "traj-dfd75ad1-1e73-45d5-8dd0-aad7e860fdcc",
+    "traj-e178b3b0-8872-4f89-9d25-4e55d3a7aaf2",
+    "traj-e7bafd52-4193-4bdd-9fc1-d46658003751"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-185837",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-13b9b11b-b1cd-4d70-9fcc-a972cbd54805.json b/docs/training-reports/report-13b9b11b-b1cd-4d70-9fcc-a972cbd54805.json
new file mode 100644
index 0000000..a0f6a6d
--- /dev/null
+++ b/docs/training-reports/report-13b9b11b-b1cd-4d70-9fcc-a972cbd54805.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-13b9b11b-b1cd-4d70-9fcc-a972cbd54805",
+  "timestamp": "2026-04-15T01:21:53.876489+00:00",
+  "source_trajectory_ids": [
+    "traj-4095188b-6d84-4ee1-a3fc-a4a147b2e983",
+    "traj-4f2d3ac1-211c-417a-9836-87126bd0aa35",
+    "traj-5aa5c2f9-8ac1-4031-a12c-e6c4e8d0ece0",
+    "traj-60f0860e-9a83-4826-966d-40cf15d4fcb9",
+    "traj-74fc1f64-33b4-443d-9c1d-a4a0e60f7ff0",
+    "traj-8bc691d7-e5bc-4b7b-8621-8401f36a5f4d",
+    "traj-8d7af32a-19d7-4ff9-8f6d-a2337280cc4c",
+    "traj-b22e4a02-c74d-4e10-91f6-8ac4b7a5c2ea",
+    "traj-f833d392-3250-4caa-8acb-90fc49d3b3c1",
+    "traj-f9702b75-4b65-4f83-8ef1-c2594c87db8a"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260415-012153",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-13f1b744-a87c-48f0-b024-d48396ae1c25.json b/docs/training-reports/report-13f1b744-a87c-48f0-b024-d48396ae1c25.json
new file mode 100644
index 0000000..564db83
--- /dev/null
+++ b/docs/training-reports/report-13f1b744-a87c-48f0-b024-d48396ae1c25.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-13f1b744-a87c-48f0-b024-d48396ae1c25",
+  "timestamp": "2026-04-14T20:06:16.345373+00:00",
+  "source_trajectory_ids": [
+    "traj-06da564b-232b-4496-ae9d-81306a08cc7b",
+    "traj-6b557f04-4b89-4628-8d4f-acb8d5b060df",
+    "traj-7e9f9a58-1594-44aa-9712-215e130a7dd6",
+    "traj-8765d4ea-b4c4-45dd-8830-df92cc3f3aba",
+    "traj-ac14447b-6afb-4d39-bd92-8172d4f50c8e",
+    "traj-b70ae420-ea46-47d8-8640-f0b21e659a81",
+    "traj-bb4b6108-ade8-421c-96f1-f35c36677029",
+    "traj-c81f1a4e-7182-4560-91c7-86bdc4ccfa03",
+    "traj-d6c3266e-8d3e-4e42-bf5b-8522ca351241",
+    "traj-eca10f07-c2e4-42bd-9ec7-21c3fba82752"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-200616",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-15c39b58-4792-486c-88c2-8fa95f34f0e7.json b/docs/training-reports/report-15c39b58-4792-486c-88c2-8fa95f34f0e7.json
new file mode 100644
index 0000000..f78b045
--- /dev/null
+++ b/docs/training-reports/report-15c39b58-4792-486c-88c2-8fa95f34f0e7.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-15c39b58-4792-486c-88c2-8fa95f34f0e7",
+  "timestamp": "2026-04-14T21:18:36.157196+00:00",
+  "source_trajectory_ids": [
+    "traj-0d2380fb-a3a9-4c3b-bb01-4390198f0e60",
+    "traj-1147c420-d6bd-45a6-9071-a38f96205f7b",
+    "traj-436265cb-1612-4ef2-94e5-311619f97900",
+    "traj-4672cbe1-bc67-4378-b517-e4f0c23395c7",
+    "traj-4cd057fa-bca1-4172-805b-cf0aac1191ce",
+    "traj-7fae7ddb-e46d-411c-ae1c-b17ed44159e6",
+    "traj-beac32c5-da6d-48e2-aab1-140041c46a80",
+    "traj-c08cf9ff-2fee-4071-bb39-91955125de74",
+    "traj-c32944a0-6f3d-4fc4-9cb7-f8a5b581445b",
+    "traj-d59a5c84-eea1-44b3-b182-9f0b4119b448"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-211836",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-16240412-414a-48fb-a5de-244647601b99.json b/docs/training-reports/report-16240412-414a-48fb-a5de-244647601b99.json
new file mode 100644
index 0000000..4111fcd
--- /dev/null
+++ b/docs/training-reports/report-16240412-414a-48fb-a5de-244647601b99.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-16240412-414a-48fb-a5de-244647601b99",
+  "timestamp": "2026-04-14T21:44:48.199092+00:00",
+  "source_trajectory_ids": [
+    "traj-3d852c8f-73dd-454f-a35b-4d22d5dd187e",
+    "traj-5d8dd2a4-854f-41dd-9f14-6a08657fc60e",
+    "traj-64125a25-a99f-42c4-9d76-1b68c45809a4",
+    "traj-6fe3e14c-b6cd-4a04-9fe3-4cfd011f880b",
+    "traj-94ddd7b7-7a70-45d6-986c-9a22512fb6b8",
+    "traj-afbbdc73-f4a9-4d42-a8d5-9f41464b0e20",
+    "traj-bca371df-ce7e-427c-acd3-83c712cb11db",
+    "traj-c4849366-5ee8-4442-80cb-fb6207f59d48",
+    "traj-c6a1ed31-c1d7-4147-a035-fef03423d0b6",
+    "traj-e9a5cbf2-4716-462c-8f00-5e75da4636a4"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-214448",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-197593f2-1928-428f-a143-d59574a1070f.json b/docs/training-reports/report-197593f2-1928-428f-a143-d59574a1070f.json
new file mode 100644
index 0000000..b3002f7
--- /dev/null
+++ b/docs/training-reports/report-197593f2-1928-428f-a143-d59574a1070f.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-197593f2-1928-428f-a143-d59574a1070f",
+  "timestamp": "2026-04-15T01:41:52.446867+00:00",
+  "source_trajectory_ids": [
+    "traj-032f9293-f5f9-4f2d-8724-4df72b6e2def",
+    "traj-1c8cbbb2-d4b0-427b-8009-57141775873f",
+    "traj-28f049b4-aac9-40d6-8e65-4f8d612fe1cb",
+    "traj-29337650-4610-4468-adf5-39cd2a095750",
+    "traj-568702f9-d5ce-440d-8056-faf70ca7492a",
+    "traj-76b8ddce-0f28-4fc1-8191-82a313b854e7",
+    "traj-d1ede161-80e7-4d3d-b09e-e65972bbbc61",
+    "traj-d3528eb9-0934-476e-9291-a0b616686308",
+    "traj-e4743774-7609-4749-bdc7-bdfe31107cd3",
+    "traj-eac8b576-7a44-472f-9553-b68773ac4bda"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-1a6693a9-d9aa-4fb7-9c37-8eca70db8ff2.json b/docs/training-reports/report-1a6693a9-d9aa-4fb7-9c37-8eca70db8ff2.json
new file mode 100644
index 0000000..722be19
--- /dev/null
+++ b/docs/training-reports/report-1a6693a9-d9aa-4fb7-9c37-8eca70db8ff2.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-1a6693a9-d9aa-4fb7-9c37-8eca70db8ff2",
+  "timestamp": "2026-04-14T20:56:07.953209+00:00",
+  "source_trajectory_ids": [
+    "traj-1d87ed37-a6de-437b-b1f7-655e1465ae99",
+    "traj-59bfcf1f-5462-419c-8571-56960b954a7a",
+    "traj-71a16960-4bab-4e3a-8187-aff9a04774f4",
+    "traj-89b30f5d-96f7-4748-b44a-f03efa183c0c",
+    "traj-a67da73d-9e4b-4082-9f02-09c1b04a30c7",
+    "traj-a82d0c43-6d5d-4c2c-b4b1-9ba06a1f8433",
+    "traj-adb2c5dd-78a0-4a5b-98a7-1d78f8e7e680",
+    "traj-b5f04bf0-caf1-4523-9b30-5a094185428c",
+    "traj-d67e548f-de22-4a83-ba4a-9b737dbeefa0",
+    "traj-f840a2e0-d314-4dcc-9c02-262a92a093e8"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-205607",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-1b339815-f279-409b-ab77-c5c5c31744f7.json b/docs/training-reports/report-1b339815-f279-409b-ab77-c5c5c31744f7.json
new file mode 100644
index 0000000..9138079
--- /dev/null
+++ b/docs/training-reports/report-1b339815-f279-409b-ab77-c5c5c31744f7.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-1b339815-f279-409b-ab77-c5c5c31744f7",
+  "timestamp": "2026-04-14T20:56:11.542333+00:00",
+  "source_trajectory_ids": [
+    "traj-02aee60f-4b45-4ee1-9341-c60be647ff1b",
+    "traj-0a7673fb-d561-44f6-9cb8-aa87122018a3",
+    "traj-441ae78a-e0c9-408e-87e4-421c9a96fc5e",
+    "traj-4d2d110d-4309-4728-b072-b1785e6df45a",
+    "traj-629ca24d-3b2f-4c8f-8be4-0b1f8bc21df7",
+    "traj-64f19728-5662-43f3-84da-9476b25db403",
+    "traj-963f8b19-2650-428c-9135-1500fd1d7ded",
+    "traj-c3aed0f9-1919-49a3-ab90-84d89f423d33",
+    "traj-e9567bfd-bec6-4836-8f23-341447fd7a9c",
+    "traj-ea6e5ff5-936f-4aa1-839b-2199ec7f925b"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-205611",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-1d45c6bd-7847-46d4-a3aa-953bcdce24ec.json b/docs/training-reports/report-1d45c6bd-7847-46d4-a3aa-953bcdce24ec.json
new file mode 100644
index 0000000..8d0d923
--- /dev/null
+++ b/docs/training-reports/report-1d45c6bd-7847-46d4-a3aa-953bcdce24ec.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-1d45c6bd-7847-46d4-a3aa-953bcdce24ec",
+  "timestamp": "2026-04-14T21:18:36.139079+00:00",
+  "source_trajectory_ids": [
+    "traj-200f901f-aecf-43e4-a9cd-f3ebfed82ed0",
+    "traj-233795c8-8d47-4a3b-86f7-9d2af40b89cb",
+    "traj-42523abe-2654-46d6-8cce-4154ab093cf2",
+    "traj-49763bb6-634e-4fa0-a23b-0f537e97262d",
+    "traj-593e6516-7ec2-4226-83fb-65a4a5274616",
+    "traj-617282fd-7c85-4583-a59f-54315bbf9e40",
+    "traj-7f6074d8-ea0d-4d48-94f3-4a67d0ee92a5",
+    "traj-8a3543c1-3007-4271-9862-06ae3202f039",
+    "traj-9121275f-b96c-4cc5-a45d-1c50532c6409",
+    "traj-c73e41e8-483d-435a-bd6e-e868a445bd30"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-211836",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-1e0bf809-418e-4720-ab59-b8d7401ce94c.json b/docs/training-reports/report-1e0bf809-418e-4720-ab59-b8d7401ce94c.json
new file mode 100644
index 0000000..b22d99c
--- /dev/null
+++ b/docs/training-reports/report-1e0bf809-418e-4720-ab59-b8d7401ce94c.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-1e0bf809-418e-4720-ab59-b8d7401ce94c",
+  "timestamp": "2026-04-14T20:30:08.746811+00:00",
+  "source_trajectory_ids": [
+    "traj-09d6c0c8-8c5b-4264-aaf4-78b6ec7689b2",
+    "traj-11b5e179-af57-4df4-a07e-6263f6e82ddd",
+    "traj-12d9e29a-03cb-4242-91ee-de30aacb0e50",
+    "traj-41580331-af54-47ed-9aab-2fab2fc8c3a0",
+    "traj-707b126a-6164-475f-81a3-4a34fe624639",
+    "traj-903e62d9-1478-44de-8348-4e08531a9178",
+    "traj-9ebcf874-21b8-453f-817f-f7038907608c",
+    "traj-9fc3b26a-7ba9-4d9a-a732-17db84494c48",
+    "traj-a16724af-e0aa-4aa3-9615-c8c3b14173a7",
+    "traj-b1235cd4-6b9f-4b86-bca3-39f48ee4c1ea"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-1e1d679b-3b0d-4cb9-b474-8302992df5ba.json b/docs/training-reports/report-1e1d679b-3b0d-4cb9-b474-8302992df5ba.json
new file mode 100644
index 0000000..a6696da
--- /dev/null
+++ b/docs/training-reports/report-1e1d679b-3b0d-4cb9-b474-8302992df5ba.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-1e1d679b-3b0d-4cb9-b474-8302992df5ba",
+  "timestamp": "2026-04-15T01:25:33.772160+00:00",
+  "source_trajectory_ids": [
+    "traj-0091cdde-9035-4995-9b30-ba3e52a4e74b",
+    "traj-21d17e25-de55-46e8-b31f-0e6d6a045351",
+    "traj-52d9d18b-f37f-4a42-bcc8-ce7e28277942",
+    "traj-5f2a03a7-33d9-48f9-b1cb-bb5ac6e1f21c",
+    "traj-86cb57eb-74c0-4ea5-bce7-3aa1690b9599",
+    "traj-87e89d6e-3ef0-4ee4-ab62-c0002b4d1b22",
+    "traj-a7dd2fb7-6756-430e-9964-dfca0f3a6981",
+    "traj-aa624e2b-6012-406a-8542-2ffed00096bc",
+    "traj-ef61014c-791a-4535-a78f-7ab715a7c3bb",
+    "traj-fbab542f-9c5f-4b90-889e-f2d253862441"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260415-012533",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-20611e96-6b00-4a0d-9be5-5d5e968b3371.json b/docs/training-reports/report-20611e96-6b00-4a0d-9be5-5d5e968b3371.json
new file mode 100644
index 0000000..8d92afa
--- /dev/null
+++ b/docs/training-reports/report-20611e96-6b00-4a0d-9be5-5d5e968b3371.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-20611e96-6b00-4a0d-9be5-5d5e968b3371",
+  "timestamp": "2026-04-14T21:22:14.567603+00:00",
+  "source_trajectory_ids": [
+    "traj-25fd7e3b-5c72-4b15-a847-70a35bc85f1f",
+    "traj-2c2d6197-5e0e-4ca8-a762-b5aca1b4d486",
+    "traj-36a27b22-b36b-4912-8dbe-72dd0138c06f",
+    "traj-590f4b56-7a29-4080-80ba-23bfc984a935",
+    "traj-7298713f-34a4-48bc-8e41-f0b0d6de8778",
+    "traj-82acfa6b-76d4-4a01-a9e5-3f989a3f2684",
+    "traj-8ab1b3d9-a6fa-4bf3-8d5b-5298ac85afb2",
+    "traj-98cfa826-215f-40a9-9fb9-7d8b41640295",
+    "traj-cd585938-d8b0-4773-8fe7-655cba23cbec",
+    "traj-ec26bcd5-3463-4a25-9ebc-b046e66020cf"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-212214",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-20ea6367-8e42-4f6a-b90d-60eb441aa9f8.json b/docs/training-reports/report-20ea6367-8e42-4f6a-b90d-60eb441aa9f8.json
new file mode 100644
index 0000000..7893291
--- /dev/null
+++ b/docs/training-reports/report-20ea6367-8e42-4f6a-b90d-60eb441aa9f8.json
@@ -0,0 +1,42 @@
+{
+  "report_id": "report-20ea6367-8e42-4f6a-b90d-60eb441aa9f8",
+  "timestamp": "2026-04-14T18:31:07.921735+00:00",
+  "source_trajectory_ids": [
+    "traj-253bd144-3ad8-4dcc-951e-535f7fa444c6",
+    "traj-2ba9feaf-ffd1-41f8-b492-365351133a96",
+    "traj-35c97cf2-e5d0-4d9c-be42-d46b14f8afa7",
+    "traj-47b3674d-73db-41dd-b8db-472f91f864a0",
+    "traj-4d744d56-15bd-4e54-ba65-29f6135dda22",
+    "traj-4e3fcf21-0020-4585-b277-9c8a03081c06",
+    "traj-655aea4f-bc1d-47c3-85ac-73c8bb63d7e4",
+    "traj-66d29b59-21be-474b-a6a8-9d7a0e36b8bb",
+    "traj-8c7ed834-ada2-44f2-9a67-e7066d0bafe6",
+    "traj-bd4fdcd8-1577-41a3-a081-a738b17bb9c1"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-183107",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-21a16654-d936-4393-88d7-9e0e00d98fec.json b/docs/training-reports/report-21a16654-d936-4393-88d7-9e0e00d98fec.json
new file mode 100644
index 0000000..d528483
--- /dev/null
+++ b/docs/training-reports/report-21a16654-d936-4393-88d7-9e0e00d98fec.json
@@ -0,0 +1,41 @@
+{
+  "report_id": "report-21a16654-d936-4393-88d7-9e0e00d98fec",
+  "timestamp": "2026-04-14T16:49:44.845808+00:00",
+  "source_trajectory_ids": [
+    "traj-2999497a-dba8-4215-8f44-f7371fb4c18d",
+    "traj-4603b0e1-ef1e-4f44-a5bf-7994eeb97fd2",
+    "traj-49f42054-4065-41c6-8d70-e89801df29dc",
+    "traj-53b375aa-ba3f-4518-973c-6c8c1b704fd1",
+    "traj-68124be2-5cc4-4c52-b891-fc5cb253b3ea",
+    "traj-828c6c7b-72ed-44ee-8628-f1bee3080ce1",
+    "traj-a4516393-6015-4029-910a-15955a283aec",
+    "traj-ba4984ba-48a3-43ac-8726-d73db56f5a5e",
+    "traj-ccb9f3ca-26e2-4efb-9b1a-b55a913b55cd",
+    "traj-f9aa9adc-232e-4f92-9161-23165cb9dca4"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-164944"
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-21e68879-c123-49e0-8af9-9f8e9dc76ecf.json b/docs/training-reports/report-21e68879-c123-49e0-8af9-9f8e9dc76ecf.json
new file mode 100644
index 0000000..7ee5219
--- /dev/null
+++ b/docs/training-reports/report-21e68879-c123-49e0-8af9-9f8e9dc76ecf.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-21e68879-c123-49e0-8af9-9f8e9dc76ecf",
+  "timestamp": "2026-04-14T18:58:37.102905+00:00",
+  "source_trajectory_ids": [
+    "traj-1f23a1c3-4ba3-412b-9df5-61bda0396bc8",
+    "traj-24c1c82b-1fac-4365-a643-faa65082b8d8",
+    "traj-5f848d9c-294e-4586-9951-30a03588cc26",
+    "traj-8bdac473-266e-454b-a2d6-267f6189850e",
+    "traj-b05d7433-97be-4604-bcee-a18cb1102a80",
+    "traj-c0a59283-e7e0-41fe-925d-79b4e076b9f6",
+    "traj-cd1ef62e-83f9-4b1b-99b7-ff1e1586afb9",
+    "traj-e2f37ecf-a075-4b1b-ad12-9c3e7be77fc7",
+    "traj-e3bb7c13-fe7c-462c-9f88-d7f40092669c",
+    "traj-e8bc452a-b3c7-4448-a994-51cc81b30730"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-185837",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-22b88101-5979-4011-b85d-c3bb3e1f84ae.json b/docs/training-reports/report-22b88101-5979-4011-b85d-c3bb3e1f84ae.json
new file mode 100644
index 0000000..cb6f0bf
--- /dev/null
+++ b/docs/training-reports/report-22b88101-5979-4011-b85d-c3bb3e1f84ae.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-22b88101-5979-4011-b85d-c3bb3e1f84ae",
+  "timestamp": "2026-04-14T15:25:06.028393+00:00",
+  "source_trajectory_ids": [
+    "traj-263efcd1-4b24-4302-a9cf-5c5778297ac2",
+    "traj-29855290-bd18-44cb-b1d2-da2bd3eff5b3",
+    "traj-5d04fe16-cac2-4f5d-b469-eb4ff0c3e66a",
+    "traj-67d1127e-70c1-4f86-93e2-8d0e7e6df433",
+    "traj-8e66108b-fa8c-426e-99e9-f7df432b7436",
+    "traj-a0df3bda-d35b-4c76-b68e-2c9dbb47f6f2",
+    "traj-bc36ba31-9c1f-4902-9f79-acf7653d0e86",
+    "traj-c62fe68a-4ad5-403a-b747-2648ae56392b",
+    "traj-c74f4755-6cf3-4f1e-950e-672a946a7b4e",
+    "traj-ff0f3558-e0cd-49e0-974b-1b5bd9cb5af1"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-2493e5d6-4be2-49a6-8e84-6f3fda442ff5.json b/docs/training-reports/report-2493e5d6-4be2-49a6-8e84-6f3fda442ff5.json
new file mode 100644
index 0000000..b99f3b7
--- /dev/null
+++ b/docs/training-reports/report-2493e5d6-4be2-49a6-8e84-6f3fda442ff5.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-2493e5d6-4be2-49a6-8e84-6f3fda442ff5",
+  "timestamp": "2026-04-14T16:54:50.564245+00:00",
+  "source_trajectory_ids": [
+    "traj-0ef5e19a-7c30-45a4-979f-67d46413ee95",
+    "traj-11483ef1-e410-47c5-a265-53bff1968182",
+    "traj-59cbb530-0ae7-41e6-a033-250431a20bb8",
+    "traj-671f19f7-cd71-4539-83ea-5441807561c9",
+    "traj-6c693d47-22b0-42e1-a76f-9ba625d79a70",
+    "traj-97c4d19b-9aff-4f15-b1e3-c82c8da598e0",
+    "traj-cd8780de-7bc7-4735-acdd-e66ed407619f",
+    "traj-e4b2af56-8718-41c4-bd7f-6c479b1fb7f3",
+    "traj-e63b675d-9bba-41bf-a472-a068cf2437fd",
+    "traj-f10cfe93-7fe1-4c84-bfba-2cb3c3892a9e"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-24f69dfe-4955-4c6a-8421-5c1bdd0bdfda.json b/docs/training-reports/report-24f69dfe-4955-4c6a-8421-5c1bdd0bdfda.json
new file mode 100644
index 0000000..7eb5e32
--- /dev/null
+++ b/docs/training-reports/report-24f69dfe-4955-4c6a-8421-5c1bdd0bdfda.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-24f69dfe-4955-4c6a-8421-5c1bdd0bdfda",
+  "timestamp": "2026-04-14T21:42:45.897635+00:00",
+  "source_trajectory_ids": [
+    "traj-11726363-9ef7-47e4-9e77-4e2b1fbefbd3",
+    "traj-28ecbc8b-0dd9-4e4e-a4d0-cfe262c4a812",
+    "traj-3abe1edd-f76a-4040-a471-0ad8535ff553",
+    "traj-3ed7731b-def2-459e-948d-a45cd595c4de",
+    "traj-4620d7ac-94ad-4310-8dfe-3f1c9124ceb9",
+    "traj-5a041387-76c4-483c-8df9-a6ee410a3264",
+    "traj-6f340ce0-d606-4890-b647-ad57360f8566",
+    "traj-82ebfe0c-08de-4c00-9a8e-8f293191c97a",
+    "traj-a5442e4e-57fb-4af6-941f-90c25a3862dc",
+    "traj-c2c5e763-e809-4264-af82-610fbe7c5fd1"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-2522e075-f011-4379-989c-f413d768a957.json b/docs/training-reports/report-2522e075-f011-4379-989c-f413d768a957.json
new file mode 100644
index 0000000..db4db2b
--- /dev/null
+++ b/docs/training-reports/report-2522e075-f011-4379-989c-f413d768a957.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-2522e075-f011-4379-989c-f413d768a957",
+  "timestamp": "2026-04-14T15:52:51.189860+00:00",
+  "source_trajectory_ids": [
+    "traj-0c96d155-112e-4658-bf90-3b35da7e7c2f",
+    "traj-22a09d6a-b9ee-4ff4-b324-a14cb8f33a91",
+    "traj-36fcb806-8f3f-44cc-99a8-1b4f6170a18e",
+    "traj-42da2c6e-15c2-4c40-92f3-14b5eef5e681",
+    "traj-6b083428-c47a-4d83-80dc-4db2c66887d7",
+    "traj-a9c06cd6-3332-41b9-bb21-3afc06e6f701",
+    "traj-b45b09b4-c348-4215-9cba-4adbe8a76410",
+    "traj-c7b2b172-8223-423c-abb4-79bfeb1cbe94",
+    "traj-cd806afc-6846-4361-9692-dff3469717a8",
+    "traj-d0622a9d-4e59-44be-8231-2b27a25d47ac"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-263a0d30-9096-4e1c-a406-43927ac46d80.json b/docs/training-reports/report-263a0d30-9096-4e1c-a406-43927ac46d80.json
new file mode 100644
index 0000000..c69ebd3
--- /dev/null
+++ b/docs/training-reports/report-263a0d30-9096-4e1c-a406-43927ac46d80.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-263a0d30-9096-4e1c-a406-43927ac46d80",
+  "timestamp": "2026-04-14T20:28:05.472990+00:00",
+  "source_trajectory_ids": [
+    "traj-02ffe167-5287-4a72-8e9e-623baef314d8",
+    "traj-58518476-0941-437f-86d7-80f000a35ae7",
+    "traj-73a97752-0bb5-4f24-92ec-8f3cec52ed4f",
+    "traj-77750db3-022e-4dfc-a19b-a45e2eb41923",
+    "traj-a7d6998f-0355-408e-acde-7f84033a7712",
+    "traj-b73c6762-b738-4156-bf01-38661089bd01",
+    "traj-d90086be-591d-4c4a-a220-6e35a125cc62",
+    "traj-dcada7b1-e74e-42ee-b117-aea8d121247b",
+    "traj-e5b873ad-8e62-4b91-a714-16eca70dbae3",
+    "traj-f63e8157-40e6-480a-a2d2-bd7a257636dc"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-202805",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-26782aa8-a2a0-45f9-8ac3-861fc1364431.json b/docs/training-reports/report-26782aa8-a2a0-45f9-8ac3-861fc1364431.json
new file mode 100644
index 0000000..13a980f
--- /dev/null
+++ b/docs/training-reports/report-26782aa8-a2a0-45f9-8ac3-861fc1364431.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-26782aa8-a2a0-45f9-8ac3-861fc1364431",
+  "timestamp": "2026-04-15T01:57:32.697736+00:00",
+  "source_trajectory_ids": [
+    "traj-0269da6e-275b-4eae-8a43-1b89a74a87c2",
+    "traj-28052977-cc23-4c06-9343-dce8b4ca5ee3",
+    "traj-3f7b8039-6d47-4ce4-9c3b-6d80d4548825",
+    "traj-4e37cd2f-c830-4eba-84d9-87cffe9bcec3",
+    "traj-74071894-0764-410c-8460-cebb98b80fa4",
+    "traj-8c547ed6-61ba-4c6e-a8e2-990264cf77b9",
+    "traj-a65f8feb-1368-43c0-a3b9-97f6a3420741",
+    "traj-b80a3d53-4ffe-4147-b8d6-b619fd951f58",
+    "traj-bb4444ec-7bd6-42e3-b9e1-4a680d368d50",
+    "traj-c5cea58f-f981-4f73-8f3a-e35d7f3befb7"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260415-015732",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-279b3b5c-bf69-4d8a-9be7-372086c295c9.json b/docs/training-reports/report-279b3b5c-bf69-4d8a-9be7-372086c295c9.json
new file mode 100644
index 0000000..8122792
--- /dev/null
+++ b/docs/training-reports/report-279b3b5c-bf69-4d8a-9be7-372086c295c9.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-279b3b5c-bf69-4d8a-9be7-372086c295c9",
+  "timestamp": "2026-04-14T20:06:16.411538+00:00",
+  "source_trajectory_ids": [
+    "traj-2382eb44-3957-434e-a171-04e6ecd5a0ce",
+    "traj-28e66c25-3314-43ce-8a7f-911d8943ea11",
+    "traj-3b62b30e-9c8f-433b-bf40-16820db431aa",
+    "traj-8b3b0d30-f9bb-4323-9403-23970be3a4e6",
+    "traj-8b84c7fe-b827-4b35-8f2e-6538fbc684fb",
+    "traj-9cc1eef1-d47d-481f-851e-5913609c8740",
+    "traj-a960cca8-bcc9-4332-9523-3170ef7c5355",
+    "traj-b34c9f41-87f6-40f8-8d3b-0878f4d61911",
+    "traj-b4ba32fb-2bee-41b6-a46e-0a094079f40c",
+    "traj-f5dd1000-03ed-4b5a-b00f-e6437ba56426"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-2957a4db-25be-4a31-be96-bb53b60b0574.json b/docs/training-reports/report-2957a4db-25be-4a31-be96-bb53b60b0574.json
new file mode 100644
index 0000000..2d74de6
--- /dev/null
+++ b/docs/training-reports/report-2957a4db-25be-4a31-be96-bb53b60b0574.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-2957a4db-25be-4a31-be96-bb53b60b0574",
+  "timestamp": "2026-04-15T01:29:18.124253+00:00",
+  "source_trajectory_ids": [
+    "traj-24875022-ace1-4d16-b802-9e19c0345039",
+    "traj-2adcad42-6922-4c27-b879-75aac21c94ba",
+    "traj-35512ca0-78e3-4d99-8efa-056013aefbbd",
+    "traj-60c4586a-c6ba-421c-939c-04a3c497ee3b",
+    "traj-64a60af0-4d68-4370-a4a0-6f5ebdc4b7ad",
+    "traj-8ef62528-7f24-495a-8c51-7936b30c02ec",
+    "traj-c52966b8-939c-493d-b528-716ca6e0c4e5",
+    "traj-d31381b8-70b3-44fa-9daa-053e9d517b8f",
+    "traj-e7852591-231f-4f0f-8032-fb872fa5e220",
+    "traj-eeb9b89c-6edf-4354-aafb-ce7fe0212dab"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260415-012918",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-2b3e115b-fae8-4813-b53d-1c8501010bf6.json b/docs/training-reports/report-2b3e115b-fae8-4813-b53d-1c8501010bf6.json
new file mode 100644
index 0000000..075f40a
--- /dev/null
+++ b/docs/training-reports/report-2b3e115b-fae8-4813-b53d-1c8501010bf6.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-2b3e115b-fae8-4813-b53d-1c8501010bf6",
+  "timestamp": "2026-04-14T15:53:50.791993+00:00",
+  "source_trajectory_ids": [
+    "traj-03447345-9e58-46c4-9ba0-8db3c0e720ee",
+    "traj-1b3c8854-3738-444e-b1ee-2ca9d728b580",
+    "traj-23d09f97-9f5e-4d4f-9e69-f70e66e665ce",
+    "traj-606fe284-63f2-40b0-88bd-fcb9a3b27738",
+    "traj-68c91a21-306c-462f-a9e6-f5b9b149de8e",
+    "traj-6ee3651f-9ccd-4d20-afc6-ac40d1a8dd9f",
+    "traj-7a5f901f-d5b4-41bb-a790-a9a82c521bee",
+    "traj-b725122c-6c00-40ae-acb0-c2ad58eaf075",
+    "traj-c160f961-cd03-404c-9ffa-037d1e196e9f",
+    "traj-ed0113b3-3c2a-4d16-b31f-8c1fcde61291"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-2bbfec49-07f0-43c8-9c0d-f9dbc33b8b53.json b/docs/training-reports/report-2bbfec49-07f0-43c8-9c0d-f9dbc33b8b53.json
new file mode 100644
index 0000000..0965831
--- /dev/null
+++ b/docs/training-reports/report-2bbfec49-07f0-43c8-9c0d-f9dbc33b8b53.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-2bbfec49-07f0-43c8-9c0d-f9dbc33b8b53",
+  "timestamp": "2026-04-14T22:10:15.279833+00:00",
+  "source_trajectory_ids": [
+    "traj-0fd602e1-438a-42d6-b684-09db49c96b27",
+    "traj-354a350a-d325-4233-afe7-387d05eba246",
+    "traj-4737d84e-ffc5-4187-b37b-c1581c9197c5",
+    "traj-54880be9-eac7-4a8a-82c2-aee098b966a1",
+    "traj-5d14290d-8afc-4a98-9df0-44db13f9bc33",
+    "traj-659b7f39-a95a-485a-aab7-65018cc206ed",
+    "traj-77720324-df56-4f6f-a732-824e97d9c7fe",
+    "traj-ed1a3760-846c-4534-9ea7-e6f54a2e1414",
+    "traj-f6fef155-e568-49f9-a286-c56c6b729c0d",
+    "traj-f8488008-391b-4fae-9d98-b9a379eca15e"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-221015",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-2ce71481-bf67-4581-b35e-65b83189c959.json b/docs/training-reports/report-2ce71481-bf67-4581-b35e-65b83189c959.json
new file mode 100644
index 0000000..e966380
--- /dev/null
+++ b/docs/training-reports/report-2ce71481-bf67-4581-b35e-65b83189c959.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-2ce71481-bf67-4581-b35e-65b83189c959",
+  "timestamp": "2026-04-14T21:22:14.626005+00:00",
+  "source_trajectory_ids": [
+    "traj-0a77ca1d-28c9-4148-b099-2990f38d701f",
+    "traj-51afda16-46e8-4fc6-aba5-1a02600624de",
+    "traj-5586372a-1b39-4028-95ae-31095cf3136d",
+    "traj-a1333bdd-5b03-45c7-853f-e49c1767031a",
+    "traj-a6a93535-fce3-4374-9cb6-0321a5a4769f",
+    "traj-b40704cc-e206-4da6-930c-1d0f1b7234a5",
+    "traj-b7c19bf6-3da3-451d-ba41-4019ef8e92d5",
+    "traj-c6f07968-deb4-48eb-a4ef-401339afb5fd",
+    "traj-d0fd2e96-e15a-4f2e-b51d-34f482772ea1",
+    "traj-e74aefaa-647c-4b9f-8a6c-e7cb2bbe0780"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-212214",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-2e6d5c48-47b9-4bd6-9a9b-117e9d646ccc.json b/docs/training-reports/report-2e6d5c48-47b9-4bd6-9a9b-117e9d646ccc.json
new file mode 100644
index 0000000..87496e3
--- /dev/null
+++ b/docs/training-reports/report-2e6d5c48-47b9-4bd6-9a9b-117e9d646ccc.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-2e6d5c48-47b9-4bd6-9a9b-117e9d646ccc",
+  "timestamp": "2026-04-14T21:18:36.299274+00:00",
+  "source_trajectory_ids": [
+    "traj-06944244-d2e8-4e6f-a49f-da5c792befce",
+    "traj-2a694d25-5898-4d2d-9bac-bcdc01d2d442",
+    "traj-4c7f18bb-78aa-4170-99e7-12e5ace54340",
+    "traj-5d7ed2c2-17b0-4daf-8661-51ec1da1fd60",
+    "traj-649669bb-b814-4114-9205-a328137d5bf7",
+    "traj-768ecbc8-30bf-4c05-82e6-c346736eea24",
+    "traj-7bb5cc17-a8e6-4c0a-b525-e2cd671187a2",
+    "traj-d23853f5-359e-4c3b-97e7-d239b5d7a152",
+    "traj-ec11358e-da16-426e-b74a-42f2e95db560",
+    "traj-f486a57e-e7b2-4a34-a6ed-6bfa836045be"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-211836",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-2f0f6640-8049-4de2-bbb9-71a76fd8be67.json b/docs/training-reports/report-2f0f6640-8049-4de2-bbb9-71a76fd8be67.json
new file mode 100644
index 0000000..aa51a58
--- /dev/null
+++ b/docs/training-reports/report-2f0f6640-8049-4de2-bbb9-71a76fd8be67.json
@@ -0,0 +1,41 @@
+{
+  "report_id": "report-2f0f6640-8049-4de2-bbb9-71a76fd8be67",
+  "timestamp": "2026-04-14T15:29:35.959531+00:00",
+  "source_trajectory_ids": [
+    "traj-13fe5578-4d8a-4781-9338-1a612a1e5a06",
+    "traj-1ed5b4dc-8872-45f8-81a9-91b46545097b",
+    "traj-4e0d0d23-c5d7-4211-a6ef-31fb93bd62aa",
+    "traj-6728130e-69dd-4b4e-beb4-d7c0a898d962",
+    "traj-74c20720-f0c5-45dc-8ee1-b78a38d9b967",
+    "traj-c3d1a98e-86fc-491c-8ab1-acb7f890f2c9",
+    "traj-c7a69537-49c5-42b2-8e87-1a05699bbb15",
+    "traj-dddd03c8-8ec2-4571-8646-2c3dfd9eefe1",
+    "traj-df54d24d-cfbc-4475-8d79-c77f4c11407b",
+    "traj-f1e396ed-ac25-4bb1-bb8e-55fc4b405819"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-152935"
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-31835629-98b1-4a08-8a42-e80702dd3ff7.json b/docs/training-reports/report-31835629-98b1-4a08-8a42-e80702dd3ff7.json
new file mode 100644
index 0000000..c8189d3
--- /dev/null
+++ b/docs/training-reports/report-31835629-98b1-4a08-8a42-e80702dd3ff7.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-31835629-98b1-4a08-8a42-e80702dd3ff7",
+  "timestamp": "2026-04-14T22:08:19.690919+00:00",
+  "source_trajectory_ids": [
+    "traj-062ae1e2-aedc-46a1-a8e7-585f1bfd6968",
+    "traj-0aaadbdb-15cb-466b-9271-ecec84e0b21f",
+    "traj-87b00206-b80a-4572-a12b-d441e45f4374",
+    "traj-93bdfd26-c5b0-43d1-926d-303f3ea7d176",
+    "traj-95de8601-6f2b-4313-ad1f-dc248c4e6d78",
+    "traj-a72b7a28-3293-4d78-8a65-411a0ca6aefb",
+    "traj-af818f60-198b-4b46-9a36-5012b467d867",
+    "traj-bc6fedae-308b-491e-bbea-419817542e18",
+    "traj-dba716d0-5bdc-425e-8b6a-043f16e1d9b1",
+    "traj-e573c252-18ea-4f15-8630-568320f4d3c3"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-220819",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-319b9b10-5c39-46eb-a905-638876b20b78.json b/docs/training-reports/report-319b9b10-5c39-46eb-a905-638876b20b78.json
new file mode 100644
index 0000000..c5a7a60
--- /dev/null
+++ b/docs/training-reports/report-319b9b10-5c39-46eb-a905-638876b20b78.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-319b9b10-5c39-46eb-a905-638876b20b78",
+  "timestamp": "2026-04-14T20:56:11.631566+00:00",
+  "source_trajectory_ids": [
+    "traj-046f3b45-5b6d-4883-801a-4b674ac9a0f6",
+    "traj-1fc6191a-e9e5-4b9c-9294-ab68a7992506",
+    "traj-4370c224-46cc-44bf-aa52-c1ae9b9884be",
+    "traj-520f51ad-e1ea-42c9-b402-edb618a95020",
+    "traj-579262b0-c64c-41fa-802f-b3800e44d890",
+    "traj-5e8a36bb-1586-4843-9c4b-8e981c02342a",
+    "traj-b1cf2655-7694-4656-94f1-0cd6a2f4e195",
+    "traj-b92d6fd6-6421-42e3-9a1e-d0d5ea7a3ce5",
+    "traj-cb26f2fe-8169-4616-b124-47594ea88495",
+    "traj-fd456c78-f35a-47e3-9491-d462f28ea5dd"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-31c3b5b5-84dc-4a81-91c6-d663f0856347.json b/docs/training-reports/report-31c3b5b5-84dc-4a81-91c6-d663f0856347.json
new file mode 100644
index 0000000..b11169d
--- /dev/null
+++ b/docs/training-reports/report-31c3b5b5-84dc-4a81-91c6-d663f0856347.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-31c3b5b5-84dc-4a81-91c6-d663f0856347",
+  "timestamp": "2026-04-14T20:57:03.309395+00:00",
+  "source_trajectory_ids": [
+    "traj-0f36d4fa-6aa7-4647-8434-939727c2c38b",
+    "traj-2abf9d11-6e8f-41c7-9554-ff424498b905",
+    "traj-43b8c849-cff6-4a05-b166-19252a8b4758",
+    "traj-78074052-0bf6-466c-bdfe-6cffd970494c",
+    "traj-7bda4651-8a18-49f1-957e-9163f264035b",
+    "traj-8c8c5944-1d6f-44e1-a6e8-119d8fd904d7",
+    "traj-91f84675-8772-44f6-8916-61f9277a9af7",
+    "traj-cf026710-8f03-45da-8327-9f5e5a671c24",
+    "traj-e11c2598-a12f-43cf-b547-98a44c373c30",
+    "traj-fa2448c9-d2da-43d2-802b-bb2585e5c8d4"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-205703",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-32ee1906-6846-46b2-99b6-0df8aa632f18.json b/docs/training-reports/report-32ee1906-6846-46b2-99b6-0df8aa632f18.json
new file mode 100644
index 0000000..674098b
--- /dev/null
+++ b/docs/training-reports/report-32ee1906-6846-46b2-99b6-0df8aa632f18.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-32ee1906-6846-46b2-99b6-0df8aa632f18",
+  "timestamp": "2026-04-14T21:42:45.766701+00:00",
+  "source_trajectory_ids": [
+    "traj-1e14ac8f-dac5-4cab-a3c6-5b4fa97fb779",
+    "traj-58dec6f6-4712-4747-b189-9428512d8069",
+    "traj-59664fbf-ce9a-4b4f-a039-2625767e85ae",
+    "traj-ae3d3da7-773c-48f7-8c7d-35c9f3d6cccf",
+    "traj-c399734a-4bbc-49f0-ae6f-c8fa23f6a482",
+    "traj-c3dff94e-43fd-4d04-820c-47c9b7f01dfb",
+    "traj-ca22da3d-09d6-493a-b338-444343e0b252",
+    "traj-eacacd75-cd33-40e3-874b-8032ddd42175",
+    "traj-f61b2282-3674-4a29-867b-0dbf664bd116",
+    "traj-ffd6b64c-3b24-4b83-b72b-01b083f6e4b8"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-214245",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-33a0b960-112f-4976-a8db-1b15177f7e8e.json b/docs/training-reports/report-33a0b960-112f-4976-a8db-1b15177f7e8e.json
new file mode 100644
index 0000000..0f97b22
--- /dev/null
+++ b/docs/training-reports/report-33a0b960-112f-4976-a8db-1b15177f7e8e.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-33a0b960-112f-4976-a8db-1b15177f7e8e",
+  "timestamp": "2026-04-14T20:32:37.730415+00:00",
+  "source_trajectory_ids": [
+    "traj-23a871e5-2833-48ee-b3c1-5ed94138fcaa",
+    "traj-3cd1b98f-d6c3-4452-bfbf-71a3cf56415c",
+    "traj-4e67d82a-e669-4074-8dd9-450c8dd5102a",
+    "traj-58dd91bb-8f28-45f6-ab96-c61aca61c671",
+    "traj-627cad0d-6b5d-4631-b0fa-ffd16e1435c9",
+    "traj-867989fb-734f-441c-8f54-7177f83bb7b9",
+    "traj-bdc6723e-392d-4fe3-94fb-950af73ecfbe",
+    "traj-d2a41298-1da9-4fc5-8ba1-4ff15425c2f4",
+    "traj-d7d6e4c5-b66f-4456-8105-fcdedf467877",
+    "traj-fcd41273-8d85-4d4d-a0aa-5313347fe699"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-33cb91ff-6fc2-4cff-9e7c-7c80d67e9beb.json b/docs/training-reports/report-33cb91ff-6fc2-4cff-9e7c-7c80d67e9beb.json
new file mode 100644
index 0000000..5e1a83e
--- /dev/null
+++ b/docs/training-reports/report-33cb91ff-6fc2-4cff-9e7c-7c80d67e9beb.json
@@ -0,0 +1,44 @@
+{
+  "report_id": "report-33cb91ff-6fc2-4cff-9e7c-7c80d67e9beb",
+  "timestamp": "2026-04-14T18:31:07.979291+00:00",
+  "source_trajectory_ids": [
+    "traj-1724a751-6423-4fff-b9c6-ef92845b7297",
+    "traj-1f6f99c3-21de-4a14-840e-180021048a34",
+    "traj-2733b029-d10a-4902-922a-644d329a17c1",
+    "traj-294aadaf-5f60-409d-80f4-c640d8d82abd",
+    "traj-3282276b-ea3a-40d8-babb-62fcad0fa27d",
+    "traj-45b4d7c4-367f-4cbc-825f-a112137275ef",
+    "traj-5035c348-520b-468b-8e12-5de07b0ca885",
+    "traj-c86af3f9-c441-4267-bd37-b8a0a2d182da",
+    "traj-ccbb4aaf-7cff-4711-b987-c4e61ec1a4a8",
+    "traj-cd5f83d9-4ef4-42e4-af07-7c3345c26fe8"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-349d27df-93ce-49a1-ad9d-9c2ecdb5f9c1.json b/docs/training-reports/report-349d27df-93ce-49a1-ad9d-9c2ecdb5f9c1.json
new file mode 100644
index 0000000..3cf8aa0
--- /dev/null
+++ b/docs/training-reports/report-349d27df-93ce-49a1-ad9d-9c2ecdb5f9c1.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-349d27df-93ce-49a1-ad9d-9c2ecdb5f9c1",
+  "timestamp": "2026-04-14T20:03:02.621111+00:00",
+  "source_trajectory_ids": [
+    "traj-40ac86e5-85ee-4468-92ca-7be25e3e7442",
+    "traj-4c1bed8a-7db6-46f3-9ae9-1844cbbba837",
+    "traj-4f8e5d10-ce5d-4249-83d0-63e528af3bcd",
+    "traj-519e05a4-eec3-44ae-b07e-0c78565e6065",
+    "traj-5d84864a-c5e9-4674-a0ec-47ccedf609a9",
+    "traj-8fb37e36-f171-4285-8a37-905f6f7a34d5",
+    "traj-a0ab9293-99af-4b7b-894c-f0d6dec9fd40",
+    "traj-beead0f0-adcb-496a-8b73-ffe1bef4ee42",
+    "traj-d909fed4-7e99-46e0-a770-7d9e629cff7c",
+    "traj-e2d3db6a-5db2-41e8-8ecc-909f5d433324"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-36954641-b7fb-4fdd-895c-590d3ec6e0b8.json b/docs/training-reports/report-36954641-b7fb-4fdd-895c-590d3ec6e0b8.json
new file mode 100644
index 0000000..f12a020
--- /dev/null
+++ b/docs/training-reports/report-36954641-b7fb-4fdd-895c-590d3ec6e0b8.json
@@ -0,0 +1,41 @@
+{
+  "report_id": "report-36954641-b7fb-4fdd-895c-590d3ec6e0b8",
+  "timestamp": "2026-04-14T16:53:59.545841+00:00",
+  "source_trajectory_ids": [
+    "traj-6ba2785a-40a7-4248-9959-81d676a53741",
+    "traj-6cc684a4-a0f5-4068-bb18-70049e38ad2f",
+    "traj-750702b7-0344-4ef3-bdb6-8f62486d8788",
+    "traj-978cf9b1-4066-4c0b-8612-f7432a602153",
+    "traj-a85c7d2a-aab9-4508-bee1-4da6e793166f",
+    "traj-ab6fd543-950f-4b00-8a2c-6fbe49ca70d0",
+    "traj-b40f890b-49cc-4c2e-b16b-35cf68e30ee1",
+    "traj-de829d20-8677-46d6-a1c5-bd9501eab3ce",
+    "traj-eb695119-3ef5-4bc3-8c01-1855bab0cd0b",
+    "traj-f13331d8-e087-4d5a-88a6-bc9f754791a0"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-165359"
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-36b3533a-655c-4a02-b65e-850d90a1c320.json b/docs/training-reports/report-36b3533a-655c-4a02-b65e-850d90a1c320.json
new file mode 100644
index 0000000..9f16e8d
--- /dev/null
+++ b/docs/training-reports/report-36b3533a-655c-4a02-b65e-850d90a1c320.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-36b3533a-655c-4a02-b65e-850d90a1c320",
+  "timestamp": "2026-04-14T20:56:11.681988+00:00",
+  "source_trajectory_ids": [
+    "traj-22545cf9-c2c8-4ef9-abcd-271a216d7b39",
+    "traj-487aba11-f414-4f54-ab08-635e4436ce00",
+    "traj-652cb0d0-0132-44b7-b88b-15cc073fa6b6",
+    "traj-6e8fb3a1-d05c-4f21-acac-751c59695c26",
+    "traj-726e0d27-aadc-487a-a2c2-2245705d78bb",
+    "traj-996ecef9-5116-4cc9-aca0-ed85c27666bb",
+    "traj-a93a36a4-0c2b-441d-8b9a-8aeb61685092",
+    "traj-d1beb072-dee1-4ed3-bbfb-3c9462a713ca",
+    "traj-d58e0089-340c-4910-8c06-bfba7862d075",
+    "traj-fd45b279-677a-4653-bebf-784271ce95a1"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-205611",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-37a28243-ed47-4bc7-b260-9bb38b5c0f99.json b/docs/training-reports/report-37a28243-ed47-4bc7-b260-9bb38b5c0f99.json
new file mode 100644
index 0000000..2bd3be7
--- /dev/null
+++ b/docs/training-reports/report-37a28243-ed47-4bc7-b260-9bb38b5c0f99.json
@@ -0,0 +1,41 @@
+{
+  "report_id": "report-37a28243-ed47-4bc7-b260-9bb38b5c0f99",
+  "timestamp": "2026-04-14T18:27:21.104548+00:00",
+  "source_trajectory_ids": [
+    "traj-21a6856f-bdd7-4851-b9e0-219458e43cdb",
+    "traj-31dffa4d-dd6e-4d32-831d-41e83119f7fc",
+    "traj-372f8001-3343-4e51-b415-ad3231658ffe",
+    "traj-37658dc1-0bd6-4245-88f2-5a12b901d82f",
+    "traj-62dfe993-4c43-4122-9dde-39cfe6ea5fc1",
+    "traj-6cdaf4fa-5553-46c4-b550-3a01fd2b6371",
+    "traj-cafc62d9-5964-4a94-ac55-26e194aed032",
+    "traj-cc0843b4-d2fd-4a69-8b82-b55ad4dc8c79",
+    "traj-e5a7d153-7a31-4618-b117-ec13d8c192db",
+    "traj-f2aeb199-2ef4-4d29-a523-740d1d10aeb7"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-182721"
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-38b4ed01-cbaa-4ec2-9b31-af634d9786b1.json b/docs/training-reports/report-38b4ed01-cbaa-4ec2-9b31-af634d9786b1.json
new file mode 100644
index 0000000..17caffe
--- /dev/null
+++ b/docs/training-reports/report-38b4ed01-cbaa-4ec2-9b31-af634d9786b1.json
@@ -0,0 +1,41 @@
+{
+  "report_id": "report-38b4ed01-cbaa-4ec2-9b31-af634d9786b1",
+  "timestamp": "2026-04-14T15:04:26.185363+00:00",
+  "source_trajectory_ids": [
+    "traj-5e3f3a25-8ba2-4047-9cc5-ec1072ef1eec",
+    "traj-68138afe-056b-4e15-b67a-cbcaf8c17ff5",
+    "traj-6c8c007c-7781-4b15-a369-572a00f40457",
+    "traj-76fbd22e-1f59-4855-afe9-ebf90f8d59e6",
+    "traj-7b21dec9-e7b7-4f68-a5a0-c50221ee37aa",
+    "traj-a967b477-d265-414f-a33e-2582a8f0e086",
+    "traj-b0d09d4f-d041-41ce-94aa-1b7221e182ca",
+    "traj-bb05f70d-64f5-49ad-bbb2-9a63bf255d5b",
+    "traj-e4bd70d9-0680-4385-9397-dfb3299f9be0",
+    "traj-f4fed21e-bf33-479b-a05e-379b6997ca42"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-150426"
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-3c6cc2dd-862a-480a-913d-5554b4058d11.json b/docs/training-reports/report-3c6cc2dd-862a-480a-913d-5554b4058d11.json
new file mode 100644
index 0000000..973b886
--- /dev/null
+++ b/docs/training-reports/report-3c6cc2dd-862a-480a-913d-5554b4058d11.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-3c6cc2dd-862a-480a-913d-5554b4058d11",
+  "timestamp": "2026-04-14T18:01:06.225115+00:00",
+  "source_trajectory_ids": [
+    "traj-0f44d7ff-d89a-4c71-8c9a-1302bf13c23b",
+    "traj-34cf1d79-5fff-4bf2-a4a0-cfb8336821a4",
+    "traj-43bc3505-9d70-4a13-9212-da1cfe4e09d1",
+    "traj-566c33fe-5f4e-4697-9166-0420d63623fb",
+    "traj-56ac21d8-1dcf-4525-95ea-93d3ca52647e",
+    "traj-787abb45-cb75-40b7-b734-71d7cf60a180",
+    "traj-a3da8105-84a6-459d-a28f-6892b09afcdc",
+    "traj-d4cbefc9-9112-490a-a27c-f07f6e3d9662",
+    "traj-dfda911e-a036-443f-a5dd-28c8f1d84bca",
+    "traj-f75507d0-c01d-44d5-aaf3-9ff5f7f6c682"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-3d4ecf12-c252-493e-8ba5-2bfe51399190.json b/docs/training-reports/report-3d4ecf12-c252-493e-8ba5-2bfe51399190.json
new file mode 100644
index 0000000..8cb061a
--- /dev/null
+++ b/docs/training-reports/report-3d4ecf12-c252-493e-8ba5-2bfe51399190.json
@@ -0,0 +1,41 @@
+{
+  "report_id": "report-3d4ecf12-c252-493e-8ba5-2bfe51399190",
+  "timestamp": "2026-04-14T16:52:07.890026+00:00",
+  "source_trajectory_ids": [
+    "traj-2d72ff63-f4d4-43e4-b7d6-a159d321ef0e",
+    "traj-2ebae5a7-9c72-4178-b019-3380e17dea1d",
+    "traj-7123a23c-15ce-400a-9aff-1e4e2251695f",
+    "traj-7a9be049-d77e-4bf8-9776-045cad6a88b0",
+    "traj-a0786cfa-9353-4740-a8f1-473809ca7cd5",
+    "traj-a1a1766b-f5c3-4070-ad40-590985c65ff7",
+    "traj-ab013496-4365-4881-b625-e466728a83e3",
+    "traj-d8dc62b0-35bb-4e47-b50e-44f98de2eaf4",
+    "traj-e0529bff-6c1c-4afd-9ff3-195690e941dc",
+    "traj-eb197e1a-479a-4bea-8fdc-f48d2a053b7c"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-165207"
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-3e0d3ae0-e20c-40bc-981c-3f582cfbb7b2.json b/docs/training-reports/report-3e0d3ae0-e20c-40bc-981c-3f582cfbb7b2.json
new file mode 100644
index 0000000..7c448bf
--- /dev/null
+++ b/docs/training-reports/report-3e0d3ae0-e20c-40bc-981c-3f582cfbb7b2.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-3e0d3ae0-e20c-40bc-981c-3f582cfbb7b2",
+  "timestamp": "2026-04-14T18:06:25.522532+00:00",
+  "source_trajectory_ids": [
+    "traj-13bc1070-0b04-4664-b4e9-13400d3a9362",
+    "traj-4151835b-6ce4-46d5-9fdc-ee914931954a",
+    "traj-5a342582-6740-452b-a17c-36cb459966e8",
+    "traj-946fe752-e504-4791-aa5f-cb0af58f66f9",
+    "traj-99c0c477-db3e-4c2c-b95c-40c3489923d8",
+    "traj-afed1dbc-f4b5-4826-8124-a448a1ec0b86",
+    "traj-b0e8b9fe-261b-4673-9a17-8e3f4f64ae2e",
+    "traj-b2579e6a-2ee4-4572-affd-33537db667df",
+    "traj-cf639c55-a6a6-48e3-941d-9bc0b9bb1b88",
+    "traj-e3a07906-0351-4b52-a1a0-f5e1391c93e3"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-401afaba-f468-45bf-8442-2bf14c8316a2.json b/docs/training-reports/report-401afaba-f468-45bf-8442-2bf14c8316a2.json
new file mode 100644
index 0000000..c78f28b
--- /dev/null
+++ b/docs/training-reports/report-401afaba-f468-45bf-8442-2bf14c8316a2.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-401afaba-f468-45bf-8442-2bf14c8316a2",
+  "timestamp": "2026-04-15T01:25:33.810966+00:00",
+  "source_trajectory_ids": [
+    "traj-13292acc-308a-43f7-9716-87c8b938eb0f",
+    "traj-223e79f2-f626-4ae1-a0ad-2a6d053af25c",
+    "traj-519063dd-7bf3-461a-875b-bb7a4ecc2893",
+    "traj-66938ede-2373-46f8-a459-b5f291f3bc2b",
+    "traj-868a893d-8fba-45ba-875d-6a6e1f2dce8e",
+    "traj-916f484c-13a0-415b-aefa-832c07dfcf03",
+    "traj-9e1cead1-15bb-4098-bf5c-8b9b810988d9",
+    "traj-b3dc9704-72c5-4b75-a238-ab66b22dd766",
+    "traj-c6ec0501-d96d-4ad8-bf88-072640e22e4d",
+    "traj-cee07144-c2df-407d-9c43-29a76de1a48a"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260415-012533",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-4105b4e5-179f-431f-9f11-b110077fb2bc.json b/docs/training-reports/report-4105b4e5-179f-431f-9f11-b110077fb2bc.json
new file mode 100644
index 0000000..69b7c0e
--- /dev/null
+++ b/docs/training-reports/report-4105b4e5-179f-431f-9f11-b110077fb2bc.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-4105b4e5-179f-431f-9f11-b110077fb2bc",
+  "timestamp": "2026-04-14T21:18:36.114147+00:00",
+  "source_trajectory_ids": [
+    "traj-2fce941b-cf65-4efc-a9bc-c83672376d6e",
+    "traj-62d39291-7f83-4655-a5d0-c992d9ecdf04",
+    "traj-69975215-b8ff-4faa-b5f2-88ad26805b28",
+    "traj-81882b7c-ba4c-4ffc-89f4-7803c3aeee01",
+    "traj-8816a644-f2ef-44c1-8185-ea2cb83afb06",
+    "traj-9480f7c2-8853-4971-b36f-19c9f1592285",
+    "traj-956fdf6d-ef5a-4a7b-b32f-029f38e72533",
+    "traj-a1e6c947-e6e2-4f90-9a6e-22d99a656a2b",
+    "traj-b2d31d5b-12b7-4694-a26f-e91e1e3be8a7",
+    "traj-d03bb05c-acd4-452d-af83-77571b30009a"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-211836",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-4109400b-cfbb-4d5a-a26a-e0f4fc054541.json b/docs/training-reports/report-4109400b-cfbb-4d5a-a26a-e0f4fc054541.json
new file mode 100644
index 0000000..2e4a2c3
--- /dev/null
+++ b/docs/training-reports/report-4109400b-cfbb-4d5a-a26a-e0f4fc054541.json
@@ -0,0 +1,41 @@
+{
+  "report_id": "report-4109400b-cfbb-4d5a-a26a-e0f4fc054541",
+  "timestamp": "2026-04-14T15:50:36.233055+00:00",
+  "source_trajectory_ids": [
+    "traj-096259f5-d76c-4c42-b42e-fa16c5d2935e",
+    "traj-19070f8a-b395-4564-a3f0-4cb00e20ea3c",
+    "traj-226fd70d-da48-4f47-9f63-b6df7efc0175",
+    "traj-2614cf9a-bebc-44ca-99f6-508a266e2f42",
+    "traj-4a954778-c76c-4c98-9923-9f97365efc12",
+    "traj-5b0281de-bb94-4864-85a9-3a6b10d38121",
+    "traj-5bd45032-16b1-438f-b3a4-ba400ac52884",
+    "traj-82903c13-0579-4ff4-9d16-631ae7174d9e",
+    "traj-9ba2db0d-281d-44dc-a50e-09df89bee6f9",
+    "traj-a6c6d8c3-8073-457e-b55c-48b78eead9ba"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-155036"
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-413d07cb-cf68-42a8-87fd-19fef8e752d1.json b/docs/training-reports/report-413d07cb-cf68-42a8-87fd-19fef8e752d1.json
new file mode 100644
index 0000000..4cfd397
--- /dev/null
+++ b/docs/training-reports/report-413d07cb-cf68-42a8-87fd-19fef8e752d1.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-413d07cb-cf68-42a8-87fd-19fef8e752d1",
+  "timestamp": "2026-04-14T22:05:59.221755+00:00",
+  "source_trajectory_ids": [
+    "traj-00f15b5e-a641-42bd-b3d0-58fe2d9ab635",
+    "traj-0fc4a255-480c-49a2-b281-420b14c89d71",
+    "traj-1e601826-f496-492b-9001-033c1f4bf38f",
+    "traj-2055c676-3aa8-47c9-838b-a464e2599090",
+    "traj-35633916-ceff-4ef2-b270-260ef43f068e",
+    "traj-37f59692-f0db-42c7-9fb8-12a24fe65336",
+    "traj-5f43857c-36d8-464f-97e4-17d3532babfb",
+    "traj-7dc6d914-fc99-4995-a701-4c544444e421",
+    "traj-99143129-4966-49d8-8c79-0bae868f22e6",
+    "traj-f28f6c36-e9f9-4745-98a3-082eccb8e7a2"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-220559",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-41510fac-07d4-48d7-b0d3-435308de8a9b.json b/docs/training-reports/report-41510fac-07d4-48d7-b0d3-435308de8a9b.json
new file mode 100644
index 0000000..6e0bcd9
--- /dev/null
+++ b/docs/training-reports/report-41510fac-07d4-48d7-b0d3-435308de8a9b.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-41510fac-07d4-48d7-b0d3-435308de8a9b",
+  "timestamp": "2026-04-14T22:09:39.015784+00:00",
+  "source_trajectory_ids": [
+    "traj-0679d7fb-16f2-4815-b422-28bfad02aa05",
+    "traj-4cacef32-7bd5-478d-9811-3cf929fdf4cb",
+    "traj-72eec91d-72f8-49ad-b56f-d40b65f4cd76",
+    "traj-7784f333-f35b-4cc6-9258-71cb52ed5d62",
+    "traj-ad9737bf-684f-42df-927b-a8c239a9e63e",
+    "traj-d1536852-2f04-44a2-b18a-6b25acacff35",
+    "traj-ef00ca26-5cfc-4a06-9dcd-e6f678c85c8f",
+    "traj-f33cfd42-2102-4873-a3f0-5a2be2be4d69",
+    "traj-f74fe8d9-81b7-4d72-8514-e5bce6b23716",
+    "traj-f772c53c-6d53-4fa3-9147-b6642fa4a1e8"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-220939",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-43d4c045-9a5c-40d5-8b1d-1d7c6f77adb6.json b/docs/training-reports/report-43d4c045-9a5c-40d5-8b1d-1d7c6f77adb6.json
new file mode 100644
index 0000000..81df3ae
--- /dev/null
+++ b/docs/training-reports/report-43d4c045-9a5c-40d5-8b1d-1d7c6f77adb6.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-43d4c045-9a5c-40d5-8b1d-1d7c6f77adb6",
+  "timestamp": "2026-04-14T18:57:10.528189+00:00",
+  "source_trajectory_ids": [
+    "traj-1c8080c5-67cb-43d0-80ac-b5e2ec996a3f",
+    "traj-3f7ac604-e2ed-445e-a8d4-dbd94524609b",
+    "traj-46540622-c2d7-4783-a309-3833fa1a3f70",
+    "traj-4e5d9253-feaa-4641-bb84-680f81e38c57",
+    "traj-7bca135b-7fe2-444a-b112-10570666956d",
+    "traj-9aec1caa-9415-45a9-ad1d-e5dddda3985b",
+    "traj-badf1655-c1c5-45ca-b7d5-3f3ab5593848",
+    "traj-bc375f1b-d8ad-459c-9f08-880f95e1b8d9",
+    "traj-da9c3b03-5ff3-4f35-bbe1-be521909b86e",
+    "traj-ec6156e1-6013-45a6-8e3b-66af0214feb7"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-185710",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-43d6e714-88cc-4a65-a101-0d295c3dd389.json b/docs/training-reports/report-43d6e714-88cc-4a65-a101-0d295c3dd389.json
new file mode 100644
index 0000000..e5b1c31
--- /dev/null
+++ b/docs/training-reports/report-43d6e714-88cc-4a65-a101-0d295c3dd389.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-43d6e714-88cc-4a65-a101-0d295c3dd389",
+  "timestamp": "2026-04-14T20:58:05.550865+00:00",
+  "source_trajectory_ids": [
+    "traj-4605e012-4600-41ea-87ef-de75d36cb859",
+    "traj-59306aa1-bcd1-42d8-9bfd-bd8f8d26120a",
+    "traj-73d23896-0973-442f-87f8-0a80e64d51cb",
+    "traj-85d6d2cd-b787-42d1-9ed4-623353e3c13f",
+    "traj-afd7b74c-dd4d-4407-9223-73b0d6fdb58c",
+    "traj-d2c33fae-90f5-42ae-a928-75cc1f2f4475",
+    "traj-deb4aa39-0602-4dd7-b06c-57f7fdda2053",
+    "traj-ee7a8746-7cb2-416b-9f11-afefd425ed0a",
+    "traj-f369cb32-35a0-4ea2-92a1-f5980c79fd06",
+    "traj-f5d33f02-0607-4730-894e-4b06be60d7d8"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-205805",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-43e5bf98-8055-4b9f-8640-f2414224d4bf.json b/docs/training-reports/report-43e5bf98-8055-4b9f-8640-f2414224d4bf.json
new file mode 100644
index 0000000..64edcf1
--- /dev/null
+++ b/docs/training-reports/report-43e5bf98-8055-4b9f-8640-f2414224d4bf.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-43e5bf98-8055-4b9f-8640-f2414224d4bf",
+  "timestamp": "2026-04-14T22:05:44.115652+00:00",
+  "source_trajectory_ids": [
+    "traj-04405ec0-7ca9-4465-9518-7dab5d020c30",
+    "traj-0c0ddc68-0063-41c0-ab91-8e966b62f705",
+    "traj-0f652d30-d7e1-47fa-b124-cd288a12a14c",
+    "traj-1ce33c0d-597a-49e6-93dc-88f069e4c302",
+    "traj-24e41e0a-09f1-4d0c-8899-d4bb69ac9ddf",
+    "traj-7ab55256-fcd2-453f-abbf-4790df4f6ca8",
+    "traj-82348fa8-183a-4a40-a366-b6ac6a87309a",
+    "traj-b2eeb3af-3de0-44a4-b183-060b3a6d81d7",
+    "traj-dcf0d891-e3b8-49e5-ac91-569f3ceb3a90",
+    "traj-eabcb8e0-3bc8-47f5-a8dc-4221362a102f"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-220544",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-45467fbb-5fdb-45fe-b4a5-293b6560c08c.json b/docs/training-reports/report-45467fbb-5fdb-45fe-b4a5-293b6560c08c.json
new file mode 100644
index 0000000..94e78d9
--- /dev/null
+++ b/docs/training-reports/report-45467fbb-5fdb-45fe-b4a5-293b6560c08c.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-45467fbb-5fdb-45fe-b4a5-293b6560c08c",
+  "timestamp": "2026-04-15T01:33:34.775057+00:00",
+  "source_trajectory_ids": [
+    "traj-0bd98b63-fe62-407b-a82a-22240f040427",
+    "traj-0d032b19-011f-48c7-9bee-508b57b44d26",
+    "traj-15f2455d-56db-46e2-bd41-6418ba23a463",
+    "traj-4781ddd9-6ee8-43a1-9047-40503b2e5fae",
+    "traj-9e942dcc-7508-4779-91e2-a03c1b82c597",
+    "traj-b09495ff-1be2-4c94-bde8-dd0ea7cfbcad",
+    "traj-cda9f732-c8a1-4d7f-81bc-21318c922894",
+    "traj-e2626858-fc8f-476a-913f-fa9b1944c6a4",
+    "traj-ea4fed23-c0b9-4f76-9489-967dd42ed5ec",
+    "traj-fd5d13e3-ee3b-468b-99ea-8e85c7e0393b"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260415-013334",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-45f61edc-3f4c-4c6c-822f-d4e97628aa77.json b/docs/training-reports/report-45f61edc-3f4c-4c6c-822f-d4e97628aa77.json
new file mode 100644
index 0000000..7c41aca
--- /dev/null
+++ b/docs/training-reports/report-45f61edc-3f4c-4c6c-822f-d4e97628aa77.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-45f61edc-3f4c-4c6c-822f-d4e97628aa77",
+  "timestamp": "2026-04-14T22:08:19.673227+00:00",
+  "source_trajectory_ids": [
+    "traj-02f093fc-763e-4d3e-bafc-bb8aeaea75f1",
+    "traj-0a2744e8-e06b-4ecd-b0df-fa0a1c886b23",
+    "traj-28d82cd2-2f61-4a7d-903c-76c29caa93a2",
+    "traj-36f2201d-9728-480b-8c84-b40a7062e4e6",
+    "traj-7bba52c5-33d5-40e6-83a8-0444853df15d",
+    "traj-8a29f02e-99b7-4169-bff5-061f9aaa82a3",
+    "traj-8f82347d-7a20-4f92-9491-6ea282ed4c9e",
+    "traj-d76fd602-b881-4484-afd1-ef3c2f26f839",
+    "traj-f000568b-7ac3-4a67-a8ce-63d1d9beb690",
+    "traj-ffa74e58-35c4-42dd-a83d-e454392c2914"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-220819",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-461a5c98-57e6-40c8-b2ce-5a70f64072d2.json b/docs/training-reports/report-461a5c98-57e6-40c8-b2ce-5a70f64072d2.json
new file mode 100644
index 0000000..c57dd40
--- /dev/null
+++ b/docs/training-reports/report-461a5c98-57e6-40c8-b2ce-5a70f64072d2.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-461a5c98-57e6-40c8-b2ce-5a70f64072d2",
+  "timestamp": "2026-04-15T01:41:52.334143+00:00",
+  "source_trajectory_ids": [
+    "traj-3493a4b4-50a6-4a28-8a28-b6811f7dd289",
+    "traj-363519ec-b624-4a15-857c-7dd124594ef9",
+    "traj-68a740ca-fdb0-4fe4-9a32-a05ea194f9eb",
+    "traj-73adf39a-8d2a-465b-815b-d6b59ce127f9",
+    "traj-7a2de94c-df1d-4b74-83c9-1170b533d845",
+    "traj-7dd9573a-f665-43a7-b819-cb5ed45dc137",
+    "traj-afaf90c3-3df5-4f20-ad12-ed790fe8d8aa",
+    "traj-b1e72746-d5a7-44b8-b666-aa8c48168305",
+    "traj-e1bc8e6c-e77a-4cb1-8175-8099e93b48de",
+    "traj-f98d3663-69cd-46b4-ae0a-c20bc80e86f2"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260415-014152",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-48ee7d3f-d744-43de-93ad-88b94206d59a.json b/docs/training-reports/report-48ee7d3f-d744-43de-93ad-88b94206d59a.json
new file mode 100644
index 0000000..0ed466a
--- /dev/null
+++ b/docs/training-reports/report-48ee7d3f-d744-43de-93ad-88b94206d59a.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-48ee7d3f-d744-43de-93ad-88b94206d59a",
+  "timestamp": "2026-04-15T02:33:47.729485+00:00",
+  "source_trajectory_ids": [
+    "traj-03f64fa4-12e6-43d3-bdd4-850386d7941f",
+    "traj-231b0863-7611-4fb2-9638-476e7663da5d",
+    "traj-552c1c6d-4bde-446e-bbf8-c762fed02e81",
+    "traj-55c1df2f-4e21-463f-8be8-d412eb14da62",
+    "traj-686ff2d2-9a3c-43e3-8b40-f96962f5647a",
+    "traj-6e82a8d9-15a2-4891-97c8-ecb3ec3f7192",
+    "traj-91c9e740-03c7-41b0-883d-b5eeef1a3cc0",
+    "traj-9d012535-4e80-4aed-b515-97c6a01f4d53",
+    "traj-ad68a77b-9601-42f2-9c42-0ca21bb7c73f",
+    "traj-b428fffd-fd29-488a-9d12-95db39eac38e"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260415-023347",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-4a5187ed-5a28-4ee0-8b40-2e821323693d.json b/docs/training-reports/report-4a5187ed-5a28-4ee0-8b40-2e821323693d.json
new file mode 100644
index 0000000..00994b3
--- /dev/null
+++ b/docs/training-reports/report-4a5187ed-5a28-4ee0-8b40-2e821323693d.json
@@ -0,0 +1,44 @@
+{
+  "report_id": "report-4a5187ed-5a28-4ee0-8b40-2e821323693d",
+  "timestamp": "2026-04-14T18:51:33.158431+00:00",
+  "source_trajectory_ids": [
+    "traj-137b6353-1315-4d48-9b83-ad28569d0c96",
+    "traj-5bd3ab53-8268-4fe0-94a3-490a58200f6f",
+    "traj-60d28c42-0680-46f8-8de2-f61ec935db8f",
+    "traj-7886fd07-f658-4bc2-8e44-232a6aad7480",
+    "traj-868a1eeb-dfad-449d-a725-227ea2c01931",
+    "traj-950f2bf9-4a9a-4498-a7fc-79abfbc47937",
+    "traj-a411bae6-de2f-4f81-bc94-6e7075c24b4b",
+    "traj-a46a7e48-9852-4132-bc5b-44c0dcd5744b",
+    "traj-eb000b22-fc6f-4a47-9287-7b960627725b",
+    "traj-f45c3555-2524-44df-9644-979f68d3ed71"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-4b6ecd0d-81b5-4e2c-b52a-d9907a011b20.json b/docs/training-reports/report-4b6ecd0d-81b5-4e2c-b52a-d9907a011b20.json
new file mode 100644
index 0000000..d93ccea
--- /dev/null
+++ b/docs/training-reports/report-4b6ecd0d-81b5-4e2c-b52a-d9907a011b20.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-4b6ecd0d-81b5-4e2c-b52a-d9907a011b20",
+  "timestamp": "2026-04-14T18:58:16.264614+00:00",
+  "source_trajectory_ids": [
+    "traj-59f457a5-cb29-40d8-bc7d-8730e06986df",
+    "traj-7102877e-30e6-448b-bb99-9b0ac908a73e",
+    "traj-79299b5a-8e49-4ccb-8881-948903c39580",
+    "traj-968451f5-70ec-4d23-b040-9e638a324b78",
+    "traj-c28c9b70-91b0-412e-af2d-f4bd5551cee4",
+    "traj-c2e9306c-f40a-45e9-8954-3fc538faaf6d",
+    "traj-d09a7893-3785-4c50-8a70-69f3bc43173b",
+    "traj-d205135d-bff0-4f1b-8626-912fee42f566",
+    "traj-e2741ebc-dc3f-490d-9a8a-aba7d6ac5f0d",
+    "traj-e408ab2c-8fdb-4a61-a71d-30f9abb9ff3d"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-185816",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-4d50ee54-0bfa-4e34-926b-45900bfd3f8d.json b/docs/training-reports/report-4d50ee54-0bfa-4e34-926b-45900bfd3f8d.json
new file mode 100644
index 0000000..eb4f1d7
--- /dev/null
+++ b/docs/training-reports/report-4d50ee54-0bfa-4e34-926b-45900bfd3f8d.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-4d50ee54-0bfa-4e34-926b-45900bfd3f8d",
+  "timestamp": "2026-04-14T20:58:05.422298+00:00",
+  "source_trajectory_ids": [
+    "traj-067e1a4c-17cd-4374-b0c6-bab2ee05aeab",
+    "traj-241638b5-a874-45da-9be1-cd0541337e08",
+    "traj-38541187-375c-438a-be39-837e2330e11a",
+    "traj-5be14b5b-4c9c-459f-afc9-22a5935a2cc8",
+    "traj-6c2e9b3a-0327-43ca-81e5-56db7ae936b7",
+    "traj-7025a020-2103-4fe2-baa5-a0da4c7b9cbb",
+    "traj-79299683-0b2d-4e4e-a124-1fa48c59ef36",
+    "traj-96c030f3-e898-42c0-9911-338093d1dc60",
+    "traj-a573418c-2c39-41cf-9440-477dea42c020",
+    "traj-b1f69fcc-66d9-4f33-8b73-312398f8217e"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-205805",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-4dc7057d-d4f5-4250-b949-bc20d6f7521a.json b/docs/training-reports/report-4dc7057d-d4f5-4250-b949-bc20d6f7521a.json
new file mode 100644
index 0000000..0b59172
--- /dev/null
+++ b/docs/training-reports/report-4dc7057d-d4f5-4250-b949-bc20d6f7521a.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-4dc7057d-d4f5-4250-b949-bc20d6f7521a",
+  "timestamp": "2026-04-15T02:33:47.693933+00:00",
+  "source_trajectory_ids": [
+    "traj-0087d4b2-4bf7-4f26-aa54-1cb943e94dd0",
+    "traj-21bb050c-500e-41e1-87b9-d43e9488d12c",
+    "traj-282b3e0e-3c53-4580-bdcb-b25d66b9f3dd",
+    "traj-42d8a589-60b0-4001-8165-25c36e4e6d09",
+    "traj-508a9891-7a54-4b4c-8dd7-943e5539100e",
+    "traj-510f9b87-c96a-4b30-897a-96b86c6d15c4",
+    "traj-6df384c0-0ce4-4049-8983-87ce2880ac89",
+    "traj-7b69d26f-cbd3-4b5e-ac40-44133fe58a39",
+    "traj-cf692ef7-f7ff-467f-85a4-dff08182e3e8",
+    "traj-f44fae13-f32f-41da-a739-cec74835bd37"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260415-023347",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-4e7caa32-18a9-456f-a773-1c67fe43ce31.json b/docs/training-reports/report-4e7caa32-18a9-456f-a773-1c67fe43ce31.json
new file mode 100644
index 0000000..411a33c
--- /dev/null
+++ b/docs/training-reports/report-4e7caa32-18a9-456f-a773-1c67fe43ce31.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-4e7caa32-18a9-456f-a773-1c67fe43ce31",
+  "timestamp": "2026-04-14T20:32:37.607581+00:00",
+  "source_trajectory_ids": [
+    "traj-20efc089-2454-4f47-aad0-a67f25a73279",
+    "traj-31875fd0-e57c-42f6-b659-3140e3f06f5b",
+    "traj-440f0075-1325-461f-b67c-e68605eea57e",
+    "traj-462b7514-5775-4ee2-a228-8923fc083277",
+    "traj-5eaada3f-e728-41be-a722-eaacc2343b27",
+    "traj-60259dde-b711-40e5-b639-b935ef71307f",
+    "traj-951afa2a-5840-4fc2-a770-8e5624551899",
+    "traj-ae15a5aa-85e9-4879-b3f5-ce1fc5882cd1",
+    "traj-d76ddb92-8283-4600-9081-ca6a81af9cd4",
+    "traj-fb2ecbaa-cbfc-4c35-8c7e-08290bb2763f"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-203237",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-4ecb8a73-0bf3-4c31-ba10-1be9db3146f9.json b/docs/training-reports/report-4ecb8a73-0bf3-4c31-ba10-1be9db3146f9.json
new file mode 100644
index 0000000..422bcb7
--- /dev/null
+++ b/docs/training-reports/report-4ecb8a73-0bf3-4c31-ba10-1be9db3146f9.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-4ecb8a73-0bf3-4c31-ba10-1be9db3146f9",
+  "timestamp": "2026-04-14T20:31:11.222850+00:00",
+  "source_trajectory_ids": [
+    "traj-0681f8e7-a761-4503-99eb-1e4bc006ec12",
+    "traj-0ce9f156-20c5-49a3-8b0a-1fc9e028996c",
+    "traj-13c2f422-b99b-4c93-b92b-35180e88519e",
+    "traj-2dfba0de-8e4b-4e07-b6b3-0752f278c8e9",
+    "traj-5c09b804-4076-41bf-ae6f-17fbc024ee42",
+    "traj-77c07fd7-4479-494b-bdff-0a353664e330",
+    "traj-a26447e5-654d-4ecd-9af9-270d8ef26e6b",
+    "traj-bd4660e9-a9a8-4aad-8215-cfc5708b4e04",
+    "traj-c059fabf-2e18-481d-a645-3718b74b9963",
+    "traj-dbf9d2d7-1476-4b77-8f64-9b9e0c16535f"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-203111",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-5012fdef-1b08-4522-a946-072c23b71714.json b/docs/training-reports/report-5012fdef-1b08-4522-a946-072c23b71714.json
new file mode 100644
index 0000000..c4abe09
--- /dev/null
+++ b/docs/training-reports/report-5012fdef-1b08-4522-a946-072c23b71714.json
@@ -0,0 +1,41 @@
+{
+  "report_id": "report-5012fdef-1b08-4522-a946-072c23b71714",
+  "timestamp": "2026-04-14T18:00:27.756619+00:00",
+  "source_trajectory_ids": [
+    "traj-13febbec-cc01-4c68-8895-953e3a458337",
+    "traj-19937555-e13c-4714-9414-1c3332af7ac8",
+    "traj-28634bdc-f068-476a-a40a-8318e8786aa9",
+    "traj-43926528-d16f-41e7-b0d5-ef6e49c1bc05",
+    "traj-896a8b18-30c4-4d70-98e0-e283f8fc5517",
+    "traj-91eb1710-444e-4c6d-90e1-aae28c98f4cf",
+    "traj-96e9bf84-ff75-4b6e-aa70-c328fc951b36",
+    "traj-a626ce77-02ac-4915-9889-4c2364adeeef",
+    "traj-b40efd95-f438-4215-ad9a-e46af8638c07",
+    "traj-fcff5db2-b6bc-46dc-b07d-d0029f88706e"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-180027"
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-508eb935-fe06-4e84-92b6-a1fe22bc6159.json b/docs/training-reports/report-508eb935-fe06-4e84-92b6-a1fe22bc6159.json
new file mode 100644
index 0000000..342dd4f
--- /dev/null
+++ b/docs/training-reports/report-508eb935-fe06-4e84-92b6-a1fe22bc6159.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-508eb935-fe06-4e84-92b6-a1fe22bc6159",
+  "timestamp": "2026-04-14T20:58:05.539426+00:00",
+  "source_trajectory_ids": [
+    "traj-0253e6ba-f504-4379-ae3d-83016e88c09d",
+    "traj-2144ea5b-3969-4e06-a893-330db7c757bc",
+    "traj-22880d82-81b8-4803-a7d8-72f7acb5712e",
+    "traj-4595b5b9-6e23-42f5-8a61-5fde96159580",
+    "traj-9c89d043-5997-4480-9de6-9262bdc02a31",
+    "traj-ad14e644-0b4f-4ef7-a62f-79bdea3c6222",
+    "traj-cf28e6e2-68f8-4351-9f0f-eb6e9cdc3cd7",
+    "traj-d1b53585-ba7d-4492-8164-95dd51f0fbf8",
+    "traj-de3972a7-1bed-44e6-adee-e4b2c7c56456",
+    "traj-f8bc3f68-896b-4b6f-b0f3-0a6381f70ac2"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-50e8ccdd-d716-40cf-b02c-5c349565a955.json b/docs/training-reports/report-50e8ccdd-d716-40cf-b02c-5c349565a955.json
new file mode 100644
index 0000000..dbfe9f3
--- /dev/null
+++ b/docs/training-reports/report-50e8ccdd-d716-40cf-b02c-5c349565a955.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-50e8ccdd-d716-40cf-b02c-5c349565a955",
+  "timestamp": "2026-04-14T20:02:20.807645+00:00",
+  "source_trajectory_ids": [
+    "traj-1304d305-b55f-4849-a6d0-df7f3e0c1fb9",
+    "traj-3156c560-d459-42b3-a70d-df0a6e9af578",
+    "traj-3cb2d518-f74a-45bc-b232-a216d3f28abc",
+    "traj-61a1791a-5a2c-4c7d-a4fc-e4b82952c51c",
+    "traj-621b385d-d516-4506-8e27-5468be804754",
+    "traj-abc09bf9-adce-4fcd-8c1a-a151bc749971",
+    "traj-b08d47a2-57d9-417f-a0fd-87661137c0e4",
+    "traj-b2b54604-eece-4f93-96b7-df5cbe2f11b8",
+    "traj-e31d9867-6180-4867-9946-c32e219ed44e",
+    "traj-ee895acf-27e2-4efb-868b-793b5b0f5f59"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-200220",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-51b272a3-aa2a-42a9-b406-2af6bb6c13d5.json b/docs/training-reports/report-51b272a3-aa2a-42a9-b406-2af6bb6c13d5.json
new file mode 100644
index 0000000..9541729
--- /dev/null
+++ b/docs/training-reports/report-51b272a3-aa2a-42a9-b406-2af6bb6c13d5.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-51b272a3-aa2a-42a9-b406-2af6bb6c13d5",
+  "timestamp": "2026-04-15T01:33:34.743842+00:00",
+  "source_trajectory_ids": [
+    "traj-019b1e3d-a4d5-414c-98f2-14566d4e1a73",
+    "traj-638e4b82-6bb8-4f95-9112-39f1d153140f",
+    "traj-7c90795a-8999-4733-9e20-aed2c20d11ce",
+    "traj-890c85dd-37ab-4949-b72a-6dbd86b22610",
+    "traj-8ef247e7-fc36-47b4-9c35-7791149a4bc6",
+    "traj-ca8173f4-ba99-4421-a150-3c04e00c1504",
+    "traj-e0823914-aeec-402a-b02c-27ddf95ab381",
+    "traj-f7357e60-cc1d-4ca7-a7b6-b00f30d5263a",
+    "traj-f8c00ccd-24d1-4f0a-a4d4-808d17529466",
+    "traj-f9e58e88-b618-4dc2-9723-09b4cf5db05b"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260415-013334",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-547cf777-50f1-4848-b5c6-e79110f6b4fa.json b/docs/training-reports/report-547cf777-50f1-4848-b5c6-e79110f6b4fa.json
new file mode 100644
index 0000000..8e45a77
--- /dev/null
+++ b/docs/training-reports/report-547cf777-50f1-4848-b5c6-e79110f6b4fa.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-547cf777-50f1-4848-b5c6-e79110f6b4fa",
+  "timestamp": "2026-04-14T21:18:36.371846+00:00",
+  "source_trajectory_ids": [
+    "traj-008e3a6c-d014-441e-a368-3a298a94b570",
+    "traj-11992679-5342-4506-8f5e-88df8f2e5b37",
+    "traj-1bd76b8d-7968-4870-adbc-7d6f4d968d4d",
+    "traj-3c86846f-d086-4198-984b-b34a3d9dc32f",
+    "traj-559b9c31-c008-4957-8774-539f3225d67b",
+    "traj-680bba7c-58b7-4b87-be69-3aaeeceb8de6",
+    "traj-99e17c4d-a8bf-4c3a-8b1e-4e16c640c8aa",
+    "traj-cd676023-60fa-49d1-bc8d-8b58a8152df0",
+    "traj-dd628bf0-b6e0-4312-a006-9fcb36cdacd3",
+    "traj-e10d1f31-6253-4508-a4e9-984894d56caf"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-54f51896-f163-490b-a35f-5f7921b9fcae.json b/docs/training-reports/report-54f51896-f163-490b-a35f-5f7921b9fcae.json
new file mode 100644
index 0000000..b6e117d
--- /dev/null
+++ b/docs/training-reports/report-54f51896-f163-490b-a35f-5f7921b9fcae.json
@@ -0,0 +1,41 @@
+{
+  "report_id": "report-54f51896-f163-490b-a35f-5f7921b9fcae",
+  "timestamp": "2026-04-14T15:52:51.142603+00:00",
+  "source_trajectory_ids": [
+    "traj-2263147a-44b3-4e96-94d2-dea2dc31cc58",
+    "traj-2e08b4bd-1a4d-4ae2-a8cf-355f3812ac0b",
+    "traj-39cf7b85-237a-4676-8d6a-716ba807446a",
+    "traj-4f9ac486-f0ae-49d4-b79e-e5c9370c3def",
+    "traj-6fbd4429-9ab3-4077-b8d0-5ff6a64965ba",
+    "traj-9ad2532a-4a42-4b8f-90bd-0f1d03fc7d34",
+    "traj-ba2fa5f5-95e5-48c8-96c0-b8f00d6a7d36",
+    "traj-c77a8e7a-d1fe-4881-9288-3bb4707e42d0",
+    "traj-dabcee54-f3d9-412d-b392-32badcbce80c",
+    "traj-db70effe-7b6c-491f-93c9-4892e67507c7"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-155251"
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-58eb6ade-acce-4120-87bc-17252a66f5c1.json b/docs/training-reports/report-58eb6ade-acce-4120-87bc-17252a66f5c1.json
new file mode 100644
index 0000000..c5138ed
--- /dev/null
+++ b/docs/training-reports/report-58eb6ade-acce-4120-87bc-17252a66f5c1.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-58eb6ade-acce-4120-87bc-17252a66f5c1",
+  "timestamp": "2026-04-14T22:05:44.095159+00:00",
+  "source_trajectory_ids": [
+    "traj-04176304-8400-459c-8345-1804b8a1857b",
+    "traj-2ee8e3e5-1274-456d-b60a-4455e6f1d688",
+    "traj-3c740a85-d4bc-45e4-8aa0-6ad8de199cd5",
+    "traj-4a69b684-5185-4792-ba8c-82ecd5ea831c",
+    "traj-5af6d0c4-0d32-4c1b-8003-cbc3b16af816",
+    "traj-7a6aa489-6c95-4c72-a1da-63acc995c51e",
+    "traj-9876e667-7cc5-4f93-a3f9-53ba44b6b343",
+    "traj-bb97b314-3287-4d08-a5be-4cfe60a70959",
+    "traj-be45651c-16e4-46a8-80f6-b53319a18f4c",
+    "traj-e1b4102d-ce4a-448d-9060-1648dae074b6"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-5996831f-a897-4eec-8f40-1786539febde.json b/docs/training-reports/report-5996831f-a897-4eec-8f40-1786539febde.json
new file mode 100644
index 0000000..e916ca3
--- /dev/null
+++ b/docs/training-reports/report-5996831f-a897-4eec-8f40-1786539febde.json
@@ -0,0 +1,42 @@
+{
+  "report_id": "report-5996831f-a897-4eec-8f40-1786539febde",
+  "timestamp": "2026-04-14T18:30:24.385886+00:00",
+  "source_trajectory_ids": [
+    "traj-0b57378b-d41c-4cd5-9c8b-cbb6d0cb6c6e",
+    "traj-1b519f5c-8226-45d9-97a9-7057df5389bd",
+    "traj-5062f87c-5bf2-4b62-b72e-6964492f4517",
+    "traj-62dc0190-fa0c-48cd-bd64-7f7d8caaffcb",
+    "traj-9bb3feb1-c576-4e24-8127-6ca03d2a0c08",
+    "traj-ca8b06b6-af43-418a-bff1-fe3323e43a01",
+    "traj-e042d848-7805-42f2-ad1b-8500c6e93136",
+    "traj-e3de23bb-643d-4833-a5c2-fe98182292ce",
+    "traj-ea84f194-39a1-40bf-8a70-592f80ba40e0",
+    "traj-ec288161-f7b1-46df-b7c4-26aa79f5e5c5"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-183024",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-5bbea27a-8808-4142-a300-a5ea9d1b8abb.json b/docs/training-reports/report-5bbea27a-8808-4142-a300-a5ea9d1b8abb.json
new file mode 100644
index 0000000..9ced682
--- /dev/null
+++ b/docs/training-reports/report-5bbea27a-8808-4142-a300-a5ea9d1b8abb.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-5bbea27a-8808-4142-a300-a5ea9d1b8abb",
+  "timestamp": "2026-04-14T20:07:38.784383+00:00",
+  "source_trajectory_ids": [
+    "traj-24636cff-0df9-464a-aadc-49e7d5bb4b19",
+    "traj-38b267f5-2839-41fb-8600-e3eff1ebc850",
+    "traj-49244af8-1ff7-4381-9bb7-68279aab96a3",
+    "traj-59765e1c-4ebc-42fd-af01-e36389ff3e39",
+    "traj-7c3f75af-6df0-485a-af83-714eb5595dbc",
+    "traj-97a16c7c-4b65-486f-920f-b488ee581a57",
+    "traj-b386fe62-333b-4cc1-be14-473188c861ed",
+    "traj-e07c427f-bdfb-43d8-953d-e3a7e91113c2",
+    "traj-e6458a19-1a8e-4d61-8444-57dfb89649f1",
+    "traj-eea30f77-6daf-4f21-9e90-8be64b29e654"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-200738",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-5bbfa4bf-2555-42d8-bc71-2dd94bf9a1aa.json b/docs/training-reports/report-5bbfa4bf-2555-42d8-bc71-2dd94bf9a1aa.json
new file mode 100644
index 0000000..0109c2a
--- /dev/null
+++ b/docs/training-reports/report-5bbfa4bf-2555-42d8-bc71-2dd94bf9a1aa.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-5bbfa4bf-2555-42d8-bc71-2dd94bf9a1aa",
+  "timestamp": "2026-04-15T01:57:32.884356+00:00",
+  "source_trajectory_ids": [
+    "traj-1d68febb-363d-405a-8456-10d070a0df6f",
+    "traj-5314e5cf-7c86-4ace-9c2c-34b169b9e455",
+    "traj-5f54288b-b434-4a88-be89-e4115b32c589",
+    "traj-67e64768-970a-4fde-8b7a-1e5c4821d0e9",
+    "traj-76a7c854-ff21-4f1e-9a45-f37d288b69bb",
+    "traj-788a70d9-4154-4526-af64-d3572ee9f3c5",
+    "traj-9472520c-14be-4841-b662-3506fc829022",
+    "traj-bfe0c5b0-1109-471d-b5e8-383f4a6f6219",
+    "traj-c314335d-9b65-4ccd-b6e0-37fff704101a",
+    "traj-cd70da62-bdba-4e86-ab7e-346b2f439fbb"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260415-015732",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-5c909aba-8438-4c52-9ac5-aa82c8552a5c.json b/docs/training-reports/report-5c909aba-8438-4c52-9ac5-aa82c8552a5c.json
new file mode 100644
index 0000000..52ee59b
--- /dev/null
+++ b/docs/training-reports/report-5c909aba-8438-4c52-9ac5-aa82c8552a5c.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-5c909aba-8438-4c52-9ac5-aa82c8552a5c",
+  "timestamp": "2026-04-14T21:21:15.173091+00:00",
+  "source_trajectory_ids": [
+    "traj-0ae4d970-1f54-4fca-acf9-49d9786ed602",
+    "traj-2be44f9a-80d1-4985-9976-a84cdc65ec77",
+    "traj-54917937-5e1d-4736-9fec-011478a063d3",
+    "traj-6c11024c-1199-4136-9cad-ebdb70a93b1d",
+    "traj-74c8f20b-b1fe-4880-8311-32bf22fb2d97",
+    "traj-921d7a96-8875-4c2f-a1be-e4d59c99b5a2",
+    "traj-96bfd297-d95c-4190-bac0-01c314cedd02",
+    "traj-9f46c7fe-20d4-48bc-a430-4dac0a33b36d",
+    "traj-a9f90b8e-b3e3-4c38-b379-5bae85c0017e",
+    "traj-f4214286-6b97-416a-98be-2af5379bc7a1"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-5ce6674e-f6e1-4622-aba2-79ed9fb07bc9.json b/docs/training-reports/report-5ce6674e-f6e1-4622-aba2-79ed9fb07bc9.json
new file mode 100644
index 0000000..a7cb30a
--- /dev/null
+++ b/docs/training-reports/report-5ce6674e-f6e1-4622-aba2-79ed9fb07bc9.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-5ce6674e-f6e1-4622-aba2-79ed9fb07bc9",
+  "timestamp": "2026-04-15T01:21:53.799961+00:00",
+  "source_trajectory_ids": [
+    "traj-1222b185-244e-46f7-a60d-f7f024819f22",
+    "traj-1d47bcaa-3b89-47a0-9e11-bd5b0cada785",
+    "traj-250b5772-1194-44f7-9c27-1eab15bf6428",
+    "traj-56d2f9d5-9608-4392-886e-0e7693c2ae7b",
+    "traj-befbea50-2481-4558-96f3-a46a4bcae55f",
+    "traj-d834d9a9-bb52-418a-8ac9-7ef9aac0d506",
+    "traj-e551dfaf-20cc-4f16-bc94-693d8ba2355b",
+    "traj-e5763320-3dbd-4f49-80be-3a0126e26e09",
+    "traj-eabda2c0-ccee-4fe5-b023-e7dc80cae38a",
+    "traj-f37c8ecb-eef8-41a2-93cb-7860a8341ae0"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-5d3ab2d4-362f-48ea-8e43-98794a729853.json b/docs/training-reports/report-5d3ab2d4-362f-48ea-8e43-98794a729853.json
new file mode 100644
index 0000000..41f7846
--- /dev/null
+++ b/docs/training-reports/report-5d3ab2d4-362f-48ea-8e43-98794a729853.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-5d3ab2d4-362f-48ea-8e43-98794a729853",
+  "timestamp": "2026-04-14T20:31:11.240432+00:00",
+  "source_trajectory_ids": [
+    "traj-052c2084-bf8f-45b3-a48b-eaa9b95f5cdc",
+    "traj-063c87a1-bde2-43e3-ace5-e7c29a9ad296",
+    "traj-08ac8afd-1f85-4f80-928a-c014756ad1ad",
+    "traj-254ee33c-82e6-4078-9431-5f421a88eac7",
+    "traj-3bb69895-8479-48dc-8685-971a00d2dd42",
+    "traj-5271851b-e3b0-4e5d-981b-2d20551680b1",
+    "traj-9909611d-1af8-4657-91d2-5bb784c28df7",
+    "traj-d7ec1de8-c98b-4489-9802-6118696a128d",
+    "traj-d8d708d8-2e49-49a9-a701-6c08ec33ce8c",
+    "traj-e3629c94-90cd-4ade-92b6-6b12d088150e"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-203111",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-5f3cc25f-f580-4610-8939-f275b48348aa.json b/docs/training-reports/report-5f3cc25f-f580-4610-8939-f275b48348aa.json
new file mode 100644
index 0000000..0139b50
--- /dev/null
+++ b/docs/training-reports/report-5f3cc25f-f580-4610-8939-f275b48348aa.json
@@ -0,0 +1,42 @@
+{
+  "report_id": "report-5f3cc25f-f580-4610-8939-f275b48348aa",
+  "timestamp": "2026-04-14T18:51:33.102552+00:00",
+  "source_trajectory_ids": [
+    "traj-12af88a3-cd2a-47eb-9f64-cfb6e22e54ce",
+    "traj-567e562b-b673-4216-a97c-e0342f30a392",
+    "traj-68ba5f57-66c1-442a-876b-358a1c3ec8a7",
+    "traj-88df2859-a5f8-41a0-a6b2-b69c7b7534e6",
+    "traj-a9dd0423-655e-4de1-b438-f129b88eacec",
+    "traj-ba9449c4-6593-416c-bb47-3aa2e47c8ec7",
+    "traj-bf6a9a13-af26-4ff1-8ec5-b087bbfde8d1",
+    "traj-c3f4305f-ca99-44e7-a009-35a3e996acd2",
+    "traj-e45c104e-f29b-435d-8eb0-cf7c3da2fb2d",
+    "traj-f71ab09f-bd40-4637-a9ba-999c1e049c47"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-185133",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-5f961f5f-f3de-463d-8586-b770aa2e951f.json b/docs/training-reports/report-5f961f5f-f3de-463d-8586-b770aa2e951f.json
new file mode 100644
index 0000000..cbc9f34
--- /dev/null
+++ b/docs/training-reports/report-5f961f5f-f3de-463d-8586-b770aa2e951f.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-5f961f5f-f3de-463d-8586-b770aa2e951f",
+  "timestamp": "2026-04-15T02:31:17.458912+00:00",
+  "source_trajectory_ids": [
+    "traj-157f8c21-719f-4ba8-8243-f22d95e30091",
+    "traj-1daf4a22-f63c-4119-8f98-426a42756b47",
+    "traj-3abf53ed-a0db-4b62-a9ee-b24cb3e74359",
+    "traj-4d732a6b-1a09-4547-91b0-da7c3b72f9c0",
+    "traj-5efe2891-e10a-44f1-8882-fa9f35ef0ff5",
+    "traj-6ea2a8f6-86ce-406d-ad52-54de0aabfe6a",
+    "traj-7653da9a-96a4-4a34-938d-8251fcf02ea4",
+    "traj-8b17c700-e430-4c91-928f-25db8f0d73f2",
+    "traj-c4efa221-68bc-4d01-8bf8-eed2268f0c7e",
+    "traj-ff6a9712-5857-48ce-96f4-b744d655bd18"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-6579e004-98ff-4fd0-b710-2116b47bbe9e.json b/docs/training-reports/report-6579e004-98ff-4fd0-b710-2116b47bbe9e.json
new file mode 100644
index 0000000..84c7a57
--- /dev/null
+++ b/docs/training-reports/report-6579e004-98ff-4fd0-b710-2116b47bbe9e.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-6579e004-98ff-4fd0-b710-2116b47bbe9e",
+  "timestamp": "2026-04-15T01:25:33.889628+00:00",
+  "source_trajectory_ids": [
+    "traj-1a1d2d0d-9f79-4edf-805b-d8b4bbd920f4",
+    "traj-28a1f4a0-7446-4c16-886a-99013507cf20",
+    "traj-3501ab39-8337-47f7-9ae7-7700c41d160f",
+    "traj-96eff406-cf30-49aa-9ee8-e278a1a1789c",
+    "traj-a63dbc77-e3e2-4db7-935e-22a088879ff7",
+    "traj-aefe0c29-8f97-46d5-8e33-23bc714c7151",
+    "traj-b10077c3-22ca-46ba-9a3d-70c3474f2449",
+    "traj-c7ee4596-ca05-40b3-8e24-c0b6f95fb0b8",
+    "traj-d0429a8b-3e38-4316-9f61-09b5847908e9",
+    "traj-d128fdeb-ec2a-44db-b1fd-f49f63c33536"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-657d2b18-44a2-42a7-821f-5697611b403f.json b/docs/training-reports/report-657d2b18-44a2-42a7-821f-5697611b403f.json
new file mode 100644
index 0000000..5cd7820
--- /dev/null
+++ b/docs/training-reports/report-657d2b18-44a2-42a7-821f-5697611b403f.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-657d2b18-44a2-42a7-821f-5697611b403f",
+  "timestamp": "2026-04-14T20:54:35.829616+00:00",
+  "source_trajectory_ids": [
+    "traj-29e7080a-f38d-4eff-8885-01625f706940",
+    "traj-4406b1af-13c0-4edb-937e-70a4ad05e9c4",
+    "traj-74fc5cb4-bcf2-4400-adb4-10f29ad17c13",
+    "traj-86c672fc-af0b-47c3-9e29-b0ebadab6c3c",
+    "traj-aef7759d-0c2e-44aa-af65-f40bd4a217ce",
+    "traj-d087f258-1c76-47e0-b532-7effcece43eb",
+    "traj-d8f2b0a7-2a9c-4354-b4f1-4fd3de378881",
+    "traj-e392e827-019b-4679-88aa-b0ca9b9e1a37",
+    "traj-eddec9a7-94c9-4787-8826-bbaf84c0be35",
+    "traj-effafeb3-8762-41b6-8594-c1267172b81a"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-205435",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-66aa10dc-0527-4459-bb2e-b5f4da1272c1.json b/docs/training-reports/report-66aa10dc-0527-4459-bb2e-b5f4da1272c1.json
new file mode 100644
index 0000000..df424cd
--- /dev/null
+++ b/docs/training-reports/report-66aa10dc-0527-4459-bb2e-b5f4da1272c1.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-66aa10dc-0527-4459-bb2e-b5f4da1272c1",
+  "timestamp": "2026-04-14T19:21:09.831626+00:00",
+  "source_trajectory_ids": [
+    "traj-0be01236-de1b-4a17-8d26-ada21cca007a",
+    "traj-13ae15bb-9bcf-46d7-a8f7-318880c697b2",
+    "traj-488630e3-898b-4668-9f93-8e3f388bb3c0",
+    "traj-4e3fc1d9-87f7-4acd-8d9e-8f8caa259c16",
+    "traj-594da9b5-d065-43c6-8581-e1c325917c5c",
+    "traj-5cd271b2-b721-4cb3-9d73-35f493d076e7",
+    "traj-61257688-a738-4164-aeae-6b8ca5b4ccfc",
+    "traj-7a425e1b-ad77-42eb-bc99-e83d37f0d13c",
+    "traj-94d587fa-4217-4d91-8119-15c33648ed5b",
+    "traj-e9bbac88-b876-432a-8208-b45250af8e42"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-192109",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-66beff81-2ce9-4bf5-9d18-6518f371347c.json b/docs/training-reports/report-66beff81-2ce9-4bf5-9d18-6518f371347c.json
new file mode 100644
index 0000000..87504b5
--- /dev/null
+++ b/docs/training-reports/report-66beff81-2ce9-4bf5-9d18-6518f371347c.json
@@ -0,0 +1,41 @@
+{
+  "report_id": "report-66beff81-2ce9-4bf5-9d18-6518f371347c",
+  "timestamp": "2026-04-14T16:54:50.517635+00:00",
+  "source_trajectory_ids": [
+    "traj-20e81f4f-5c98-4f41-b892-81be69efa36c",
+    "traj-21e3d99b-04f5-4402-9733-f8beaaf8c044",
+    "traj-54727683-a607-494e-9944-a51bc38d5d22",
+    "traj-62e7f1b3-ca1f-439e-90ba-081637093125",
+    "traj-9a7b407c-622b-4adf-9143-513bd58d480e",
+    "traj-a9ebbf52-9235-4227-8d40-f40cd19b5231",
+    "traj-b3b68970-229e-40e4-a98c-11a909ba9f7d",
+    "traj-b786daaf-4ee7-4d79-b09e-0aec9a2b16a3",
+    "traj-b9b0d092-8782-4f66-beef-4d0995a32fa0",
+    "traj-c153d7b1-4379-439d-b808-9378fe9f054e"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-165450"
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-679f7a61-bd73-423f-81c8-9836584ce96f.json b/docs/training-reports/report-679f7a61-bd73-423f-81c8-9836584ce96f.json
new file mode 100644
index 0000000..fc85e23
--- /dev/null
+++ b/docs/training-reports/report-679f7a61-bd73-423f-81c8-9836584ce96f.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-679f7a61-bd73-423f-81c8-9836584ce96f",
+  "timestamp": "2026-04-14T15:01:27.746539+00:00",
+  "source_trajectory_ids": [
+    "traj-008e354d-cf3f-4838-885d-887add59b833",
+    "traj-0408386b-4086-4e8b-aab0-34092888ab20",
+    "traj-41487090-aab2-4c39-bb32-3f08b34253c3",
+    "traj-49d76619-87bd-4968-ab72-fe2cc4fd687d",
+    "traj-4ba6da5f-0faa-4792-ab0d-6c44ec286fc3",
+    "traj-4f8edddf-0ee2-49ac-8d3a-292447189a95",
+    "traj-6155b78f-ccdd-4141-b0ce-d2d9f3553550",
+    "traj-6ac43687-13a2-4987-94cf-c7032bc6a8d2",
+    "traj-7b19cd3d-55e0-4da2-a517-96051a53e63c",
+    "traj-f2ad8bea-5289-443c-b99d-6e81a5b3f33a"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-68d38c2f-d8dc-4b33-8514-8fa9410b0961.json b/docs/training-reports/report-68d38c2f-d8dc-4b33-8514-8fa9410b0961.json
new file mode 100644
index 0000000..2fbfde1
--- /dev/null
+++ b/docs/training-reports/report-68d38c2f-d8dc-4b33-8514-8fa9410b0961.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-68d38c2f-d8dc-4b33-8514-8fa9410b0961",
+  "timestamp": "2026-04-14T20:57:28.152811+00:00",
+  "source_trajectory_ids": [
+    "traj-1c05d834-9c1d-4a43-ad8b-8c6c8087256a",
+    "traj-386bdc31-2972-4831-9c43-01a2aae8d9f4",
+    "traj-730465f5-2014-4bea-b491-4bb8fe7c64a4",
+    "traj-79375497-a0ac-45dd-ae4d-b09bff4be7ed",
+    "traj-871a56b9-b8bc-40bf-9c2c-9b6358ab53cc",
+    "traj-ac621263-b868-4692-ab37-7a86bff5a35d",
+    "traj-b077ff51-61cb-4947-a8b8-0f5f7c5ebea7",
+    "traj-b342a862-14e1-4635-b4b5-d8e36d10d29b",
+    "traj-ce5c7f40-792e-43f1-b727-31fd4fdb05e8",
+    "traj-f9deff53-f6e6-4bdb-a58e-16c927c90d80"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-690aa1c7-7761-49c2-b0a3-9bddfb75dc47.json b/docs/training-reports/report-690aa1c7-7761-49c2-b0a3-9bddfb75dc47.json
new file mode 100644
index 0000000..1d4731c
--- /dev/null
+++ b/docs/training-reports/report-690aa1c7-7761-49c2-b0a3-9bddfb75dc47.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-690aa1c7-7761-49c2-b0a3-9bddfb75dc47",
+  "timestamp": "2026-04-14T21:22:14.790781+00:00",
+  "source_trajectory_ids": [
+    "traj-0558d6b6-9b4f-4580-a6fa-534dac75f3c4",
+    "traj-1da973fc-88c3-4ea1-9574-3b7a1558d315",
+    "traj-2adae2da-c0c1-4cff-ba3d-c16500853ca6",
+    "traj-41302519-e96f-4a36-bd95-982f8690569e",
+    "traj-5c77ffb7-aee1-4b79-8e88-7d9116c70629",
+    "traj-77ad9e4d-6265-41e9-8af3-9cef86345cf9",
+    "traj-aa6c3083-3d8d-4b8f-93eb-674346ef5005",
+    "traj-aec8383e-152a-46c8-83ec-0fc181a93516",
+    "traj-c4130706-d699-4990-805a-a8a90d5b7b8d",
+    "traj-f0a1f252-5a21-47c6-bfdd-3ff5026b1fec"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-6bf4158b-c9d8-4eab-ba70-6eb707d0bca1.json b/docs/training-reports/report-6bf4158b-c9d8-4eab-ba70-6eb707d0bca1.json
new file mode 100644
index 0000000..178e936
--- /dev/null
+++ b/docs/training-reports/report-6bf4158b-c9d8-4eab-ba70-6eb707d0bca1.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-6bf4158b-c9d8-4eab-ba70-6eb707d0bca1",
+  "timestamp": "2026-04-14T15:29:36.013045+00:00",
+  "source_trajectory_ids": [
+    "traj-1d69c6a6-e7d9-446a-911e-96fa227a3c16",
+    "traj-1dcf8869-c1d1-4fb0-93f0-6fc42870ef9d",
+    "traj-3f9baf44-61f6-4428-9e9f-8c3f007e522d",
+    "traj-5018de53-b5b9-4e3e-b160-00c07286de44",
+    "traj-5dbdcc78-ec34-4272-99b1-bd91c940a8b7",
+    "traj-a2a2f17f-a173-47b5-859e-bc7774081f6a",
+    "traj-adf886a2-770f-49bf-b3e2-c6f43d7f138e",
+    "traj-b2621d11-6b94-45b9-93fb-a777e0d900dc",
+    "traj-e5f11c2b-56d6-4b1d-b827-e2542af15171",
+    "traj-f152c0cc-5cc8-4ba1-836f-57bd6d8afe05"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-6ea1be0d-d5b7-40f4-8b56-2519612367c7.json b/docs/training-reports/report-6ea1be0d-d5b7-40f4-8b56-2519612367c7.json
new file mode 100644
index 0000000..3130e1c
--- /dev/null
+++ b/docs/training-reports/report-6ea1be0d-d5b7-40f4-8b56-2519612367c7.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-6ea1be0d-d5b7-40f4-8b56-2519612367c7",
+  "timestamp": "2026-04-14T15:29:42.041754+00:00",
+  "source_trajectory_ids": [
+    "traj-0ab04ba8-0292-4c2c-bfa4-8a13a848da6c",
+    "traj-2db80c63-593c-4d37-b90e-582192c5ba19",
+    "traj-2f821626-4bb5-4af3-a24d-58c8e188e23e",
+    "traj-68f4d4af-0e40-48d4-b031-0260bc21effe",
+    "traj-6b2eb327-63dc-49f6-9e97-9393bce968ab",
+    "traj-b0ad6eaa-82c7-4abe-b854-b48732b81eaa",
+    "traj-b544cece-67a9-4920-8c5f-cacf4088ac44",
+    "traj-c035a938-9de2-4346-8407-6b47d0f38597",
+    "traj-d2fb2aca-ae01-4c1b-923f-5f1fe9d8e92b",
+    "traj-f7dbd739-5c42-4ba5-bca2-b4f390c0c277"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-6ecc1806-5caf-4def-b8e7-64f9b44be9fc.json b/docs/training-reports/report-6ecc1806-5caf-4def-b8e7-64f9b44be9fc.json
new file mode 100644
index 0000000..7106dea
--- /dev/null
+++ b/docs/training-reports/report-6ecc1806-5caf-4def-b8e7-64f9b44be9fc.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-6ecc1806-5caf-4def-b8e7-64f9b44be9fc",
+  "timestamp": "2026-04-14T20:58:05.406666+00:00",
+  "source_trajectory_ids": [
+    "traj-14f93591-742c-468d-b37b-93e19bfcc79f",
+    "traj-4a9b5ebf-5c1f-442a-a0e4-8f14a4e9f678",
+    "traj-4df1b668-4114-499a-ac9e-fddd57e0b1c8",
+    "traj-63d5932c-183a-4fb0-95d8-18523c4c486d",
+    "traj-64f1feba-60b7-40e3-aea8-0d1a1c0defbb",
+    "traj-687aee87-9150-442b-aed7-45b98399cd02",
+    "traj-69e4c95d-6d60-4466-a582-d0741d22f04d",
+    "traj-88339f4c-3c81-4137-b233-eea51c1a6b8e",
+    "traj-8b107137-35d2-42f3-a5b8-14e0eaa7aa30",
+    "traj-92c65d68-f839-49ac-8fb7-7a29f9af19f0"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-205805",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-6ecec7b8-074a-46ef-8b29-3b683738074e.json b/docs/training-reports/report-6ecec7b8-074a-46ef-8b29-3b683738074e.json
new file mode 100644
index 0000000..8877268
--- /dev/null
+++ b/docs/training-reports/report-6ecec7b8-074a-46ef-8b29-3b683738074e.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-6ecec7b8-074a-46ef-8b29-3b683738074e",
+  "timestamp": "2026-04-14T17:15:16.862902+00:00",
+  "source_trajectory_ids": [
+    "traj-082da722-82fa-460c-b98f-22fad1f364d8",
+    "traj-09e7688c-92a9-4910-a0f7-92a0c35eb252",
+    "traj-1118a441-97ff-4431-8427-c0d5535751a0",
+    "traj-1ed861c0-1fa5-4052-9dc6-d26c9d738de6",
+    "traj-65d5dc7d-e9d3-48bf-a3ae-34e55c870195",
+    "traj-8f0f2c25-2565-4144-889c-7a34adb30d45",
+    "traj-a04080e7-f1fa-4047-9f58-0cd5231e5e36",
+    "traj-b69e9885-37d2-4c22-9ac1-b0f7b60b06a9",
+    "traj-bb6f76bf-a5d5-4cfa-8be9-1d54c21e8591",
+    "traj-dae57c63-fbd6-4a87-bd9e-4ac58a662543"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-6f6c29a6-873b-478d-8847-24480f44f8a6.json b/docs/training-reports/report-6f6c29a6-873b-478d-8847-24480f44f8a6.json
new file mode 100644
index 0000000..11c4ced
--- /dev/null
+++ b/docs/training-reports/report-6f6c29a6-873b-478d-8847-24480f44f8a6.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-6f6c29a6-873b-478d-8847-24480f44f8a6",
+  "timestamp": "2026-04-14T19:19:01.987820+00:00",
+  "source_trajectory_ids": [
+    "traj-0bd48b9e-1baf-4ced-ad03-1a98ce4dbc80",
+    "traj-1edfc2ce-d806-41cf-87b1-53c49a1c8f44",
+    "traj-3de2ffc5-5a19-40a5-a1ab-3b9c0d01823f",
+    "traj-4d76447d-0c25-424d-9863-0d47c54d7736",
+    "traj-7b87b3fb-9469-4cfc-896d-615c860a1e87",
+    "traj-888a4065-b13c-4214-8259-40e12210bc35",
+    "traj-9703ae83-7f2e-4f15-bcc9-03979412bcd6",
+    "traj-9b5fa2de-1d78-4f95-9e08-f00a367871ec",
+    "traj-9f10367b-dc6a-4da0-90e7-f2b5e69665ed",
+    "traj-fda90a71-144b-4109-919f-68ed99b1dcfe"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-191901",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-6ff5d88c-42b7-4513-af9b-da0d1cfb5d0e.json b/docs/training-reports/report-6ff5d88c-42b7-4513-af9b-da0d1cfb5d0e.json
new file mode 100644
index 0000000..18a6436
--- /dev/null
+++ b/docs/training-reports/report-6ff5d88c-42b7-4513-af9b-da0d1cfb5d0e.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-6ff5d88c-42b7-4513-af9b-da0d1cfb5d0e",
+  "timestamp": "2026-04-14T22:08:19.799797+00:00",
+  "source_trajectory_ids": [
+    "traj-00579592-c3af-41d1-9f1f-7ef863150303",
+    "traj-486e0adb-b3a7-424b-9803-4adef3754d11",
+    "traj-4e60a033-8d1d-4d40-97ff-285f1aadb44d",
+    "traj-56e0b16c-a542-4212-ab69-d60ff62a8579",
+    "traj-68706353-1cf2-451f-909a-ed43ee8d922a",
+    "traj-789d830c-1db1-4671-a4c1-060c056b9da7",
+    "traj-7c4c4cb3-aae7-48c7-ae5e-8bd6263f8e30",
+    "traj-a822f692-f0a2-433a-ab51-96870b509a22",
+    "traj-e298e14d-2a48-405e-a530-90b77be9905a",
+    "traj-feb5e08d-5b97-436e-87e4-ce2d2de84414"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-220819",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-7078b6d8-f026-4c36-89d9-12d02f651dd3.json b/docs/training-reports/report-7078b6d8-f026-4c36-89d9-12d02f651dd3.json
new file mode 100644
index 0000000..6ac60f8
--- /dev/null
+++ b/docs/training-reports/report-7078b6d8-f026-4c36-89d9-12d02f651dd3.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-7078b6d8-f026-4c36-89d9-12d02f651dd3",
+  "timestamp": "2026-04-14T22:09:38.785450+00:00",
+  "source_trajectory_ids": [
+    "traj-031720a7-fea3-4e49-b135-b3b5591307d1",
+    "traj-427426c8-07fb-49b7-b19a-e59471bb5b03",
+    "traj-64093076-b764-4a6a-9eea-157054480539",
+    "traj-8cfbfc9c-4d1c-42f1-9361-38a8db2af9ec",
+    "traj-9baadeac-6b04-46b5-b132-e963061732b6",
+    "traj-acdc31dd-34e4-4016-906e-77b711e81c66",
+    "traj-b09d085d-c4c4-4df3-89a1-3980140aa96e",
+    "traj-c31478e4-761a-43e0-b895-6a33b7eebb40",
+    "traj-d1bccd73-8fb2-4ced-a684-773fd49ee78e",
+    "traj-fb63c834-a6f7-495a-b9ca-a232678a981f"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-220938",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-716d3658-e56e-475c-b87d-1904b15184e8.json b/docs/training-reports/report-716d3658-e56e-475c-b87d-1904b15184e8.json
new file mode 100644
index 0000000..ac103a6
--- /dev/null
+++ b/docs/training-reports/report-716d3658-e56e-475c-b87d-1904b15184e8.json
@@ -0,0 +1,41 @@
+{
+  "report_id": "report-716d3658-e56e-475c-b87d-1904b15184e8",
+  "timestamp": "2026-04-14T16:53:16.879667+00:00",
+  "source_trajectory_ids": [
+    "traj-3fb70daf-c014-498c-8066-490d02db130b",
+    "traj-6954f688-1c63-4cc2-ae3c-1082dfc4c74a",
+    "traj-6cb082e1-f2af-495d-b463-43e725544f22",
+    "traj-73af4833-54c5-43b8-8c16-0e3bc4598b6f",
+    "traj-89b55a45-581f-4bbf-854a-9eff339e8b7f",
+    "traj-8e73d10b-a90a-41fc-be34-f348129d0c9a",
+    "traj-d3aba8b1-b712-44db-8f32-bc002a83e482",
+    "traj-e537400c-4cab-4ac1-9dc2-887f70ba9e19",
+    "traj-edc9669c-9425-4802-9d91-93ec63388e0a",
+    "traj-f006c107-3802-4eb1-9824-3b19415b0cc9"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-165316"
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-71d18e5e-93f1-4973-b205-5e21ae4cf132.json b/docs/training-reports/report-71d18e5e-93f1-4973-b205-5e21ae4cf132.json
new file mode 100644
index 0000000..313da04
--- /dev/null
+++ b/docs/training-reports/report-71d18e5e-93f1-4973-b205-5e21ae4cf132.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-71d18e5e-93f1-4973-b205-5e21ae4cf132",
+  "timestamp": "2026-04-14T21:44:48.185492+00:00",
+  "source_trajectory_ids": [
+    "traj-07cc6d5b-8d85-4ac6-9d4b-4bfc129ab4e5",
+    "traj-35e4dc69-fbcb-4378-88e1-02f93fe78a31",
+    "traj-422c85ac-d2e1-4587-b3d1-c87881007757",
+    "traj-6f5c6331-735e-49ee-b402-062b3e711058",
+    "traj-8555bf2a-03c1-4d66-bd82-6edd583564f7",
+    "traj-98976a8d-1e1e-4e15-887c-567487a5a8bb",
+    "traj-b5331542-8962-4f0e-8be7-bda2dbe60fb0",
+    "traj-d63b7563-2553-4440-b637-72a225eef4e0",
+    "traj-d8334f2e-5b4c-47ab-bba9-49a3fa2f4b27",
+    "traj-ddc27f2e-73ed-4fbe-ab56-5c612dcebfe3"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-214448",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-71df18e0-e98e-4663-9279-46bd619196be.json b/docs/training-reports/report-71df18e0-e98e-4663-9279-46bd619196be.json
new file mode 100644
index 0000000..03927f9
--- /dev/null
+++ b/docs/training-reports/report-71df18e0-e98e-4663-9279-46bd619196be.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-71df18e0-e98e-4663-9279-46bd619196be",
+  "timestamp": "2026-04-15T01:36:36.536819+00:00",
+  "source_trajectory_ids": [
+    "traj-155f162e-c6eb-4b33-82f8-ddab0dd2d63f",
+    "traj-15d916ea-8e00-4301-afd8-69c8371ca19d",
+    "traj-45d60fba-bb81-4fe8-b1a2-fc39ee4e8175",
+    "traj-45e653cb-f3f1-4e4a-bf5a-879a1d45c0f6",
+    "traj-ac56a61d-3b28-4fc5-89fe-1585d7a095df",
+    "traj-b95675d2-2eef-4563-b00f-0ab16366164d",
+    "traj-e23436a3-9fbf-450a-9c68-d5b9ce50b6ec",
+    "traj-f180b809-b918-4227-9b34-1c4f6c9448e2",
+    "traj-f1c411ee-fd4a-4e25-8fc9-ec36dae50b69",
+    "traj-f8fb5253-f07a-44de-a8c4-87123e70f689"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260415-013636",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-72b168b7-0904-4df5-8c0e-40e83d404554.json b/docs/training-reports/report-72b168b7-0904-4df5-8c0e-40e83d404554.json
new file mode 100644
index 0000000..bf4d90b
--- /dev/null
+++ b/docs/training-reports/report-72b168b7-0904-4df5-8c0e-40e83d404554.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-72b168b7-0904-4df5-8c0e-40e83d404554",
+  "timestamp": "2026-04-15T01:57:32.711488+00:00",
+  "source_trajectory_ids": [
+    "traj-10e25201-4b4b-4019-90b1-83d1680be21e",
+    "traj-32e4d84c-a667-4a47-9298-0125e963c079",
+    "traj-52faa2ef-b9ee-4ecb-81bd-ade6c2469dd3",
+    "traj-57abbb97-de7e-454e-b71b-5903aafc818d",
+    "traj-7f3fe1da-8dbf-456d-9d56-e6a4685a4072",
+    "traj-8bf39705-498f-4531-80d3-484748a4679f",
+    "traj-ade910a9-6578-4a87-89ea-2fff0db2f89c",
+    "traj-cc534e68-0d81-4ebd-8f2c-31740d8015a6",
+    "traj-d05c409b-7536-4bf7-b012-7677cfed5906",
+    "traj-f40e3dd4-496a-47bc-8671-befc06b8f071"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260415-015732",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-73683712-e416-4693-9f7c-85f78bc5c34b.json b/docs/training-reports/report-73683712-e416-4693-9f7c-85f78bc5c34b.json
new file mode 100644
index 0000000..f281ac6
--- /dev/null
+++ b/docs/training-reports/report-73683712-e416-4693-9f7c-85f78bc5c34b.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-73683712-e416-4693-9f7c-85f78bc5c34b",
+  "timestamp": "2026-04-14T22:08:57.531386+00:00",
+  "source_trajectory_ids": [
+    "traj-0bea8430-0d7a-458c-8389-103a79e81a6f",
+    "traj-240d14c8-0d5e-4e90-9752-eab26315ea26",
+    "traj-40dc0cbc-37c3-48d5-94fa-85153d444a20",
+    "traj-5acb9c23-702b-4be4-ac29-a6052782260b",
+    "traj-5b5167db-b303-48bb-b78f-14f8c140633c",
+    "traj-6e2a8741-7e18-48e1-aa1d-d60d7fcc143f",
+    "traj-aa1c75b4-20aa-4d03-a45c-30a0850eabe2",
+    "traj-b1b01b2a-dd5f-4203-bf99-0aecbbced7c5",
+    "traj-dff09def-3106-44ad-a2fa-41cb9a5e9778",
+    "traj-e21ef421-3e49-4f23-843b-7c2cf17ae89d"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-220857",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-7576a115-0b27-46f0-b9a8-51b9a18a0fe6.json b/docs/training-reports/report-7576a115-0b27-46f0-b9a8-51b9a18a0fe6.json
new file mode 100644
index 0000000..742021f
--- /dev/null
+++ b/docs/training-reports/report-7576a115-0b27-46f0-b9a8-51b9a18a0fe6.json
@@ -0,0 +1,41 @@
+{
+  "report_id": "report-7576a115-0b27-46f0-b9a8-51b9a18a0fe6",
+  "timestamp": "2026-04-14T18:06:25.465766+00:00",
+  "source_trajectory_ids": [
+    "traj-08a6168f-f0b7-42bb-aef8-ac2215c45cad",
+    "traj-0df55922-a115-41b3-a721-e861afe1fb5b",
+    "traj-3209ba26-ea03-4788-8bd7-9b5fb6aaf830",
+    "traj-3845105a-b0fb-4a28-8322-002af1fdc5f3",
+    "traj-5b1ca6c3-3cbc-4c40-91e4-918988937861",
+    "traj-8f00713c-623b-430f-a3f3-bb0929291c72",
+    "traj-b0d7dae3-3d72-43ce-9c86-f9407b257b75",
+    "traj-b470b811-25ca-4c7c-b217-bd92d415c196",
+    "traj-dafe6056-749b-4aae-9984-e640634ac83a",
+    "traj-fda175cf-e74b-4e4b-ac9a-99a3189ff6cf"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-180625"
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-75df8940-51be-4562-afe7-0f1f374219b7.json b/docs/training-reports/report-75df8940-51be-4562-afe7-0f1f374219b7.json
new file mode 100644
index 0000000..baba9a6
--- /dev/null
+++ b/docs/training-reports/report-75df8940-51be-4562-afe7-0f1f374219b7.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-75df8940-51be-4562-afe7-0f1f374219b7",
+  "timestamp": "2026-04-14T20:57:28.014592+00:00",
+  "source_trajectory_ids": [
+    "traj-27837c4d-d551-459c-a927-b0e32bec2891",
+    "traj-66a29122-5fbd-4e9b-a175-53382743ae4f",
+    "traj-7803135e-0e45-4ee8-a903-3920eda92711",
+    "traj-782f247f-39c1-464f-a5d6-56e7f9b444a7",
+    "traj-8a851754-1403-488d-a3e2-b3a2765ded34",
+    "traj-8cba8586-5cad-4c07-bbbb-f6c8f93a8895",
+    "traj-a6632a98-e33a-477d-b148-af244c6386bf",
+    "traj-cc07e16e-8ff7-4c82-b234-77d8335242d8",
+    "traj-e7327928-4089-49bb-9060-3828e65f83fe",
+    "traj-f49e6b3e-6ad8-459e-ac2f-9bc3a3297517"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-205728",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-76faf179-ae5b-489f-ba05-ba89f22aa9ca.json b/docs/training-reports/report-76faf179-ae5b-489f-ba05-ba89f22aa9ca.json
new file mode 100644
index 0000000..5ef012f
--- /dev/null
+++ b/docs/training-reports/report-76faf179-ae5b-489f-ba05-ba89f22aa9ca.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-76faf179-ae5b-489f-ba05-ba89f22aa9ca",
+  "timestamp": "2026-04-14T19:19:02.003460+00:00",
+  "source_trajectory_ids": [
+    "traj-06cce6e6-05e6-4a4a-8254-8b7d21a43e58",
+    "traj-38840dd8-8d74-4d1e-95ab-74b45736838d",
+    "traj-432599a4-b898-4b86-9025-ecd601192efa",
+    "traj-a7dfcce9-8068-4b98-972d-7cb2975d6224",
+    "traj-b9b42927-828b-40d9-806b-8cd1b211ae9d",
+    "traj-bf1b9759-81c5-403a-8548-e1d674ed10de",
+    "traj-d16ffb2b-0768-49fa-8f19-9d7b57f5af08",
+    "traj-d9afd41e-580e-4efc-82ee-e2f525f39b59",
+    "traj-e57518db-770e-4dcc-9507-04bf043c240f",
+    "traj-fd97119b-4e71-457d-8fad-d00d7c291454"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-191902",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-774e00aa-847e-4329-b4e4-6745a7510deb.json b/docs/training-reports/report-774e00aa-847e-4329-b4e4-6745a7510deb.json
new file mode 100644
index 0000000..4d38320
--- /dev/null
+++ b/docs/training-reports/report-774e00aa-847e-4329-b4e4-6745a7510deb.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-774e00aa-847e-4329-b4e4-6745a7510deb",
+  "timestamp": "2026-04-14T18:28:06.116516+00:00",
+  "source_trajectory_ids": [
+    "traj-0680f16c-a3e2-43d9-92bc-413943c87fb2",
+    "traj-11dd5532-5b59-4562-aba9-ec1e3fb314c9",
+    "traj-45cddb43-6197-4bfc-88b5-94e0b896054f",
+    "traj-4de4cc12-7e2d-45cb-a523-1141dbeb15e2",
+    "traj-72bc5ccc-1dfd-4d8f-a5be-4198aacad1dc",
+    "traj-95c5c1ae-ef96-40b5-8f0b-a9e156d3657a",
+    "traj-976dcb41-a7c5-473a-b907-753fe7c6721a",
+    "traj-c8100fdb-052f-401e-b04f-b99a62d84289",
+    "traj-c9ba219d-f53d-4292-a2e5-f91378600a09",
+    "traj-e7644f00-be04-4d15-9489-7b3668004f36"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-77d96a9c-b260-4ae8-b3e8-444604bf75f8.json b/docs/training-reports/report-77d96a9c-b260-4ae8-b3e8-444604bf75f8.json
new file mode 100644
index 0000000..35131da
--- /dev/null
+++ b/docs/training-reports/report-77d96a9c-b260-4ae8-b3e8-444604bf75f8.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-77d96a9c-b260-4ae8-b3e8-444604bf75f8",
+  "timestamp": "2026-04-14T16:53:16.938055+00:00",
+  "source_trajectory_ids": [
+    "traj-0f512c57-990c-4129-bfdd-e2cf71cee0cd",
+    "traj-3d5e314b-b7bd-42ec-9f3f-546c0f1556fc",
+    "traj-6846c005-3a1d-4dd5-86d0-4fd770d0befb",
+    "traj-68ce854e-0101-4ddd-9a3f-7ebf3e2a328a",
+    "traj-6d63a4e3-9db1-423b-b014-724509fb2efd",
+    "traj-8058f54c-1268-4908-9234-ff9fa6b6f886",
+    "traj-ccab552c-e82c-459a-95c2-f777bcb821be",
+    "traj-ce59c4db-28ec-4aa3-ac23-00ecc5046e11",
+    "traj-e93a74a3-f222-436f-bd4c-31001be3add3",
+    "traj-fad54386-752b-4be8-a770-72c86b3a0301"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-78c72165-f016-49da-bd17-edcaf3d0fe93.json b/docs/training-reports/report-78c72165-f016-49da-bd17-edcaf3d0fe93.json
new file mode 100644
index 0000000..6c62ea8
--- /dev/null
+++ b/docs/training-reports/report-78c72165-f016-49da-bd17-edcaf3d0fe93.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-78c72165-f016-49da-bd17-edcaf3d0fe93",
+  "timestamp": "2026-04-14T22:09:38.807229+00:00",
+  "source_trajectory_ids": [
+    "traj-1bebede9-ff62-4edf-88f7-4b7c8c01db7e",
+    "traj-22856fc1-cd85-43a0-8df6-ce9493104a32",
+    "traj-369510fa-53f7-42ba-afec-ddacb87240cd",
+    "traj-4e5fa0e0-da1a-49a1-bcc2-553f2414285c",
+    "traj-503891d2-34f4-4ae2-b7b9-c72426249514",
+    "traj-561ee822-6e49-4800-b88a-122054d153b8",
+    "traj-87b9f160-8f3e-4fd8-b096-420d56a6701a",
+    "traj-9d6f08f3-7494-4235-a955-1b692ed56477",
+    "traj-d082d1a8-aebc-402a-b8cc-b5d4ea291ccf",
+    "traj-f3e5cf58-8a67-47a9-a3c3-166628b75865"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-220938",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-78cd7466-8da2-476d-a066-ad24d259993b.json b/docs/training-reports/report-78cd7466-8da2-476d-a066-ad24d259993b.json
new file mode 100644
index 0000000..b4cf881
--- /dev/null
+++ b/docs/training-reports/report-78cd7466-8da2-476d-a066-ad24d259993b.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-78cd7466-8da2-476d-a066-ad24d259993b",
+  "timestamp": "2026-04-14T20:02:20.878572+00:00",
+  "source_trajectory_ids": [
+    "traj-06e3f022-7237-4f78-a932-446c23355912",
+    "traj-14d729fc-5e4e-4ddf-be4d-1cdf0925404e",
+    "traj-50af15e1-95d7-4712-b1ea-e4451b1f4f0e",
+    "traj-6265a728-1cda-4ce1-a43e-d4dbe5c11617",
+    "traj-7a4a0e3d-cbef-4eb1-a5af-d18a7f25fbcb",
+    "traj-8ffc2bc7-5b72-4bc7-b1b1-f534e08d2175",
+    "traj-b11030bf-9f24-4d96-8f73-f3b2ff2ad84a",
+    "traj-c830bc23-3c04-4440-b45f-ef27207f95d8",
+    "traj-cc66bddc-541e-46b4-b8c6-6139e4a399ca",
+    "traj-fe47844e-144c-4b1b-9e0f-79377dff71f0"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-79ed3a93-77a0-43f0-8292-2bcf92efece9.json b/docs/training-reports/report-79ed3a93-77a0-43f0-8292-2bcf92efece9.json
new file mode 100644
index 0000000..d68565f
--- /dev/null
+++ b/docs/training-reports/report-79ed3a93-77a0-43f0-8292-2bcf92efece9.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-79ed3a93-77a0-43f0-8292-2bcf92efece9",
+  "timestamp": "2026-04-14T22:08:57.523349+00:00",
+  "source_trajectory_ids": [
+    "traj-0ea806a1-936e-4175-a018-12f69a729e23",
+    "traj-0eaa5a43-8655-49ec-9c6d-622f479c6835",
+    "traj-148638fa-2385-407e-9916-e7ddeddc1c27",
+    "traj-1c46e94e-5b91-467b-8b4e-1d57480cc745",
+    "traj-20925b8c-f507-4085-baa2-ab417bc7d9c2",
+    "traj-59d8245c-88c5-4b47-9d4d-0f9560fd18e4",
+    "traj-90e42a4c-df1f-4586-9bbb-ce430321fe6f",
+    "traj-a7604864-dc57-4681-b654-73f9253ad476",
+    "traj-af7b7001-38e6-4067-aa03-46f0466d9a9a",
+    "traj-b049ab69-573a-462f-bc7b-0c5867825251"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-7b49086a-603a-475c-b68b-f1f91dadd5f7.json b/docs/training-reports/report-7b49086a-603a-475c-b68b-f1f91dadd5f7.json
new file mode 100644
index 0000000..5af77dd
--- /dev/null
+++ b/docs/training-reports/report-7b49086a-603a-475c-b68b-f1f91dadd5f7.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-7b49086a-603a-475c-b68b-f1f91dadd5f7",
+  "timestamp": "2026-04-14T20:04:58.773686+00:00",
+  "source_trajectory_ids": [
+    "traj-02be2bfc-7a5c-4a78-a3bd-46c947889687",
+    "traj-40f61062-4dfc-485d-93dd-a0f5cff744f1",
+    "traj-479b19c2-ceef-45ff-8765-4b88448470d6",
+    "traj-70b19941-f21d-4e73-85c6-b007cc4a01d9",
+    "traj-7de17462-21f7-4ed2-9e1a-98978596cb95",
+    "traj-9c681ffd-1c59-4eb9-b297-766896c09b44",
+    "traj-ad285c8c-036c-49cf-8c18-47595e1beb1d",
+    "traj-b433a2d0-39b3-4743-a619-7d26d6ba7170",
+    "traj-d0689fdc-4a0a-45a6-bb87-84b4907f09d8",
+    "traj-d5a5a060-9280-429e-adf0-25a31aff7503"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-200458",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-7caea702-2511-4a95-a864-3689aa2ad0d9.json b/docs/training-reports/report-7caea702-2511-4a95-a864-3689aa2ad0d9.json
new file mode 100644
index 0000000..8bf9088
--- /dev/null
+++ b/docs/training-reports/report-7caea702-2511-4a95-a864-3689aa2ad0d9.json
@@ -0,0 +1,41 @@
+{
+  "report_id": "report-7caea702-2511-4a95-a864-3689aa2ad0d9",
+  "timestamp": "2026-04-14T15:53:50.734704+00:00",
+  "source_trajectory_ids": [
+    "traj-39cac35e-7c25-4060-a318-abc3b5348d4e",
+    "traj-51e0c7e0-92c9-48c7-a729-68ba3938f6b4",
+    "traj-56d1aa4a-1405-42bb-8f92-2a40256550c5",
+    "traj-58f3999c-11a3-4ec7-8045-ecf9c8a975e4",
+    "traj-5b061c74-cf63-4e25-be8e-57d5d37ba1fa",
+    "traj-6f128a84-d017-4a7d-9fd8-542a4bea0dce",
+    "traj-782f6872-920c-44ea-bb75-bdd6f8183a3a",
+    "traj-c01ae2e8-be7f-45d9-a96f-347cdac3df72",
+    "traj-d65b30df-971a-44e7-a639-e27b43547483",
+    "traj-edaf3d1f-5dac-4be8-a83a-741a5b5a573a"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-155350"
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-7d902557-f70d-4fef-9d62-6aa342e8e377.json b/docs/training-reports/report-7d902557-f70d-4fef-9d62-6aa342e8e377.json
new file mode 100644
index 0000000..b4625b7
--- /dev/null
+++ b/docs/training-reports/report-7d902557-f70d-4fef-9d62-6aa342e8e377.json
@@ -0,0 +1,44 @@
+{
+  "report_id": "report-7d902557-f70d-4fef-9d62-6aa342e8e377",
+  "timestamp": "2026-04-14T18:30:24.441520+00:00",
+  "source_trajectory_ids": [
+    "traj-1223e20d-cc08-4cde-a091-d4d93076754f",
+    "traj-1c5231fe-2568-46fe-bede-5df699d68252",
+    "traj-2def2337-04e5-460e-b304-c56cd94cfa0c",
+    "traj-866ba769-fd58-4198-8cbf-b58536a6c8ad",
+    "traj-a4ac7fac-fb17-48aa-8ad3-29ab4ff2d82b",
+    "traj-c855a1b9-2b55-4ff8-8860-bdffa0007d9b",
+    "traj-cc900807-0dd5-4678-93e9-b7a5fcbb115b",
+    "traj-d7a83b94-ab03-47ae-bd34-43e3569b7661",
+    "traj-dcf39e28-ba1d-4f59-96ce-8f2f37392e0f",
+    "traj-f2106a1a-e7d9-4eed-adba-3b4efb403870"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-7e1496f0-7f0c-4a63-b692-43cfbc20fa08.json b/docs/training-reports/report-7e1496f0-7f0c-4a63-b692-43cfbc20fa08.json
new file mode 100644
index 0000000..9f72d6b
--- /dev/null
+++ b/docs/training-reports/report-7e1496f0-7f0c-4a63-b692-43cfbc20fa08.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-7e1496f0-7f0c-4a63-b692-43cfbc20fa08",
+  "timestamp": "2026-04-14T18:06:58.447400+00:00",
+  "source_trajectory_ids": [
+    "traj-24b4113e-b5b7-4e95-8d60-cf8bffd93a31",
+    "traj-2b78cd34-d97f-47c5-a969-76261eb42302",
+    "traj-3a298882-04a8-4499-abc9-2cb86807d866",
+    "traj-45aeb4e1-7356-40b7-a3bb-a6a04c45053e",
+    "traj-4e4cb938-5ad3-4404-b4bd-470fe315478a",
+    "traj-52a8714e-18d3-4e6a-a86a-537d7fd019e0",
+    "traj-58810fbe-1d12-4865-873f-f305864cd277",
+    "traj-7ab1758b-3dc7-4fe4-ae8a-925b996b5b27",
+    "traj-d1e97442-2516-4929-8ca3-9d9c94fa38dc",
+    "traj-f4e5e924-7f05-4633-87a8-f14aeab64e4e"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-7e50ca6a-54a5-4bbd-81af-df48c6e8914f.json b/docs/training-reports/report-7e50ca6a-54a5-4bbd-81af-df48c6e8914f.json
new file mode 100644
index 0000000..22a0da4
--- /dev/null
+++ b/docs/training-reports/report-7e50ca6a-54a5-4bbd-81af-df48c6e8914f.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-7e50ca6a-54a5-4bbd-81af-df48c6e8914f",
+  "timestamp": "2026-04-14T20:57:28.147337+00:00",
+  "source_trajectory_ids": [
+    "traj-0d7435b2-984e-40a8-8edb-adf22890d7b8",
+    "traj-0f38e5b4-f39c-4d9f-afcc-717e81dbf51d",
+    "traj-16289330-a579-4aa4-a189-c9b31aeae31a",
+    "traj-3f346979-7403-4fe7-806e-991fc05eac08",
+    "traj-804e7f40-b6a6-41c0-bfbd-cee5faca7126",
+    "traj-8ec9f4da-e7f5-45d4-99ce-d17b78277424",
+    "traj-964f433b-b7d5-44d3-9ed9-0acf65540800",
+    "traj-bc2474b1-5269-4ddf-8142-a2d93ac72616",
+    "traj-e8a0b6c5-48bf-47c8-b33a-41a88c735c42",
+    "traj-fb49b641-4ab6-4415-a853-1cb76506cb8e"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-205728",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-7ed3bf36-e982-41d3-959a-1156ffd1999a.json b/docs/training-reports/report-7ed3bf36-e982-41d3-959a-1156ffd1999a.json
new file mode 100644
index 0000000..d6f1e64
--- /dev/null
+++ b/docs/training-reports/report-7ed3bf36-e982-41d3-959a-1156ffd1999a.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-7ed3bf36-e982-41d3-959a-1156ffd1999a",
+  "timestamp": "2026-04-15T01:36:36.549118+00:00",
+  "source_trajectory_ids": [
+    "traj-1cfffe68-99f3-4b79-acaf-492f98bfb30e",
+    "traj-1d01aaea-2f82-4aab-808e-22b582cf3804",
+    "traj-46735e8f-ebaf-4e64-b7ff-fadbc7661eda",
+    "traj-4ed1dd60-007c-4b67-9bdb-5bc25301e439",
+    "traj-73c8466c-8313-4ac0-af99-97792bee79a7",
+    "traj-7df1403a-6441-486a-9486-5e8a09430c42",
+    "traj-905c6036-4972-407f-bec0-d0ceeeb0168f",
+    "traj-95a86969-c5c1-48a5-81f1-2f767b79ae75",
+    "traj-b11258b0-0969-48b0-96a3-d1dc48d6d4ba",
+    "traj-b6c9f538-3585-4653-a7dd-64ee41deb323"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260415-013636",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-80957f87-ab7d-48b9-a539-135f6a2264c3.json b/docs/training-reports/report-80957f87-ab7d-48b9-a539-135f6a2264c3.json
new file mode 100644
index 0000000..93d61cb
--- /dev/null
+++ b/docs/training-reports/report-80957f87-ab7d-48b9-a539-135f6a2264c3.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-80957f87-ab7d-48b9-a539-135f6a2264c3",
+  "timestamp": "2026-04-14T20:34:01.478949+00:00",
+  "source_trajectory_ids": [
+    "traj-03c82893-5c8d-4a00-8d7d-fab40579de5c",
+    "traj-3556fa38-6b54-4852-a431-de7c4c8bf65c",
+    "traj-46299d00-2b46-4ad4-a54e-62da45757215",
+    "traj-489c5c00-15d0-41c9-876d-1cf33499efb6",
+    "traj-5b197c9d-6186-4c47-8b04-ffd9febc4c5e",
+    "traj-71b1132e-4113-45ce-a58f-4a433ea7b378",
+    "traj-8c57805b-3f69-4b28-ad9c-b40a0ac62980",
+    "traj-ca0abd8a-3477-4155-b959-b96fdfeafabd",
+    "traj-d4f2ec7a-e604-4fa1-a271-aa3c3578787b",
+    "traj-d78066c1-6ad5-4fe1-8f21-882a2eb1fb50"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-203401",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-80e0b670-ea2a-47b3-ab54-6481afcb7c23.json b/docs/training-reports/report-80e0b670-ea2a-47b3-ab54-6481afcb7c23.json
new file mode 100644
index 0000000..79ab97a
--- /dev/null
+++ b/docs/training-reports/report-80e0b670-ea2a-47b3-ab54-6481afcb7c23.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-80e0b670-ea2a-47b3-ab54-6481afcb7c23",
+  "timestamp": "2026-04-14T22:10:23.390552+00:00",
+  "source_trajectory_ids": [
+    "traj-3a34ed0e-6006-4fcf-a401-d583ea95266e",
+    "traj-3ee70787-9427-42c4-9993-e6a7406401c7",
+    "traj-3f0b5592-b637-4d45-9a00-722be627c706",
+    "traj-495ace0b-e88e-4a49-99ea-1e9982bfdb9f",
+    "traj-6dafc03e-d667-439d-b691-237a437a0cd2",
+    "traj-74950b70-551f-4ef0-9d50-d36552403354",
+    "traj-837ddc95-4152-461c-9bd7-fd789734a9d4",
+    "traj-9ce5a24d-35f4-49fb-932d-8053c8ccfea3",
+    "traj-c1f1da9d-c325-4207-b54c-f0a79575ab02",
+    "traj-cfcbeec2-de79-4382-83be-ed7382811289"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-221023",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-813a3b23-c617-45bd-bcd6-e941054922cb.json b/docs/training-reports/report-813a3b23-c617-45bd-bcd6-e941054922cb.json
new file mode 100644
index 0000000..169301a
--- /dev/null
+++ b/docs/training-reports/report-813a3b23-c617-45bd-bcd6-e941054922cb.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-813a3b23-c617-45bd-bcd6-e941054922cb",
+  "timestamp": "2026-04-14T18:57:10.511393+00:00",
+  "source_trajectory_ids": [
+    "traj-11604d98-9fda-4d93-b95a-b8890c081387",
+    "traj-41d5bde2-6c0c-4d64-a950-caa5dab1ec28",
+    "traj-5c7b8795-4ce4-45f7-8672-dc60e183adb5",
+    "traj-7e40294c-b7b4-4556-ae27-f505f784b349",
+    "traj-96987742-db16-4bbe-8a86-c3b8b8a1b200",
+    "traj-aa145147-f81e-47b7-9e7a-1e5d854d7709",
+    "traj-c1e9f16a-65d1-4e40-9dac-bf9395931317",
+    "traj-e0256034-a98c-4f72-b038-2c7cefb311d3",
+    "traj-ea461c8f-b1f9-4eee-86c3-dbf6f899f5b2",
+    "traj-eba44f0c-d723-4561-b9fc-85a3343a0144"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-185710",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-81576e60-8de8-4f78-bfbc-d9868a297c74.json b/docs/training-reports/report-81576e60-8de8-4f78-bfbc-d9868a297c74.json
new file mode 100644
index 0000000..c81faed
--- /dev/null
+++ b/docs/training-reports/report-81576e60-8de8-4f78-bfbc-d9868a297c74.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-81576e60-8de8-4f78-bfbc-d9868a297c74",
+  "timestamp": "2026-04-14T19:41:33.388720+00:00",
+  "source_trajectory_ids": [
+    "traj-0ba4e422-2a4a-437b-b206-7c75cd7b1fd2",
+    "traj-130fb6f1-d212-45bd-a37d-c0d4679f3824",
+    "traj-5deb5b82-ffc9-4609-a1fe-201d34ab4f84",
+    "traj-776a9217-14a1-47ee-8931-fc68ecd29df4",
+    "traj-8b25fac4-7465-4cfd-9e04-3867a7564d89",
+    "traj-9e9bcd8b-4283-405c-b7da-f05ba2078152",
+    "traj-adf8cfbc-bc31-46cf-8371-8329f7781625",
+    "traj-c1872f64-599f-4e00-94af-ab3082573351",
+    "traj-c1a75b20-8145-4744-bfb7-5cc4f38814d2",
+    "traj-e71f334a-047d-4976-b2b9-17156cefa676"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-194133",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-83931755-d4b6-4ffc-a3ba-bb42a57c2703.json b/docs/training-reports/report-83931755-d4b6-4ffc-a3ba-bb42a57c2703.json
new file mode 100644
index 0000000..98f5b14
--- /dev/null
+++ b/docs/training-reports/report-83931755-d4b6-4ffc-a3ba-bb42a57c2703.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-83931755-d4b6-4ffc-a3ba-bb42a57c2703",
+  "timestamp": "2026-04-14T20:32:37.622875+00:00",
+  "source_trajectory_ids": [
+    "traj-1b10bd44-58ca-4529-ac5a-450101520ead",
+    "traj-1c07e980-c3b3-4d31-8a92-6917518f6d0f",
+    "traj-26404c84-9032-48e9-a252-5e5c759d3ed9",
+    "traj-5032358e-b033-477c-aa80-6d443eeec3e5",
+    "traj-525efb64-bd21-4e59-b0b4-7e95d5ee8bc3",
+    "traj-5c574fba-966a-44f9-907c-71203298ce82",
+    "traj-7a0c61a4-6e75-4269-82c7-ee6cf735bb8a",
+    "traj-98490501-2251-4637-a29e-3dd00705bbc3",
+    "traj-bf6cb52a-66f2-4455-a212-0fa93adabafa",
+    "traj-fcc0b20d-a2e0-4493-9960-a284d8df8c75"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-203237",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-846503cc-7539-411b-bec8-797423739b33.json b/docs/training-reports/report-846503cc-7539-411b-bec8-797423739b33.json
new file mode 100644
index 0000000..b4528d0
--- /dev/null
+++ b/docs/training-reports/report-846503cc-7539-411b-bec8-797423739b33.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-846503cc-7539-411b-bec8-797423739b33",
+  "timestamp": "2026-04-14T21:21:15.060958+00:00",
+  "source_trajectory_ids": [
+    "traj-037ac519-ba97-4b7a-97d5-f97f9b85a151",
+    "traj-3d24b7c0-5651-4fee-a4e8-fa304770b9cd",
+    "traj-472a907c-e919-46db-8f7d-20f3158fb8b9",
+    "traj-613362df-8139-4c00-a325-75fcb3d43d0f",
+    "traj-7fc820e9-f38d-4e3a-832b-6e1789e754bc",
+    "traj-834a1362-b9c3-425f-a814-3e37c0d9b3b4",
+    "traj-94fdc828-4d7b-4174-813d-64937f461ff8",
+    "traj-c6477509-0608-44ed-8044-6178c3840771",
+    "traj-ce1d2299-fd21-4b8f-9c52-63ace4b36ad9",
+    "traj-d0a029ec-7272-4ba1-9c5d-ff6339a88cd5"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-212115",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-84999c85-add6-4217-a619-d7e9b885ebd4.json b/docs/training-reports/report-84999c85-add6-4217-a619-d7e9b885ebd4.json
new file mode 100644
index 0000000..e5be4a3
--- /dev/null
+++ b/docs/training-reports/report-84999c85-add6-4217-a619-d7e9b885ebd4.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-84999c85-add6-4217-a619-d7e9b885ebd4",
+  "timestamp": "2026-04-14T21:22:14.902538+00:00",
+  "source_trajectory_ids": [
+    "traj-25d53d2d-ad4c-47ef-b69f-7775319a0425",
+    "traj-42e48398-78b7-4d4c-9857-2e508d277631",
+    "traj-84cb72ab-2972-4243-a4c7-92aa08e25061",
+    "traj-99c0ca36-dc99-4a93-8dcb-15eb9d4d42b4",
+    "traj-b3ab1fd1-e41f-48a2-9a96-366685a6ee6b",
+    "traj-b99760af-283f-4abc-b9d6-08e7ae18be5a",
+    "traj-cc1b2fea-8ec3-4258-a76d-f78ea128e199",
+    "traj-d252955f-ad96-4cd3-b56e-08907e50aefa",
+    "traj-f029e126-7286-45ef-8877-6a6361f13949",
+    "traj-fff66343-2d72-43ff-b5a4-0c5bf6f6945f"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-212214",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-861ec0ac-12df-4364-8353-833b01551326.json b/docs/training-reports/report-861ec0ac-12df-4364-8353-833b01551326.json
new file mode 100644
index 0000000..7b4c599
--- /dev/null
+++ b/docs/training-reports/report-861ec0ac-12df-4364-8353-833b01551326.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-861ec0ac-12df-4364-8353-833b01551326",
+  "timestamp": "2026-04-14T20:28:05.488890+00:00",
+  "source_trajectory_ids": [
+    "traj-076d450f-b0f8-4056-b3e8-6a6bd4358853",
+    "traj-109d9788-f729-4c43-b1a8-fd9760e58823",
+    "traj-10cd4d15-2952-4d92-9d69-9bae45de014b",
+    "traj-33a631a9-9133-4de1-8076-09b7a492a98b",
+    "traj-3c98074c-1094-4dd9-bc2b-ce86893a8020",
+    "traj-6dd93046-7964-4d07-8a68-0f3e21ffb971",
+    "traj-b632f2e8-de53-48d0-b80f-ce84f693585f",
+    "traj-c0bbb5e5-14c1-4b13-b104-a383a388da99",
+    "traj-c3a0367b-b9b6-4fd7-bf96-9e300c213a8d",
+    "traj-df9f86c0-b697-48e7-b4ee-284708dd389f"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-202805",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-8632d57e-b8cd-478e-9a0f-67e80fdb083a.json b/docs/training-reports/report-8632d57e-b8cd-478e-9a0f-67e80fdb083a.json
new file mode 100644
index 0000000..87dd2d9
--- /dev/null
+++ b/docs/training-reports/report-8632d57e-b8cd-478e-9a0f-67e80fdb083a.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-8632d57e-b8cd-478e-9a0f-67e80fdb083a",
+  "timestamp": "2026-04-15T01:41:52.320732+00:00",
+  "source_trajectory_ids": [
+    "traj-0a0b4023-fc35-42f9-b6ba-3702f51d1ee5",
+    "traj-0e1a3b4d-bd5e-4b77-abd3-8d919381c39b",
+    "traj-3d790042-c656-4156-bde8-c09f26ca5e29",
+    "traj-3ec762af-785f-45d3-8fc4-97e5552d87a3",
+    "traj-46bdb0d8-e0f4-4f40-a827-bb5af885ac63",
+    "traj-5263feac-df7e-46eb-9d43-c7a2854dadce",
+    "traj-9c78fed2-f613-4fa3-9cf6-149392bf85db",
+    "traj-a8aa0a98-39cf-4967-92cf-3989f9160491",
+    "traj-ac9f41a7-bd74-487b-a0ee-099f16376dde",
+    "traj-b8b25237-11ef-4d11-b103-19336da2a915"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260415-014152",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-8648638d-4733-4f01-b253-ed8bdd622f31.json b/docs/training-reports/report-8648638d-4733-4f01-b253-ed8bdd622f31.json
new file mode 100644
index 0000000..1a702de
--- /dev/null
+++ b/docs/training-reports/report-8648638d-4733-4f01-b253-ed8bdd622f31.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-8648638d-4733-4f01-b253-ed8bdd622f31",
+  "timestamp": "2026-04-14T21:42:45.793182+00:00",
+  "source_trajectory_ids": [
+    "traj-191d8496-6a70-498f-a9b8-0ed466282bef",
+    "traj-1ce69adc-9039-4855-a1c8-0f09bc98e480",
+    "traj-5ba7c32d-3f3f-48e3-a21e-ca00432b6336",
+    "traj-6cdc3520-b9bf-4755-bdbd-0d557bfd6861",
+    "traj-8d5974c0-1ff8-4d40-833a-c34092a41190",
+    "traj-a6673469-e3bc-450b-bba3-705533546f28",
+    "traj-c3f46d8b-05a2-4f61-9bfe-611178e853ac",
+    "traj-dc2894ca-7ce8-4832-b202-4e881ec6c321",
+    "traj-ed936ed8-a13c-484c-a81b-9660ac914a77",
+    "traj-f24b1f5a-c880-4e57-aa0b-aa925ff7b3b8"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-214245",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-87e8c52b-b9d4-4313-9427-6ff15435f288.json b/docs/training-reports/report-87e8c52b-b9d4-4313-9427-6ff15435f288.json
new file mode 100644
index 0000000..f494641
--- /dev/null
+++ b/docs/training-reports/report-87e8c52b-b9d4-4313-9427-6ff15435f288.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-87e8c52b-b9d4-4313-9427-6ff15435f288",
+  "timestamp": "2026-04-14T20:57:28.031518+00:00",
+  "source_trajectory_ids": [
+    "traj-24e288c1-5b5f-466b-8d57-8947efe30c3d",
+    "traj-419f0bc4-2099-4702-8cae-e09559835562",
+    "traj-4b96d746-8f00-43d8-964d-d89daa892714",
+    "traj-5daee68f-b5d3-49d4-b443-df61bb4fd1dd",
+    "traj-70cba1f5-10a1-4b5e-b3a7-b2e58376e304",
+    "traj-79c24421-c9d5-4cd1-bdcf-f7fa5bca13bb",
+    "traj-8800b4d2-7581-49c0-9f0f-b7b2a8c80e51",
+    "traj-8f4500be-5f43-4bab-8727-209a40cbdd80",
+    "traj-cc04b521-7699-4d1d-ad39-0d17f8ec039c",
+    "traj-d313b94a-5d22-44aa-bd16-cad1abd710ee"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-205728",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-883d758b-a1da-4510-88c5-cebf8ce0ba31.json b/docs/training-reports/report-883d758b-a1da-4510-88c5-cebf8ce0ba31.json
new file mode 100644
index 0000000..ee0c5c7
--- /dev/null
+++ b/docs/training-reports/report-883d758b-a1da-4510-88c5-cebf8ce0ba31.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-883d758b-a1da-4510-88c5-cebf8ce0ba31",
+  "timestamp": "2026-04-14T17:16:51.844103+00:00",
+  "source_trajectory_ids": [
+    "traj-10750d7e-aec2-49e4-8c6a-49bf8d2bf620",
+    "traj-10fcbc94-01ac-42e2-804f-1d54ef7f5a87",
+    "traj-472748b4-dc44-408f-9ee5-a7991053cb4b",
+    "traj-52cf389a-e84e-4490-aa75-a0c116f8f82b",
+    "traj-66a4e8e5-42d0-4046-a9be-4c02c50325a5",
+    "traj-6d41eb83-b2cb-4987-a44f-16eb13d32d26",
+    "traj-9552c648-2d0b-4ca7-84ea-d738964bb54c",
+    "traj-aa5b30f3-a9f4-46c8-951e-4d3360ca1c15",
+    "traj-f4a7999c-0982-4ae0-ab0e-770de4c4ab8c",
+    "traj-f9a2ae8d-8fff-4e7b-951a-2e536bd8c96e"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-88e3948c-3625-405b-af33-c51e9135aff2.json b/docs/training-reports/report-88e3948c-3625-405b-af33-c51e9135aff2.json
new file mode 100644
index 0000000..6897b60
--- /dev/null
+++ b/docs/training-reports/report-88e3948c-3625-405b-af33-c51e9135aff2.json
@@ -0,0 +1,41 @@
+{
+  "report_id": "report-88e3948c-3625-405b-af33-c51e9135aff2",
+  "timestamp": "2026-04-14T16:52:41.146294+00:00",
+  "source_trajectory_ids": [
+    "traj-13e828a9-1ba3-4a80-9047-cbda9d2d5b6e",
+    "traj-50cd11c9-13b9-4265-ad3a-9ec07cd54504",
+    "traj-5837b135-1320-4be2-927e-3d9196e2b9da",
+    "traj-5dab0e5f-e8d1-40af-b2ea-fa691b9e9615",
+    "traj-740b3a62-9784-461c-a817-8f0615946d63",
+    "traj-9f118b9b-63fe-400f-ab25-959518aa658b",
+    "traj-cb67afcf-8c40-42bb-a469-b838e74c046a",
+    "traj-d0c75c4c-6241-4a19-9e12-a0c052d51108",
+    "traj-e4db8044-df9c-4c4c-ab98-97c839d8945e",
+    "traj-fe162343-b729-4d58-a395-a85ea4dd1921"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-165241"
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-8c7eae98-78c8-4392-b3bb-e08f1b55b477.json b/docs/training-reports/report-8c7eae98-78c8-4392-b3bb-e08f1b55b477.json
new file mode 100644
index 0000000..4241dbc
--- /dev/null
+++ b/docs/training-reports/report-8c7eae98-78c8-4392-b3bb-e08f1b55b477.json
@@ -0,0 +1,41 @@
+{
+  "report_id": "report-8c7eae98-78c8-4392-b3bb-e08f1b55b477",
+  "timestamp": "2026-04-14T15:02:28.783291+00:00",
+  "source_trajectory_ids": [
+    "traj-3fc37a97-5031-4cf8-8d66-42bc06486572",
+    "traj-49691566-2f6c-4774-bcf2-7e92cf7177f9",
+    "traj-678c08d1-4c36-43e0-bca4-962a057e50ab",
+    "traj-889ee048-f42d-493b-83c5-46e4d0bf1af7",
+    "traj-8b29c640-6379-4155-909f-1e0aad020cfe",
+    "traj-98bac095-cb81-4b20-82b4-60241df64bc5",
+    "traj-9c775265-5f9a-4552-836d-a6cd2c50067c",
+    "traj-abf09758-c6a2-493f-94ab-3939964b6ece",
+    "traj-c7bf056e-fe8a-42b4-a0e6-8bb111616e4c",
+    "traj-d923f51a-8d41-4cf4-b2e1-5f6aa3cfbc40"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-150228"
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-8d24c2b9-cd2a-4637-b590-6c4bb256b30e.json b/docs/training-reports/report-8d24c2b9-cd2a-4637-b590-6c4bb256b30e.json
new file mode 100644
index 0000000..bc1455e
--- /dev/null
+++ b/docs/training-reports/report-8d24c2b9-cd2a-4637-b590-6c4bb256b30e.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-8d24c2b9-cd2a-4637-b590-6c4bb256b30e",
+  "timestamp": "2026-04-14T16:52:07.930194+00:00",
+  "source_trajectory_ids": [
+    "traj-4a66d419-43d3-4300-b341-7bee0dbad001",
+    "traj-67c2f7d0-bd50-4fe0-b7f5-9cf6280a8b15",
+    "traj-6aee8a88-7456-477d-803e-84b29665e300",
+    "traj-6ba96b06-a520-4750-be96-f6e7d47bb05c",
+    "traj-74f20e0d-f7a3-4919-acac-b9d17bb668bf",
+    "traj-7f78f0f4-db8b-4f69-9820-f76ee0b3bb51",
+    "traj-af8a8b2a-a62d-4d47-b8c9-3072d23cd291",
+    "traj-b06ec696-af9c-48ee-949e-59fcc53c27ad",
+    "traj-d3b29b21-6816-4602-a43e-d9e7999e370d",
+    "traj-f6ab046b-4bb3-4f69-85d3-411b3c53c871"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-8d4c7747-fa7c-4644-be2f-970c83f00bf9.json b/docs/training-reports/report-8d4c7747-fa7c-4644-be2f-970c83f00bf9.json
new file mode 100644
index 0000000..0cc1dda
--- /dev/null
+++ b/docs/training-reports/report-8d4c7747-fa7c-4644-be2f-970c83f00bf9.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-8d4c7747-fa7c-4644-be2f-970c83f00bf9",
+  "timestamp": "2026-04-14T20:33:28.701057+00:00",
+  "source_trajectory_ids": [
+    "traj-052faf89-7da2-47b6-8d10-c6e2a9ebcbd4",
+    "traj-086a884a-9385-446c-a84d-a2b913fc999d",
+    "traj-0f01bc07-6889-437c-a17d-62282f3268b9",
+    "traj-13b10580-5268-42e1-b694-30b2d6b6a805",
+    "traj-57b3a72c-9b3e-4bf3-85ce-0ecaf911ca7e",
+    "traj-7d9dfe23-94ff-4780-93f7-95d52c7514f1",
+    "traj-bf08149d-17e5-4852-b729-22b0c0dd6722",
+    "traj-d1e4a1f4-94ea-48c0-b14f-e84c22bda1ed",
+    "traj-ea5e5e8a-b34a-4dbe-a904-077a747c61eb",
+    "traj-f69f3469-1f86-41e7-9de6-33f517bbcd56"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-929ce385-3224-4343-b6a3-623d743d00d0.json b/docs/training-reports/report-929ce385-3224-4343-b6a3-623d743d00d0.json
new file mode 100644
index 0000000..53ecdda
--- /dev/null
+++ b/docs/training-reports/report-929ce385-3224-4343-b6a3-623d743d00d0.json
@@ -0,0 +1,41 @@
+{
+  "report_id": "report-929ce385-3224-4343-b6a3-623d743d00d0",
+  "timestamp": "2026-04-14T17:15:16.809604+00:00",
+  "source_trajectory_ids": [
+    "traj-09e794a0-c453-468e-abf4-671f75d09b27",
+    "traj-17352f71-f442-4a14-a968-3ac95478d226",
+    "traj-45c6096f-f884-454d-8ee0-86796ed28e3c",
+    "traj-b6c5ae89-313e-4c08-a3cf-44019a6b54f9",
+    "traj-bea9c4d9-616a-4982-b984-04fb347f3833",
+    "traj-c19e693c-5497-43f5-8279-97e33e6ae6bd",
+    "traj-d71c1ee7-c50b-4c24-9c9c-99a25c682734",
+    "traj-dc39ead7-258a-4b55-aa7a-5d93bddd628b",
+    "traj-dd1a3c6d-497d-4926-8073-755e7d6a83f0",
+    "traj-fd07ca81-b6de-4afe-afa3-54855388b6d5"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-171516"
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-9411737d-5da5-4423-afb8-960331cdc84e.json b/docs/training-reports/report-9411737d-5da5-4423-afb8-960331cdc84e.json
new file mode 100644
index 0000000..7a5a666
--- /dev/null
+++ b/docs/training-reports/report-9411737d-5da5-4423-afb8-960331cdc84e.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-9411737d-5da5-4423-afb8-960331cdc84e",
+  "timestamp": "2026-04-14T22:08:19.704176+00:00",
+  "source_trajectory_ids": [
+    "traj-0e884ed2-e5a4-4958-8ccd-949789f2f075",
+    "traj-1ecb2bbb-062f-414f-a766-168911840e45",
+    "traj-315f5e26-3ada-4e15-9cd2-cc3bbe56961c",
+    "traj-535bbe8b-024a-4f48-a2d5-7b47d96eeff1",
+    "traj-59d74691-087b-4fb8-be2e-8494044a5c5e",
+    "traj-73365bf3-01a6-4b89-aebc-868662c93d53",
+    "traj-a2fc2a8a-0625-49ee-9dce-55f898108099",
+    "traj-b87b6d30-5cd5-47ea-906d-574c4d159708",
+    "traj-d84b9e7c-c4aa-431f-805d-11a3e9bb0e5f",
+    "traj-e3db39fc-ae68-4eb3-b0d7-9ba77f6605f6"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-220819",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-94760e0f-e122-41d8-af83-333a8c37a193.json b/docs/training-reports/report-94760e0f-e122-41d8-af83-333a8c37a193.json
new file mode 100644
index 0000000..2e2c8ea
--- /dev/null
+++ b/docs/training-reports/report-94760e0f-e122-41d8-af83-333a8c37a193.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-94760e0f-e122-41d8-af83-333a8c37a193",
+  "timestamp": "2026-04-14T18:27:21.162078+00:00",
+  "source_trajectory_ids": [
+    "traj-01e96207-75ce-48c2-8091-10f0bb7e37c4",
+    "traj-06eb55c2-4f97-4fc8-a784-ef6c4d817073",
+    "traj-20570b08-b2a6-454e-80ce-caa0fca0383d",
+    "traj-2131dbb2-21f4-4eb0-abe4-5ed360d9edc1",
+    "traj-5079dd02-a733-49f6-9e78-725036d4ea60",
+    "traj-6c16d814-aa3f-447c-b7b6-5fdf70746110",
+    "traj-a426c334-3e74-4b85-bab6-78d3cffbe889",
+    "traj-c0a40c81-335f-48c1-b2f4-62c5d2395083",
+    "traj-e65f84b2-506d-43c1-8348-6c6d31a92cb9",
+    "traj-f4e4a5d7-614d-4ddb-964c-0ff97b4117ba"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-95ce94d7-1f5a-4b01-abfc-33221db5edb8.json b/docs/training-reports/report-95ce94d7-1f5a-4b01-abfc-33221db5edb8.json
new file mode 100644
index 0000000..1d364bb
--- /dev/null
+++ b/docs/training-reports/report-95ce94d7-1f5a-4b01-abfc-33221db5edb8.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-95ce94d7-1f5a-4b01-abfc-33221db5edb8",
+  "timestamp": "2026-04-14T22:05:59.116543+00:00",
+  "source_trajectory_ids": [
+    "traj-0bfe93a3-bae9-4362-8f20-9c72c4b1f327",
+    "traj-0f156cb4-a83d-4e61-a20b-7f79ad707683",
+    "traj-4895bea3-bbc0-41a5-90e4-cb1d3dda6a67",
+    "traj-51649ce3-4499-4149-90db-62efd52e79f4",
+    "traj-584e3954-521b-4418-90e5-2b61caa50403",
+    "traj-9cadb24c-e9a3-4d40-824c-597569d7c2cf",
+    "traj-9f822bbd-502d-485b-b54a-deb4470e555a",
+    "traj-c7b2b31b-fd5d-478b-add5-9ed96b171036",
+    "traj-df615790-893c-4421-92b0-e0bdfdd94844",
+    "traj-e553ba9f-3ab9-4463-91bb-b651bd213dda"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-220559",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-968dc267-95ba-4796-a51e-76f67e534009.json b/docs/training-reports/report-968dc267-95ba-4796-a51e-76f67e534009.json
new file mode 100644
index 0000000..f006aff
--- /dev/null
+++ b/docs/training-reports/report-968dc267-95ba-4796-a51e-76f67e534009.json
@@ -0,0 +1,41 @@
+{
+  "report_id": "report-968dc267-95ba-4796-a51e-76f67e534009",
+  "timestamp": "2026-04-14T15:01:27.701445+00:00",
+  "source_trajectory_ids": [
+    "traj-083bd3bc-cae0-4982-b7ce-fdeb1bc18028",
+    "traj-19771c35-b6ec-4863-8543-c08370ca0822",
+    "traj-1c6864b5-22a1-4b81-8516-50ebe8eef5ae",
+    "traj-3af19065-9017-465b-a183-f3b860a1d228",
+    "traj-43adff43-d81c-4f0e-86da-b27c07f8a6ca",
+    "traj-5167e5bc-e287-47e7-8d5c-b8a1171f7978",
+    "traj-543b5227-9ea6-4007-affd-de515c82aa82",
+    "traj-d75e3381-3303-4e9b-ba2c-a0937bd825c0",
+    "traj-e0df49ba-e771-4e0f-aa5d-92f5677b6f0b",
+    "traj-f8efbc7a-8f37-4edd-85b1-d7b8ae8021eb"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-150127"
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-9693116a-6a49-479d-bd5c-d7e2e31d762e.json b/docs/training-reports/report-9693116a-6a49-479d-bd5c-d7e2e31d762e.json
new file mode 100644
index 0000000..df1fac5
--- /dev/null
+++ b/docs/training-reports/report-9693116a-6a49-479d-bd5c-d7e2e31d762e.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-9693116a-6a49-479d-bd5c-d7e2e31d762e",
+  "timestamp": "2026-04-14T22:08:19.819939+00:00",
+  "source_trajectory_ids": [
+    "traj-13a9f6c9-7e91-4697-8fbe-4566db4e6b7c",
+    "traj-2d8bd089-7a55-4b37-9779-2d6eff8b5c75",
+    "traj-430c8552-caa3-48d4-a04a-50d33b2f4651",
+    "traj-4dbf435e-2948-49b9-8128-e9ad6aecb83b",
+    "traj-7efca5b5-d777-4d02-bd1c-a1bcc894946e",
+    "traj-81d83ce2-17e2-48e3-a0f8-6daf658bfb98",
+    "traj-81ef2ecc-2b92-4311-af28-b603b6e1db14",
+    "traj-b9fa615c-fced-4c42-b354-0b653b56dbb5",
+    "traj-bef1d857-2276-47c8-b453-cff7d0aba2b2",
+    "traj-fccde1c2-f299-456f-a020-23e38a9677a2"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-97d7333f-8fff-433c-b976-1bf7c91084ef.json b/docs/training-reports/report-97d7333f-8fff-433c-b976-1bf7c91084ef.json
new file mode 100644
index 0000000..3aab3dc
--- /dev/null
+++ b/docs/training-reports/report-97d7333f-8fff-433c-b976-1bf7c91084ef.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-97d7333f-8fff-433c-b976-1bf7c91084ef",
+  "timestamp": "2026-04-15T02:31:17.347142+00:00",
+  "source_trajectory_ids": [
+    "traj-02d262d8-1363-4c07-88b3-a31809c1e92c",
+    "traj-042ab936-62ca-4a6f-bd33-0bee39011ebc",
+    "traj-28f6ddb1-92a8-4869-9213-72d44636c31c",
+    "traj-2fe5a024-3ac8-42cc-a54b-1ef4c2c18ccc",
+    "traj-4a3a1415-6b20-493c-a507-cec184b19efc",
+    "traj-9dcb498e-8acc-47d8-9783-e6e21fd7911d",
+    "traj-d43c9fb3-11e2-4df1-97cb-1bef6aeb5cb1",
+    "traj-e8776a75-5865-4d50-95ac-f7c02a96754a",
+    "traj-f3f945da-3c7d-48d9-ae0f-779552c8e39f",
+    "traj-fa89579f-35d2-405e-8d65-bdc096d7d3a3"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260415-023117",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-98271ceb-8f12-4af9-8107-75c1d4417a18.json b/docs/training-reports/report-98271ceb-8f12-4af9-8107-75c1d4417a18.json
new file mode 100644
index 0000000..f5cc2aa
--- /dev/null
+++ b/docs/training-reports/report-98271ceb-8f12-4af9-8107-75c1d4417a18.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-98271ceb-8f12-4af9-8107-75c1d4417a18",
+  "timestamp": "2026-04-15T01:21:53.704331+00:00",
+  "source_trajectory_ids": [
+    "traj-1cb29d73-0930-4e93-9405-f3eba3ec5645",
+    "traj-246371c8-1d8f-415a-a1e7-e3a71e34013f",
+    "traj-4111d916-a401-40f3-808f-0a7fe984a576",
+    "traj-4af2386b-ddba-44e3-a9de-69167dacff25",
+    "traj-4dd46793-9506-460b-abbe-cab2513e68bb",
+    "traj-678326cb-a48e-4d57-9370-4f836d87d237",
+    "traj-6f8de3f1-e141-4b45-af97-54bb1bd89215",
+    "traj-8765e3b5-169e-44e7-b2e3-a89b8accbc48",
+    "traj-89094a9c-093b-4350-85df-1e45b82394cb",
+    "traj-fe60dc09-9b7c-422a-ac77-609981c40f2f"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260415-012153",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-983c0662-bcf9-4a16-9973-ddaddad78011.json b/docs/training-reports/report-983c0662-bcf9-4a16-9973-ddaddad78011.json
new file mode 100644
index 0000000..a7e667f
--- /dev/null
+++ b/docs/training-reports/report-983c0662-bcf9-4a16-9973-ddaddad78011.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-983c0662-bcf9-4a16-9973-ddaddad78011",
+  "timestamp": "2026-04-14T20:31:11.334248+00:00",
+  "source_trajectory_ids": [
+    "traj-06c09b49-ff21-4254-98ee-db1561f48e77",
+    "traj-146bd07d-3d43-4282-a0c8-9225ba05bc69",
+    "traj-1fa2cd91-42e3-421a-a1d3-1737dabb196f",
+    "traj-25b3b238-59dd-47dd-ad44-2663febeebfa",
+    "traj-3c10bf10-20ed-4e2a-b405-d72c9e00fec8",
+    "traj-3cf8ae87-53e2-41f7-b16e-4dace2d06754",
+    "traj-9c7bffcf-cd71-4509-96f3-1f62b79857fd",
+    "traj-a8b04220-2c5f-42da-9727-6ad8ce8c256e",
+    "traj-b67d4dbb-a598-4536-973a-eb8f5d5c099c",
+    "traj-feae3677-adb6-431b-bd8f-6f32bbcfff23"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-986b8fb7-7bfb-49b1-86c5-361d0ef3faec.json b/docs/training-reports/report-986b8fb7-7bfb-49b1-86c5-361d0ef3faec.json
new file mode 100644
index 0000000..6dcf352
--- /dev/null
+++ b/docs/training-reports/report-986b8fb7-7bfb-49b1-86c5-361d0ef3faec.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-986b8fb7-7bfb-49b1-86c5-361d0ef3faec",
+  "timestamp": "2026-04-14T19:19:02.064004+00:00",
+  "source_trajectory_ids": [
+    "traj-01798b8c-8f40-477c-a29c-20f9b63b2c39",
+    "traj-183830ec-e6b5-4fa6-9c21-ab37be03948c",
+    "traj-2e1c9767-6883-4102-a0c1-a37fd58a33dc",
+    "traj-2f68cc6e-84f9-468c-8651-6b798da1bad8",
+    "traj-34ebba6f-8f77-4bf9-b196-1c789e12d7b1",
+    "traj-6f7b3641-34a1-4635-b993-c456c7a68c56",
+    "traj-8ac3d02f-1ba7-45d6-9d51-32f7f7a698ca",
+    "traj-b7458383-db3a-46ac-992e-28cb0da0f5c8",
+    "traj-dfc99b79-5474-428d-959b-7e3187494d41",
+    "traj-f7e2bb59-2b0a-43d5-bacd-ae1b7ecde76b"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-98f18532-e2a1-49c1-8b57-b6e3dd45052b.json b/docs/training-reports/report-98f18532-e2a1-49c1-8b57-b6e3dd45052b.json
new file mode 100644
index 0000000..a76bb8d
--- /dev/null
+++ b/docs/training-reports/report-98f18532-e2a1-49c1-8b57-b6e3dd45052b.json
@@ -0,0 +1,41 @@
+{
+  "report_id": "report-98f18532-e2a1-49c1-8b57-b6e3dd45052b",
+  "timestamp": "2026-04-14T17:16:51.793811+00:00",
+  "source_trajectory_ids": [
+    "traj-0d1add2e-cc1f-4d95-93e7-83632005460c",
+    "traj-19552c13-b01d-4f63-95c3-8050ee73732e",
+    "traj-29336be2-22cb-483e-8481-59c99da47b6e",
+    "traj-363e23a6-ad90-4821-985b-d041c97275d5",
+    "traj-51055cfe-31df-473d-90c6-d6ef3d76b794",
+    "traj-51b8e02e-f36e-40f7-b9cf-c076c60fac7e",
+    "traj-88467392-2831-4a14-88f8-f21c7dd17487",
+    "traj-91b019f6-8e7e-4de8-84b6-7cb00976d1c6",
+    "traj-bc32252b-6a48-48ad-baa5-332a75fee261",
+    "traj-bd5b5c36-2f01-469d-b162-75d64359da0c"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-171651"
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-995ea2f1-7710-45f1-ada9-f5e91cc6563a.json b/docs/training-reports/report-995ea2f1-7710-45f1-ada9-f5e91cc6563a.json
new file mode 100644
index 0000000..b8494d3
--- /dev/null
+++ b/docs/training-reports/report-995ea2f1-7710-45f1-ada9-f5e91cc6563a.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-995ea2f1-7710-45f1-ada9-f5e91cc6563a",
+  "timestamp": "2026-04-14T18:05:15.833596+00:00",
+  "source_trajectory_ids": [
+    "traj-0c8a0005-00ef-4c60-8fc3-c989d42758ea",
+    "traj-29cd85b7-bd1e-4f34-bba9-b49a61dd2fe5",
+    "traj-2c0e79d9-e25c-4303-9b5d-b562050bf5c7",
+    "traj-6af102dd-1d08-41d4-8165-ea4c8bc5d578",
+    "traj-6c109019-7ef3-41df-be70-9b33e9b0db92",
+    "traj-990cf702-35f5-45fd-be4e-8b437cc0860a",
+    "traj-a3383362-b355-4e3d-9e2b-e63cc2e5a536",
+    "traj-aaa92abe-ca48-4f6d-9fca-6977b0ba2763",
+    "traj-b8be0603-142d-4575-a0bc-1a4a324c3f8e",
+    "traj-bb66e0f8-0f96-46b0-9b80-565f35306d1a"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-99e7039d-ad7d-40f3-8df1-921a6b4a755a.json b/docs/training-reports/report-99e7039d-ad7d-40f3-8df1-921a6b4a755a.json
new file mode 100644
index 0000000..253995a
--- /dev/null
+++ b/docs/training-reports/report-99e7039d-ad7d-40f3-8df1-921a6b4a755a.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-99e7039d-ad7d-40f3-8df1-921a6b4a755a",
+  "timestamp": "2026-04-15T02:31:17.330200+00:00",
+  "source_trajectory_ids": [
+    "traj-01934281-c2cd-4655-9713-dcf9a6423612",
+    "traj-078d9a1d-75bb-43ce-a0c6-363ace32baae",
+    "traj-2bbef0ab-e642-474b-a4cb-c0c988f313bb",
+    "traj-2cb7f855-20ad-4a8b-b11c-30665c47d857",
+    "traj-85a46d03-4ccd-4002-88cc-1b3e30406b77",
+    "traj-8b61fd64-ea3c-4246-821d-84210bc8cc14",
+    "traj-ce33a0f6-414b-4d11-95a2-9146c8bb9125",
+    "traj-d4fe5501-4fcc-46ad-b75e-fc5ef9a3934e",
+    "traj-fa0fbbfb-e0b0-4452-8172-0c382518cf1d",
+    "traj-fdc82c0a-b867-4053-b8c4-44bdc0561d1f"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260415-023117",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-9d2e3473-442e-4fcb-b2dc-57b7e80ef623.json b/docs/training-reports/report-9d2e3473-442e-4fcb-b2dc-57b7e80ef623.json
new file mode 100644
index 0000000..bf90f3e
--- /dev/null
+++ b/docs/training-reports/report-9d2e3473-442e-4fcb-b2dc-57b7e80ef623.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-9d2e3473-442e-4fcb-b2dc-57b7e80ef623",
+  "timestamp": "2026-04-14T22:05:59.105612+00:00",
+  "source_trajectory_ids": [
+    "traj-1ca21b0f-fdc4-409e-a6a7-ee09a5f888b2",
+    "traj-2a2b3fc7-a3ec-421a-be23-470b44771087",
+    "traj-51813a04-e1db-4f5a-a1b4-3df19be7e08c",
+    "traj-637bc4f2-c93f-4ed9-b4d1-794b4cf086f4",
+    "traj-803fcf08-6779-49cd-9324-6b26b24f40fb",
+    "traj-9f3acfbb-ee22-41fc-9c81-ab6b6bbf8a0a",
+    "traj-a3928831-c8be-4ef5-b7fb-c993635f5d04",
+    "traj-b11b9446-05a2-405b-92ab-e759fd3e821a",
+    "traj-c36ad7bd-7533-4772-9ae0-39b06970a6ec",
+    "traj-f3fdf61f-eee6-46df-ae61-6aa2e1e2b167"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-220559",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-9fd6cd04-3f46-4363-bc6c-65a68c7669e7.json b/docs/training-reports/report-9fd6cd04-3f46-4363-bc6c-65a68c7669e7.json
new file mode 100644
index 0000000..8dd1c1b
--- /dev/null
+++ b/docs/training-reports/report-9fd6cd04-3f46-4363-bc6c-65a68c7669e7.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-9fd6cd04-3f46-4363-bc6c-65a68c7669e7",
+  "timestamp": "2026-04-14T21:22:02.820380+00:00",
+  "source_trajectory_ids": [
+    "traj-14515db2-6257-46d3-8111-ee5298cb8707",
+    "traj-25357165-32ee-4c36-af02-2770f53f6207",
+    "traj-7f2fa927-eb98-4ed7-916f-fdbc2b1e8f03",
+    "traj-87eda674-8b88-4843-852b-c64e472ac374",
+    "traj-a4833d9a-3ee5-4cb4-bd1f-fa8b0a619b87",
+    "traj-afa6a33f-ff16-4fc4-a9ec-79ee42ca88d3",
+    "traj-b5789cbe-1ef7-4e48-a8eb-a09e0beb91c9",
+    "traj-ba7db9ce-e019-413c-b45e-50d2b03c7fb1",
+    "traj-e063b56f-1b26-4955-908a-e87e175b3e01",
+    "traj-f344b472-15ad-4e29-b084-de2185896a81"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-212202",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-a08199f7-c94c-4829-bace-e904a7f4ac74.json b/docs/training-reports/report-a08199f7-c94c-4829-bace-e904a7f4ac74.json
new file mode 100644
index 0000000..cb9bc08
--- /dev/null
+++ b/docs/training-reports/report-a08199f7-c94c-4829-bace-e904a7f4ac74.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-a08199f7-c94c-4829-bace-e904a7f4ac74",
+  "timestamp": "2026-04-14T21:21:15.044727+00:00",
+  "source_trajectory_ids": [
+    "traj-0e13473a-9ee2-4567-a265-c6e2b9fd5100",
+    "traj-1e705a1c-5e10-4a9b-a00b-3653cffaa31c",
+    "traj-27c9411b-8b91-4382-9276-dca0733b2e36",
+    "traj-2a4f54be-13d9-49c8-9621-3de69fc023c0",
+    "traj-47cfb39a-67a1-488a-a060-484958e90abc",
+    "traj-5514a1b9-11ac-4039-85e3-7d8a509dbf3d",
+    "traj-5a38919f-4707-4604-94ca-8d383a89eaf7",
+    "traj-69386c22-782d-4c78-bd4d-b4c4c3e4d87c",
+    "traj-6c67ad21-4da3-4e2f-9a74-c5d457d3fccd",
+    "traj-c37ea4a2-83b9-4968-8a3c-8cc5c55c8899"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-212115",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-a0a7ee60-aa2d-4f53-8686-e6a72ed1d192.json b/docs/training-reports/report-a0a7ee60-aa2d-4f53-8686-e6a72ed1d192.json
new file mode 100644
index 0000000..6a5cac4
--- /dev/null
+++ b/docs/training-reports/report-a0a7ee60-aa2d-4f53-8686-e6a72ed1d192.json
@@ -0,0 +1,41 @@
+{
+  "report_id": "report-a0a7ee60-aa2d-4f53-8686-e6a72ed1d192",
+  "timestamp": "2026-04-14T18:03:43.972514+00:00",
+  "source_trajectory_ids": [
+    "traj-016b4424-cdd6-453a-98a8-f4fa6da5f166",
+    "traj-0aacdc90-9e9e-48d2-876b-bc68e5af08e6",
+    "traj-141b40bc-85df-4d3d-a3e1-a1ee36a8cf48",
+    "traj-233b089d-0899-497d-a3a4-a9af408cfc58",
+    "traj-5fbeaa00-d1a7-4b4e-a151-6d71347e2f94",
+    "traj-7dbfdbc7-3113-48cd-bcf8-4066d6f8b98a",
+    "traj-7e3d6383-adb2-4f62-97fb-259f9d1778a3",
+    "traj-ea3648dd-a2e5-4d04-bc49-8c330f279b13",
+    "traj-ed863566-5d50-41a7-b23d-30f9229286fc",
+    "traj-f9c2b526-29bc-48c7-a150-2782b4a524cb"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-180343"
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-a155a209-f98a-4672-984a-5822384f5c1a.json b/docs/training-reports/report-a155a209-f98a-4672-984a-5822384f5c1a.json
new file mode 100644
index 0000000..7a70a9a
--- /dev/null
+++ b/docs/training-reports/report-a155a209-f98a-4672-984a-5822384f5c1a.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-a155a209-f98a-4672-984a-5822384f5c1a",
+  "timestamp": "2026-04-15T02:33:47.917171+00:00",
+  "source_trajectory_ids": [
+    "traj-130fc82e-79ff-4b26-8521-cab929ec178d",
+    "traj-3f07d696-a342-4d07-8134-5c7dc5226ddf",
+    "traj-83b19d1e-ce8f-4dd3-a940-0f991a477eff",
+    "traj-87a2d328-8aa6-46e2-b84a-07ec14128b7f",
+    "traj-a5c5a69d-46ea-4cf2-8dc1-9258fff743c7",
+    "traj-acad6133-2bac-4e71-a546-adc0b110b975",
+    "traj-c94b9c62-c5f3-421b-916c-ae1028e3db57",
+    "traj-d78205e9-74e9-4de8-87f2-8871c81fa9ec",
+    "traj-d910f0c5-5ea6-472a-a94b-cfcd98e0b42b",
+    "traj-f9c6c857-b7cc-4b49-8991-85b43c869373"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260415-023347",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-a15627c6-080a-406e-a549-9e6fa828c1bf.json b/docs/training-reports/report-a15627c6-080a-406e-a549-9e6fa828c1bf.json
new file mode 100644
index 0000000..0f42b0e
--- /dev/null
+++ b/docs/training-reports/report-a15627c6-080a-406e-a549-9e6fa828c1bf.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-a15627c6-080a-406e-a549-9e6fa828c1bf",
+  "timestamp": "2026-04-14T20:30:08.656393+00:00",
+  "source_trajectory_ids": [
+    "traj-516c4b52-83f2-490d-8c8f-3321b1733b26",
+    "traj-5564bdb2-35f1-46b9-95b8-567e85e606bb",
+    "traj-66d9de3c-86db-4f36-ac6b-f210d98f6839",
+    "traj-87819fe2-3abf-4f0d-a924-8386a5c0683b",
+    "traj-9a2eb8ef-0d4c-45cf-9773-a86d86a7c8d5",
+    "traj-a29b6097-ceec-4d7a-aea8-f0efb4436486",
+    "traj-af5323f0-3699-4d16-9274-363a474477b9",
+    "traj-c9f5678e-5fa8-43db-bb96-45d8f67fdb2d",
+    "traj-cdce8964-60be-4a15-b9ae-37ec6300470d",
+    "traj-e168fb14-1e5e-4b33-9819-02e33d435812"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-203008",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-a41767b9-4256-4d1d-afcc-3b6f9dc85ddf.json b/docs/training-reports/report-a41767b9-4256-4d1d-afcc-3b6f9dc85ddf.json
new file mode 100644
index 0000000..cf03e9a
--- /dev/null
+++ b/docs/training-reports/report-a41767b9-4256-4d1d-afcc-3b6f9dc85ddf.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-a41767b9-4256-4d1d-afcc-3b6f9dc85ddf",
+  "timestamp": "2026-04-14T19:41:58.743590+00:00",
+  "source_trajectory_ids": [
+    "traj-44afabc6-efd0-4aaa-b815-361bc2c996a0",
+    "traj-5e29c8b5-bc67-48d4-8ce4-e44f87bb0f2e",
+    "traj-69fcd481-11f6-42e8-b89d-77cf0a841404",
+    "traj-6d52b9eb-0b50-42e8-b336-b44387d147c6",
+    "traj-a4508037-1c11-4e3a-b8a6-cfd453fb44ff",
+    "traj-c9e7e3f1-1971-47c7-8f9d-ed1f6faa4bd6",
+    "traj-cd23f8c5-adcb-456f-9682-d97629ea389e",
+    "traj-ce025b94-b21b-496f-895f-a8b4f241fd60",
+    "traj-da027372-9d50-4d5a-875d-732a504e5ff3",
+    "traj-da44d36d-600e-45bf-b66e-bce65a28e40b"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-194158",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-a4439f74-2df0-4bf0-a0e1-9a6f3b6bf4e4.json b/docs/training-reports/report-a4439f74-2df0-4bf0-a0e1-9a6f3b6bf4e4.json
new file mode 100644
index 0000000..a0cf123
--- /dev/null
+++ b/docs/training-reports/report-a4439f74-2df0-4bf0-a0e1-9a6f3b6bf4e4.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-a4439f74-2df0-4bf0-a0e1-9a6f3b6bf4e4",
+  "timestamp": "2026-04-14T22:10:15.298250+00:00",
+  "source_trajectory_ids": [
+    "traj-0a4226f6-e916-4e96-9454-798fec39ff67",
+    "traj-0f7ea76d-a5d2-409e-9b21-7dee91b4854a",
+    "traj-17f515e4-5577-4107-8ced-644b541bf429",
+    "traj-280e5a6b-8d2a-4b79-8c3a-5d3f1b066c6a",
+    "traj-7414d872-e032-471c-9796-24dffd19426b",
+    "traj-846c0282-d6c3-490a-8cac-85576955c43d",
+    "traj-8a9d55c1-f594-48aa-8fc2-8bfc66ca5dab",
+    "traj-a657c7b6-7967-409e-8466-a9a65aa796d5",
+    "traj-ca1fc6dd-3168-4035-9a16-8f0653ca6306",
+    "traj-faf255fb-9f40-498a-be06-e77f3c04a763"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-221015",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-a699cdfc-432b-4561-8991-7e0129ec50f3.json b/docs/training-reports/report-a699cdfc-432b-4561-8991-7e0129ec50f3.json
new file mode 100644
index 0000000..6e8a816
--- /dev/null
+++ b/docs/training-reports/report-a699cdfc-432b-4561-8991-7e0129ec50f3.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-a699cdfc-432b-4561-8991-7e0129ec50f3",
+  "timestamp": "2026-04-14T17:17:57.987874+00:00",
+  "source_trajectory_ids": [
+    "traj-1420cb03-d185-4412-992c-54f7db4e96aa",
+    "traj-3557f94b-279c-4c8c-bb33-77ff5d84937c",
+    "traj-4606f2f9-64a9-40df-b76e-580c15d39441",
+    "traj-5f8d3fe1-d60e-4c33-a569-ff507724503c",
+    "traj-5ffe92b7-295f-4e8e-9b95-2c108126e5a6",
+    "traj-6d7f1a66-5b58-4791-af4c-fd7a8a2ad21e",
+    "traj-8f4ceb49-eca2-4b8b-bbed-c55f1e37cca7",
+    "traj-a4f28432-26e3-4cb7-9f8d-a753dd265ca5",
+    "traj-a85fb06c-1c5a-4bda-bb76-71fa64108ea9",
+    "traj-e9894efb-bc73-4246-9ff5-346bf361c645"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-a7c13fda-3bc4-40fe-8005-1ea30070e87d.json b/docs/training-reports/report-a7c13fda-3bc4-40fe-8005-1ea30070e87d.json
new file mode 100644
index 0000000..57857de
--- /dev/null
+++ b/docs/training-reports/report-a7c13fda-3bc4-40fe-8005-1ea30070e87d.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-a7c13fda-3bc4-40fe-8005-1ea30070e87d",
+  "timestamp": "2026-04-14T15:02:28.835899+00:00",
+  "source_trajectory_ids": [
+    "traj-1bbd6764-060e-4e1b-a736-85d1cdeac96c",
+    "traj-24593f48-53db-4ffb-9686-71c03a8d76bb",
+    "traj-390f0671-c917-43ce-af6c-e560adf1e853",
+    "traj-4c5c72fd-1211-441d-ad74-58209bbe0e07",
+    "traj-6810ea4f-454a-4cbe-95f8-477b310d6d28",
+    "traj-712830a1-eae2-4988-9e58-3916202f2602",
+    "traj-807823b8-f5c6-4c6f-9c7b-4450bf382446",
+    "traj-c3e7e656-bcba-482b-af68-8d7074a26a88",
+    "traj-c8ad6290-bf84-47a4-a0d1-10a56818483c",
+    "traj-fe08dd3e-9fbe-463d-8c17-8f0b168c963a"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-a8a3ceca-3d16-4dcd-a615-92350dcca0d0.json b/docs/training-reports/report-a8a3ceca-3d16-4dcd-a615-92350dcca0d0.json
new file mode 100644
index 0000000..d7032ff
--- /dev/null
+++ b/docs/training-reports/report-a8a3ceca-3d16-4dcd-a615-92350dcca0d0.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-a8a3ceca-3d16-4dcd-a615-92350dcca0d0",
+  "timestamp": "2026-04-14T20:28:05.555096+00:00",
+  "source_trajectory_ids": [
+    "traj-03d4d829-ffeb-4969-83b9-dae08e858a8e",
+    "traj-061c8e42-7cd6-4cca-b9b8-95041eec844d",
+    "traj-0b075101-cf3f-4793-89cf-c10a5c2d8292",
+    "traj-176a444f-0601-4c9a-905d-677c3eb2da22",
+    "traj-18aa001a-3d41-405c-8c0c-188611c8e2c6",
+    "traj-48364e4f-449b-4ad7-acab-0fe6fbbef59b",
+    "traj-727c5e76-1001-47d7-8d62-fc8764e6abf4",
+    "traj-83effcd6-7c25-462c-8754-290a782ddf5c",
+    "traj-cc60ce08-0bab-435c-8f77-34f82773c046",
+    "traj-f31a284d-ea48-4b74-92eb-e8002e85f14b"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-a8dc1af4-2dff-48ed-acca-064453de957b.json b/docs/training-reports/report-a8dc1af4-2dff-48ed-acca-064453de957b.json
new file mode 100644
index 0000000..d32de26
--- /dev/null
+++ b/docs/training-reports/report-a8dc1af4-2dff-48ed-acca-064453de957b.json
@@ -0,0 +1,41 @@
+{
+  "report_id": "report-a8dc1af4-2dff-48ed-acca-064453de957b",
+  "timestamp": "2026-04-14T17:16:23.799704+00:00",
+  "source_trajectory_ids": [
+    "traj-17ce2ec3-0e16-4bd4-ac67-019e51caadb8",
+    "traj-2ae009a2-a5e8-4ff2-9f43-d792771aae64",
+    "traj-3ef172c6-6ea9-4fd6-83db-177b84191c1e",
+    "traj-6f9c4819-85bf-4f05-b9d3-79d4e1bc48f6",
+    "traj-929223b3-aadb-4f68-84d0-46e31c1fd289",
+    "traj-a9f4c049-d2d8-4c92-bcb3-c0faf0ea367c",
+    "traj-aa1aa2de-f72e-4672-b435-c20b90e6170c",
+    "traj-ab8d1feb-96cb-44f1-9712-13ede898eb56",
+    "traj-b032db6f-0262-4078-8110-ce02963d6f4a",
+    "traj-c98ae591-93dd-4d1c-acc8-77f17792725d"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-171623"
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-a9207b6a-4ce4-4a2e-8c68-0f8bc824d25a.json b/docs/training-reports/report-a9207b6a-4ce4-4a2e-8c68-0f8bc824d25a.json
new file mode 100644
index 0000000..08357bd
--- /dev/null
+++ b/docs/training-reports/report-a9207b6a-4ce4-4a2e-8c68-0f8bc824d25a.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-a9207b6a-4ce4-4a2e-8c68-0f8bc824d25a",
+  "timestamp": "2026-04-14T20:03:02.565465+00:00",
+  "source_trajectory_ids": [
+    "traj-02511035-45f1-445a-aba9-628c6cd28531",
+    "traj-3275a8f1-b2bd-433e-b303-e03b4ebbb9a0",
+    "traj-43ea388a-0a38-4650-a607-019c02e994c2",
+    "traj-781042f3-d3f1-491a-8140-2f0b29aaafb2",
+    "traj-897a2ffd-6b31-4b78-8573-fbb9298c1308",
+    "traj-c7d1ccc9-5436-4380-8d91-54129f3ca332",
+    "traj-c7fa45db-f91c-485c-aee5-4d8ae31f7c27",
+    "traj-cff8e2f2-27b2-4064-89ae-bfe9ac4f5a4b",
+    "traj-f3e9f993-55ba-41da-8c9f-ec7b2a5a1d85",
+    "traj-fee1b187-7073-4490-b82a-1a66ae1f061e"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-200302",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-a94d4e71-5183-4bc7-9093-13afd6d4af21.json b/docs/training-reports/report-a94d4e71-5183-4bc7-9093-13afd6d4af21.json
new file mode 100644
index 0000000..a6868aa
--- /dev/null
+++ b/docs/training-reports/report-a94d4e71-5183-4bc7-9093-13afd6d4af21.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-a94d4e71-5183-4bc7-9093-13afd6d4af21",
+  "timestamp": "2026-04-14T18:03:44.027491+00:00",
+  "source_trajectory_ids": [
+    "traj-10901c12-d7d2-4575-bab0-481bdf4cad64",
+    "traj-11df76ec-bd7c-45cc-bdb9-c596ab840488",
+    "traj-2c7bd714-3320-4f81-b173-714c94b15d32",
+    "traj-3657359a-8270-4599-89bb-a8d1f746f151",
+    "traj-4782717b-8feb-40db-97fa-d40f2f3ed3f7",
+    "traj-5a3851a6-3a65-4cfe-a680-6e93ff270b2f",
+    "traj-6a428f29-648e-492c-951e-f66313f5a223",
+    "traj-bc4eeb6a-53ef-42be-9358-97f14f5faa51",
+    "traj-def08080-e44b-4962-9600-f9c93bf1735f",
+    "traj-e1819f76-b5fe-4a3d-a0eb-0e63baa71367"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-abcad13b-e47e-422d-899b-bd3506c50f90.json b/docs/training-reports/report-abcad13b-e47e-422d-899b-bd3506c50f90.json
new file mode 100644
index 0000000..afc091d
--- /dev/null
+++ b/docs/training-reports/report-abcad13b-e47e-422d-899b-bd3506c50f90.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-abcad13b-e47e-422d-899b-bd3506c50f90",
+  "timestamp": "2026-04-14T14:58:22.633922+00:00",
+  "source_trajectory_ids": [
+    "traj-03ec4e07-01bd-4a3e-92b8-7d47b9f4f38d",
+    "traj-162d3154-0c71-4219-b995-a25767b32ce3",
+    "traj-2b57f91d-15e9-4758-a2c9-233bb766bd6a",
+    "traj-40731eb6-ff31-4441-bd55-68f96738601d",
+    "traj-45ab0672-f94c-4823-b102-ef12fccf1058",
+    "traj-51596c64-2922-4446-b61d-d312ab01bc60",
+    "traj-614127bd-e303-48b1-a863-3d8eecbaae13",
+    "traj-b1e405c0-f500-4b8b-b651-37be305ab040",
+    "traj-d21d0b04-b11b-4052-81bc-b3691d043dab",
+    "traj-e8dafb23-71db-4d33-993e-3af2cbfbd39d"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-ac39add9-b527-4a69-b204-88ddb1520840.json b/docs/training-reports/report-ac39add9-b527-4a69-b204-88ddb1520840.json
new file mode 100644
index 0000000..1d3c022
--- /dev/null
+++ b/docs/training-reports/report-ac39add9-b527-4a69-b204-88ddb1520840.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-ac39add9-b527-4a69-b204-88ddb1520840",
+  "timestamp": "2026-04-14T15:26:25.610444+00:00",
+  "source_trajectory_ids": [
+    "traj-1ceccbf7-6330-47ed-8847-4b91daf7cc4b",
+    "traj-2b893213-85dc-4790-9bb2-0629483e4d3c",
+    "traj-36b3fa2f-33a1-49ff-898d-762620569a41",
+    "traj-5195dada-dc07-4e09-9975-6a89db4235a6",
+    "traj-5dc5d158-1159-4e4f-93ff-c41e767488c4",
+    "traj-7790964a-282d-4dcc-b485-747c6c7184d4",
+    "traj-7bacbf96-41a3-4086-bdec-27eab7b218f5",
+    "traj-7e67c0b7-9b4f-4abe-b47d-b19ab291cd87",
+    "traj-d9d8a7de-3f14-4831-98f4-f16fed122d00",
+    "traj-f6480177-ca71-4aeb-8ffa-2123c92f25b8"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-ad614434-8387-4c0a-9170-9821f5b6e5dc.json b/docs/training-reports/report-ad614434-8387-4c0a-9170-9821f5b6e5dc.json
new file mode 100644
index 0000000..4981773
--- /dev/null
+++ b/docs/training-reports/report-ad614434-8387-4c0a-9170-9821f5b6e5dc.json
@@ -0,0 +1,41 @@
+{
+  "report_id": "report-ad614434-8387-4c0a-9170-9821f5b6e5dc",
+  "timestamp": "2026-04-14T16:51:38.267196+00:00",
+  "source_trajectory_ids": [
+    "traj-009e84d5-54ae-45fe-be70-318309d44e88",
+    "traj-03204054-d990-4ff9-8a78-5ce4ad733771",
+    "traj-092679d0-2802-4a05-8a78-08d0018eff91",
+    "traj-1482c6be-621f-48a8-8c86-5f76b6228610",
+    "traj-9647873e-9662-4522-9686-340d41b70a7e",
+    "traj-af69ebb9-eb2e-42ae-b6ea-34f0e01492a8",
+    "traj-c01cf256-50d1-4987-8ff2-c4296d49f786",
+    "traj-dbb31292-2d6c-4f7b-98cb-3c25282b6332",
+    "traj-e49d6087-1a59-4b20-b94e-08dd152d4538",
+    "traj-fdc62729-ef73-449e-9939-7a1e7346c9b6"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-165138"
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-afb90250-beba-4250-823f-151413488736.json b/docs/training-reports/report-afb90250-beba-4250-823f-151413488736.json
new file mode 100644
index 0000000..7f4699d
--- /dev/null
+++ b/docs/training-reports/report-afb90250-beba-4250-823f-151413488736.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-afb90250-beba-4250-823f-151413488736",
+  "timestamp": "2026-04-14T20:33:28.562913+00:00",
+  "source_trajectory_ids": [
+    "traj-0471e097-5d37-4815-b0eb-e94932d89df6",
+    "traj-136d1e6b-1172-40a9-a711-9c1f31c05269",
+    "traj-38f6850d-18fd-40a9-b84c-882662737e0a",
+    "traj-40d5edf5-8ce9-41bb-8de5-fe9fa43c6691",
+    "traj-4a46b326-a971-4fda-9cb0-50f32fc5e360",
+    "traj-50d662c4-240b-4d1e-aa13-87d60a81753c",
+    "traj-8af47f18-6a0e-429e-827c-f4f7c861bc43",
+    "traj-9fb29645-8d07-4924-9b8b-6e29c6e27b91",
+    "traj-c7e2dcfd-096e-4c3b-8aa3-4f3f5e9216e0",
+    "traj-d60f2070-0f4e-4365-9e1a-94217c7f85b2"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-203328",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-b08b0c58-bf8a-4778-8be1-14297ef8da75.json b/docs/training-reports/report-b08b0c58-bf8a-4778-8be1-14297ef8da75.json
new file mode 100644
index 0000000..6e7f027
--- /dev/null
+++ b/docs/training-reports/report-b08b0c58-bf8a-4778-8be1-14297ef8da75.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-b08b0c58-bf8a-4778-8be1-14297ef8da75",
+  "timestamp": "2026-04-14T16:52:41.195817+00:00",
+  "source_trajectory_ids": [
+    "traj-09039ff7-415b-4f7d-aa5b-9631409dfaeb",
+    "traj-11f1581f-23fb-464a-96da-70bee88c77b3",
+    "traj-191e9df4-0fad-43c5-84ca-e975604888ee",
+    "traj-1a6d2e5d-0b27-4385-9fd3-eede85e9905b",
+    "traj-1c98843e-f8ae-4591-802f-7becc8ced646",
+    "traj-8910116d-dd38-4b77-bed2-d68167e051aa",
+    "traj-916583ca-2b06-4872-9982-a0bf53812925",
+    "traj-a420b171-d66c-41fc-a0ab-587552e66a08",
+    "traj-ac38e061-4d94-46e5-841f-51ef534af6fa",
+    "traj-fbae6722-a3a4-4718-b051-ad1e63d1d501"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-b15b1ed1-f644-49a7-953f-15462a81088a.json b/docs/training-reports/report-b15b1ed1-f644-49a7-953f-15462a81088a.json
new file mode 100644
index 0000000..fb150b4
--- /dev/null
+++ b/docs/training-reports/report-b15b1ed1-f644-49a7-953f-15462a81088a.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-b15b1ed1-f644-49a7-953f-15462a81088a",
+  "timestamp": "2026-04-14T22:10:15.388367+00:00",
+  "source_trajectory_ids": [
+    "traj-037d286c-f051-41dd-b52b-6c0bf50ee9b1",
+    "traj-144db647-0c11-40a7-9025-1309dcf66d3b",
+    "traj-428ffedd-fa2b-48f9-9ad3-18ede627b50a",
+    "traj-649cc27f-7580-4951-b29c-fa6c7dbe33aa",
+    "traj-68e97826-cf15-4ea0-8ff7-b8acfa708ef5",
+    "traj-6debbbb7-fcdb-440a-8507-e57788eb909b",
+    "traj-858a7b11-0c27-4dc2-9a01-e48cbcad6557",
+    "traj-8cf01efb-3af0-48a4-a6c3-2d01c5a6180d",
+    "traj-9c0bb987-beb6-454b-bd62-4d7eb69ecdfc",
+    "traj-dbf587c3-637c-4d42-a0ea-c5c15ec1c1af"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-b3889917-dee5-44e9-93e2-d02bac4f591a.json b/docs/training-reports/report-b3889917-dee5-44e9-93e2-d02bac4f591a.json
new file mode 100644
index 0000000..8c84a18
--- /dev/null
+++ b/docs/training-reports/report-b3889917-dee5-44e9-93e2-d02bac4f591a.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-b3889917-dee5-44e9-93e2-d02bac4f591a",
+  "timestamp": "2026-04-14T20:04:58.790133+00:00",
+  "source_trajectory_ids": [
+    "traj-065ad0b9-8718-4195-9f4a-a2e6193e3919",
+    "traj-09fc27f8-ac42-4ba0-a222-2393aa78f02e",
+    "traj-3e24ada7-d0a1-4096-9fec-cfc046b00f29",
+    "traj-6d80b27c-d907-4907-ad07-02f4a43a5bd4",
+    "traj-98ae6e53-996e-4227-8b13-d960e85f64dc",
+    "traj-a0fa2cd8-d8a6-4fd2-ae4c-f82d6231ed6d",
+    "traj-bf2d8b66-a2b2-4a9b-a500-ce8c9106b3f8",
+    "traj-c4703e13-6da0-48a1-8539-0389b4326a2b",
+    "traj-d6bc19c6-4ea5-4f3e-8aae-e44bb574aa99",
+    "traj-dae8fad0-a9fd-4189-bebf-5255c217d9ca"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-200458",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-b5ca812a-32d1-4db3-b051-fc048d104bfa.json b/docs/training-reports/report-b5ca812a-32d1-4db3-b051-fc048d104bfa.json
new file mode 100644
index 0000000..cc4e33c
--- /dev/null
+++ b/docs/training-reports/report-b5ca812a-32d1-4db3-b051-fc048d104bfa.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-b5ca812a-32d1-4db3-b051-fc048d104bfa",
+  "timestamp": "2026-04-14T18:57:05.769250+00:00",
+  "source_trajectory_ids": [
+    "traj-05888383-9224-4b2d-8242-91c084b1c134",
+    "traj-2c77022b-6f74-4421-bda0-18898681bf4a",
+    "traj-3593a811-c2b7-43c5-87d9-e5fb423653b2",
+    "traj-3ce3f763-87a2-4873-a9ec-eafd7c8456ad",
+    "traj-5b32d46d-3026-443b-822e-b1b4feca14c0",
+    "traj-6d8730a1-be49-443d-bb69-b804f56cf9a0",
+    "traj-783b4a60-78ec-45f6-bf55-68a62a29e288",
+    "traj-be504263-ca1d-429a-8526-16de4be65b3b",
+    "traj-c19440a5-c45b-4541-85fb-42bcfd8d61c5",
+    "traj-c291fee4-83f2-4f3b-8d3f-6598bb3cf2de"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-185705",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-b5cb4ed3-3658-4aa7-b82a-2bb69051bf66.json b/docs/training-reports/report-b5cb4ed3-3658-4aa7-b82a-2bb69051bf66.json
new file mode 100644
index 0000000..4fa134c
--- /dev/null
+++ b/docs/training-reports/report-b5cb4ed3-3658-4aa7-b82a-2bb69051bf66.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-b5cb4ed3-3658-4aa7-b82a-2bb69051bf66",
+  "timestamp": "2026-04-14T21:22:14.602041+00:00",
+  "source_trajectory_ids": [
+    "traj-0228a534-97db-4d3f-b9e2-1ae0747bb805",
+    "traj-0fc13a52-b42b-4bbe-aab7-6c6e947e7307",
+    "traj-200269e8-5edc-4498-b8f0-fa5697514404",
+    "traj-6a820e5e-46ab-496f-a3e3-29a3f0d57db0",
+    "traj-6c9870b9-6588-43cc-b567-1797b6c4c718",
+    "traj-7817c80c-c9d3-4aa2-90f5-56d349c3b7f1",
+    "traj-b9a8ad9a-f2ed-4359-a6aa-665179284484",
+    "traj-d2a809e9-4a0a-47c2-a10f-d8d7d8ddfe13",
+    "traj-d6482725-07df-443e-899d-fe4c4d48a428",
+    "traj-e100291e-f2b3-413e-a4ec-af5b3266d115"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-212214",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-b624fb18-a44b-4165-a263-67e79606bab7.json b/docs/training-reports/report-b624fb18-a44b-4165-a263-67e79606bab7.json
new file mode 100644
index 0000000..ae6d916
--- /dev/null
+++ b/docs/training-reports/report-b624fb18-a44b-4165-a263-67e79606bab7.json
@@ -0,0 +1,41 @@
+{
+  "report_id": "report-b624fb18-a44b-4165-a263-67e79606bab7",
+  "timestamp": "2026-04-14T17:17:57.927250+00:00",
+  "source_trajectory_ids": [
+    "traj-02e88bb2-fa0a-4a26-88c5-b64123fe93f7",
+    "traj-03ca6759-f0a6-4ce6-8ec3-909c0511e7cc",
+    "traj-0b150651-df2f-4892-beaa-cbf3b3f5cf49",
+    "traj-832c8656-ecbf-43f9-8a27-788c3c909a28",
+    "traj-9607751f-10b9-4182-a9ac-8d8435bf223a",
+    "traj-a4867557-b4a4-4d55-a129-ff075331c72c",
+    "traj-a77e1df5-c366-4922-a43f-fbfef5ac2caa",
+    "traj-c4b15c6b-5945-4442-a1d2-ed9bae9fd580",
+    "traj-d0df5946-d4d0-4578-9119-8c3ea6919ef6",
+    "traj-e60df802-1d11-48a7-b6cb-a6c96a330ec4"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-171757"
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-b6a8a135-b797-4239-a193-ed50dac34580.json b/docs/training-reports/report-b6a8a135-b797-4239-a193-ed50dac34580.json
new file mode 100644
index 0000000..85b74e6
--- /dev/null
+++ b/docs/training-reports/report-b6a8a135-b797-4239-a193-ed50dac34580.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-b6a8a135-b797-4239-a193-ed50dac34580",
+  "timestamp": "2026-04-14T22:10:23.259626+00:00",
+  "source_trajectory_ids": [
+    "traj-21d4470f-f925-416d-baef-066f54f03eb9",
+    "traj-2986a37a-ef01-44a2-b284-321dde5911b6",
+    "traj-359e0bb9-9d3d-4bff-8732-75627ff4d637",
+    "traj-76524185-f3a4-482a-a200-873003aadf08",
+    "traj-8881428d-0f16-47b9-bfa6-f33a9c277f36",
+    "traj-a9ac163e-f0fb-4422-999d-b88049f587b6",
+    "traj-af08ed4c-f920-4bff-be83-46b595ecb048",
+    "traj-bcd03d93-f9ac-47d6-9763-d924052351fb",
+    "traj-bfb72fd6-34d4-4d08-bc51-a9cf99367220",
+    "traj-ec36d500-0fee-428c-a2eb-b01c432cea15"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-221023",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-b7383132-01b9-4367-a7f2-1aba245fbadd.json b/docs/training-reports/report-b7383132-01b9-4367-a7f2-1aba245fbadd.json
new file mode 100644
index 0000000..cf730e0
--- /dev/null
+++ b/docs/training-reports/report-b7383132-01b9-4367-a7f2-1aba245fbadd.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-b7383132-01b9-4367-a7f2-1aba245fbadd",
+  "timestamp": "2026-04-14T18:00:27.811321+00:00",
+  "source_trajectory_ids": [
+    "traj-0689f2e4-311d-487b-89ff-ef90068fadb7",
+    "traj-0aa5a60c-8c01-4861-9183-8d6074eef288",
+    "traj-30e01a6f-7e1c-447b-a542-399735b194a9",
+    "traj-4998be70-3b0a-4d6e-b22b-58aee0911c38",
+    "traj-52a6a35d-2ddf-48e8-a490-8894d96391a7",
+    "traj-7a5a71d3-458c-4738-991d-cc130067e868",
+    "traj-9a52134d-6090-41da-9b9c-3fe33edf484b",
+    "traj-9d05beab-6c51-4737-ad99-b3f00e950c24",
+    "traj-9dfbb478-577a-46e0-9550-0262d307b96a",
+    "traj-b791de9c-eee5-4166-abe7-7080e25fa9e7"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-b7ac968d-1404-47b1-81bd-0d78f65dbeca.json b/docs/training-reports/report-b7ac968d-1404-47b1-81bd-0d78f65dbeca.json
new file mode 100644
index 0000000..519612c
--- /dev/null
+++ b/docs/training-reports/report-b7ac968d-1404-47b1-81bd-0d78f65dbeca.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-b7ac968d-1404-47b1-81bd-0d78f65dbeca",
+  "timestamp": "2026-04-14T19:41:33.403940+00:00",
+  "source_trajectory_ids": [
+    "traj-0fedb768-b218-4e2a-ab56-d0655d6c947a",
+    "traj-15327bb3-7a2f-420d-97c6-ea80afc54b59",
+    "traj-21e16eee-5ac2-4b74-a05c-c0dc7020a25e",
+    "traj-62984473-f827-4a12-b084-651eb6ea553d",
+    "traj-66bb33bb-6b33-4ec5-9f37-6bf2f6d41bdb",
+    "traj-6886aa3a-62b8-4cfd-bb14-8cd430e0cba7",
+    "traj-8b3ed693-3a22-4e20-8f2a-8030fb132dcc",
+    "traj-8fb431e8-7ec8-4f41-8d74-8d6103513ac4",
+    "traj-af35eb2e-9d6c-4305-8833-4513c5d23e7f",
+    "traj-b0dc17ea-3956-404a-a52c-87fbf4b5c546"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-194133",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-b96a7d12-192c-4a4b-9bb4-6d420db36e62.json b/docs/training-reports/report-b96a7d12-192c-4a4b-9bb4-6d420db36e62.json
new file mode 100644
index 0000000..67cc5ae
--- /dev/null
+++ b/docs/training-reports/report-b96a7d12-192c-4a4b-9bb4-6d420db36e62.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-b96a7d12-192c-4a4b-9bb4-6d420db36e62",
+  "timestamp": "2026-04-14T22:05:43.990256+00:00",
+  "source_trajectory_ids": [
+    "traj-00ce0e5c-3e46-47b0-b69f-131cfd13e311",
+    "traj-071fd37a-7fe1-4299-8f3b-64013316eb20",
+    "traj-1a05680f-94fd-4fae-92a9-2cbb55041263",
+    "traj-4711d2da-0d1e-4d33-863b-d1b1769c7780",
+    "traj-56fb49ad-8eaf-4c31-82d8-4f99688d0865",
+    "traj-78ed00a1-be06-4efd-959f-76d172d02081",
+    "traj-a0e1ce5d-1089-44c2-900f-7c3b298c0234",
+    "traj-a44ca7d1-78db-46de-b144-42d70a1d0bfc",
+    "traj-bd696e6c-e004-49f7-9967-b991bbe5369f",
+    "traj-fdbb3b87-c254-4d81-a06d-ea0ceb7e3093"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-220543",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-b9b63402-56c4-4082-ab3b-ce9d6bcb7300.json b/docs/training-reports/report-b9b63402-56c4-4082-ab3b-ce9d6bcb7300.json
new file mode 100644
index 0000000..589ac84
--- /dev/null
+++ b/docs/training-reports/report-b9b63402-56c4-4082-ab3b-ce9d6bcb7300.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-b9b63402-56c4-4082-ab3b-ce9d6bcb7300",
+  "timestamp": "2026-04-14T22:09:38.829716+00:00",
+  "source_trajectory_ids": [
+    "traj-26d89301-9410-4f3a-b5e3-abe70b34449b",
+    "traj-36e6c163-f3c1-4455-a2e3-2086b49ad2ff",
+    "traj-48e59c9e-bd24-43b3-9f45-40df2a51b0e9",
+    "traj-69ed9ad6-3c96-46ab-9e03-4293ab89f00e",
+    "traj-77b4ee5c-3632-4f28-afb3-2d36355a709a",
+    "traj-8cde9570-81f2-4bf3-b052-5a7e570fe584",
+    "traj-b1522698-1324-4157-a938-e4eaca620616",
+    "traj-db8ac9c4-cf34-4202-8392-a84d57eba527",
+    "traj-f6027573-cd03-4538-9f9f-71702bf9dcf1",
+    "traj-fa4760e8-f762-4d68-93a9-3a222336aff2"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-220938",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-ba1732f5-2baf-4c43-933f-988b5154d0f8.json b/docs/training-reports/report-ba1732f5-2baf-4c43-933f-988b5154d0f8.json
new file mode 100644
index 0000000..d3e71f1
--- /dev/null
+++ b/docs/training-reports/report-ba1732f5-2baf-4c43-933f-988b5154d0f8.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-ba1732f5-2baf-4c43-933f-988b5154d0f8",
+  "timestamp": "2026-04-14T22:10:15.421472+00:00",
+  "source_trajectory_ids": [
+    "traj-0011f4e3-1e88-4436-8c14-c7d141d79d46",
+    "traj-1ecd6c57-826c-49db-ab52-e1591940b7ba",
+    "traj-22b8c4d4-87b2-4c18-a6e3-4bd606bd668c",
+    "traj-2c1fc276-d7df-49c0-a650-ac3dc36ce843",
+    "traj-398e90e6-7085-49f5-b2ad-3d3e5af28be4",
+    "traj-9a591a6c-8109-49c7-9ea5-a13a1be0072e",
+    "traj-a1a7f2aa-c82a-4c37-a683-fd3133e1cb41",
+    "traj-c37d86df-2122-4727-a07b-3a385cec8928",
+    "traj-cba4cb94-1e32-4157-9d37-81eacb4bb837",
+    "traj-e850792b-93b4-4902-8882-00b7458dcf0a"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-221015",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-ba50f102-0064-4a66-94b0-f70255912adf.json b/docs/training-reports/report-ba50f102-0064-4a66-94b0-f70255912adf.json
new file mode 100644
index 0000000..3de4e76
--- /dev/null
+++ b/docs/training-reports/report-ba50f102-0064-4a66-94b0-f70255912adf.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-ba50f102-0064-4a66-94b0-f70255912adf",
+  "timestamp": "2026-04-14T18:05:53.342527+00:00",
+  "source_trajectory_ids": [
+    "traj-23a6ed04-1fca-4471-9829-d6380b446e4d",
+    "traj-27e30234-7703-41b6-a67c-ea578304d23c",
+    "traj-387b8923-472c-472f-b164-8f8e4e10c109",
+    "traj-50440795-3a97-4965-baa4-c2e4e879f04e",
+    "traj-5251de53-5592-463c-9ff4-c4a355dd79ac",
+    "traj-576d8a39-b1cc-4b0f-b579-9db70c57dad8",
+    "traj-61d38340-1cfe-478a-8a3d-fa70a0e659db",
+    "traj-7cde25d9-6c7c-4f50-ae43-cd69f41fdce3",
+    "traj-9435c839-1ae3-46e7-a1e6-0b2a2558dbc6",
+    "traj-d2d5388f-f2a9-487c-a491-ecae4e887756"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-bac33bbb-f742-4452-ad1e-3f38f1b1da41.json b/docs/training-reports/report-bac33bbb-f742-4452-ad1e-3f38f1b1da41.json
new file mode 100644
index 0000000..ff99991
--- /dev/null
+++ b/docs/training-reports/report-bac33bbb-f742-4452-ad1e-3f38f1b1da41.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-bac33bbb-f742-4452-ad1e-3f38f1b1da41",
+  "timestamp": "2026-04-15T01:41:52.348551+00:00",
+  "source_trajectory_ids": [
+    "traj-1acb8315-4183-408e-baac-6c5d10a76ad0",
+    "traj-6f54e908-779a-46ea-9b64-0eaa71eb0e79",
+    "traj-75460b31-a146-4e33-bafd-11431cde6ab5",
+    "traj-85ab9708-1e7b-430a-9bdd-58f7edf87a91",
+    "traj-8da986f4-6c63-4cbf-a7b1-7863f137d09f",
+    "traj-9b2f3762-dc26-4325-9576-1b6ea291af0d",
+    "traj-a13a088f-68c6-47cf-ac0c-62918b634ecd",
+    "traj-bd3aabad-dea4-40fe-8eae-1863e68bae66",
+    "traj-c181d665-ffb1-48a3-9351-70d90f300877",
+    "traj-e96d6fe3-30e8-429a-b6de-07d9d65b8614"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260415-014152",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-bdbfaea5-f10d-4d52-aab6-0c51903a34c6.json b/docs/training-reports/report-bdbfaea5-f10d-4d52-aab6-0c51903a34c6.json
new file mode 100644
index 0000000..af9719e
--- /dev/null
+++ b/docs/training-reports/report-bdbfaea5-f10d-4d52-aab6-0c51903a34c6.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-bdbfaea5-f10d-4d52-aab6-0c51903a34c6",
+  "timestamp": "2026-04-14T22:05:43.973354+00:00",
+  "source_trajectory_ids": [
+    "traj-013904bf-451f-4fc8-886e-8bd677c396fa",
+    "traj-12db6565-4add-40a6-bd7a-bdcba778740c",
+    "traj-23de983d-e612-4b47-a896-5882bad7f55c",
+    "traj-4410b217-3e73-4d24-93fc-3ecc16709ffe",
+    "traj-4be8e3ad-07ac-4f8b-adc5-6e9246a666fa",
+    "traj-b6bcacb9-35e8-4058-8dbd-d8d2efcf82be",
+    "traj-c48aef4e-be06-4295-8934-0efa2fa62e64",
+    "traj-c657e1e0-7974-4bdd-8189-386644098129",
+    "traj-e6af881a-4dcb-456d-9418-874d1fb53f0b",
+    "traj-eaac1f04-0cb0-4a2a-aa5d-75cc8b312645"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-220543",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-bf874f6f-6b22-40fb-881b-3a657de8c947.json b/docs/training-reports/report-bf874f6f-6b22-40fb-881b-3a657de8c947.json
new file mode 100644
index 0000000..430eed4
--- /dev/null
+++ b/docs/training-reports/report-bf874f6f-6b22-40fb-881b-3a657de8c947.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-bf874f6f-6b22-40fb-881b-3a657de8c947",
+  "timestamp": "2026-04-14T15:04:26.240344+00:00",
+  "source_trajectory_ids": [
+    "traj-1682ca68-24d4-400f-819b-159269099ef9",
+    "traj-1a0710b3-1ce2-4f92-a905-f6bb334c2e03",
+    "traj-31847413-2d52-477a-8cda-f5c6aaa96d4e",
+    "traj-61cbf705-2d7d-4c98-9342-ca21d2acc5a5",
+    "traj-8a23fecf-d6ca-428e-bbdb-97f932c9d837",
+    "traj-b880afc6-d668-4cd7-a3c7-1c97e835ba47",
+    "traj-c95e2379-7f34-4c82-8afb-54934a62dabb",
+    "traj-d7e0cc79-197b-4e09-9902-be1168fce911",
+    "traj-e4303b1e-47a3-4a03-be69-523bc53db3dc",
+    "traj-e7fc1eb5-c4e9-4b1e-8abf-34cd01612431"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-c01cc9e5-1955-4dd9-9c76-c47a81862cbb.json b/docs/training-reports/report-c01cc9e5-1955-4dd9-9c76-c47a81862cbb.json
new file mode 100644
index 0000000..e877722
--- /dev/null
+++ b/docs/training-reports/report-c01cc9e5-1955-4dd9-9c76-c47a81862cbb.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-c01cc9e5-1955-4dd9-9c76-c47a81862cbb",
+  "timestamp": "2026-04-14T17:38:32.809545+00:00",
+  "source_trajectory_ids": [
+    "traj-3972e30c-c93d-4a1e-b53b-f2dee9ec910e",
+    "traj-51f1987a-30d2-44bd-b0c0-a958bc8cdc04",
+    "traj-564404fe-f2c1-4962-81a1-6d1c44cfa858",
+    "traj-652c319b-368f-43a5-a00d-5f5eed161e12",
+    "traj-6578adb1-ca9c-46ba-b544-d8e3bdd2ae22",
+    "traj-6b30fbd2-751c-4f50-9bd0-a689840b0687",
+    "traj-76e06af8-ed94-4866-93ab-1f874dce5926",
+    "traj-809634f9-e464-4b97-9db7-f574c9160993",
+    "traj-b99be6b0-71ae-4b3c-9d6a-d3a0159fdf92",
+    "traj-c24e00a1-8ce2-4178-80b0-ace21f9b3c8b"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-c06df51a-9e80-452e-b1cc-97ff0464287a.json b/docs/training-reports/report-c06df51a-9e80-452e-b1cc-97ff0464287a.json
new file mode 100644
index 0000000..3e809a2
--- /dev/null
+++ b/docs/training-reports/report-c06df51a-9e80-452e-b1cc-97ff0464287a.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-c06df51a-9e80-452e-b1cc-97ff0464287a",
+  "timestamp": "2026-04-14T19:41:58.758717+00:00",
+  "source_trajectory_ids": [
+    "traj-04c786ef-d1e2-49fa-9b09-a257924e8a8a",
+    "traj-141c99cd-6de0-4b94-b1fd-080bdd9682c7",
+    "traj-31f846ae-2ed7-4b40-8b12-d50dd072068a",
+    "traj-3f63215d-575c-4cf9-b203-573eca4be6eb",
+    "traj-5e54b1a3-07be-45b7-8398-341aa6dbfadf",
+    "traj-86f55c72-139c-43e1-82e5-faf2346bdef8",
+    "traj-9969138c-855c-4bcd-b3f6-c41c716e4afe",
+    "traj-9a3b3648-0d8c-4329-9c6b-e05a130b4fe8",
+    "traj-d0ea1484-2964-4b98-90fe-74c59b0b7e14",
+    "traj-dbccd463-80bf-4cbc-890b-154c9fd631ca"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-194158",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-c072450b-837b-4c9d-b890-906af6470902.json b/docs/training-reports/report-c072450b-837b-4c9d-b890-906af6470902.json
new file mode 100644
index 0000000..77137df
--- /dev/null
+++ b/docs/training-reports/report-c072450b-837b-4c9d-b890-906af6470902.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-c072450b-837b-4c9d-b890-906af6470902",
+  "timestamp": "2026-04-14T21:44:48.318468+00:00",
+  "source_trajectory_ids": [
+    "traj-173485d8-32c3-4e6f-bd00-c3fbab201bd9",
+    "traj-2ce59a34-0c0f-4109-b7c8-e1b2e881ed9c",
+    "traj-388456f0-a36a-4ad5-a647-763b00b46a54",
+    "traj-41c56bb8-ddf9-43fb-9609-3a7b3ca2722d",
+    "traj-7a4e855e-ce55-4b30-9c94-da0cb7b9337a",
+    "traj-7fe376fc-2c1d-4d89-8f79-eb74d80f0609",
+    "traj-8c1525e4-67ce-428d-bb68-5721fc48af03",
+    "traj-a51b2451-eeab-4b01-ae03-0cfd8a0c751a",
+    "traj-acf5c8eb-c562-445f-9509-36fb199a5278",
+    "traj-ff4c62b3-f124-4d4e-8375-393213f36ec0"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-c0c51d04-9150-4d99-a072-cb74bd449d57.json b/docs/training-reports/report-c0c51d04-9150-4d99-a072-cb74bd449d57.json
new file mode 100644
index 0000000..54fdb51
--- /dev/null
+++ b/docs/training-reports/report-c0c51d04-9150-4d99-a072-cb74bd449d57.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-c0c51d04-9150-4d99-a072-cb74bd449d57",
+  "timestamp": "2026-04-15T01:29:18.109113+00:00",
+  "source_trajectory_ids": [
+    "traj-0ae4c52d-e2b9-4c1e-9e82-9e4643eac994",
+    "traj-0e675169-7841-47a6-b686-fbc5b709e2b5",
+    "traj-1ed2bcc2-f67a-49be-b23b-36c6576e0f38",
+    "traj-4a6a7c91-8852-4d20-9e68-b44512d859b3",
+    "traj-706288b8-dba8-49eb-b4fc-c370c2e70da8",
+    "traj-7c985ba5-58cd-4890-b237-62476f9b1a64",
+    "traj-a924efc9-ec0d-4cd3-aaff-35140f3137b7",
+    "traj-d4e28964-e670-44de-9038-9fe75aec5439",
+    "traj-ddad6c58-d793-48c8-bd7c-8ec8255eb34a",
+    "traj-f773c13d-5a33-4fae-bf4e-735aba862ccb"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260415-012918",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-c0ea3db2-6e3c-4bb6-974e-4b151528897a.json b/docs/training-reports/report-c0ea3db2-6e3c-4bb6-974e-4b151528897a.json
new file mode 100644
index 0000000..94fa481
--- /dev/null
+++ b/docs/training-reports/report-c0ea3db2-6e3c-4bb6-974e-4b151528897a.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-c0ea3db2-6e3c-4bb6-974e-4b151528897a",
+  "timestamp": "2026-04-15T01:25:33.790818+00:00",
+  "source_trajectory_ids": [
+    "traj-0e621cdc-27e6-4aea-8237-1913e07e3a04",
+    "traj-1f607627-554f-470c-b755-1a345abdf2fe",
+    "traj-3f268ad9-c65d-4977-a300-92085c2eceb6",
+    "traj-7e08836a-280c-445b-9bf1-79da285d17cf",
+    "traj-930c7c4e-914a-45f4-8c8c-4abf43ddf22e",
+    "traj-9fbe5b46-1c80-444a-b6fa-3f666bbdfb84",
+    "traj-addf8af3-81ad-4577-b0ed-3d2603965271",
+    "traj-c1f3e575-d22b-4d20-a06d-9d5c736ce9ad",
+    "traj-e192a163-fc63-487d-8dc7-e7480cb1b3bd",
+    "traj-ee92b488-6994-489a-88cb-03d2cd57c852"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260415-012533",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-c17a47a0-db7c-4a1b-93ad-9b9b3cb91be6.json b/docs/training-reports/report-c17a47a0-db7c-4a1b-93ad-9b9b3cb91be6.json
new file mode 100644
index 0000000..cb3d8f5
--- /dev/null
+++ b/docs/training-reports/report-c17a47a0-db7c-4a1b-93ad-9b9b3cb91be6.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-c17a47a0-db7c-4a1b-93ad-9b9b3cb91be6",
+  "timestamp": "2026-04-14T20:02:20.820321+00:00",
+  "source_trajectory_ids": [
+    "traj-15d398ff-cf8a-4740-98b9-6f767eb2701c",
+    "traj-20a09faf-8c47-47a7-96b3-b5540f4f9741",
+    "traj-23b2ea87-fb1e-47a6-94c6-f87c6dbd6a6f",
+    "traj-340baf51-ca18-4930-807e-92da95cab7d1",
+    "traj-3ae727c5-8028-45a1-b721-8b8cd450c9a6",
+    "traj-89b5b1a5-4794-402c-94ac-ae889bfd0330",
+    "traj-8fb8a6da-0eeb-4ed9-a2d9-b4327dcc58f6",
+    "traj-99d911b8-6058-4968-95ad-f65b90564dc9",
+    "traj-b78b77c4-3968-4a6e-b690-89b24989fc51",
+    "traj-c294451a-3fcf-4d6d-be30-3dd35d97d46e"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-200220",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-c19bb0a6-2d9a-4989-9457-8af5e912ec9b.json b/docs/training-reports/report-c19bb0a6-2d9a-4989-9457-8af5e912ec9b.json
new file mode 100644
index 0000000..be6c979
--- /dev/null
+++ b/docs/training-reports/report-c19bb0a6-2d9a-4989-9457-8af5e912ec9b.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-c19bb0a6-2d9a-4989-9457-8af5e912ec9b",
+  "timestamp": "2026-04-14T20:58:05.443004+00:00",
+  "source_trajectory_ids": [
+    "traj-2ea69728-7b80-4764-b555-c3a0d6505837",
+    "traj-3461130a-cd1d-4690-a394-8c05edcbfde2",
+    "traj-35e46fbd-f374-489d-97b9-37a5fd6be082",
+    "traj-3f7b645c-ff20-41ee-8bd5-1a1d8e89710f",
+    "traj-6f6b27ec-fe35-49c7-b32b-2376e7319c5b",
+    "traj-81bf4283-a56f-4746-8169-fa772d827a58",
+    "traj-c4717ea8-6eb9-4376-99d6-9de31e014efd",
+    "traj-c4ca1daa-bf73-4eaa-8822-88bb915f3956",
+    "traj-d4c19cb0-13fe-4e5f-af0b-e539c1097a3b",
+    "traj-f13f5053-3a00-448a-b6db-6be812ed26e4"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-205805",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-c1a8c365-3589-492b-a5ec-bb642ee5a49d.json b/docs/training-reports/report-c1a8c365-3589-492b-a5ec-bb642ee5a49d.json
new file mode 100644
index 0000000..fc2e2d7
--- /dev/null
+++ b/docs/training-reports/report-c1a8c365-3589-492b-a5ec-bb642ee5a49d.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-c1a8c365-3589-492b-a5ec-bb642ee5a49d",
+  "timestamp": "2026-04-15T01:36:36.519137+00:00",
+  "source_trajectory_ids": [
+    "traj-11094fa0-83dd-4230-b7c8-61d9e49012e0",
+    "traj-21156bd2-738c-4c8d-ba5a-1c43c57e9c8a",
+    "traj-8323ac23-e4be-45f9-80a8-342f454ebb9b",
+    "traj-89986fa7-3cbb-4cbd-9c3a-b3ef8bf379bf",
+    "traj-93ae8e66-ab43-41fb-8d4a-a4897238859a",
+    "traj-97448ce5-33e4-4730-af43-6eeddbaecf41",
+    "traj-a345d9ec-af0a-4111-838a-fc42f11a9722",
+    "traj-adb59401-f166-4167-9800-49782b5cf8ae",
+    "traj-b37dd814-79b3-4987-a87e-bcdf016d3fd0",
+    "traj-d3f4a3cf-d3e8-4bfb-9bc3-b339d5587645"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260415-013636",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-c3173371-285f-494d-9b27-9bfe27bd4e32.json b/docs/training-reports/report-c3173371-285f-494d-9b27-9bfe27bd4e32.json
new file mode 100644
index 0000000..6a72630
--- /dev/null
+++ b/docs/training-reports/report-c3173371-285f-494d-9b27-9bfe27bd4e32.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-c3173371-285f-494d-9b27-9bfe27bd4e32",
+  "timestamp": "2026-04-15T01:57:32.724964+00:00",
+  "source_trajectory_ids": [
+    "traj-45a98f88-a40c-474a-9096-9eccbd472cff",
+    "traj-51e9d1a3-e056-4dac-8f8f-2682112787c2",
+    "traj-54b8b23e-bce6-4cce-a33f-9bf27b44d0ac",
+    "traj-7283e59e-f4ba-46b7-a6d9-4195d74aab77",
+    "traj-8505e1bd-f4ee-41ed-8301-20dc0b16d263",
+    "traj-a9787b27-0d54-40c0-810b-3f689dcf3a29",
+    "traj-c7bdd3d8-358a-44cc-85f7-b0f0c5503df0",
+    "traj-da3a21e8-ffc5-4424-9fcc-558e6ac6dab0",
+    "traj-e1938779-6d55-47bc-a36d-e7314f5c8444",
+    "traj-eed42b8e-78bb-49f3-8976-7982dd41fe55"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260415-015732",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-c3e96ae8-32ea-4bb4-a431-b24e68aee196.json b/docs/training-reports/report-c3e96ae8-32ea-4bb4-a431-b24e68aee196.json
new file mode 100644
index 0000000..ce151e2
--- /dev/null
+++ b/docs/training-reports/report-c3e96ae8-32ea-4bb4-a431-b24e68aee196.json
@@ -0,0 +1,41 @@
+{
+  "report_id": "report-c3e96ae8-32ea-4bb4-a431-b24e68aee196",
+  "timestamp": "2026-04-14T15:25:30.504606+00:00",
+  "source_trajectory_ids": [
+    "traj-26936c0b-44c0-436d-b733-3a58d091df06",
+    "traj-2b2f336b-f53d-446a-b8ba-92dd960b587a",
+    "traj-42034694-89c7-468c-990e-10fd7fe171ed",
+    "traj-84bb3272-3a6a-490c-8a77-b54fec9ef499",
+    "traj-8c507785-8bba-4070-b93b-1c08dfebb435",
+    "traj-93c7668d-cd94-45a0-8c7c-3e77814d8ff7",
+    "traj-9750eb36-1d03-4ebd-9b5c-8fae31366bdf",
+    "traj-aeb8e090-5253-4a3a-a70c-ef7d4184c24f",
+    "traj-c7421208-ae56-4be1-95dc-508009d01a72",
+    "traj-fdb9a0ee-a2c1-48ee-8734-033f9604cd16"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-152530"
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-c5aa0414-d36d-4ebe-86ab-ca6b800fa588.json b/docs/training-reports/report-c5aa0414-d36d-4ebe-86ab-ca6b800fa588.json
new file mode 100644
index 0000000..9b23fd6
--- /dev/null
+++ b/docs/training-reports/report-c5aa0414-d36d-4ebe-86ab-ca6b800fa588.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-c5aa0414-d36d-4ebe-86ab-ca6b800fa588",
+  "timestamp": "2026-04-14T16:49:44.884587+00:00",
+  "source_trajectory_ids": [
+    "traj-2d8901ee-5d76-4dd9-a49d-1c883a02bdea",
+    "traj-35fe497d-2203-4d17-bc82-3dddfa76d541",
+    "traj-4c2fcb31-652d-4e72-84e9-cc2ddc0c4601",
+    "traj-5bdda2b8-0593-4c8e-aa6a-95ac8fa0d108",
+    "traj-6571e341-23a4-4e58-b377-7929e3e0c3fa",
+    "traj-90e0e4bc-c9ac-422b-a689-4208b9759f1e",
+    "traj-b96791b9-6ed9-40b6-919c-58e2af3b7561",
+    "traj-c25c91bf-dc9e-42f4-ac1a-caae371bf3c3",
+    "traj-da88d8b5-cf0a-43e8-bd28-600e518d00fb",
+    "traj-e422e41c-901e-431f-9753-bd0b33c9de36"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-c852d6d2-752e-4047-8ea1-974f09b93f9a.json b/docs/training-reports/report-c852d6d2-752e-4047-8ea1-974f09b93f9a.json
new file mode 100644
index 0000000..e608710
--- /dev/null
+++ b/docs/training-reports/report-c852d6d2-752e-4047-8ea1-974f09b93f9a.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-c852d6d2-752e-4047-8ea1-974f09b93f9a",
+  "timestamp": "2026-04-14T20:34:01.592740+00:00",
+  "source_trajectory_ids": [
+    "traj-05c1d7fa-b0b4-4604-bdcb-d9e7177bdd42",
+    "traj-0a4153d5-3c8e-4bef-af61-50dc3d96e882",
+    "traj-148c678f-f8a0-4e20-b0eb-11a16efffdcb",
+    "traj-6039897d-4781-4c47-88f5-66ddb64671a0",
+    "traj-91b6d544-77db-4ae3-923b-98d1c29a5eea",
+    "traj-a02972fc-6eab-4bfe-88a5-8d0e8e76ecb7",
+    "traj-a36478d9-a059-4b41-8c83-5c6274ccf24d",
+    "traj-b1830d6e-8a4d-483c-9a0e-c12b36538204",
+    "traj-dcd714ed-f057-4072-b956-920d641aa8da",
+    "traj-e4a26d08-16fd-442d-a3ad-f59498229e69"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-c9311f0d-4049-477f-afe6-7076642f0ec2.json b/docs/training-reports/report-c9311f0d-4049-477f-afe6-7076642f0ec2.json
new file mode 100644
index 0000000..c13b492
--- /dev/null
+++ b/docs/training-reports/report-c9311f0d-4049-477f-afe6-7076642f0ec2.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-c9311f0d-4049-477f-afe6-7076642f0ec2",
+  "timestamp": "2026-04-14T20:57:28.048313+00:00",
+  "source_trajectory_ids": [
+    "traj-0c670378-8585-43d8-9842-6271b7f464fe",
+    "traj-2440d4a0-4f59-4085-a2da-9b8a9106421c",
+    "traj-70adbbbe-86b6-438e-80be-392553e727cf",
+    "traj-73e465b5-794e-494b-b2c0-d7c6eb6d62ac",
+    "traj-9d7ada37-f233-4e88-9a2d-6fc499b29163",
+    "traj-a2a2f85a-f7d0-43da-8207-a55c60fa7853",
+    "traj-c1877544-d735-460a-a663-89a62e8e0d90",
+    "traj-c6ffcb61-0919-40be-baef-14fbf8b1a2d8",
+    "traj-c9914af3-c974-4e7f-adfb-22576edb65b4",
+    "traj-da28d36d-fe4d-4c68-9995-10d5193fb01c"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-205728",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-ca9f7219-01ad-42e2-b6e5-47d0654c30d9.json b/docs/training-reports/report-ca9f7219-01ad-42e2-b6e5-47d0654c30d9.json
new file mode 100644
index 0000000..6814e3f
--- /dev/null
+++ b/docs/training-reports/report-ca9f7219-01ad-42e2-b6e5-47d0654c30d9.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-ca9f7219-01ad-42e2-b6e5-47d0654c30d9",
+  "timestamp": "2026-04-14T22:05:44.007084+00:00",
+  "source_trajectory_ids": [
+    "traj-08efb59a-fa69-4b7f-8be6-b53e3521474e",
+    "traj-2758fe69-867a-44fc-b2f9-fdaac8e3dcc9",
+    "traj-455ff9bb-cd0c-46ab-ae65-6f040ed26b3b",
+    "traj-5172d14f-0bd7-4ef9-9e60-44badea48055",
+    "traj-5ec85b29-41c4-4ae0-b293-7d0e9158ff6e",
+    "traj-611d0c0a-96db-486a-905e-5df1aa2a8d98",
+    "traj-634297e6-68ad-4f79-8816-11a66d5ea299",
+    "traj-7accfe4e-961d-49c6-bc04-31e8f94453da",
+    "traj-9065b828-578a-4b33-8ee9-86ec033a78e4",
+    "traj-a14c8e2a-f256-47fc-96db-ccbb86340e25"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-220544",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-cd34b1ad-03d6-4caa-a013-dc3344fbadfc.json b/docs/training-reports/report-cd34b1ad-03d6-4caa-a013-dc3344fbadfc.json
new file mode 100644
index 0000000..3719ed3
--- /dev/null
+++ b/docs/training-reports/report-cd34b1ad-03d6-4caa-a013-dc3344fbadfc.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-cd34b1ad-03d6-4caa-a013-dc3344fbadfc",
+  "timestamp": "2026-04-14T20:56:11.558946+00:00",
+  "source_trajectory_ids": [
+    "traj-189f9afd-a820-4b6c-9113-90b783b9221a",
+    "traj-1a836db9-5cc0-49d7-97ba-652559847588",
+    "traj-262c5f9b-f588-4f52-93c5-aa91ab26b4b5",
+    "traj-3e5d9867-cdba-4a2f-a0ee-45927ba29a6b",
+    "traj-7b20761c-cfed-41a9-9ce2-3e8a51b8a44d",
+    "traj-9155d91f-1e61-4e48-b7be-c08ce26ac19a",
+    "traj-ab6306ba-730b-4489-bdf4-fc672b34b542",
+    "traj-b77cc94e-8069-44bc-9c74-1377ab559833",
+    "traj-c753ac1b-5934-48b0-8592-b00926b43b21",
+    "traj-e2616df0-0f8a-4747-b3e5-2f8a8a7e51e6"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-205611",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-cf57461d-9cd9-4091-954c-2116899fa819.json b/docs/training-reports/report-cf57461d-9cd9-4091-954c-2116899fa819.json
new file mode 100644
index 0000000..33ee044
--- /dev/null
+++ b/docs/training-reports/report-cf57461d-9cd9-4091-954c-2116899fa819.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-cf57461d-9cd9-4091-954c-2116899fa819",
+  "timestamp": "2026-04-14T22:09:38.908061+00:00",
+  "source_trajectory_ids": [
+    "traj-00dc33e8-b08d-455e-bfc0-ea7b2a61d37b",
+    "traj-2d49f999-6433-4c08-aff0-c397a3510e19",
+    "traj-5d53bd03-4d74-46d4-8a2e-6a1ea0058b2d",
+    "traj-6febe1f4-9d67-4c49-a2ec-7982b0f02178",
+    "traj-743f7d85-4529-4e29-8b5f-4664740fca18",
+    "traj-8d0bdac7-9384-45dd-99d1-924962681054",
+    "traj-a3a28ab1-dbf8-4b5f-95eb-5676d2e1f4a3",
+    "traj-aaf34328-4ac9-4562-95d2-7dd7e38aae1d",
+    "traj-c5b9b9c4-4a62-4177-b262-62a37b11497c",
+    "traj-f10615e9-a2c9-46f0-950b-3a6d935bd4ea"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-cfd806b9-d65f-42de-aa48-83d459f3f19c.json b/docs/training-reports/report-cfd806b9-d65f-42de-aa48-83d459f3f19c.json
new file mode 100644
index 0000000..22c90b3
--- /dev/null
+++ b/docs/training-reports/report-cfd806b9-d65f-42de-aa48-83d459f3f19c.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-cfd806b9-d65f-42de-aa48-83d459f3f19c",
+  "timestamp": "2026-04-15T01:36:36.645965+00:00",
+  "source_trajectory_ids": [
+    "traj-2b1c7197-9cf6-4832-8373-8d9623ffdbe9",
+    "traj-4be5befc-b1d7-44a9-829b-b62f9b618e79",
+    "traj-51263133-7a86-4619-9a3b-c77c72cf6dcb",
+    "traj-61e9ab6e-ad84-4b9b-bf1f-a854003ea0ef",
+    "traj-65bc098c-dedc-4c3a-89c9-7bb69de25bf4",
+    "traj-786afb59-6853-4862-8746-a8e627b4e946",
+    "traj-b301d62d-79d1-4b7e-9cc6-1ec377d5e674",
+    "traj-bf2d6c27-aacf-4596-9fae-96882bdebb22",
+    "traj-c22c3912-924b-4d21-a713-215416810099",
+    "traj-c64acab7-5425-4ffd-a28a-4ed346193aad"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-d02ccedd-fdf8-4dd7-af8f-d132765084f9.json b/docs/training-reports/report-d02ccedd-fdf8-4dd7-af8f-d132765084f9.json
new file mode 100644
index 0000000..870ecc8
--- /dev/null
+++ b/docs/training-reports/report-d02ccedd-fdf8-4dd7-af8f-d132765084f9.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-d02ccedd-fdf8-4dd7-af8f-d132765084f9",
+  "timestamp": "2026-04-14T20:34:01.494134+00:00",
+  "source_trajectory_ids": [
+    "traj-05f6993d-cdf9-45b4-9c4e-90eb1cbbf143",
+    "traj-46cdd34c-3c0e-49a0-80c5-f25305b57abb",
+    "traj-70bd5eaf-962d-4f2f-829a-0ec1c92fe092",
+    "traj-78071eb9-5083-4961-a87c-f6e4569ad858",
+    "traj-88c5e46a-d37d-415a-adcf-7b74faa641e5",
+    "traj-8a7671bc-4e2e-45f6-9dff-4ff281592f1d",
+    "traj-b1ff2f33-7b98-4eed-940b-1c9ecd2acc43",
+    "traj-b76252cb-642c-4996-a8ba-b65e868b6efc",
+    "traj-d321e753-9f95-43dc-99bf-379509a3911a",
+    "traj-ef722e2c-76ec-4a6a-a984-69da5b33ff1f"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-203401",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-d0cd90a5-ef4a-4167-b3db-67c417669aea.json b/docs/training-reports/report-d0cd90a5-ef4a-4167-b3db-67c417669aea.json
new file mode 100644
index 0000000..484ff4c
--- /dev/null
+++ b/docs/training-reports/report-d0cd90a5-ef4a-4167-b3db-67c417669aea.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-d0cd90a5-ef4a-4167-b3db-67c417669aea",
+  "timestamp": "2026-04-14T20:03:02.548083+00:00",
+  "source_trajectory_ids": [
+    "traj-4399ca04-9011-45c9-91b0-663c823ad37f",
+    "traj-45de2e2d-cfb7-44cc-ac79-9dd18f865fd4",
+    "traj-942fd417-caa3-4e68-bb36-23fb2bfc7fd5",
+    "traj-95cad70c-56c6-4222-9323-6a561bf5d7dc",
+    "traj-9a185707-02dd-49d7-8c34-b0ba058b3f22",
+    "traj-a19dd27e-6a49-4dd5-b2d6-98334ed6c971",
+    "traj-a9dd1c05-92b9-49fa-9b3a-a58932d5dcbe",
+    "traj-abc842b8-ad40-41bb-882b-2209b409f32a",
+    "traj-b0b00a31-288d-42bd-9feb-66a9a7c4f816",
+    "traj-cb670c02-9592-4d5a-bb11-8ea6d004b3e8"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-200302",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-d34a861c-41f3-430e-af58-0fe460911260.json b/docs/training-reports/report-d34a861c-41f3-430e-af58-0fe460911260.json
new file mode 100644
index 0000000..490fa3e
--- /dev/null
+++ b/docs/training-reports/report-d34a861c-41f3-430e-af58-0fe460911260.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-d34a861c-41f3-430e-af58-0fe460911260",
+  "timestamp": "2026-04-14T21:44:48.308452+00:00",
+  "source_trajectory_ids": [
+    "traj-0268003b-2c0a-417e-95fb-7ad9eac382c6",
+    "traj-11815e71-e969-46fd-b92a-b9d30c27f818",
+    "traj-1b21bd87-6895-49a4-aa1a-0e295420a772",
+    "traj-29625285-f1bd-4114-ad94-099624a58846",
+    "traj-2e1e2c62-3c9e-4593-bc32-d6e8a8b40821",
+    "traj-76d68926-c753-4936-b577-d5c30fb06968",
+    "traj-8138144a-221f-4177-82b6-0eb5fa4660b3",
+    "traj-b5ae8e3a-3b60-458d-9bcb-a2b47be458b9",
+    "traj-cf563c03-cd68-44ab-af02-8ddb54dab197",
+    "traj-efa123d6-2c74-4c0e-beae-761934a045f2"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-214448",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-d448c7ff-8162-4ae7-b489-68c67675a752.json b/docs/training-reports/report-d448c7ff-8162-4ae7-b489-68c67675a752.json
new file mode 100644
index 0000000..cc6804c
--- /dev/null
+++ b/docs/training-reports/report-d448c7ff-8162-4ae7-b489-68c67675a752.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-d448c7ff-8162-4ae7-b489-68c67675a752",
+  "timestamp": "2026-04-14T21:21:15.238780+00:00",
+  "source_trajectory_ids": [
+    "traj-263df466-1bd9-490d-8225-69fa64b10c8b",
+    "traj-3a6196d2-f94e-4149-bc8c-4637d1939a79",
+    "traj-4c2cf6fb-dd03-477f-8570-932f2254af01",
+    "traj-81cc8735-94cd-4a94-99df-19ce67d282a1",
+    "traj-91245c94-2620-4ee2-ab0a-20542cc83bb5",
+    "traj-91fcb473-9f0e-4187-a3a1-84454daea52e",
+    "traj-ad751016-d9a6-4464-9286-694dc8f9272e",
+    "traj-b62924a5-5ccb-40e6-87d7-bc42bad5ad4f",
+    "traj-b9223b5f-8dca-4965-a4cd-08e060a91542",
+    "traj-d453b4a0-666c-4ea9-a930-faaef9bf8f4b"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-212115",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-d46767dc-7142-47c7-9456-44db4df475a7.json b/docs/training-reports/report-d46767dc-7142-47c7-9456-44db4df475a7.json
new file mode 100644
index 0000000..6008c45
--- /dev/null
+++ b/docs/training-reports/report-d46767dc-7142-47c7-9456-44db4df475a7.json
@@ -0,0 +1,41 @@
+{
+  "report_id": "report-d46767dc-7142-47c7-9456-44db4df475a7",
+  "timestamp": "2026-04-14T18:05:53.292364+00:00",
+  "source_trajectory_ids": [
+    "traj-055b1e88-d24e-4b66-8638-4ac1afbb7193",
+    "traj-069542e2-fbf2-478a-845e-43094feb4ce1",
+    "traj-306f78fb-0151-4f58-9ae5-4959fc2ea2a6",
+    "traj-3b806f93-2ed5-4961-b742-5d3d21a7acf4",
+    "traj-4e391bc5-c6bf-461b-ba26-aadbb0de214a",
+    "traj-5c096700-bfca-4f6a-9abc-af1d895cd62f",
+    "traj-6d99cef7-f649-4ed5-9915-76717baba1a1",
+    "traj-8a0d9f86-076e-44fe-b0d8-9b879b8db845",
+    "traj-bdd8ffd2-5bb1-4c4a-ba73-7f5b1f2703be",
+    "traj-d87d887a-96f2-403c-a5a9-fdfcf7cc846c"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-180553"
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-d5d4708a-a9c1-4e32-a1eb-80a156055374.json b/docs/training-reports/report-d5d4708a-a9c1-4e32-a1eb-80a156055374.json
new file mode 100644
index 0000000..b2c75e2
--- /dev/null
+++ b/docs/training-reports/report-d5d4708a-a9c1-4e32-a1eb-80a156055374.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-d5d4708a-a9c1-4e32-a1eb-80a156055374",
+  "timestamp": "2026-04-15T01:29:18.250752+00:00",
+  "source_trajectory_ids": [
+    "traj-239200f5-a6ad-4189-bad1-cc34d0ec57b5",
+    "traj-8a3f24bf-578a-472e-bbb1-3e2c701d1bfc",
+    "traj-9fb35b1f-b9db-4bed-9fcc-6d812b8eefaa",
+    "traj-a6180f6a-42b5-4cda-886b-2c50c9056f31",
+    "traj-a80dad6a-e710-4a45-ab31-151ffa342603",
+    "traj-b47c1410-ad27-449b-b446-905bd47ce262",
+    "traj-b640d93e-1258-4c59-9157-ab6c6d1d95d8",
+    "traj-cb46a313-1a0c-4be9-b757-9d122db50752",
+    "traj-ce579216-ac1c-40a3-9fe7-500d6cad014f",
+    "traj-eb4c2c46-7e44-44a6-9aa5-1b909cfc7dd4"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-d607641c-8460-44ea-b9f0-1fbb9ca0dad5.json b/docs/training-reports/report-d607641c-8460-44ea-b9f0-1fbb9ca0dad5.json
new file mode 100644
index 0000000..23fbb9b
--- /dev/null
+++ b/docs/training-reports/report-d607641c-8460-44ea-b9f0-1fbb9ca0dad5.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-d607641c-8460-44ea-b9f0-1fbb9ca0dad5",
+  "timestamp": "2026-04-14T21:44:48.170031+00:00",
+  "source_trajectory_ids": [
+    "traj-132ee788-dba2-4fe7-ace2-f86e4e58e012",
+    "traj-32a84d4f-8c0b-4f35-aaac-3f90b3e3aa3e",
+    "traj-3868936e-a6bb-4b79-bdcb-2818757709a8",
+    "traj-391c8367-54ab-464b-84a4-b87c18a398cf",
+    "traj-4c2774f5-414b-47fb-942e-bcd42b3dfd7f",
+    "traj-6d5eb242-82dd-4c6c-8b42-b7491b15c83d",
+    "traj-94718bcc-1b5e-4651-ad96-6e00a9be07bd",
+    "traj-c344578f-70c3-490a-b813-4e4ce9dbf99c",
+    "traj-e88dda13-55bf-4428-abe7-de91a2b40d20",
+    "traj-fd02e009-81c4-4280-a3e5-9405570c36c1"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-214448",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-d629a53f-1527-4984-ad23-a344dd0ee965.json b/docs/training-reports/report-d629a53f-1527-4984-ad23-a344dd0ee965.json
new file mode 100644
index 0000000..9ca01ed
--- /dev/null
+++ b/docs/training-reports/report-d629a53f-1527-4984-ad23-a344dd0ee965.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-d629a53f-1527-4984-ad23-a344dd0ee965",
+  "timestamp": "2026-04-14T22:10:23.241693+00:00",
+  "source_trajectory_ids": [
+    "traj-26a8e8b2-d28f-482d-bb80-153f522871bd",
+    "traj-2905e431-4574-4e8f-a8f8-34aa87bcff50",
+    "traj-41318a60-642e-4c1b-bbaf-51f083dd0063",
+    "traj-62037c5e-06d5-4319-803b-5769f511f89a",
+    "traj-71ae1c04-b498-4a33-94af-ec86e9e198b0",
+    "traj-b578d761-010a-4851-bff5-258ed555a9e8",
+    "traj-c3a05916-8867-4733-99a1-f3f83ff6aae6",
+    "traj-ec177665-1bad-42b6-b316-ee9b28205f4a",
+    "traj-f97e531f-e0a2-4bef-b6bc-bdc777029898",
+    "traj-fdb67019-e369-4a27-a150-c17ee293a493"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-221023",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-d71aa4f7-2e1e-4bae-b564-3f9fc6c6cd3d.json b/docs/training-reports/report-d71aa4f7-2e1e-4bae-b564-3f9fc6c6cd3d.json
new file mode 100644
index 0000000..7ff9a0a
--- /dev/null
+++ b/docs/training-reports/report-d71aa4f7-2e1e-4bae-b564-3f9fc6c6cd3d.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-d71aa4f7-2e1e-4bae-b564-3f9fc6c6cd3d",
+  "timestamp": "2026-04-15T01:33:34.954131+00:00",
+  "source_trajectory_ids": [
+    "traj-0d72cd8c-e2fa-4096-a8ec-9e2bb8c17761",
+    "traj-3e3680b5-5eca-4f19-b92a-d2b6819fb0d1",
+    "traj-7429f563-4a87-49aa-b8c3-d4a37535dcfc",
+    "traj-77212f26-a4d8-41b3-9a49-9dc59e01d192",
+    "traj-8de21233-aef7-46f8-a320-b0c7892cdb0d",
+    "traj-8e287f1a-b36e-4394-b333-a391248606ac",
+    "traj-9f1cdd1a-64dd-4e8f-aaf6-0c891dc2fbd3",
+    "traj-b4ef0eca-88c9-4bda-8599-5846267e0536",
+    "traj-c41750ac-d969-47de-8375-694911136b5b",
+    "traj-df7662d9-c080-43b8-b86b-a2452b120f9c"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260415-013334",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-d7b956e1-c2c8-4105-9765-2c11a59903c2.json b/docs/training-reports/report-d7b956e1-c2c8-4105-9765-2c11a59903c2.json
new file mode 100644
index 0000000..c842607
--- /dev/null
+++ b/docs/training-reports/report-d7b956e1-c2c8-4105-9765-2c11a59903c2.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-d7b956e1-c2c8-4105-9765-2c11a59903c2",
+  "timestamp": "2026-04-14T17:16:23.861811+00:00",
+  "source_trajectory_ids": [
+    "traj-5771e541-b118-49dc-9a61-cd498528ea45",
+    "traj-67b4ae15-c4a8-439b-b5af-e137fc243746",
+    "traj-7210de2c-aaaa-4bcb-a2e4-51585e1e2867",
+    "traj-72d6eb97-1251-4017-92bd-6e8030293d32",
+    "traj-84fada4a-85b7-4e67-a54f-1481c33d20c6",
+    "traj-972862cf-1d1e-46b9-8510-5314d5c42da5",
+    "traj-aedce72f-34fb-4d38-9c5d-2b4d8b0e8961",
+    "traj-d5ecb440-0907-419e-9d6d-024e703168da",
+    "traj-db015c45-fd5a-445a-b5fb-07a14e0d764e",
+    "traj-e923acdf-8fbc-4164-bd3b-50b633f273a6"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-d8a735d5-1511-4867-ae34-d5450578b4eb.json b/docs/training-reports/report-d8a735d5-1511-4867-ae34-d5450578b4eb.json
new file mode 100644
index 0000000..17da59f
--- /dev/null
+++ b/docs/training-reports/report-d8a735d5-1511-4867-ae34-d5450578b4eb.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-d8a735d5-1511-4867-ae34-d5450578b4eb",
+  "timestamp": "2026-04-14T20:33:28.544173+00:00",
+  "source_trajectory_ids": [
+    "traj-05634505-966b-4ca5-bee5-2acc0f375da9",
+    "traj-356893fa-ddfb-4cee-bc10-2c80601a964a",
+    "traj-5de517e3-4302-45d5-ad1a-8995423cb0d8",
+    "traj-6095abea-d8e2-4f2c-ad89-1a8ffc170286",
+    "traj-86b7818f-5f1c-4c2c-ab23-5facae4b6861",
+    "traj-b37c76d7-bbd0-47b0-88f8-054667e29d70",
+    "traj-c0483a4c-2f25-4c22-9f8d-72846331c765",
+    "traj-dda694aa-dc3f-4b5e-9f0e-9721b1d04905",
+    "traj-de3c0563-fcd6-4499-a349-1679468d6ab8",
+    "traj-fc6e6767-f20e-44c6-a2a1-3807426c1e61"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-203328",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-da9f5483-f9e7-45b1-a5a6-8293c3fa1997.json b/docs/training-reports/report-da9f5483-f9e7-45b1-a5a6-8293c3fa1997.json
new file mode 100644
index 0000000..7381412
--- /dev/null
+++ b/docs/training-reports/report-da9f5483-f9e7-45b1-a5a6-8293c3fa1997.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-da9f5483-f9e7-45b1-a5a6-8293c3fa1997",
+  "timestamp": "2026-04-14T20:04:58.852352+00:00",
+  "source_trajectory_ids": [
+    "traj-3e4bd63d-9d95-4c40-add0-d4a2bfe7b522",
+    "traj-4d0ff1e6-774b-4f7f-992e-3b8e0ab255ad",
+    "traj-5ff66524-3a06-4d48-8252-7fac0cd6d9af",
+    "traj-9c02aa8c-185e-4bb7-b7bc-4d4bdb5889ef",
+    "traj-b7459c9c-73d3-430f-aeed-2371437f31d4",
+    "traj-bfab0455-55af-4c50-b7fd-0fda053de88e",
+    "traj-c4b55069-fa9d-48f0-b949-7481d22a16fb",
+    "traj-e137f479-cd23-4b44-94aa-4fb4119fdac9",
+    "traj-e5a044db-fd31-4e93-b533-12ce84634bd9",
+    "traj-ff43d8b5-55ff-4491-81b3-7429fa9c3455"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-daf13edb-5a29-462a-b09c-f5f3ca73e757.json b/docs/training-reports/report-daf13edb-5a29-462a-b09c-f5f3ca73e757.json
new file mode 100644
index 0000000..b357dd4
--- /dev/null
+++ b/docs/training-reports/report-daf13edb-5a29-462a-b09c-f5f3ca73e757.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-daf13edb-5a29-462a-b09c-f5f3ca73e757",
+  "timestamp": "2026-04-15T02:33:47.842875+00:00",
+  "source_trajectory_ids": [
+    "traj-1020f220-92a2-4654-baa6-72e52ec71053",
+    "traj-47295e54-e0dc-4ef5-82f4-96954496198d",
+    "traj-4db1eccf-61c3-4333-8345-eddacc8e27cc",
+    "traj-5eadd414-270d-455b-8df9-a827f8f5c585",
+    "traj-75ce6d90-cf75-4852-afb1-f42bf31120a6",
+    "traj-7ebb66f4-5ade-4815-b147-33179c8d0cfe",
+    "traj-823c8d29-ee1f-4701-a18f-bd7a847452c7",
+    "traj-8df5ba18-78e0-4546-913b-01b887bf1015",
+    "traj-e06e46d9-485c-43ec-a2d8-7c166781dff6",
+    "traj-f857f2ad-8f5a-4770-ae6f-1d07d91e2be3"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-db8988e0-a578-4fe7-9cdb-4676e7df9533.json b/docs/training-reports/report-db8988e0-a578-4fe7-9cdb-4676e7df9533.json
new file mode 100644
index 0000000..1be9d3b
--- /dev/null
+++ b/docs/training-reports/report-db8988e0-a578-4fe7-9cdb-4676e7df9533.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-db8988e0-a578-4fe7-9cdb-4676e7df9533",
+  "timestamp": "2026-04-14T22:10:15.259647+00:00",
+  "source_trajectory_ids": [
+    "traj-05daae2e-2ddf-451c-bbed-0a541389b238",
+    "traj-3cc8cddd-56b4-4e19-aa8d-607a780d51f9",
+    "traj-65595747-c587-473b-af09-536b4a1ed09f",
+    "traj-87ffdde5-e669-4c1b-96f1-0750f43c2e72",
+    "traj-9e62f3ea-6852-425e-9495-1a4f4f434834",
+    "traj-a7b4ed18-5bfa-4c66-bbd5-ba68b1faf8d9",
+    "traj-b2d30603-6c44-417b-9ffe-fc10ca98cd85",
+    "traj-bba3a9f1-fe90-49c1-b589-bba3492b7a6e",
+    "traj-eee3a4a3-74be-4b5a-940c-3b28df374b3a",
+    "traj-f08030e2-1ddc-4447-9546-ea1ade988688"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-221015",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-deb97363-3e48-4492-b1fe-81c5b2f0cbe9.json b/docs/training-reports/report-deb97363-3e48-4492-b1fe-81c5b2f0cbe9.json
new file mode 100644
index 0000000..8a76f1d
--- /dev/null
+++ b/docs/training-reports/report-deb97363-3e48-4492-b1fe-81c5b2f0cbe9.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-deb97363-3e48-4492-b1fe-81c5b2f0cbe9",
+  "timestamp": "2026-04-14T20:07:38.768519+00:00",
+  "source_trajectory_ids": [
+    "traj-0b76be8b-2770-49d3-b2f6-d7aaf678fdda",
+    "traj-63111548-ac20-481f-908e-fb426bfa3000",
+    "traj-73abe76c-84e9-42b2-baa4-6eb180186c38",
+    "traj-73b8a06f-e93f-4282-a10b-f1bd4f83d317",
+    "traj-770a367b-9cb8-4b13-aef8-78ccd6992052",
+    "traj-ad4a2a80-851a-4bac-92e7-26854f165be9",
+    "traj-cfc3d5e2-00d6-4394-a7f6-40128541dd79",
+    "traj-ec0c07bb-a8cb-4816-98ec-e749a98971e3",
+    "traj-ee41c065-c055-4797-8d3f-730e63732f0f",
+    "traj-ef5a45f8-be53-4450-812d-0bbd1fdb13d8"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-200738",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-df7c7b8d-f5a1-45e0-ac6d-8812a3cfdd82.json b/docs/training-reports/report-df7c7b8d-f5a1-45e0-ac6d-8812a3cfdd82.json
new file mode 100644
index 0000000..85c53e5
--- /dev/null
+++ b/docs/training-reports/report-df7c7b8d-f5a1-45e0-ac6d-8812a3cfdd82.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-df7c7b8d-f5a1-45e0-ac6d-8812a3cfdd82",
+  "timestamp": "2026-04-15T01:21:53.687956+00:00",
+  "source_trajectory_ids": [
+    "traj-080955e6-3524-4392-aac5-36c7edabbf77",
+    "traj-16b8d58c-1d3d-4b2a-9600-c4107f1ba2cd",
+    "traj-4383bc8f-cdf0-46c7-b5ad-6eb4317b7761",
+    "traj-6de7e63e-ad6d-45f3-823f-07a37bbc1cee",
+    "traj-a9f6837a-bfdb-4438-aa70-f1bef125ff76",
+    "traj-e087b8ea-3d7d-41d7-91c7-371b55220151",
+    "traj-e45207bc-8291-471a-8b05-2aae927274e7",
+    "traj-f4cc6c86-e6ee-4a4b-8352-ecd52ee6bda4",
+    "traj-f6e18ddd-c91e-4da0-840c-cc78e953b8e2",
+    "traj-f9ea3c5f-7b2e-4c8d-bff7-64e56391c5e4"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260415-012153",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-df953479-f64a-4d9b-b437-0820f8ade9cb.json b/docs/training-reports/report-df953479-f64a-4d9b-b437-0820f8ade9cb.json
new file mode 100644
index 0000000..ac70f52
--- /dev/null
+++ b/docs/training-reports/report-df953479-f64a-4d9b-b437-0820f8ade9cb.json
@@ -0,0 +1,41 @@
+{
+  "report_id": "report-df953479-f64a-4d9b-b437-0820f8ade9cb",
+  "timestamp": "2026-04-14T18:05:15.780624+00:00",
+  "source_trajectory_ids": [
+    "traj-224d7e4b-dee5-459e-94dc-41a7c5d1cafe",
+    "traj-3be09bf7-ccd8-4e5a-9d16-43e0dd11cec4",
+    "traj-44741aba-78ff-429e-87ae-13a3f4178304",
+    "traj-46518a93-9201-4745-92c6-799205b18a1d",
+    "traj-4f5ca0b0-e9fc-44c7-a997-1c690cc69095",
+    "traj-50b3c733-51a6-47d6-89b9-cdaec97cce2b",
+    "traj-791b50b2-c319-425c-91cd-76fa51a09df1",
+    "traj-888930f8-507f-408d-98e7-a331705ae548",
+    "traj-9de333f2-5263-4c42-b3ac-870836328a9c",
+    "traj-d76fb56e-ebc2-4a33-a298-5000ad94778f"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-180515"
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-e0290bbc-50f9-4284-b44e-953f7b686a87.json b/docs/training-reports/report-e0290bbc-50f9-4284-b44e-953f7b686a87.json
new file mode 100644
index 0000000..889bb4e
--- /dev/null
+++ b/docs/training-reports/report-e0290bbc-50f9-4284-b44e-953f7b686a87.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-e0290bbc-50f9-4284-b44e-953f7b686a87",
+  "timestamp": "2026-04-15T01:25:33.972369+00:00",
+  "source_trajectory_ids": [
+    "traj-190b4a5c-bdac-4aff-94cd-0e28366ce081",
+    "traj-3beb0139-7c6c-493e-a6d9-a02264781d7c",
+    "traj-4e922a44-282e-47ae-8e45-741189d712d2",
+    "traj-5dc65f8c-ea10-4291-a7d2-9b4a784c3a48",
+    "traj-6ee5e2fa-2d80-49bf-8216-d5582320bcc0",
+    "traj-887c6376-9df1-47d6-a9f0-84df3de7aaab",
+    "traj-8a3b3a1c-73bd-449e-a090-a688246361e9",
+    "traj-af0dcd29-6552-4092-b9a6-3daf4580d06b",
+    "traj-c46cff1d-05f9-4233-a7eb-6c89e019c827",
+    "traj-fc51fe5c-b4d1-4245-88c8-0fa34c5b4820"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260415-012533",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-e0d5b2d1-9c97-4e39-938e-fa26725a38ea.json b/docs/training-reports/report-e0d5b2d1-9c97-4e39-938e-fa26725a38ea.json
new file mode 100644
index 0000000..97b1e1c
--- /dev/null
+++ b/docs/training-reports/report-e0d5b2d1-9c97-4e39-938e-fa26725a38ea.json
@@ -0,0 +1,41 @@
+{
+  "report_id": "report-e0d5b2d1-9c97-4e39-938e-fa26725a38ea",
+  "timestamp": "2026-04-14T17:38:32.766277+00:00",
+  "source_trajectory_ids": [
+    "traj-066155cb-7006-450d-be8c-38f5c0551da5",
+    "traj-0fa878bc-b6a3-4642-83ff-41148eb173ad",
+    "traj-2f30ecb8-f2fe-4d2b-997c-8852b6fd33b0",
+    "traj-47893dab-ad1d-4778-a5b1-c51835d34a60",
+    "traj-5ba0e654-2be3-407c-85a8-358be8b81d09",
+    "traj-681cab8d-832c-487d-9dff-ba2d7016a72a",
+    "traj-753ac1b1-416c-429a-a6d3-01d8eac1d125",
+    "traj-beafa3ca-5c96-4449-839a-ce18c14c641e",
+    "traj-cd97d02c-56b4-479f-83cd-0a1541cbcbec",
+    "traj-e2a7ff20-3e59-4ec1-a4b3-a7d8781d81a8"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-173832"
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-e16add80-5ba4-44b5-9432-56da24eb0ffa.json b/docs/training-reports/report-e16add80-5ba4-44b5-9432-56da24eb0ffa.json
new file mode 100644
index 0000000..870ec62
--- /dev/null
+++ b/docs/training-reports/report-e16add80-5ba4-44b5-9432-56da24eb0ffa.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-e16add80-5ba4-44b5-9432-56da24eb0ffa",
+  "timestamp": "2026-04-14T22:08:57.425891+00:00",
+  "source_trajectory_ids": [
+    "traj-137adbba-3299-4e2a-80bc-dea7feddd5f7",
+    "traj-1f6bee3f-e284-4ab5-9137-7f3edab1ef3c",
+    "traj-2ea814f1-5344-465e-bb79-be735b66dd8d",
+    "traj-66cffda6-b6ee-498b-b8bb-cdbb00e85004",
+    "traj-916df637-b8bf-4330-b800-d1085ebf8671",
+    "traj-ad7652bf-b165-4da4-b365-28a848fd95b5",
+    "traj-d1a1e96b-9858-4b7c-a1a6-136eb3db8c3f",
+    "traj-de3bc12f-5083-4544-b3b1-d3f3285e37af",
+    "traj-f24ae176-1313-4812-86c9-e6ff010967fa",
+    "traj-fd65a143-4f58-4cd4-95be-bc832d7991f9"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-220857",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-e1e80ed1-a1c1-4853-a7c6-2a64a7298bfe.json b/docs/training-reports/report-e1e80ed1-a1c1-4853-a7c6-2a64a7298bfe.json
new file mode 100644
index 0000000..058d943
--- /dev/null
+++ b/docs/training-reports/report-e1e80ed1-a1c1-4853-a7c6-2a64a7298bfe.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-e1e80ed1-a1c1-4853-a7c6-2a64a7298bfe",
+  "timestamp": "2026-04-15T01:33:34.877747+00:00",
+  "source_trajectory_ids": [
+    "traj-09fe4d7c-5710-4124-8e32-4f2a5f2acc33",
+    "traj-0ebd77c5-f37c-4dff-a674-5681833cef6d",
+    "traj-2a5f1f91-44b1-4f46-892f-17c17dbb3e4a",
+    "traj-5ee5bcc4-4b7f-4d87-8f45-00216a4988b5",
+    "traj-7a0eda78-1571-4600-97eb-f2038850125f",
+    "traj-8587d4ad-d609-41b1-8c1f-22b1db7b9f4e",
+    "traj-b62704ca-b248-4d6d-aa6e-95b5cb202311",
+    "traj-b64969d6-2863-4bf1-acd7-14b1b445d276",
+    "traj-bf0fd2a9-03a6-4e7c-920a-20d38586ae1a",
+    "traj-fff422bb-122b-4c00-996b-be99e5ee72de"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-e417488b-5ae1-453c-a778-048212c656be.json b/docs/training-reports/report-e417488b-5ae1-453c-a778-048212c656be.json
new file mode 100644
index 0000000..6e29e12
--- /dev/null
+++ b/docs/training-reports/report-e417488b-5ae1-453c-a778-048212c656be.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-e417488b-5ae1-453c-a778-048212c656be",
+  "timestamp": "2026-04-14T22:05:59.262693+00:00",
+  "source_trajectory_ids": [
+    "traj-00ea1a3d-6882-4346-9644-003aeb653785",
+    "traj-340a9b1a-f6a1-4096-9e39-880299c5c17b",
+    "traj-5c44a81a-2c38-4de5-91b9-cadcab8bf75e",
+    "traj-7d181b97-19e6-4f2d-954f-e0d2bf05d3f5",
+    "traj-7e6381ae-9f4a-410b-9389-0bbcb0f5006b",
+    "traj-81b00183-53f7-44d6-aa30-6cc5bf18498e",
+    "traj-a060dbb8-3c32-4bbc-b22a-be280ca4abec",
+    "traj-a66d985e-a7b7-4c12-8c0c-16cad34dc1f8",
+    "traj-c3b2bdfe-fa31-4bf4-8356-80ce021888e4",
+    "traj-e2b08721-e6b2-4cdb-93b6-4332eaa24f7e"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-e5d1ccc5-92e7-4b4e-970d-3c295320874d.json b/docs/training-reports/report-e5d1ccc5-92e7-4b4e-970d-3c295320874d.json
new file mode 100644
index 0000000..05dd838
--- /dev/null
+++ b/docs/training-reports/report-e5d1ccc5-92e7-4b4e-970d-3c295320874d.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-e5d1ccc5-92e7-4b4e-970d-3c295320874d",
+  "timestamp": "2026-04-14T20:06:16.329964+00:00",
+  "source_trajectory_ids": [
+    "traj-00cd4485-1631-4d40-8808-9c58ff380031",
+    "traj-01cbcaf2-a53a-4d43-9147-fc68e054ef42",
+    "traj-4dbf5b3b-687e-4376-8ec5-5c2a74dedd1d",
+    "traj-6aeabfa2-c245-4687-8d69-45bcffc27c9a",
+    "traj-a2f6af6e-ec0b-4117-9bc6-a77d6982b811",
+    "traj-a47370d1-89ec-4f55-841c-cdcd4a2a5807",
+    "traj-a69dc48e-b005-490e-ad65-812a0bc4bdf6",
+    "traj-ca5a2a5d-5b1e-4ed5-9239-5b528d2fa1ed",
+    "traj-cb7942a4-a3c7-40ff-ad16-4a32b71bd14d",
+    "traj-ffd0712c-7a9b-414c-96f2-a611100cb668"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-200616",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-e691ec63-98f9-4239-820c-5bf6d57de243.json b/docs/training-reports/report-e691ec63-98f9-4239-820c-5bf6d57de243.json
new file mode 100644
index 0000000..31ca5c0
--- /dev/null
+++ b/docs/training-reports/report-e691ec63-98f9-4239-820c-5bf6d57de243.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-e691ec63-98f9-4239-820c-5bf6d57de243",
+  "timestamp": "2026-04-14T15:25:30.542190+00:00",
+  "source_trajectory_ids": [
+    "traj-040359a5-00c1-4a34-a9fb-ba92c413b7b3",
+    "traj-0d41a9da-0c99-4eb2-9d3a-708b9b7b82db",
+    "traj-3df2f77a-e3ec-4417-8791-26ce0676f84a",
+    "traj-8e50fb5a-5cc0-4365-adbe-198bef7e4420",
+    "traj-8f0b868a-d9dd-4dbe-a02e-ce9b8c04c2ea",
+    "traj-c5783fab-637b-46e1-958c-5e92ed130ee9",
+    "traj-ce5196c6-665c-4279-8985-13bc6537cf00",
+    "traj-db9d61be-2302-48d0-a646-c0f2308789b1",
+    "traj-e72c549b-6f44-48cf-b2df-0c236e44cfa6",
+    "traj-ea4ec480-c25d-477e-9f46-79783b2bf7c4"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-e6964e86-8d88-44b5-b71f-646d511c1760.json b/docs/training-reports/report-e6964e86-8d88-44b5-b71f-646d511c1760.json
new file mode 100644
index 0000000..510ae48
--- /dev/null
+++ b/docs/training-reports/report-e6964e86-8d88-44b5-b71f-646d511c1760.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-e6964e86-8d88-44b5-b71f-646d511c1760",
+  "timestamp": "2026-04-14T22:05:59.127680+00:00",
+  "source_trajectory_ids": [
+    "traj-2809c098-f8eb-4033-b427-a1877fae79ce",
+    "traj-2a4ac1d6-e8b4-4470-b6df-5cf001962453",
+    "traj-79d8dcbd-2ffd-4052-9851-4606f29a7924",
+    "traj-89c138ae-808c-4e52-a721-dcdb1d5ad6cb",
+    "traj-a24b86ef-0afd-48df-be9c-fb4a0c2c32fc",
+    "traj-abf933d5-9e37-478c-ad3a-ec5292d3f091",
+    "traj-acb50486-6301-49af-9dc1-38ea94a3671a",
+    "traj-b3caf571-6c26-4fd9-83fc-7e6ad2142f90",
+    "traj-c9179c98-1c94-4df9-b829-46bea8866bc6",
+    "traj-e85ff765-bb03-4ddd-8907-5f3a460e5ee1"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-220559",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-e7c8c800-0285-451c-a381-9d973b09a28d.json b/docs/training-reports/report-e7c8c800-0285-451c-a381-9d973b09a28d.json
new file mode 100644
index 0000000..dd7654b
--- /dev/null
+++ b/docs/training-reports/report-e7c8c800-0285-451c-a381-9d973b09a28d.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-e7c8c800-0285-451c-a381-9d973b09a28d",
+  "timestamp": "2026-04-14T16:53:59.603802+00:00",
+  "source_trajectory_ids": [
+    "traj-021fe9c7-2219-4720-a611-00be43fd45a5",
+    "traj-3057337f-dfa6-4cad-9a75-0fd7578ad26e",
+    "traj-4be526b9-7ece-4472-969f-958300b7de46",
+    "traj-5d648d76-9044-4bfb-8698-5d82edaaae41",
+    "traj-8d9f783d-4a01-4336-b021-34e0619afa39",
+    "traj-bf8c7b99-aed9-4c4f-8fb8-779b87ddc8f7",
+    "traj-dfff1285-8e2a-4cac-8f2b-a75ebb7caae6",
+    "traj-ec2344f9-6741-4806-8b0c-1de561cf4b3d",
+    "traj-f0f0723d-da65-4db5-a8ac-8cc96fafe09d",
+    "traj-ff4b3b70-e611-4736-8494-fc580cacb7cc"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-e816962d-950c-44d5-9c56-84fbac38ad03.json b/docs/training-reports/report-e816962d-950c-44d5-9c56-84fbac38ad03.json
new file mode 100644
index 0000000..66b6cf8
--- /dev/null
+++ b/docs/training-reports/report-e816962d-950c-44d5-9c56-84fbac38ad03.json
@@ -0,0 +1,41 @@
+{
+  "report_id": "report-e816962d-950c-44d5-9c56-84fbac38ad03",
+  "timestamp": "2026-04-14T15:29:41.997128+00:00",
+  "source_trajectory_ids": [
+    "traj-0f91f7d5-8baa-41af-acf4-89fd1912dadd",
+    "traj-363ee6c2-07ec-4694-bcc1-34f1f5be50fe",
+    "traj-3b44d3de-1632-4c52-9703-839eb64cd8c0",
+    "traj-497af52f-3274-4304-b628-f877c7ea9b05",
+    "traj-54d7eb22-d6ef-4e27-a62c-aa29ac1cad80",
+    "traj-7a4077ab-a811-41d3-88a5-d88ca6af6697",
+    "traj-c3f0b796-d860-4cb0-96d7-9908b4c6765a",
+    "traj-d0b1e945-a422-4d68-979d-92a86c6dc849",
+    "traj-d0f36666-a9b4-419c-8bb1-5e3a675331d5",
+    "traj-e0bc1956-9463-4d5a-bd67-da6f6b375604"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-152941"
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-e8c7784a-ae81-43d4-b452-69c1f3cd4b57.json b/docs/training-reports/report-e8c7784a-ae81-43d4-b452-69c1f3cd4b57.json
new file mode 100644
index 0000000..ca66f8c
--- /dev/null
+++ b/docs/training-reports/report-e8c7784a-ae81-43d4-b452-69c1f3cd4b57.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-e8c7784a-ae81-43d4-b452-69c1f3cd4b57",
+  "timestamp": "2026-04-14T22:08:57.406399+00:00",
+  "source_trajectory_ids": [
+    "traj-13684ccc-9d94-4432-9e30-bbc6c63867c3",
+    "traj-1572c18d-e850-40e5-a174-64077d127dee",
+    "traj-21d8d6cc-82cf-4dc9-9bf3-2ed4bee40057",
+    "traj-3ff1b908-b792-4706-b8b4-8c34dbeb478e",
+    "traj-91502573-f6b7-4242-8794-c4540083911a",
+    "traj-9db8409b-0701-46b2-9170-75be6ce920e5",
+    "traj-b6cff214-92d6-498a-adb0-69de05645bae",
+    "traj-de8f5c5b-a1b1-43f0-a462-1feb799b74d4",
+    "traj-ea05f470-b5b4-44b1-9550-1d511ff336ed",
+    "traj-ffaa1acf-fb13-4afe-a1fe-9e837c6eaab1"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-220857",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-e906bfbd-7e74-4fa0-b5fc-af95662e3cbb.json b/docs/training-reports/report-e906bfbd-7e74-4fa0-b5fc-af95662e3cbb.json
new file mode 100644
index 0000000..6fc6b1f
--- /dev/null
+++ b/docs/training-reports/report-e906bfbd-7e74-4fa0-b5fc-af95662e3cbb.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-e906bfbd-7e74-4fa0-b5fc-af95662e3cbb",
+  "timestamp": "2026-04-14T20:30:08.670779+00:00",
+  "source_trajectory_ids": [
+    "traj-102953fc-c25f-467b-aa45-df989baf932d",
+    "traj-32aaed3b-2202-4501-84c9-f37da80b2b61",
+    "traj-6462d1e8-96f7-4afa-bb9d-738cf14f5b78",
+    "traj-a0f3e55c-d632-4398-87d3-0e5ac29e0004",
+    "traj-b342e4ab-e4ec-47e3-9778-82ce5b2448cd",
+    "traj-c6e60f16-bddb-40cf-b802-7791797f1dd2",
+    "traj-cc0a72ed-b91f-477f-990f-1d1d5e49d4d7",
+    "traj-e5155977-eefa-4516-9d71-5c4e1e45e895",
+    "traj-ee2f8242-c041-4ed7-872e-76f174eb7b19",
+    "traj-f47630d3-e921-4255-9d54-31446b16e80c"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-203008",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-e90f9007-de11-4d53-bb49-02ea80f41f16.json b/docs/training-reports/report-e90f9007-de11-4d53-bb49-02ea80f41f16.json
new file mode 100644
index 0000000..11f1643
--- /dev/null
+++ b/docs/training-reports/report-e90f9007-de11-4d53-bb49-02ea80f41f16.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-e90f9007-de11-4d53-bb49-02ea80f41f16",
+  "timestamp": "2026-04-15T01:36:36.721484+00:00",
+  "source_trajectory_ids": [
+    "traj-133b537d-e6ac-412b-b587-f1284e53871f",
+    "traj-2009aee1-92c6-46ed-a2e5-301b3d290aa2",
+    "traj-62e63e82-0507-4d50-b427-c1c98b7a413a",
+    "traj-9e44bc91-5a5f-45ab-8a4e-63c8f3db2863",
+    "traj-ac7b3556-f7fd-4851-a247-d711d861dca2",
+    "traj-acaf1d77-43b3-4126-b334-fc40b3acfc00",
+    "traj-b93d9b0f-4b85-4702-9434-a155fca3c42f",
+    "traj-d3c4168c-738c-429f-a905-696e99d70394",
+    "traj-d5ec2f13-247e-4145-aa3a-7fec0909763a",
+    "traj-d8413c3d-f807-49a1-9adf-2e085c285b92"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260415-013636",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-ea1359a3-de4d-41ea-bb38-f0686988dfad.json b/docs/training-reports/report-ea1359a3-de4d-41ea-bb38-f0686988dfad.json
new file mode 100644
index 0000000..4c6bc59
--- /dev/null
+++ b/docs/training-reports/report-ea1359a3-de4d-41ea-bb38-f0686988dfad.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-ea1359a3-de4d-41ea-bb38-f0686988dfad",
+  "timestamp": "2026-04-14T18:57:10.584479+00:00",
+  "source_trajectory_ids": [
+    "traj-0f8f2837-462f-4e68-81e5-95269261f9c3",
+    "traj-15dd9093-770a-4618-8f81-a44802fe739a",
+    "traj-17fb2b75-5c45-40a6-b7fb-70f5232f13ab",
+    "traj-41782cfe-0342-4316-80f6-87068179a731",
+    "traj-6fd3bd56-8bfd-49e2-85a7-7879ed9edda9",
+    "traj-729c9408-62d8-4d1f-854a-fd4f6b3b724a",
+    "traj-90a72499-3dcb-42fd-8422-a1a5707fdfb7",
+    "traj-9872615d-1abf-4dc7-839d-6d7311da5b29",
+    "traj-ab17bef8-d6de-41cf-a20f-9322534a2e6a",
+    "traj-ceac8d83-5f7a-4b4e-83a4-db73f487446f"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-eb4bdf38-2de7-4fa3-9551-38db92dc4112.json b/docs/training-reports/report-eb4bdf38-2de7-4fa3-9551-38db92dc4112.json
new file mode 100644
index 0000000..bb5f561
--- /dev/null
+++ b/docs/training-reports/report-eb4bdf38-2de7-4fa3-9551-38db92dc4112.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-eb4bdf38-2de7-4fa3-9551-38db92dc4112",
+  "timestamp": "2026-04-14T19:21:09.906852+00:00",
+  "source_trajectory_ids": [
+    "traj-3568d203-d9a8-457f-a795-f02cccd74226",
+    "traj-3efaca24-0feb-4713-bb93-1dcbe8f78e05",
+    "traj-43152ba0-8459-4c74-aede-f42320839358",
+    "traj-6d7a01bd-727d-4c61-9245-a52743781eca",
+    "traj-797b472e-ad48-460c-bfda-f96a8411c97c",
+    "traj-9070ba4b-6cac-4ac2-b2eb-086d070539e4",
+    "traj-b0513ac4-2a90-4c07-a6b2-1aba4640f4d0",
+    "traj-c7737f31-0869-47bb-9b9f-36e3650f4816",
+    "traj-cb2f8ab9-6fab-442b-af9a-e933a4e1857f",
+    "traj-f98e8f81-3dc5-4e33-9726-8b3b06fca5b3"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-eb61d92d-9e0b-42d8-88a4-3f9103d3e107.json b/docs/training-reports/report-eb61d92d-9e0b-42d8-88a4-3f9103d3e107.json
new file mode 100644
index 0000000..7725639
--- /dev/null
+++ b/docs/training-reports/report-eb61d92d-9e0b-42d8-88a4-3f9103d3e107.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-eb61d92d-9e0b-42d8-88a4-3f9103d3e107",
+  "timestamp": "2026-04-15T01:33:34.762735+00:00",
+  "source_trajectory_ids": [
+    "traj-00bb52b0-1516-4293-9c3a-da247dcd0b2b",
+    "traj-3d122123-93ef-4f7f-9f27-2a98e4eb125b",
+    "traj-4e1c3b97-e24a-4659-a53a-31ca7f50317e",
+    "traj-5bd04398-7767-40c2-860a-4b9d5feb9d0d",
+    "traj-76065493-0f85-4e97-bed8-91c7157f5505",
+    "traj-9855c8b7-86e0-4a96-9835-12fa75e8d0e5",
+    "traj-a7fa856c-76fc-42ae-889f-6f9f18ba22de",
+    "traj-ab48e55f-3106-44ad-b6b1-88c9d5e44cca",
+    "traj-bc54320b-6e2a-41a8-a234-df11f3205bea",
+    "traj-d7b3eb1f-0f60-4f0b-ba1e-84d5de43ff6e"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260415-013334",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-ec16f041-8bb5-4b1d-9511-9e69bfac708e.json b/docs/training-reports/report-ec16f041-8bb5-4b1d-9511-9e69bfac708e.json
new file mode 100644
index 0000000..3ceb6f3
--- /dev/null
+++ b/docs/training-reports/report-ec16f041-8bb5-4b1d-9511-9e69bfac708e.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-ec16f041-8bb5-4b1d-9511-9e69bfac708e",
+  "timestamp": "2026-04-14T15:50:36.264321+00:00",
+  "source_trajectory_ids": [
+    "traj-08aa6357-9b21-45b3-ac1c-9b28411977f8",
+    "traj-31a561de-fa5e-4198-8153-a514c6221614",
+    "traj-3f5ea542-7acd-4464-923c-bb9a7669c4ce",
+    "traj-487a4ee1-408d-412e-808e-0216683d12d1",
+    "traj-801ff984-f4be-45a7-a8bc-50549b44eda9",
+    "traj-b07bfe9f-97c8-4056-9570-97f890b43fef",
+    "traj-c1558560-0a56-4fe4-b1fe-03da1b894c8b",
+    "traj-d9b6253c-f864-4458-8aa9-8481b6f6dd3f",
+    "traj-ea2ed025-6f31-4714-a752-85b6e11b28f3",
+    "traj-ec0709c5-ab4c-4670-a4c7-539256f04ad1"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-ec302092-0762-4d16-aebc-9876b6bc7a73.json b/docs/training-reports/report-ec302092-0762-4d16-aebc-9876b6bc7a73.json
new file mode 100644
index 0000000..4ba92f3
--- /dev/null
+++ b/docs/training-reports/report-ec302092-0762-4d16-aebc-9876b6bc7a73.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-ec302092-0762-4d16-aebc-9876b6bc7a73",
+  "timestamp": "2026-04-15T01:29:18.092249+00:00",
+  "source_trajectory_ids": [
+    "traj-0162689c-caf0-4ea3-94f2-ab2093ea43d0",
+    "traj-163f5c06-b304-4f1d-a39b-d1ce661ab37d",
+    "traj-4ce69f4d-3f69-408e-9539-8e72db29b69c",
+    "traj-5e496f48-0f92-4a3d-b11a-cedf1b00dd63",
+    "traj-73bab35f-f2a9-44e6-b2e0-d2afbfb23ee0",
+    "traj-b7712775-b9a0-4a34-ba53-bc3da7555f80",
+    "traj-bd3c6896-8ac1-43be-82e1-35cdef359fb9",
+    "traj-c659e864-7823-4171-a435-495bd0cfabac",
+    "traj-dd684833-24e4-4fff-84c6-ac60c39dd99e",
+    "traj-e165e57d-c47a-4106-a714-e6668bfe19b5"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260415-012918",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-ece2bad2-a0c3-4e1c-8a0f-3ef7d117a020.json b/docs/training-reports/report-ece2bad2-a0c3-4e1c-8a0f-3ef7d117a020.json
new file mode 100644
index 0000000..d4ffe9c
--- /dev/null
+++ b/docs/training-reports/report-ece2bad2-a0c3-4e1c-8a0f-3ef7d117a020.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-ece2bad2-a0c3-4e1c-8a0f-3ef7d117a020",
+  "timestamp": "2026-04-15T01:21:53.722270+00:00",
+  "source_trajectory_ids": [
+    "traj-09dd77dc-cb65-4842-ae03-2f122122a4a8",
+    "traj-1d0a6fd7-e0b1-4b46-84ee-20efcc7068e0",
+    "traj-2d3ef0e1-b3f0-4489-90a6-fae5ff89612f",
+    "traj-5abdeaf3-5e14-4ec8-8ecc-d6747aef72ca",
+    "traj-6224679f-fde2-4e8d-9b79-ecca1a069fdd",
+    "traj-6c13be64-cf54-4193-bbb5-ef24fbd7807a",
+    "traj-8e38d5b7-2b09-4e89-8aaf-f569f26af6be",
+    "traj-b08fb543-4b37-4f0d-ae4a-c9456c27a82f",
+    "traj-db14df9c-e710-43d4-8458-bfe644b901ce",
+    "traj-fd6b49e0-fa1a-4149-84ae-c1259715af98"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260415-012153",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-edc77cfc-4d81-4c19-af38-251cab92970d.json b/docs/training-reports/report-edc77cfc-4d81-4c19-af38-251cab92970d.json
new file mode 100644
index 0000000..877801f
--- /dev/null
+++ b/docs/training-reports/report-edc77cfc-4d81-4c19-af38-251cab92970d.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-edc77cfc-4d81-4c19-af38-251cab92970d",
+  "timestamp": "2026-04-15T02:33:47.711827+00:00",
+  "source_trajectory_ids": [
+    "traj-11a04d19-980c-4253-bc08-dc0b12d64997",
+    "traj-508b04f7-0766-406b-98fd-06f3e646cb9f",
+    "traj-77b5433a-8c6a-4641-b874-c2d9ff3e106e",
+    "traj-8b92455b-c396-4db5-853b-ddfea0490004",
+    "traj-a9ddc0cd-b1d4-4bdf-a6df-c439bc1578e9",
+    "traj-c2aaca11-d2f9-4c22-937e-ccbe63e8813b",
+    "traj-cb03e2f0-4727-4aac-9b3d-e85a34debe3f",
+    "traj-e1af53f8-c308-4cf6-a5a2-430858cdd615",
+    "traj-ea713056-2e30-43e6-a7b5-5fb21bad52b7",
+    "traj-f53ce840-b5ad-45d1-bb3b-07161ad67487"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260415-023347",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-ef6332a7-6380-4bac-b564-07efde275167.json b/docs/training-reports/report-ef6332a7-6380-4bac-b564-07efde275167.json
new file mode 100644
index 0000000..7784a33
--- /dev/null
+++ b/docs/training-reports/report-ef6332a7-6380-4bac-b564-07efde275167.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-ef6332a7-6380-4bac-b564-07efde275167",
+  "timestamp": "2026-04-14T21:21:15.079182+00:00",
+  "source_trajectory_ids": [
+    "traj-1eba9860-e518-47dd-a740-fdab80c52396",
+    "traj-2a0349b1-6962-47df-bef1-d8ad96ffc420",
+    "traj-3115b21b-166d-4491-a11c-f4964af6a33f",
+    "traj-3e7be7f5-3de5-48ab-b1d2-8d648af7f4e6",
+    "traj-3eca0352-8b8c-4b4e-8c62-c844caae2a3a",
+    "traj-3fa32280-6653-488d-8c88-f6f9ea771191",
+    "traj-59b08da1-87fc-4739-ad8a-de335000ed7a",
+    "traj-64807cf0-c788-4627-b76b-798aca232a74",
+    "traj-760bf2da-d4a5-4e2e-844a-3985ef77895d",
+    "traj-c33fd39d-479c-421f-abf5-39704848038e"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-212115",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-f2e9a5a6-2e1e-479f-b2ed-596df0646cc7.json b/docs/training-reports/report-f2e9a5a6-2e1e-479f-b2ed-596df0646cc7.json
new file mode 100644
index 0000000..338b06c
--- /dev/null
+++ b/docs/training-reports/report-f2e9a5a6-2e1e-479f-b2ed-596df0646cc7.json
@@ -0,0 +1,41 @@
+{
+  "report_id": "report-f2e9a5a6-2e1e-479f-b2ed-596df0646cc7",
+  "timestamp": "2026-04-14T15:25:05.984437+00:00",
+  "source_trajectory_ids": [
+    "traj-001f5738-83e9-4dbc-aadf-7521abadfd40",
+    "traj-2d2e0090-3e96-427d-b74a-3e1189efb210",
+    "traj-5efe4bed-ff80-4c62-82ee-e324b9dd157c",
+    "traj-6283a946-ff0f-4360-8cff-40c3272c12b7",
+    "traj-702659a1-47b5-44a7-b402-c4fcae372e58",
+    "traj-7cf0b837-d9cf-43ee-afc2-81ca9eba0d6f",
+    "traj-9a14d982-2ee2-4023-98f3-4ae15bbd18bf",
+    "traj-be916cd5-8cb6-460d-98d9-067f4fad75f2",
+    "traj-ceafa0b8-210c-4bc9-b5aa-0192faaf3d48",
+    "traj-f4223a89-a6d7-4e5d-950e-c982243f87e3"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-152505"
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-f31d1eaf-ab94-4afc-93db-987f0c86d72f.json b/docs/training-reports/report-f31d1eaf-ab94-4afc-93db-987f0c86d72f.json
new file mode 100644
index 0000000..0ae3d93
--- /dev/null
+++ b/docs/training-reports/report-f31d1eaf-ab94-4afc-93db-987f0c86d72f.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-f31d1eaf-ab94-4afc-93db-987f0c86d72f",
+  "timestamp": "2026-04-15T01:29:18.310997+00:00",
+  "source_trajectory_ids": [
+    "traj-043c0a2b-fe0c-47f7-942e-37cd9f1a2c52",
+    "traj-2b197bf7-fbf9-49c5-a7d6-bab8fc42eb17",
+    "traj-3c404c3d-bb3b-4069-adf2-23cc85c31953",
+    "traj-53a6e6a3-0fae-49fe-b1bd-f1d246934bea",
+    "traj-55cbd2c3-e0c3-4258-8f1b-02c7bf2a722b",
+    "traj-895ae645-9e2e-4239-b26d-136f3c7a6169",
+    "traj-a19458bf-6b48-46eb-b5b5-1a67d924faf4",
+    "traj-d438f7ce-b221-44da-9bb5-0e1bbf1bf08d",
+    "traj-d9182e76-e379-46f8-a250-5c46138f564c",
+    "traj-f6f774df-1db7-42e0-9bb0-f0aa3dbb4e9e"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260415-012918",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-f614c9d8-c842-4ded-9133-f12f7e335170.json b/docs/training-reports/report-f614c9d8-c842-4ded-9133-f12f7e335170.json
new file mode 100644
index 0000000..8b4140f
--- /dev/null
+++ b/docs/training-reports/report-f614c9d8-c842-4ded-9133-f12f7e335170.json
@@ -0,0 +1,41 @@
+{
+  "report_id": "report-f614c9d8-c842-4ded-9133-f12f7e335170",
+  "timestamp": "2026-04-14T18:06:58.399935+00:00",
+  "source_trajectory_ids": [
+    "traj-2fb36bd0-d817-49a5-b3e6-56438fda87f0",
+    "traj-60e6ea02-5d69-4f87-82dd-2d115a0fb374",
+    "traj-9200a377-e6ef-4d92-b179-0198e98d7223",
+    "traj-94fff4cb-960d-4888-8c5a-355139ff2c47",
+    "traj-9a524afd-a257-4358-9f6e-211598546789",
+    "traj-cfb9a25d-26b1-43e0-8d9a-5c8199e790e1",
+    "traj-d7d4c8be-12a4-45fe-81e2-d89d49e82222",
+    "traj-ec8805d1-ec26-40ec-bde2-fc84b15e5f23",
+    "traj-f1bd10c8-39ae-48bc-878f-7e858b116185",
+    "traj-f3f8e2b2-7312-465d-95e2-3cc557ce6630"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-180658"
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-f7cc3f6c-f59d-4e4d-9dca-fcb871db1610.json b/docs/training-reports/report-f7cc3f6c-f59d-4e4d-9dca-fcb871db1610.json
new file mode 100644
index 0000000..cde1c0e
--- /dev/null
+++ b/docs/training-reports/report-f7cc3f6c-f59d-4e4d-9dca-fcb871db1610.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-f7cc3f6c-f59d-4e4d-9dca-fcb871db1610",
+  "timestamp": "2026-04-14T21:22:02.917939+00:00",
+  "source_trajectory_ids": [
+    "traj-1da5679f-c0f8-490b-b08c-903522c31a0a",
+    "traj-48c2b7d6-6fe6-43db-9e92-bf845b2ab039",
+    "traj-7dc29339-2171-4562-af26-66ad33333cea",
+    "traj-7e2d006f-eab7-4a6f-b45f-3a1cd336f311",
+    "traj-8b18e462-7b9a-48d0-8106-27bc0da7b559",
+    "traj-8fb9dfcf-c792-4080-9897-23f71262ca58",
+    "traj-a258951e-5e02-4594-8600-8baa199f1cf3",
+    "traj-e593ee82-9029-4896-aac6-5599dc41975e",
+    "traj-e807c756-0bb9-474f-9664-4f3f1a01113d",
+    "traj-fc39007a-f9a8-4592-adb4-69e51ab99714"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-f83a4926-75db-4068-9162-e10b3f1e1f04.json b/docs/training-reports/report-f83a4926-75db-4068-9162-e10b3f1e1f04.json
new file mode 100644
index 0000000..6cc3c72
--- /dev/null
+++ b/docs/training-reports/report-f83a4926-75db-4068-9162-e10b3f1e1f04.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-f83a4926-75db-4068-9162-e10b3f1e1f04",
+  "timestamp": "2026-04-14T14:58:26.730544+00:00",
+  "source_trajectory_ids": [
+    "traj-1c6f3ea2-dd6d-4f3d-86a8-7993c299bb17",
+    "traj-2210d72c-4be6-4c03-891e-17188094863f",
+    "traj-286fefbc-1242-4a01-82c6-504b524180e6",
+    "traj-3c66b440-7854-47fe-aa20-16015d87abc5",
+    "traj-6fe14a99-029d-4613-8ef1-5ff8f771b6d5",
+    "traj-73aa330c-05cc-4f69-b115-2884e9e71893",
+    "traj-9e60ffa0-2303-463d-bb2c-c1466ae52fc6",
+    "traj-a5454786-4882-4e76-9e9a-78f7402296c9",
+    "traj-b074de5f-5d0b-49e4-83aa-528d94881baa",
+    "traj-b21963be-1683-4287-b037-ff4d206a5583"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-f86438b5-d5a5-45c9-9a02-edc6fc1beea6.json b/docs/training-reports/report-f86438b5-d5a5-45c9-9a02-edc6fc1beea6.json
new file mode 100644
index 0000000..7fb69f2
--- /dev/null
+++ b/docs/training-reports/report-f86438b5-d5a5-45c9-9a02-edc6fc1beea6.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-f86438b5-d5a5-45c9-9a02-edc6fc1beea6",
+  "timestamp": "2026-04-14T19:41:58.818358+00:00",
+  "source_trajectory_ids": [
+    "traj-000c5d65-b669-4762-b1ec-ff532e4f175b",
+    "traj-2a5e6577-cce2-4dcb-8e6c-342882dc80be",
+    "traj-4a06eada-beb2-40a7-889f-65c30690ba8d",
+    "traj-4a752185-8f2b-41ed-b37e-e27a48f5cf65",
+    "traj-57c9952e-0e3c-4f15-b848-304fbdedd1af",
+    "traj-6113d0eb-95ed-4087-8761-93797aef2de7",
+    "traj-6f942715-d773-4dcb-8e17-390ad54e6a1c",
+    "traj-84022ea5-84d2-4bee-84c9-4145d1ce2252",
+    "traj-b09eb135-5dd2-45fc-931a-462b0820ad69",
+    "traj-cbd0f9e5-c314-474f-a89a-f5c08f7094c4"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-facd9bf0-010d-4367-acc7-9f3ecf45ff2a.json b/docs/training-reports/report-facd9bf0-010d-4367-acc7-9f3ecf45ff2a.json
new file mode 100644
index 0000000..cc3699d
--- /dev/null
+++ b/docs/training-reports/report-facd9bf0-010d-4367-acc7-9f3ecf45ff2a.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-facd9bf0-010d-4367-acc7-9f3ecf45ff2a",
+  "timestamp": "2026-04-14T21:22:02.838599+00:00",
+  "source_trajectory_ids": [
+    "traj-1e2f6e46-e1f7-4f85-96e5-2c0a402b4730",
+    "traj-68aa8a8d-0a4e-4b7d-9a9a-4dabdd9314ba",
+    "traj-7731fbbb-5783-4448-bbd8-33d490678333",
+    "traj-7b0500f4-5a59-4027-8158-163a8c418cf4",
+    "traj-8c978479-17bd-4a26-8f92-e96cfc9c0939",
+    "traj-b1c96539-10c6-4846-b1a8-60dc1bcd530b",
+    "traj-c3e65f27-e23f-4671-bad0-a93b45312dc7",
+    "traj-d337c078-637a-43ac-b40d-25cb7fd9c7a5",
+    "traj-e9f0dacb-86bd-49f9-a3eb-3dd58c8cd3b2",
+    "traj-fae22657-16c7-4936-8f9b-e68c7044fb87"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-212202",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-fb17f5e0-1dda-45bc-9b62-e81f630036fc.json b/docs/training-reports/report-fb17f5e0-1dda-45bc-9b62-e81f630036fc.json
new file mode 100644
index 0000000..54c9db7
--- /dev/null
+++ b/docs/training-reports/report-fb17f5e0-1dda-45bc-9b62-e81f630036fc.json
@@ -0,0 +1,45 @@
+{
+  "report_id": "report-fb17f5e0-1dda-45bc-9b62-e81f630036fc",
+  "timestamp": "2026-04-14T18:58:16.333603+00:00",
+  "source_trajectory_ids": [
+    "traj-05893f5d-61e8-49bf-a396-dda96476bf3f",
+    "traj-20d60006-2a38-44f9-90a2-dce4d2f10f57",
+    "traj-2a867bbd-0dca-413f-917b-5c38013f8c34",
+    "traj-6abe7a50-b7f4-426a-99ea-84c6601b7cc6",
+    "traj-7464de1e-e8a9-4057-a457-9ce9f0b87fe4",
+    "traj-985f0d65-67ca-42d2-9ec8-93a97eb393df",
+    "traj-a3e1a3d3-6a9d-4d9c-9630-a14fc6e59364",
+    "traj-a5876914-5344-498d-92f2-e0be0345508c",
+    "traj-d63ccd70-74cd-4120-9cb4-3e81625440b2",
+    "traj-e0cdedd2-5bbb-4703-ad55-791637c624c0"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Reward delta 0.0000 below minimum 1.0"
+    ],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": null,
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-fbee2bc3-2d42-4054-8977-0de42e4f9677.json b/docs/training-reports/report-fbee2bc3-2d42-4054-8977-0de42e4f9677.json
new file mode 100644
index 0000000..893d2a8
--- /dev/null
+++ b/docs/training-reports/report-fbee2bc3-2d42-4054-8977-0de42e4f9677.json
@@ -0,0 +1,41 @@
+{
+  "report_id": "report-fbee2bc3-2d42-4054-8977-0de42e4f9677",
+  "timestamp": "2026-04-14T18:28:06.051428+00:00",
+  "source_trajectory_ids": [
+    "traj-1806edd7-1665-464d-8fe8-81e08b68508c",
+    "traj-1853aaf9-58cd-482d-add6-20732f06d63f",
+    "traj-18d7697a-b1dc-4756-aee5-05f9abdc92a9",
+    "traj-5be27c08-5b53-45c6-9426-25bcc1c9e586",
+    "traj-aa39d764-ad5c-4f6f-b111-6865899d2039",
+    "traj-b5eb758a-12b7-4a88-8706-b0dd75af3682",
+    "traj-b73439b8-a2a2-4b95-a664-988d94633db9",
+    "traj-bc97b4da-12c5-4c11-9411-55c09801bb77",
+    "traj-c197926c-7548-4fc5-8411-eb524e80f8f4",
+    "traj-fd6129ca-2a05-4fad-b498-5b648ae60313"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-182806"
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-fc2280f9-4f45-4c80-863f-9f3230680adf.json b/docs/training-reports/report-fc2280f9-4f45-4c80-863f-9f3230680adf.json
new file mode 100644
index 0000000..7389a8b
--- /dev/null
+++ b/docs/training-reports/report-fc2280f9-4f45-4c80-863f-9f3230680adf.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-fc2280f9-4f45-4c80-863f-9f3230680adf",
+  "timestamp": "2026-04-14T21:42:45.963717+00:00",
+  "source_trajectory_ids": [
+    "traj-04ce8768-6cdf-4845-a641-4a5399053067",
+    "traj-6b1f94e1-effd-4194-be51-14fc8df6c591",
+    "traj-739ff5b8-6fd7-4626-81bc-6f8090a51af6",
+    "traj-7452c5c4-4024-465c-8c77-6dbd5ea25a92",
+    "traj-79cd3a80-9ae2-419a-9862-cd1d75791dab",
+    "traj-94e88578-9401-4e6f-af9d-1bdae819061f",
+    "traj-96f4c259-0f1f-441d-8804-4fc44cd9596c",
+    "traj-d5525102-21c8-4d3c-aaa5-c28be4f954a4",
+    "traj-fb5b223a-6284-4b29-88cd-4c54a4706a15",
+    "traj-feba8df4-9acc-423d-a9e6-d57fc11e8121"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-214245",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-fc5436d6-b262-4bd1-b49e-868ed99182ac.json b/docs/training-reports/report-fc5436d6-b262-4bd1-b49e-868ed99182ac.json
new file mode 100644
index 0000000..a8eddb5
--- /dev/null
+++ b/docs/training-reports/report-fc5436d6-b262-4bd1-b49e-868ed99182ac.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-fc5436d6-b262-4bd1-b49e-868ed99182ac",
+  "timestamp": "2026-04-15T02:31:17.315117+00:00",
+  "source_trajectory_ids": [
+    "traj-44949aa6-dd35-4dd1-9559-e57ae1858444",
+    "traj-5e16afa7-2990-4fde-b21a-e34e6b1bd4d7",
+    "traj-7d92d69d-0718-4486-89e7-1b8646058403",
+    "traj-84ede30b-c5bf-43f4-85b7-0876cc551cb9",
+    "traj-bbbb4bfd-5a77-4784-a3fb-64b7ef345cff",
+    "traj-c17c9054-f6c6-4fe9-a146-81b927c1f422",
+    "traj-d53098c2-e164-4539-bc67-88625530eac5",
+    "traj-d8c78344-76c8-417d-bc40-135e6a6b132f",
+    "traj-de33c53a-a9e2-4015-8005-64b98d6fe5c1",
+    "traj-ee7a4e9f-bdf8-4a82-8829-fd02d60a045a"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260415-023117",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-fe3e7a38-a614-4a5d-966a-917f76bbb2a1.json b/docs/training-reports/report-fe3e7a38-a614-4a5d-966a-917f76bbb2a1.json
new file mode 100644
index 0000000..0b13cf3
--- /dev/null
+++ b/docs/training-reports/report-fe3e7a38-a614-4a5d-966a-917f76bbb2a1.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-fe3e7a38-a614-4a5d-966a-917f76bbb2a1",
+  "timestamp": "2026-04-14T19:21:09.845980+00:00",
+  "source_trajectory_ids": [
+    "traj-008b6a44-c3f9-471c-877a-7d611030dc20",
+    "traj-07292070-2eb1-4f02-bf11-100203c05ee6",
+    "traj-1f81e7ba-64be-4375-bade-2be3a7a664ba",
+    "traj-295fcff4-376d-4070-af6b-373692f36397",
+    "traj-2da2ec0d-4ca4-4a1e-b506-9605a1f3b1e9",
+    "traj-595b8736-b78d-419f-9053-2159ce9120be",
+    "traj-76d980b2-94c8-4a9e-ab06-8ea3ba1e1714",
+    "traj-8a315c2a-cb9d-429e-a092-d7d0e1fd33b2",
+    "traj-d97d51a1-af38-46e2-9701-22721a93fc8a",
+    "traj-f2c72660-912a-467e-b2f9-a42253485446"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-192109",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-feb730d9-1798-4ca6-aa23-808735bcf1f0.json b/docs/training-reports/report-feb730d9-1798-4ca6-aa23-808735bcf1f0.json
new file mode 100644
index 0000000..1aaf62c
--- /dev/null
+++ b/docs/training-reports/report-feb730d9-1798-4ca6-aa23-808735bcf1f0.json
@@ -0,0 +1,41 @@
+{
+  "report_id": "report-feb730d9-1798-4ca6-aa23-808735bcf1f0",
+  "timestamp": "2026-04-14T15:01:23.368695+00:00",
+  "source_trajectory_ids": [
+    "traj-21260677-0787-42a3-b8bd-6c95997ae207",
+    "traj-433b056e-c53a-43c3-8dd8-67b13348b777",
+    "traj-4b742a19-8d3f-4f03-bd27-30a0037a6922",
+    "traj-5766dd21-89ea-410c-b7f9-683ec31c6688",
+    "traj-7e676561-ad0d-42ff-8816-efa7d046bd50",
+    "traj-91077d5b-f8bf-4c87-8d09-d6e093b5bb88",
+    "traj-b5efa56d-f6c0-458c-9e36-7eacd5e3aecd",
+    "traj-c1ae0253-b64c-4729-8091-86f4cf1db32f",
+    "traj-dac9bf7d-93a8-4152-9e15-cd9d3ec106ac",
+    "traj-fd5264e2-8196-440d-824d-178554b9f3b5"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-150123"
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-ff04fe09-ef40-4a66-9686-d40e1e44a435.json b/docs/training-reports/report-ff04fe09-ef40-4a66-9686-d40e1e44a435.json
new file mode 100644
index 0000000..cc77888
--- /dev/null
+++ b/docs/training-reports/report-ff04fe09-ef40-4a66-9686-d40e1e44a435.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-ff04fe09-ef40-4a66-9686-d40e1e44a435",
+  "timestamp": "2026-04-14T22:08:57.386324+00:00",
+  "source_trajectory_ids": [
+    "traj-01aac751-1501-416a-9dc0-6a468a475b86",
+    "traj-576398f8-1d5a-48c9-b034-773996eb879b",
+    "traj-612968e5-793c-4837-a47d-ba9ae22a1b29",
+    "traj-6dac8d03-c298-423b-9b69-f3bd76274710",
+    "traj-961d6d99-672d-41b2-a1b5-67aeb2de2adf",
+    "traj-a5ea2801-0117-4ec3-859b-ecef47344247",
+    "traj-a97fe01e-4d79-4db4-8e56-8bac791ce374",
+    "traj-c866e29e-706b-41da-aa85-ba32786b1589",
+    "traj-d909c5a3-00c7-49e9-9a5e-871622fe5c1c",
+    "traj-f468128e-3e64-4537-99d8-39168035c1cf"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": 0.0,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": 0.0,
+      "baseline_avg_reward": 0.44,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-220857",
+  "baseline_version_id": null,
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-ff8f03c8-8567-4a59-af68-fc54fe4f5240.json b/docs/training-reports/report-ff8f03c8-8567-4a59-af68-fc54fe4f5240.json
new file mode 100644
index 0000000..0ae04b5
--- /dev/null
+++ b/docs/training-reports/report-ff8f03c8-8567-4a59-af68-fc54fe4f5240.json
@@ -0,0 +1,43 @@
+{
+  "report_id": "report-ff8f03c8-8567-4a59-af68-fc54fe4f5240",
+  "timestamp": "2026-04-14T18:58:16.279667+00:00",
+  "source_trajectory_ids": [
+    "traj-02b7fb3c-5b02-4951-9591-595dd475bbb2",
+    "traj-09e1f07a-a743-4ed4-8960-3714fcdf9b63",
+    "traj-0e2dc646-2625-4875-aaef-1509c73d6318",
+    "traj-11563e48-055c-48b7-9811-ea869d92942e",
+    "traj-3032352a-d96b-4eea-8f98-361689d8c2a2",
+    "traj-77944f35-49b0-4673-b74a-4721236d6bf7",
+    "traj-90ca8f92-7a02-4f4b-85ae-46acfba2710c",
+    "traj-aac78c6e-3ad4-4d52-9d77-1e734e9b267f",
+    "traj-d2e80efb-0a35-4a3d-82a6-50161a81693c",
+    "traj-d3d3f7f6-52e6-4142-8e5a-6a98c54636b7"
+  ],
+  "sample_count": 10,
+  "baseline_metrics": {
+    "task_count": 1,
+    "avg_reward": 1.032,
+    "error_rate": 0.0,
+    "avg_latency_ms": 42.0
+  },
+  "challenger_metrics": {
+    "task_count": 1,
+    "avg_reward": 0.44,
+    "error_rate": 0.0,
+    "avg_latency_ms": 0.0
+  },
+  "promotion_decision": {
+    "accepted": true,
+    "reasons": [],
+    "metrics": {
+      "reward_delta": -0.592,
+      "error_rate_delta": 0.0,
+      "latency_delta_ms": -42.0,
+      "baseline_avg_reward": 1.032,
+      "challenger_avg_reward": 0.44
+    }
+  },
+  "promoted_version_id": "20260414-185816",
+  "baseline_version_id": "v-baseline",
+  "dry_run": false
+}
\ No newline at end of file
diff --git a/docs/training-reports/report-skipped-0.json b/docs/training-reports/report-skipped-0.json
new file mode 100644
index 0000000..82de6ac
--- /dev/null
+++ b/docs/training-reports/report-skipped-0.json
@@ -0,0 +1,17 @@
+{
+  "report_id": "report-skipped-0",
+  "timestamp": "2026-04-15T02:33:47.840638+00:00",
+  "source_trajectory_ids": [],
+  "sample_count": 0,
+  "baseline_metrics": {},
+  "challenger_metrics": {},
+  "promotion_decision": {
+    "accepted": false,
+    "reasons": [
+      "Too few new trajectories (2 < 5)"
+    ],
+    "metrics": {}
+  },
+  "promoted_version_id": null,
+  "skipped": true
+}
\ No newline at end of file
diff --git a/pyproject.toml b/pyproject.toml
new file mode 100644
index 0000000..6d98fb0
--- /dev/null
+++ b/pyproject.toml
@@ -0,0 +1,27 @@
+[project]
+name = "memabra"
+version = "0.1.0"
+description = "An intuition-driven control plane for agent memory and action selection."
+readme = "README.md"
+requires-python = ">=3.11"
+dependencies = [
+    "pyyaml>=6.0",
+]
+
+[project.optional-dependencies]
+dev = [
+    "pytest>=7.0",
+]
+
+[project.scripts]
+memabra = "memabra.cli:main"
+
+[build-system]
+requires = ["setuptools>=61.0", "wheel"]
+build-backend = "setuptools.build_meta"
+
+[tool.setuptools.packages.find]
+where = ["src"]
+
+[tool.pytest.ini_options]
+testpaths = ["tests"]
diff --git a/src/memabra/__init__.py b/src/memabra/__init__.py
new file mode 100644
index 0000000..0874f52
--- /dev/null
+++ b/src/memabra/__init__.py
@@ -0,0 +1,73 @@
+"""memabra: intuition-driven control plane for agent memory and action selection."""
+
+from . import (
+    app,
+    artifact_index,
+    benchmarks,
+    candidate_types,
+    case_index,
+    dataset,
+    evaluator,
+    execution,
+    memory_store,
+    online_learning,
+    outcome,
+    persistence,
+    promotion,
+    replay,
+    retrieval,
+    reward,
+    router,
+    router_versioning,
+    runner,
+    schemas,
+    telemetry,
+    training_reports,
+    trajectory_summary,
+)
+from .benchmarks import BenchmarkSuite, BenchmarkTask
+from .case_index import CaseIndex
+from .online_learning import OnlineLearningCoordinator
+from .promotion import PromotionDecision, PromotionPolicy
+from .training_reports import TrainingReportStore
+
+__all__ = [
+    "app",
+    "artifact_index",
+    "benchmarks",
+    "BenchmarkSuite",
+    "BenchmarkTask",
+    "candidate_types",
+    "case_index",
+    "CaseIndex",
+    "cli",
+    "dataset",
+    "evaluator",
+    "execution",
+    "memory_store",
+    "online_learning",
+    "OnlineLearningCoordinator",
+    "outcome",
+    "persistence",
+    "promotion",
+    "PromotionDecision",
+    "PromotionPolicy",
+    "replay",
+    "retrieval",
+    "reward",
+    "router",
+    "router_versioning",
+    "runner",
+    "schemas",
+    "telemetry",
+    "training_reports",
+    "trajectory_summary",
+    "TrainingReportStore",
+]
+
+
+def __getattr__(name: str):
+    if name == "cli":
+        from . import cli as _cli
+        return _cli
+    raise AttributeError(f"module {__name__!r} has no attribute {name!r}")
diff --git a/src/memabra/app.py b/src/memabra/app.py
new file mode 100644
index 0000000..37562f5
--- /dev/null
+++ b/src/memabra/app.py
@@ -0,0 +1,308 @@
+from __future__ import annotations
+
+from dataclasses import dataclass
+from pathlib import Path
+from typing import Any
+
+from .artifact_index import ArtifactIndex
+from .candidate_types import CandidateObject
+from .case_index import CaseIndex
+from .dataset import DatasetBuilder
+from .execution import ExecutionEngine, FileSystemSkillBackend
+from .memory_store import InMemoryMemoryStore, MemoryRecord, MemorySource
+from .online_learning import OnlineLearningCoordinator
+from .persistence import PersistenceStore
+from .promotion import PromotionPolicy
+from .replay import ReplaySummary, TrajectoryReplay
+from .retrieval import CandidateRetriever, InMemoryCandidateProvider
+from .router import RouterProtocol, RuleBasedRouter, SimpleLearningRouter, TaskContext
+from .router_versioning import RouterVersionStore
+from .runner import MemabraRunner
+
+
+class DemoToolBackend:
+    def run_tool(self, tool_id: str, context: TaskContext, params: dict | None = None) -> dict:
+        return {
+            "status": "success",
+            "output": f"demo-result-for:{tool_id}",
+            "error": None,
+            "latency_ms": 42,
+        }
+
+
+class DemoSkillBackend:
+    def load_skill(self, skill_id: str) -> dict:
+        return {
+            "skill_id": skill_id,
+            "instructions": "Demo skill payload loaded successfully.",
+        }
+
+
+@dataclass(slots=True)
+class MemabraApp:
+    runner: MemabraRunner
+    persistence_store: PersistenceStore
+    case_index: CaseIndex | None = None
+
+    def run_task(self, user_input: str, *, channel: str = "local", user_id: str | None = None) -> dict:
+        return self.runner.run(
+            context=TaskContext(user_input=user_input),
+            channel=channel,
+            user_id=user_id,
+            persist=True,
+        )
+
+    def replay_summary(self) -> ReplaySummary:
+        return TrajectoryReplay().summarize_persistence_store(self.persistence_store)
+
+    def artifact_index(self) -> ArtifactIndex:
+        return ArtifactIndex(persistence_store=self.persistence_store)
+
+    def set_router(self, router: RouterProtocol) -> None:
+        self.runner.router = router
+
+    def train_learning_router(self) -> SimpleLearningRouter:
+        index = self.artifact_index()
+        trajectories = index.query()
+        if not trajectories:
+            return SimpleLearningRouter()
+        builder = DatasetBuilder()
+        samples = builder.build(trajectories)
+        router = SimpleLearningRouter()
+        router.fit(samples)
+        return router
+
+    def save_learning_router(
+        self,
+        version_id: str | None = None,
+        base_dir: str | Path = "docs/projects/memabra/router-versions",
+        metadata: dict[str, Any] | None = None,
+    ) -> dict[str, Any]:
+        if not isinstance(self.runner.router, SimpleLearningRouter):
+            raise TypeError("Current router is not a SimpleLearningRouter.")
+        store = RouterVersionStore(base_dir=base_dir)
+        return store.save(self.runner.router, version_id=version_id, metadata=metadata)
+
+    def load_learning_router(
+        self,
+        version_id: str | None = None,
+        base_dir: str | Path = "docs/projects/memabra/router-versions",
+    ) -> SimpleLearningRouter:
+        store = RouterVersionStore(base_dir=base_dir)
+        router = store.load(version_id)
+        self.runner.router = router
+        return router
+
+    def list_router_versions(
+        self,
+        base_dir: str | Path = "docs/projects/memabra/router-versions",
+    ) -> list[dict[str, Any]]:
+        store = RouterVersionStore(base_dir=base_dir)
+        return store.list_versions()
+
+    def run_online_learning_cycle(
+        self,
+        policy: PromotionPolicy,
+        benchmark_tasks: list,
+        min_new_trajectories: int = 5,
+        version_store_base_dir: str | Path = "docs/projects/memabra/router-versions",
+        report_store_base_dir: str | Path = "docs/projects/memabra/training-reports",
+        seen_trajectory_store: str | Path | None = None,
+        dry_run: bool = False,
+        baseline_version_id: str | None = None,
+        case_index_path: str | Path | None = None,
+    ) -> dict[str, Any]:
+        coordinator = OnlineLearningCoordinator(
+            app=self,
+            policy=policy,
+            benchmark_tasks=benchmark_tasks,
+            min_new_trajectories=min_new_trajectories,
+            version_store_base_dir=version_store_base_dir,
+            report_store_base_dir=report_store_base_dir,
+            seen_trajectory_store=seen_trajectory_store,
+            case_index_path=case_index_path,
+        )
+        return coordinator.run_cycle(dry_run=dry_run, baseline_version_id=baseline_version_id)
+
+    def build_case_index(self) -> CaseIndex:
+        index = self.artifact_index()
+        case_index = CaseIndex()
+        for trajectory in index.query():
+            case_index.add(trajectory)
+        self.case_index = case_index
+        self.runner.case_index = case_index
+        return case_index
+
+    def save_case_index(self, path: str | Path) -> None:
+        if self.case_index is None:
+            raise RuntimeError("No case index loaded. Call build_case_index() or load_case_index() first.")
+        self.case_index.save(path)
+
+    def load_case_index(self, path: str | Path) -> CaseIndex:
+        case_index = CaseIndex.load(path)
+        self.case_index = case_index
+        self.runner.case_index = case_index
+        return case_index
+
+    def best_trajectory_for(self, input_text: str) -> str | None:
+        if self.case_index is None:
+            return None
+        return self.case_index.best(input_text)
+
+
+def build_demo_app(*, base_dir: str | Path = "artifacts") -> MemabraApp:
+    memory_store = InMemoryMemoryStore()
+    memory_store.upsert(
+        MemoryRecord(
+            id="mem-telegram-pref",
+            memory_type="semantic",
+            fact_status="verified",
+            content="Prefer plain text on Telegram.",
+            summary="Telegram plain-text preference",
+            source=MemorySource(kind="user", ref="demo-seed"),
+            confidence=0.95,
+            tags=["telegram", "output"],
+        )
+    )
+
+    providers = [
+        InMemoryCandidateProvider(
+            candidate_type="memory",
+            candidates=[
+                CandidateObject(
+                    id="mem-telegram-pref",
+                    type="memory",
+                    title="Telegram preference",
+                    summary="Prefer plain text on Telegram.",
+                    triggers=["telegram", "preference", "answer"],
+                    confidence=0.95,
+                    success_rate=0.9,
+                    freshness=0.9,
+                    tags=["output"],
+                    source="user",
+                )
+            ],
+        ),
+        InMemoryCandidateProvider(
+            candidate_type="skill",
+            candidates=[
+                CandidateObject(
+                    id="skill-deploy",
+                    type="skill",
+                    title="Deploy workflow",
+                    summary="Reusable deployment workflow.",
+                    triggers=["deploy", "workflow", "service"],
+                    confidence=0.8,
+                    success_rate=0.9,
+                    freshness=0.8,
+                    tags=["ops"],
+                    source="system",
+                )
+            ],
+        ),
+        InMemoryCandidateProvider(
+            candidate_type="tool",
+            candidates=[
+                CandidateObject(
+                    id="tool-terminal",
+                    type="tool",
+                    title="terminal",
+                    summary="Run terminal-style inspection commands.",
+                    triggers=["check", "current", "status", "system"],
+                    confidence=0.95,
+                    success_rate=0.9,
+                    freshness=1.0,
+                    tags=["inspection"],
+                    source="system",
+                )
+            ],
+        ),
+    ]
+
+    persistence_store = PersistenceStore(base_dir=base_dir)
+    runner = MemabraRunner(
+        retriever=CandidateRetriever(providers),
+        router=RuleBasedRouter(),
+        execution_engine=ExecutionEngine(tool_backend=DemoToolBackend(), skill_backend=DemoSkillBackend()),
+        persistence_store=persistence_store,
+        memory_store=memory_store,
+    )
+    return MemabraApp(runner=runner, persistence_store=persistence_store)
+
+
+def build_app_with_skills(
+    *,
+    base_dir: str | Path = "artifacts",
+    skill_search_paths: list[str | Path] | None = None,
+) -> MemabraApp:
+    """Build a MemabraApp that loads real skills from the filesystem.
+
+    By default it searches ~/.hermes/skills for SKILL.md files.
+    If a requested skill_id is not found on disk, the skill executor
+    will return an error payload in the trajectory events.
+    """
+    memory_store = InMemoryMemoryStore()
+    memory_store.upsert(
+        MemoryRecord(
+            id="mem-telegram-pref",
+            memory_type="semantic",
+            fact_status="verified",
+            content="Prefer plain text on Telegram.",
+            summary="Telegram plain-text preference",
+            source=MemorySource(kind="user", ref="demo-seed"),
+            confidence=0.95,
+            tags=["telegram", "output"],
+        )
+    )
+
+    providers = [
+        InMemoryCandidateProvider(
+            candidate_type="memory",
+            candidates=[
+                CandidateObject(
+                    id="mem-telegram-pref",
+                    type="memory",
+                    title="Telegram preference",
+                    summary="Prefer plain text on Telegram.",
+                    triggers=["telegram", "preference", "answer"],
+                    confidence=0.95,
+                    success_rate=0.9,
+                    freshness=0.9,
+                    tags=["output"],
+                    source="user",
+                )
+            ],
+        ),
+        InMemoryCandidateProvider(
+            candidate_type="skill",
+            candidates=[],
+        ),
+        InMemoryCandidateProvider(
+            candidate_type="tool",
+            candidates=[
+                CandidateObject(
+                    id="tool-terminal",
+                    type="tool",
+                    title="terminal",
+                    summary="Run terminal-style inspection commands.",
+                    triggers=["check", "current", "status", "system"],
+                    confidence=0.95,
+                    success_rate=0.9,
+                    freshness=1.0,
+                    tags=["inspection"],
+                    source="system",
+                )
+            ],
+        ),
+    ]
+
+    skill_backend = FileSystemSkillBackend(search_paths=skill_search_paths)
+    persistence_store = PersistenceStore(base_dir=base_dir)
+    runner = MemabraRunner(
+        retriever=CandidateRetriever(providers),
+        router=RuleBasedRouter(),
+        execution_engine=ExecutionEngine(tool_backend=DemoToolBackend(), skill_backend=skill_backend),
+        persistence_store=persistence_store,
+        memory_store=memory_store,
+    )
+    return MemabraApp(runner=runner, persistence_store=persistence_store)
diff --git a/src/memabra/artifact_index.py b/src/memabra/artifact_index.py
new file mode 100644
index 0000000..55e56e4
--- /dev/null
+++ b/src/memabra/artifact_index.py
@@ -0,0 +1,104 @@
+from __future__ import annotations
+
+from dataclasses import dataclass, field
+from pathlib import Path
+from typing import Any
+
+from .persistence import PersistenceStore
+
+
+@dataclass
+class ArtifactIndex:
+    persistence_store: PersistenceStore | None = None
+    base_dir: str | Path | None = None
+    _trajectories: list[dict[str, Any]] = field(default_factory=list, repr=False)
+
+    def __post_init__(self):
+        if self.persistence_store is None and self.base_dir is None:
+            raise ValueError("Either persistence_store or base_dir must be provided")
+        self.refresh()
+
+    def refresh(self) -> None:
+        paths = self._list_trajectory_paths()
+        self._trajectories = []
+        for path in paths:
+            try:
+                trajectory = self._load_trajectory(path)
+                self._trajectories.append(trajectory)
+            except Exception:
+                continue
+
+    def query(
+        self,
+        *,
+        status: str | None = None,
+        min_reward: float | None = None,
+        max_reward: float | None = None,
+        decision_type: str | None = None,
+        channel: str | None = None,
+        min_tool_errors: int | None = None,
+        min_user_corrections: int | None = None,
+        input_contains: str | None = None,
+    ) -> list[dict[str, Any]]:
+        results = []
+        for trajectory in self._trajectories:
+            if status is not None and trajectory["outcome"]["status"] != status:
+                continue
+            reward_total = trajectory["reward"]["total"]
+            if min_reward is not None and reward_total < min_reward:
+                continue
+            if max_reward is not None and reward_total > max_reward:
+                continue
+            if decision_type is not None:
+                decisions = trajectory.get("decisions", [])
+                if not any(d["decision_type"] == decision_type for d in decisions):
+                    continue
+            if channel is not None and trajectory["task"]["channel"] != channel:
+                continue
+            if min_tool_errors is not None and trajectory["outcome"]["tool_errors"] < min_tool_errors:
+                continue
+            if min_user_corrections is not None and trajectory["outcome"]["user_corrections"] < min_user_corrections:
+                continue
+            if input_contains is not None:
+                task_input = trajectory["task"]["input"]
+                if input_contains.lower() not in task_input.lower():
+                    continue
+            results.append(trajectory)
+        return results
+
+    def slice_dataset(
+        self,
+        *,
+        status: str | None = None,
+        min_reward: float | None = None,
+        max_reward: float | None = None,
+        decision_type: str | None = None,
+        channel: str | None = None,
+        min_tool_errors: int | None = None,
+        min_user_corrections: int | None = None,
+        input_contains: str | None = None,
+    ) -> list[str]:
+        results = self.query(
+            status=status,
+            min_reward=min_reward,
+            max_reward=max_reward,
+            decision_type=decision_type,
+            channel=channel,
+            min_tool_errors=min_tool_errors,
+            min_user_corrections=min_user_corrections,
+            input_contains=input_contains,
+        )
+        return [r["trajectory_id"] for r in results]
+
+    def _list_trajectory_paths(self) -> list[Path]:
+        if self.persistence_store is not None:
+            return self.persistence_store.list_trajectory_paths()
+        return sorted(Path(self.base_dir).glob("*.json"))
+
+    def _load_trajectory(self, path: Path) -> dict[str, Any]:
+        if self.persistence_store is not None:
+            trajectory_id = path.stem
+            return self.persistence_store.load_trajectory(trajectory_id)
+        import json
+
+        return json.loads(path.read_text(encoding="utf-8"))
diff --git a/src/memabra/benchmarks.py b/src/memabra/benchmarks.py
new file mode 100644
index 0000000..f0e172d
--- /dev/null
+++ b/src/memabra/benchmarks.py
@@ -0,0 +1,63 @@
+from __future__ import annotations
+
+import json
+from dataclasses import dataclass, field
+from pathlib import Path
+from typing import Any
+
+from .evaluator import BenchmarkTask
+
+
+@dataclass(slots=True)
+class BenchmarkSuite:
+    name: str
+    tasks: list[BenchmarkTask] = field(default_factory=list)
+    metadata: dict[str, Any] = field(default_factory=dict)
+
+
+def save_benchmark_suite(suite: BenchmarkSuite, path: str | Path) -> None:
+    path = Path(path)
+    record = {
+        "name": suite.name,
+        "tasks": [
+            {
+                "user_input": t.user_input,
+                "channel": t.channel,
+                "user_id": t.user_id,
+            }
+            for t in suite.tasks
+        ],
+        "metadata": suite.metadata,
+    }
+    path.write_text(json.dumps(record, indent=2), encoding="utf-8")
+
+
+def load_benchmark_suite(path: str | Path) -> BenchmarkSuite:
+    path = Path(path)
+    record = json.loads(path.read_text(encoding="utf-8"))
+    tasks = [
+        BenchmarkTask(
+            user_input=t["user_input"],
+            channel=t.get("channel", "local"),
+            user_id=t.get("user_id"),
+        )
+        for t in record.get("tasks", [])
+    ]
+    return BenchmarkSuite(
+        name=record.get("name", "unnamed"),
+        tasks=tasks,
+        metadata=record.get("metadata", {}),
+    )
+
+
+def default_benchmark_suite() -> BenchmarkSuite:
+    return BenchmarkSuite(
+        name="default",
+        tasks=[
+            BenchmarkTask(user_input="Recall my saved preference from memory."),
+            BenchmarkTask(user_input="Run the deploy workflow skill."),
+            BenchmarkTask(user_input="Check current system status with a tool."),
+            BenchmarkTask(user_input="Use multiple capabilities: memory, skill, and tool."),
+        ],
+        metadata={"source": "seed", "description": "Coverage over memory, skill, tool, and composite tasks"},
+    )
diff --git a/src/memabra/candidate_types.py b/src/memabra/candidate_types.py
new file mode 100644
index 0000000..a13b06f
--- /dev/null
+++ b/src/memabra/candidate_types.py
@@ -0,0 +1,30 @@
+from dataclasses import dataclass, field
+from typing import Any, Literal
+
+CandidateType = Literal["memory", "skill", "tool"]
+DecisionType = Literal[
+    "direct_answer",
+    "inject_memory",
+    "load_skill",
+    "call_tool",
+    "clarify",
+    "composite_action",
+]
+
+
+@dataclass(slots=True)
+class CandidateObject:
+    id: str
+    type: CandidateType
+    title: str
+    summary: str
+    triggers: list[str] = field(default_factory=list)
+    cost: float = 0.0
+    confidence: float = 0.0
+    success_rate: float = 0.0
+    freshness: float = 0.0
+    risk: float = 0.0
+    tags: list[str] = field(default_factory=list)
+    source: str = "generated"
+    preconditions: list[str] = field(default_factory=list)
+    type_payload: dict[str, Any] = field(default_factory=dict)
diff --git a/src/memabra/case_index.py b/src/memabra/case_index.py
new file mode 100644
index 0000000..3b28d7b
--- /dev/null
+++ b/src/memabra/case_index.py
@@ -0,0 +1,48 @@
+from __future__ import annotations
+
+import json
+from pathlib import Path
+from typing import Any
+
+
+class CaseIndex:
+    """Simple JSON-backed index that maps normalized task inputs to the best trajectory ID."""
+
+    def __init__(self) -> None:
+        self._index: dict[str, tuple[str, float]] = {}
+
+    @staticmethod
+    def _normalize(text: str) -> str:
+        return " ".join(text.strip().lower().split())
+
+    def add(self, trajectory: dict[str, Any]) -> None:
+        trajectory_id = trajectory["trajectory_id"]
+        task_input = self._normalize(trajectory["task"]["input"])
+        reward = float(trajectory["reward"]["total"])
+        existing = self._index.get(task_input)
+        if existing is None or reward > existing[1]:
+            self._index[task_input] = (trajectory_id, reward)
+
+    def best(self, input_text: str) -> str | None:
+        normalized = self._normalize(input_text)
+        entry = self._index.get(normalized)
+        if entry is None:
+            return None
+        return entry[0]
+
+    def save(self, path: str | Path) -> None:
+        data = {
+            "cases": {
+                task_input: {"trajectory_id": traj_id, "reward": reward}
+                for task_input, (traj_id, reward) in self._index.items()
+            }
+        }
+        Path(path).write_text(json.dumps(data, indent=2), encoding="utf-8")
+
+    @classmethod
+    def load(cls, path: str | Path) -> CaseIndex:
+        data = json.loads(Path(path).read_text(encoding="utf-8"))
+        index = cls()
+        for task_input, entry in data.get("cases", {}).items():
+            index._index[task_input] = (entry["trajectory_id"], float(entry["reward"]))
+        return index
diff --git a/src/memabra/cli.py b/src/memabra/cli.py
new file mode 100644
index 0000000..f832582
--- /dev/null
+++ b/src/memabra/cli.py
@@ -0,0 +1,411 @@
+from __future__ import annotations
+
+import json
+from pathlib import Path
+from typing import Any
+
+from .app import build_demo_app
+from .benchmarks import default_benchmark_suite
+from .evaluator import BenchmarkTask, Evaluator
+from .promotion import PromotionPolicy
+
+
+def run_wrapup_workflow(*, base_dir: str | Path = "artifacts") -> dict[str, Any]:
+    base_path = Path(base_dir)
+    app = build_demo_app(base_dir=base_path)
+
+    seed_prompts = [
+        "Use my telegram preference for this answer.",
+        "Check the current system status.",
+        "Deploy this service with the usual workflow.",
+    ]
+    for prompt in seed_prompts:
+        app.run_task(prompt, channel="telegram", user_id="oza")
+
+    seed_summary = app.replay_summary()
+    learning_router = app.train_learning_router()
+
+    evaluator = Evaluator(app)
+    benchmark_tasks = [
+        BenchmarkTask(user_input="Use my telegram preference for this answer.", channel="telegram", user_id="oza"),
+        BenchmarkTask(user_input="Check the current system status.", channel="local", user_id="oza"),
+        BenchmarkTask(user_input="Deploy this service with the usual workflow.", channel="local", user_id="oza"),
+    ]
+    baseline = evaluator.run(benchmark_tasks)
+    challenger = evaluator.run(benchmark_tasks, router=learning_router)
+    comparison = {
+        "baseline": {
+            "avg_reward": baseline.avg_reward,
+            "error_rate": baseline.error_rate,
+            "avg_latency_ms": baseline.avg_latency_ms,
+            "decision_distribution": baseline.decision_distribution,
+        },
+        "challenger": {
+            "avg_reward": challenger.avg_reward,
+            "error_rate": challenger.error_rate,
+            "avg_latency_ms": challenger.avg_latency_ms,
+            "decision_distribution": challenger.decision_distribution,
+        },
+        **evaluator.compare(baseline, challenger),
+    }
+
+    app.set_router(learning_router)
+    saved_version = app.save_learning_router(
+        base_dir=base_path / "router-versions",
+        metadata={
+            "avg_reward": challenger.avg_reward,
+            "task_count": challenger.task_count,
+            "source": "wrapup_workflow",
+        },
+    )
+
+    return {
+        "seed_summary": {
+            "trajectories": seed_summary.trajectories,
+            "success_count": seed_summary.success_count,
+            "failure_count": seed_summary.failure_count,
+            "average_reward": seed_summary.average_reward,
+        },
+        "comparison": comparison,
+        "saved_version": saved_version,
+    }
+
+
+def run_online_learning_workflow(
+    *,
+    base_dir: str | Path = "artifacts",
+    min_new_trajectories: int = 3,
+    seen_trajectory_store: str | Path | None = None,
+    dry_run: bool = False,
+    baseline_version: str | None = None,
+    case_index_path: str | Path | None = None,
+    rebuild_case_index: bool = False,
+) -> dict[str, Any]:
+    base_path = Path(base_dir)
+    app = build_demo_app(base_dir=base_path)
+
+    # Seed demo tasks if no artifacts exist yet
+    if not any((base_path / "trajectories").glob("*.json")):
+        seed_prompts = [
+            "Use my telegram preference for this answer.",
+            "Check the current system status.",
+            "Deploy this service with the usual workflow.",
+            "Recall my saved preference from memory.",
+            "Run the deploy workflow skill.",
+        ]
+        for prompt in seed_prompts:
+            app.run_task(prompt, channel="local")
+
+    # Handle case index loading or rebuilding
+    if case_index_path is not None:
+        case_index_file = Path(case_index_path)
+        if rebuild_case_index:
+            app.build_case_index()
+            app.save_case_index(case_index_file)
+        elif case_index_file.exists():
+            app.load_case_index(case_index_file)
+
+    policy = PromotionPolicy(
+        min_reward_delta=-1.0,
+        max_error_rate_increase=1.0,
+        max_latency_increase_ms=10000.0,
+        required_task_count=1,
+    )
+    benchmark_tasks = default_benchmark_suite().tasks
+
+    result = app.run_online_learning_cycle(
+        policy=policy,
+        benchmark_tasks=benchmark_tasks,
+        min_new_trajectories=min_new_trajectories,
+        version_store_base_dir=base_path / "router-versions",
+        report_store_base_dir=base_path / "training-reports",
+        seen_trajectory_store=seen_trajectory_store,
+        dry_run=dry_run,
+        baseline_version_id=baseline_version,
+        case_index_path=case_index_path,
+    )
+
+    # Serialize dataclass objects for JSON compatibility
+    from dataclasses import asdict
+
+    serializable = {}
+    for key, value in result.items():
+        if hasattr(value, "__dataclass_fields__"):
+            serializable[key] = asdict(value)
+        else:
+            serializable[key] = value
+    return serializable
+
+
+def show_status(*, base_dir: str | Path = "artifacts") -> dict[str, Any]:
+    base_path = Path(base_dir)
+    from .router_versioning import RouterVersionStore
+    from .training_reports import TrainingReportStore
+
+    version_store = RouterVersionStore(base_dir=base_path / "router-versions")
+    report_store = TrainingReportStore(base_dir=base_path / "training-reports")
+
+    current = version_store.get_current()
+    versions = version_store.list_versions()
+    reports = report_store.list_reports()
+    latest_report = reports[-1] if reports else None
+
+    trajectory_dir = base_path / "trajectories"
+    trajectory_count = len(list(trajectory_dir.glob("*.json"))) if trajectory_dir.exists() else 0
+
+    return {
+        "base_dir": str(base_path),
+        "current_version_id": current.get("current_version_id"),
+        "version_count": len(versions),
+        "trajectory_count": trajectory_count,
+        "report_count": len(reports),
+        "latest_report": {
+            "report_id": latest_report.get("report_id"),
+            "timestamp": latest_report.get("timestamp"),
+            "promoted": latest_report.get("promotion_decision", {}).get("accepted") if latest_report else None,
+        } if latest_report else None,
+    }
+
+
+def format_output(payload: dict[str, Any], *, output_format: str, mode: str) -> str:
+    if output_format == "json":
+        return json.dumps(payload, indent=2, ensure_ascii=False)
+
+    def _as_mapping(value: Any) -> dict[str, Any]:
+        if value is None:
+            return {}
+        if isinstance(value, dict):
+            return value
+        if hasattr(value, "__dataclass_fields__"):
+            from dataclasses import asdict
+
+            return asdict(value)
+        return {
+            key: getattr(value, key)
+            for key in ("avg_reward", "error_rate", "avg_latency_ms", "metrics", "reasons", "accepted")
+            if hasattr(value, key)
+        }
+
+    def _fmt_bool(value: Any) -> str:
+        if value is None:
+            return "none"
+        return "yes" if bool(value) else "no"
+
+    def _fmt_number(value: Any, *, digits: int = 4) -> str:
+        if value is None:
+            return "none"
+        if isinstance(value, bool):
+            return _fmt_bool(value)
+        if isinstance(value, (int, float)):
+            return f"{float(value):.{digits}f}"
+        return str(value)
+
+    if mode == "status":
+        latest_report = payload.get("latest_report") or {}
+        lines = [
+            "Memabra status",
+            "Summary",
+            f"Base dir: {payload.get('base_dir') or 'none'}",
+            f"Current version: {payload.get('current_version_id') or 'none'}",
+            f"Saved versions: {payload.get('version_count', 0)}",
+            f"Trajectory count: {payload.get('trajectory_count', 0)}",
+            f"Training reports: {payload.get('report_count', 0)}",
+            f"Latest report: {latest_report.get('report_id') or 'none'}",
+        ]
+        if latest_report.get("timestamp") is not None:
+            lines.append(f"Latest report time: {latest_report.get('timestamp')}")
+        if latest_report.get("promoted") is not None:
+            lines.append(f"Latest promotion accepted: {_fmt_bool(latest_report.get('promoted'))}")
+        return "\n".join(lines)
+
+    if mode == "list_versions":
+        versions = payload.get("versions", [])
+        current_version_id = payload.get("current_version_id")
+        lines = [f"Saved router versions ({len(versions)} total)"]
+        lines.append(f"Current version: {current_version_id or 'none'}")
+        if not versions:
+            lines.append("(none)")
+            return "\n".join(lines)
+        for index, version in enumerate(versions, start=1):
+            metadata = version.get("metadata") or {}
+            metadata_parts = []
+            if version.get("version_id") == current_version_id:
+                metadata_parts.append("current")
+            if metadata.get("source") is not None:
+                metadata_parts.append(f"source={metadata['source']}")
+            if metadata.get("avg_reward") is not None:
+                metadata_parts.append(f"avg_reward={metadata['avg_reward']}")
+            suffix = f" ({', '.join(metadata_parts)})" if metadata_parts else ""
+            lines.append(f"{index}. {version.get('version_id')}{suffix}")
+        return "\n".join(lines)
+
+    if mode == "rollback":
+        return f"Rolled back current version to: {payload.get('current_version_id') or 'none'}"
+
+    if mode == "workflow":
+        report_id = payload.get("report_id") or "none"
+        lines = [
+            "Memabra online learning result",
+            "Summary",
+            f"Report ID: {report_id}",
+            f"Skipped: {_fmt_bool(payload.get('skipped'))}",
+            f"Promoted: {_fmt_bool(payload.get('promoted'))}",
+        ]
+        if "dry_run" in payload:
+            lines.append(f"Dry run: {_fmt_bool(payload.get('dry_run'))}")
+
+        baseline_metrics = _as_mapping(payload.get("baseline_metrics"))
+        challenger_metrics = _as_mapping(payload.get("challenger_metrics"))
+        decision = _as_mapping(payload.get("decision"))
+        decision_metrics = _as_mapping(decision.get("metrics"))
+
+        if baseline_metrics:
+            lines.extend([
+                "Baseline",
+                f"Reward: {_fmt_number(baseline_metrics.get('avg_reward'))}",
+                f"Error rate: {_fmt_number(baseline_metrics.get('error_rate'))}",
+                f"Latency (ms): {_fmt_number(baseline_metrics.get('avg_latency_ms'))}",
+            ])
+        if challenger_metrics:
+            lines.extend([
+                "Challenger",
+                f"Reward: {_fmt_number(challenger_metrics.get('avg_reward'))}",
+                f"Error rate: {_fmt_number(challenger_metrics.get('error_rate'))}",
+                f"Latency (ms): {_fmt_number(challenger_metrics.get('avg_latency_ms'))}",
+            ])
+        if decision_metrics:
+            lines.extend([
+                "Deltas",
+                f"Reward delta: {_fmt_number(decision_metrics.get('reward_delta'))}",
+                f"Error rate delta: {_fmt_number(decision_metrics.get('error_rate_delta'))}",
+                f"Latency delta (ms): {_fmt_number(decision_metrics.get('latency_delta_ms'))}",
+            ])
+
+        reason = payload.get("reason")
+        if not reason:
+            decision_reasons = decision.get("reasons", []) if isinstance(decision, dict) else []
+            if decision_reasons:
+                reason = "; ".join(str(item) for item in decision_reasons)
+
+        error = payload.get("error")
+        if reason or error or decision.get("accepted") is not None:
+            lines.append("Decision")
+        if decision.get("accepted") is not None:
+            lines.append(f"Accepted: {_fmt_bool(decision.get('accepted'))}")
+        if reason:
+            lines.append(f"Reason: {reason}")
+        if error:
+            lines.append(f"Error: {error}")
+
+        version_id = payload.get("version_id") or payload.get("promoted_version_id")
+        if version_id:
+            lines.append(f"Version ID: {version_id}")
+        return "\n".join(lines)
+
+    return json.dumps(payload, indent=2, ensure_ascii=False)
+
+
+def main(argv: list[str] | None = None) -> int:
+    import argparse
+    import sys
+
+    if argv is None:
+        argv = sys.argv[1:]
+        # When running under pytest without explicit args, default to run subcommand
+        # to avoid argparse picking up pytest's own command-line arguments.
+        if "pytest" in sys.modules:
+            argv = ["run"]
+
+    # Backward compat: default to 'run' when invoked without a known subcommand
+    known_commands = {"run", "status", "version"}
+    if not argv or argv[0] not in known_commands:
+        if argv and argv[0] in ("-h", "--help"):
+            pass  # let top-level parser show help
+        else:
+            argv = ["run"] + list(argv)
+
+    parser = argparse.ArgumentParser(description="memabra CLI")
+    subparsers = parser.add_subparsers(dest="command")
+
+    run_parser = subparsers.add_parser("run", help="Run the online learning workflow")
+    run_parser.add_argument("--base-dir", default="artifacts", help="Base directory for artifacts")
+    run_parser.add_argument("--min-new-trajectories", type=int, default=3, help="Minimum new trajectories required to run a cycle")
+    run_parser.add_argument("--seen-trajectory-store", default=None, help="Path to persist seen trajectory IDs (defaults to <base-dir>/seen-trajectories.json)")
+    run_parser.add_argument("--dry-run", action="store_true", help="Train and evaluate but do not promote or save a new router version")
+    run_parser.add_argument("--baseline-version", default=None, help="Load a specific router version as the baseline for evaluation")
+    run_parser.add_argument("--case-index", default=None, help="Path to a case index JSON file for episodic retrieval")
+    run_parser.add_argument("--rebuild-case-index", action="store_true", help="Rebuild and save the case index from existing trajectories before running")
+    run_parser.add_argument("--format", choices=("json", "text"), default="json", help="Output format for CLI results")
+
+    status_parser = subparsers.add_parser("status", help="Show system status")
+    status_parser.add_argument("--base-dir", default="artifacts", help="Base directory for artifacts")
+    status_parser.add_argument("--format", choices=("json", "text"), default="json", help="Output format for CLI results")
+
+    version_parser = subparsers.add_parser("version", help="Manage router versions")
+    version_subparsers = version_parser.add_subparsers(dest="version_command")
+
+    list_parser = version_subparsers.add_parser("list", help="List all saved router versions")
+    list_parser.add_argument("--base-dir", default="artifacts", help="Base directory for artifacts")
+    list_parser.add_argument("--format", choices=("json", "text"), default="json", help="Output format for CLI results")
+
+    rollback_parser = version_subparsers.add_parser("rollback", help="Rollback to a specific router version")
+    rollback_parser.add_argument("version_id", help="Router version ID to rollback to")
+    rollback_parser.add_argument("--base-dir", default="artifacts", help="Base directory for artifacts")
+    rollback_parser.add_argument("--format", choices=("json", "text"), default="json", help="Output format for CLI results")
+
+    args = parser.parse_args(args=argv)
+
+    base_path = Path(args.base_dir)
+
+    if args.command == "status":
+        result = show_status(base_dir=base_path)
+        print(format_output(result, output_format=args.format, mode="status"))
+        return 0
+
+    if args.command == "version":
+        from .router_versioning import RouterVersionStore
+
+        store = RouterVersionStore(base_dir=base_path / "router-versions")
+        if args.version_command == "rollback":
+            try:
+                rollback_result = store.rollback(args.version_id)
+            except ValueError as exc:
+                print(str(exc), file=sys.stderr)
+                return 1
+            current_version_id = rollback_result.get("current_version_id")
+            if current_version_id is None:
+                current = store.get_current()
+                current_version_id = current.get("current_version_id")
+            print(format_output({"current_version_id": current_version_id}, output_format=args.format, mode="rollback"))
+            return 0
+        elif args.version_command == "list":
+            versions = store.list_versions()
+            current = store.get_current()
+            print(format_output({"versions": versions, "current_version_id": current.get("current_version_id")}, output_format=args.format, mode="list_versions"))
+            return 0
+        else:
+            version_parser.print_help()
+            return 2
+
+    if args.command == "run":
+        seen_store = args.seen_trajectory_store or str(base_path / "seen-trajectories.json")
+        case_index_path = args.case_index or (str(base_path / "case-index.json") if args.rebuild_case_index else None)
+
+        result = run_online_learning_workflow(
+            base_dir=base_path,
+            min_new_trajectories=args.min_new_trajectories,
+            seen_trajectory_store=seen_store,
+            dry_run=args.dry_run,
+            baseline_version=args.baseline_version,
+            case_index_path=case_index_path,
+            rebuild_case_index=args.rebuild_case_index,
+        )
+        print(format_output(result, output_format=args.format, mode="workflow"))
+        return 0
+
+    parser.print_help()
+    return 2
+
+
+if __name__ == "__main__":
+    raise SystemExit(main())
diff --git a/src/memabra/dataset.py b/src/memabra/dataset.py
new file mode 100644
index 0000000..ba96a8f
--- /dev/null
+++ b/src/memabra/dataset.py
@@ -0,0 +1,48 @@
+from __future__ import annotations
+
+from dataclasses import dataclass
+from typing import Any
+
+
+@dataclass(slots=True)
+class TrainingSample:
+    input_text: str
+    features: dict[str, float]
+    label: str
+    reward: float
+
+
+class DatasetBuilder:
+    def build(self, trajectories: list[dict[str, Any]]) -> list[TrainingSample]:
+        samples: list[TrainingSample] = []
+        for trajectory in trajectories:
+            task_input = trajectory["task"]["input"]
+            candidate_sets = trajectory["candidate_sets"]
+            decisions = trajectory.get("decisions", [])
+            label = decisions[0]["decision_type"] if decisions else "clarify"
+            reward_total = trajectory["reward"]["total"]
+
+            memory = candidate_sets.get("memory", [])
+            skill = candidate_sets.get("skill", [])
+            tool = candidate_sets.get("tool", [])
+
+            features: dict[str, float] = {
+                "input_length": float(len(task_input)),
+                "memory_count": float(len(memory)),
+                "skill_count": float(len(skill)),
+                "tool_count": float(len(tool)),
+                "top_memory_confidence": max((c.get("confidence", 0.0) for c in memory), default=0.0),
+                "top_skill_success_rate": max((c.get("success_rate", 0.0) for c in skill), default=0.0),
+                "top_tool_confidence": max((c.get("confidence", 0.0) for c in tool), default=0.0),
+                "top_tool_risk": max((c.get("risk", 0.0) for c in tool), default=0.0),
+            }
+
+            samples.append(
+                TrainingSample(
+                    input_text=task_input,
+                    features=features,
+                    label=label,
+                    reward=reward_total,
+                )
+            )
+        return samples
diff --git a/src/memabra/evaluator.py b/src/memabra/evaluator.py
new file mode 100644
index 0000000..39ec575
--- /dev/null
+++ b/src/memabra/evaluator.py
@@ -0,0 +1,94 @@
+from __future__ import annotations
+
+from dataclasses import dataclass, field
+from typing import TYPE_CHECKING, Any
+
+from .dataset import DatasetBuilder
+from .router import SimpleLearningRouter, TaskContext
+
+if TYPE_CHECKING:
+    from .app import MemabraApp
+
+
+@dataclass(slots=True)
+class BenchmarkTask:
+    user_input: str
+    channel: str = "local"
+    user_id: str | None = None
+
+
+@dataclass(slots=True)
+class EvaluationResult:
+    task_count: int = 0
+    trajectories: list[dict[str, Any]] = field(default_factory=list)
+    avg_reward: float = 0.0
+    decision_distribution: dict[str, int] = field(default_factory=dict)
+    error_rate: float = 0.0
+    avg_latency_ms: float = 0.0
+
+
+class Evaluator:
+    def __init__(self, app: MemabraApp):
+        self.app = app
+
+    def run(self, tasks: list[BenchmarkTask], router=None) -> EvaluationResult:
+        original_router = self.app.runner.router
+        if router is not None:
+            self.app.runner.router = router
+
+        trajectories: list[dict[str, Any]] = []
+        try:
+            for task in tasks:
+                trajectory = self.app.run_task(
+                    task.user_input,
+                    channel=task.channel,
+                    user_id=task.user_id,
+                )
+                trajectories.append(trajectory)
+        finally:
+            self.app.runner.router = original_router
+
+        return self._analyze(trajectories)
+
+    def _analyze(self, trajectories: list[dict[str, Any]]) -> EvaluationResult:
+        if not trajectories:
+            return EvaluationResult()
+
+        total_reward = sum(t["reward"]["total"] for t in trajectories)
+        decisions = [t["decisions"][0]["decision_type"] for t in trajectories if t.get("decisions")]
+        distribution: dict[str, int] = {}
+        for d in decisions:
+            distribution[d] = distribution.get(d, 0) + 1
+
+        error_count = sum(1 for t in trajectories if t["outcome"]["status"] == "error")
+        total_latency = sum(t["outcome"]["latency_ms"] for t in trajectories)
+
+        return EvaluationResult(
+            task_count=len(trajectories),
+            trajectories=trajectories,
+            avg_reward=round(total_reward / len(trajectories), 4),
+            decision_distribution=distribution,
+            error_rate=round(error_count / len(trajectories), 4),
+            avg_latency_ms=round(total_latency / len(trajectories), 4),
+        )
+
+    def compare(self, baseline: EvaluationResult, challenger: EvaluationResult) -> dict[str, Any]:
+        reward_delta = round(challenger.avg_reward - baseline.avg_reward, 4)
+        error_delta = round(challenger.error_rate - baseline.error_rate, 4)
+        latency_delta = round(challenger.avg_latency_ms - baseline.avg_latency_ms, 4)
+
+        if reward_delta > 0.001:
+            winner = "challenger"
+        elif reward_delta < -0.001:
+            winner = "baseline"
+        else:
+            winner = "tie"
+
+        return {
+            "winner": winner,
+            "avg_reward_delta": reward_delta,
+            "error_rate_delta": error_delta,
+            "avg_latency_ms_delta": latency_delta,
+            "baseline_avg_reward": baseline.avg_reward,
+            "challenger_avg_reward": challenger.avg_reward,
+        }
diff --git a/src/memabra/execution.py b/src/memabra/execution.py
new file mode 100644
index 0000000..b4133d4
--- /dev/null
+++ b/src/memabra/execution.py
@@ -0,0 +1,296 @@
+from __future__ import annotations
+
+from dataclasses import dataclass, field
+from pathlib import Path
+from typing import Any, Protocol
+
+import yaml
+
+from .memory_store import InMemoryMemoryStore
+from .router import RouteDecision, TaskContext
+from .telemetry import Event
+
+
+@dataclass(slots=True)
+class ActionResult:
+    decision_type: str
+    status: str
+    details: dict[str, Any] = field(default_factory=dict)
+    events: list[Event] = field(default_factory=list)
+
+
+class ToolBackend(Protocol):
+    def run_tool(self, tool_id: str, context: TaskContext, params: dict[str, Any] | None = None) -> dict[str, Any]:
+        ...
+
+
+class SkillBackend(Protocol):
+    def load_skill(self, skill_id: str) -> dict[str, Any]:
+        ...
+
+
+@dataclass(slots=True)
+class FileSystemSkillBackend:
+    search_paths: list[str | Path] = field(default_factory=lambda: [Path.home() / ".hermes" / "skills"])
+
+    def _discover(self) -> dict[str, Path]:
+        index: dict[str, Path] = {}
+        for base in self.search_paths:
+            base_path = Path(base)
+            if not base_path.exists():
+                continue
+            for skill_file in base_path.rglob("SKILL.md"):
+                frontmatter = self._parse_frontmatter(skill_file)
+                name = frontmatter.get("name") if frontmatter else None
+                if name:
+                    index[name] = skill_file
+        return index
+
+    def _parse_frontmatter(self, path: Path) -> dict[str, Any] | None:
+        text = path.read_text(encoding="utf-8")
+        if not text.startswith("---"):
+            return None
+        try:
+            _, rest = text.split("---", 1)
+            fm_text, _ = rest.split("---", 1)
+            return yaml.safe_load(fm_text) or {}
+        except Exception:
+            return None
+
+    def load_skill(self, skill_id: str) -> dict[str, Any]:
+        index = self._discover()
+        skill_path = index.get(skill_id)
+        if skill_path is None:
+            return {
+                "skill_id": skill_id,
+                "status": "error",
+                "error": f"Skill '{skill_id}' not found in {self.search_paths}.",
+            }
+        text = skill_path.read_text(encoding="utf-8")
+        frontmatter = self._parse_frontmatter(skill_path) or {}
+        body = text
+        if text.startswith("---"):
+            try:
+                _, rest = text.split("---", 1)
+                _, body = rest.split("---", 1)
+            except Exception:
+                pass
+        return {
+            "skill_id": skill_id,
+            "status": "success",
+            "name": frontmatter.get("name", skill_id),
+            "description": frontmatter.get("description", ""),
+            "version": frontmatter.get("version", ""),
+            "author": frontmatter.get("author", ""),
+            "content": body.strip(),
+            "path": str(skill_path),
+            **frontmatter,
+        }
+
+
+@dataclass(slots=True)
+class LocalFunctionToolAdapter:
+    func: Any
+
+    def run_tool(self, tool_id: str, context: TaskContext, params: dict[str, Any] | None = None) -> dict[str, Any]:
+        params = params or {}
+        start = __import__("time").time()
+        try:
+            output = self.func(**params)
+            latency_ms = int((__import__("time").time() - start) * 1000)
+            return {"status": "success", "output": output, "error": None, "latency_ms": latency_ms}
+        except Exception as exc:
+            latency_ms = int((__import__("time").time() - start) * 1000)
+            return {"status": "error", "output": None, "error": str(exc), "latency_ms": latency_ms}
+
+
+@dataclass(slots=True)
+class SubprocessToolAdapter:
+    command: str
+
+    def run_tool(self, tool_id: str, context: TaskContext, params: dict[str, Any] | None = None) -> dict[str, Any]:
+        import subprocess
+        import time
+
+        start = time.time()
+        try:
+            proc = subprocess.run(self.command, shell=True, capture_output=True, text=True, timeout=30)
+            latency_ms = int((time.time() - start) * 1000)
+            if proc.returncode == 0:
+                return {"status": "success", "output": proc.stdout.strip(), "error": None, "latency_ms": latency_ms}
+            return {"status": "error", "output": proc.stdout.strip(), "error": proc.stderr.strip(), "latency_ms": latency_ms}
+        except Exception as exc:
+            latency_ms = int((time.time() - start) * 1000)
+            return {"status": "error", "output": None, "error": str(exc), "latency_ms": latency_ms}
+
+
+@dataclass(slots=True)
+class ToolRegistry:
+    _tools: dict[str, ToolBackend] = field(default_factory=dict)
+
+    def register(self, tool_id: str, backend: ToolBackend) -> None:
+        self._tools[tool_id] = backend
+
+    def run_tool(self, tool_id: str, context: TaskContext, params: dict[str, Any] | None = None) -> dict[str, Any]:
+        backend = self._tools.get(tool_id)
+        if backend is None:
+            return {"status": "error", "output": None, "error": f"Tool '{tool_id}' not found in registry.", "latency_ms": 0}
+        return backend.run_tool(tool_id, context, params)
+
+
+@dataclass(slots=True)
+class MemoryExecutor:
+    memory_store: InMemoryMemoryStore | None = None
+
+    def execute(self, decision: RouteDecision, context: TaskContext, trajectory_id: str) -> ActionResult:
+        selected_ids = list(decision.selected_ids)
+        events: list[Event] = []
+        for record_id in selected_ids:
+            if self.memory_store is not None and self.memory_store.get(record_id) is not None:
+                self.memory_store.mark_used(record_id)
+            events.append(
+                Event(
+                    event_id=f"evt-memory-{trajectory_id}-{record_id}",
+                    trajectory_id=trajectory_id,
+                    stage="execution",
+                    event_type="memory_injected",
+                    payload={"record_id": record_id, "input": context.user_input},
+                )
+            )
+        return ActionResult(
+            decision_type=decision.decision_type,
+            status="executed" if selected_ids else "skipped",
+            details={"selected_ids": selected_ids, "latency_ms": 0},
+            events=events,
+        )
+
+
+@dataclass(slots=True)
+class SkillExecutor:
+    backend: SkillBackend | None = None
+
+    def execute(self, decision: RouteDecision, context: TaskContext, trajectory_id: str) -> ActionResult:
+        selected_ids = list(decision.selected_ids)
+        payloads: list[dict[str, Any]] = []
+        events: list[Event] = []
+        for skill_id in selected_ids:
+            payload = self.backend.load_skill(skill_id) if self.backend is not None else {"skill_id": skill_id}
+            payloads.append(payload)
+            event_payload = {"skill_id": skill_id, "input": context.user_input, **payload}
+            events.append(
+                Event(
+                    event_id=f"evt-skill-{trajectory_id}-{skill_id}",
+                    trajectory_id=trajectory_id,
+                    stage="execution",
+                    event_type="skill_loaded",
+                    payload=event_payload,
+                )
+            )
+        return ActionResult(
+            decision_type=decision.decision_type,
+            status="executed" if selected_ids else "skipped",
+            details={"selected_ids": selected_ids, "payloads": payloads, "latency_ms": 0},
+            events=events,
+        )
+
+
+@dataclass(slots=True)
+class ToolExecutor:
+    backend: ToolBackend | None = None
+
+    def execute(self, decision: RouteDecision, context: TaskContext, trajectory_id: str) -> ActionResult:
+        selected_ids = list(decision.selected_ids)
+        events: list[Event] = []
+        result_status = "executed" if selected_ids else "skipped"
+        result_payloads: list[dict[str, Any]] = []
+        max_latency = 0
+        for idx, tool_id in enumerate(selected_ids):
+            params = decision.selected_payloads[idx] if idx < len(decision.selected_payloads) else {}
+            backend_result = (
+                self.backend.run_tool(tool_id, context, params=params)
+                if self.backend is not None
+                else {"status": "success", "output": "mock-success", "error": None, "latency_ms": 0}
+            )
+            result_payloads.append({"tool_id": tool_id, **backend_result})
+            max_latency = max(max_latency, int(backend_result.get("latency_ms", 0) or 0))
+            if backend_result.get("status") == "error":
+                result_status = "error"
+            events.extend(
+                [
+                    Event(
+                        event_id=f"evt-tool-{trajectory_id}-{tool_id}",
+                        trajectory_id=trajectory_id,
+                        stage="execution",
+                        event_type="tool_called",
+                        payload={"tool_id": tool_id, "input": context.user_input},
+                    ),
+                    Event(
+                        event_id=f"evt-tool-result-{trajectory_id}-{tool_id}",
+                        trajectory_id=trajectory_id,
+                        stage="execution",
+                        event_type="tool_result",
+                        payload={"tool_id": tool_id, **backend_result},
+                    ),
+                ]
+            )
+        return ActionResult(
+            decision_type=decision.decision_type,
+            status=result_status,
+            details={"selected_ids": selected_ids, "results": result_payloads, "latency_ms": max_latency},
+            events=events,
+        )
+
+
+# Backward compatibility alias
+MockToolExecutor = ToolExecutor
+
+
+@dataclass(slots=True)
+class ExecutionEngine:
+    memory_executor: MemoryExecutor = field(default_factory=MemoryExecutor)
+    skill_executor: SkillExecutor = field(default_factory=SkillExecutor)
+    tool_executor: ToolExecutor = field(default_factory=ToolExecutor)
+
+    def __init__(
+        self,
+        memory_executor: MemoryExecutor | None = None,
+        skill_executor: SkillExecutor | None = None,
+        tool_executor: ToolExecutor | None = None,
+        tool_backend: ToolBackend | None = None,
+        skill_backend: SkillBackend | None = None,
+    ):
+        self.memory_executor = memory_executor or MemoryExecutor()
+        self.skill_executor = skill_executor or SkillExecutor(backend=skill_backend)
+        self.tool_executor = tool_executor or ToolExecutor(backend=tool_backend)
+
+    def execute(self, decision: RouteDecision, context: TaskContext, trajectory_id: str) -> ActionResult:
+        if decision.decision_type == "inject_memory":
+            return self.memory_executor.execute(decision, context, trajectory_id)
+        if decision.decision_type == "load_skill":
+            return self.skill_executor.execute(decision, context, trajectory_id)
+        if decision.decision_type == "call_tool":
+            return self.tool_executor.execute(decision, context, trajectory_id)
+        if decision.decision_type == "composite_action":
+            events: list[Event] = []
+            steps: list[dict[str, Any]] = []
+            total_latency = 0
+            status = "executed"
+            for step in decision.composite_steps:
+                step_result = self.execute(step, context, trajectory_id)
+                events.extend(step_result.events)
+                steps.append({"decision_type": step.decision_type, "status": step_result.status, "details": step_result.details})
+                total_latency += int(step_result.details.get("latency_ms", 0) or 0)
+                if step_result.status in ("error", "failure"):
+                    status = "error"
+            return ActionResult(
+                decision_type="composite_action",
+                status=status,
+                details={"steps": steps, "latency_ms": total_latency},
+                events=events,
+            )
+        return ActionResult(
+            decision_type=decision.decision_type,
+            status="noop",
+            details={"reason": "No executor needed for this decision type.", "latency_ms": 0},
+            events=[],
+        )
diff --git a/src/memabra/memory_store.py b/src/memabra/memory_store.py
new file mode 100644
index 0000000..dd9883b
--- /dev/null
+++ b/src/memabra/memory_store.py
@@ -0,0 +1,107 @@
+from __future__ import annotations
+
+from dataclasses import dataclass, field
+from datetime import UTC, datetime
+from typing import Literal
+
+MemoryType = Literal["semantic", "procedural", "episodic", "working"]
+FactStatus = Literal["draft", "assumed", "verified", "deprecated", "revoked"]
+
+
+@dataclass(slots=True)
+class MemorySource:
+    kind: str
+    ref: str
+
+
+@dataclass(slots=True)
+class VerificationState:
+    status: str = "unknown"
+    last_checked_at: str | None = None
+    check_method: str | None = None
+
+
+@dataclass(slots=True)
+class MemoryRecord:
+    id: str
+    memory_type: MemoryType
+    fact_status: FactStatus
+    content: str
+    summary: str
+    source: MemorySource
+    confidence: float
+    created_at: str = field(default_factory=lambda: datetime.now(UTC).isoformat())
+    updated_at: str = field(default_factory=lambda: datetime.now(UTC).isoformat())
+    tags: list[str] = field(default_factory=list)
+    related_entities: list[str] = field(default_factory=list)
+    last_used_at: str | None = None
+    expires_at: str | None = None
+    verification: VerificationState = field(default_factory=VerificationState)
+    revocation: dict[str, str] | None = None
+
+    def to_dict(self) -> dict:
+        return {
+            "id": self.id,
+            "memory_type": self.memory_type,
+            "fact_status": self.fact_status,
+            "content": self.content,
+            "summary": self.summary,
+            "source": {"kind": self.source.kind, "ref": self.source.ref},
+            "confidence": self.confidence,
+            "tags": list(self.tags),
+            "related_entities": list(self.related_entities),
+            "created_at": self.created_at,
+            "updated_at": self.updated_at,
+            "last_used_at": self.last_used_at,
+            "expires_at": self.expires_at,
+            "verification": {
+                "status": self.verification.status,
+                "last_checked_at": self.verification.last_checked_at,
+                "check_method": self.verification.check_method,
+            },
+            "revocation": self.revocation,
+        }
+
+
+class InMemoryMemoryStore:
+    def __init__(self):
+        self._records: dict[str, MemoryRecord] = {}
+
+    def upsert(self, record: MemoryRecord) -> None:
+        record.updated_at = datetime.now(UTC).isoformat()
+        self._records[record.id] = record
+
+    def get(self, record_id: str) -> MemoryRecord | None:
+        return self._records.get(record_id)
+
+    def list_by_type(self, memory_type: MemoryType | None = None) -> list[MemoryRecord]:
+        records = list(self._records.values())
+        if memory_type is None:
+            return records
+        return [record for record in records if record.memory_type == memory_type]
+
+    def mark_used(self, record_id: str) -> None:
+        record = self._require_record(record_id)
+        now = datetime.now(UTC).isoformat()
+        record.last_used_at = now
+        record.updated_at = now
+
+    def verify(self, record_id: str, *, status: str, check_method: str) -> None:
+        record = self._require_record(record_id)
+        now = datetime.now(UTC).isoformat()
+        record.fact_status = "verified" if status == "confirmed" else record.fact_status
+        record.verification = VerificationState(status=status, last_checked_at=now, check_method=check_method)
+        record.updated_at = now
+
+    def revoke(self, record_id: str, *, reason: str) -> None:
+        record = self._require_record(record_id)
+        now = datetime.now(UTC).isoformat()
+        record.fact_status = "revoked"
+        record.revocation = {"reason": reason, "revoked_at": now}
+        record.updated_at = now
+
+    def _require_record(self, record_id: str) -> MemoryRecord:
+        record = self.get(record_id)
+        if record is None:
+            raise KeyError(f"Unknown memory record: {record_id}")
+        return record
diff --git a/src/memabra/online_learning.py b/src/memabra/online_learning.py
new file mode 100644
index 0000000..d64cc70
--- /dev/null
+++ b/src/memabra/online_learning.py
@@ -0,0 +1,175 @@
+from __future__ import annotations
+
+import json
+from dataclasses import dataclass, field
+from datetime import datetime, timezone
+from pathlib import Path
+from typing import Any
+
+from .benchmarks import BenchmarkTask
+from .dataset import DatasetBuilder
+from .evaluator import Evaluator, EvaluationResult
+from .promotion import PromotionDecision, PromotionPolicy
+from .router import SimpleLearningRouter
+from .router_versioning import RouterVersionStore
+from .training_reports import TrainingReportStore, build_report
+
+
+@dataclass
+class OnlineLearningCoordinator:
+    app: Any
+    policy: PromotionPolicy
+    benchmark_tasks: list[BenchmarkTask]
+    min_new_trajectories: int = 5
+    version_store_base_dir: str | Path = "docs/projects/memabra/router-versions"
+    report_store_base_dir: str | Path = "docs/projects/memabra/training-reports"
+    seen_trajectory_store: str | Path | None = None
+    case_index_path: str | Path | None = None
+    _seen_trajectory_ids: set[str] = field(default_factory=set, repr=False)
+
+    def __post_init__(self):
+        if self.seen_trajectory_store is not None:
+            path = Path(self.seen_trajectory_store)
+            if path.exists():
+                data = json.loads(path.read_text(encoding="utf-8"))
+                self._seen_trajectory_ids = set(data.get("seen_trajectory_ids", []))
+
+    def _version_store(self) -> RouterVersionStore:
+        return RouterVersionStore(base_dir=self.version_store_base_dir)
+
+    def _save_seen_trajectories(self) -> None:
+        if self.seen_trajectory_store is not None:
+            path = Path(self.seen_trajectory_store)
+            path.write_text(
+                json.dumps({"seen_trajectory_ids": sorted(self._seen_trajectory_ids)}, indent=2),
+                encoding="utf-8",
+            )
+
+    def run_cycle(self, dry_run: bool = False, baseline_version_id: str | None = None) -> dict[str, Any]:
+        index = self.app.artifact_index()
+        all_trajectories = index.query()
+        new_trajectories = [
+            t for t in all_trajectories if t["trajectory_id"] not in self._seen_trajectory_ids
+        ]
+
+        if len(new_trajectories) < self.min_new_trajectories:
+            report = {
+                "report_id": f"report-skipped-{len(self._seen_trajectory_ids)}",
+                "timestamp": datetime.now(timezone.utc).isoformat(),
+                "source_trajectory_ids": [],
+                "sample_count": 0,
+                "baseline_metrics": {},
+                "challenger_metrics": {},
+                "promotion_decision": {"accepted": False, "reasons": [f"Too few new trajectories ({len(new_trajectories)} < {self.min_new_trajectories})"], "metrics": {}},
+                "promoted_version_id": None,
+                "skipped": True,
+            }
+            self._save_report(report)
+            return {
+                "skipped": True,
+                "reason": f"Too few new trajectories ({len(new_trajectories)} < {self.min_new_trajectories})",
+                "new_count": len(new_trajectories),
+                "min_required": self.min_new_trajectories,
+                "report_id": report["report_id"],
+            }
+
+        try:
+            # Train challenger on all available trajectories
+            dataset_builder = DatasetBuilder()
+            samples = dataset_builder.build(all_trajectories)
+            challenger = SimpleLearningRouter()
+            if samples:
+                challenger.fit(samples)
+
+            # Load baseline version if specified
+            original_router = self.app.runner.router
+            if baseline_version_id is not None:
+                baseline_router = self._version_store().load(baseline_version_id)
+                self.app.set_router(baseline_router)
+
+            # Evaluate baseline vs challenger
+            evaluator = Evaluator(self.app)
+            baseline_result = evaluator.run(self.benchmark_tasks)
+            challenger_result = evaluator.run(self.benchmark_tasks, router=challenger)
+        except Exception as exc:
+            report = build_report(
+                source_trajectory_ids=[t["trajectory_id"] for t in all_trajectories],
+                baseline=EvaluationResult(task_count=0, trajectories=[], avg_reward=0.0, error_rate=0.0, avg_latency_ms=0.0, decision_distribution={}),
+                challenger=EvaluationResult(task_count=0, trajectories=[], avg_reward=0.0, error_rate=0.0, avg_latency_ms=0.0, decision_distribution={}),
+                decision=PromotionDecision(accepted=False, reasons=[f"Cycle failed: {exc}"], metrics={}),
+                promoted_version_id=None,
+            )
+            self._save_report(report)
+            return {
+                "skipped": False,
+                "promoted": False,
+                "error": str(exc),
+                "report_id": report["report_id"],
+            }
+        finally:
+            # Restore original router if a baseline version was loaded
+            if baseline_version_id is not None:
+                self.app.set_router(original_router)
+
+        # Refresh index to capture trajectories generated during evaluation
+        # and mark everything as seen so benchmark runs don't retrigger cycles.
+        index.refresh()
+        post_eval_trajectories = index.query()
+        for t in post_eval_trajectories:
+            self._seen_trajectory_ids.add(t["trajectory_id"])
+        self._save_seen_trajectories()
+
+        if self.case_index_path is not None:
+            self.app.build_case_index()
+            self.app.save_case_index(self.case_index_path)
+
+        decision = self.policy.evaluate(baseline_result, challenger_result)
+
+        version_id: str | None = None
+        if decision.accepted and not dry_run:
+            store = RouterVersionStore(base_dir=self.version_store_base_dir)
+            version_record = store.save(
+                challenger,
+                metadata={
+                    "source": "online_learning",
+                    "benchmark_summary": decision.metrics,
+                },
+            )
+            version_id = version_record["version_id"]
+            self.app.set_router(challenger)
+
+        report = build_report(
+            source_trajectory_ids=[t["trajectory_id"] for t in all_trajectories],
+            baseline=baseline_result,
+            challenger=challenger_result,
+            decision=decision,
+            promoted_version_id=version_id,
+            baseline_version_id=baseline_version_id,
+        )
+        report["dry_run"] = dry_run
+        self._save_report(report)
+
+        if not decision.accepted or dry_run:
+            return {
+                "skipped": False,
+                "promoted": False,
+                "decision": decision,
+                "baseline_metrics": baseline_result,
+                "challenger_metrics": challenger_result,
+                "report_id": report["report_id"],
+                "dry_run": dry_run,
+            }
+
+        return {
+            "skipped": False,
+            "promoted": True,
+            "decision": decision,
+            "version_id": version_id,
+            "baseline_metrics": baseline_result,
+            "challenger_metrics": challenger_result,
+            "report_id": report["report_id"],
+        }
+
+    def _save_report(self, report: dict[str, Any]) -> None:
+        store = TrainingReportStore(base_dir=self.report_store_base_dir)
+        store.save(report)
diff --git a/src/memabra/outcome.py b/src/memabra/outcome.py
new file mode 100644
index 0000000..bdf080a
--- /dev/null
+++ b/src/memabra/outcome.py
@@ -0,0 +1,138 @@
+from __future__ import annotations
+
+from dataclasses import dataclass
+from typing import Any
+
+from .execution import ActionResult
+from .retrieval import RetrievalResult
+from .router import RouteDecision
+from .telemetry import RewardBreakdown
+
+
+@dataclass(slots=True)
+class Outcome:
+    status: str
+    steps: int
+    latency_ms: int
+    user_corrections: int
+    tool_errors: int
+    notes: str | None = None
+
+
+class OutcomeEngine:
+    def build_outcome(self, decision: RouteDecision, execution_result: ActionResult | None = None) -> Outcome:
+        latency_ms = int((execution_result.details.get("latency_ms", 0) if execution_result is not None else 0) or 0)
+        steps = 1
+        user_corrections = 0
+
+        if execution_result is not None and execution_result.status == "error":
+            tool_errors = self._count_tool_errors(decision, execution_result)
+            status = self._resolve_status(decision, execution_result, tool_errors)
+            notes = "Execution failed during runner dispatch."
+            if status == "partial_success":
+                notes = "Some tools succeeded, but errors were encountered."
+            return Outcome(
+                status=status,
+                steps=steps,
+                latency_ms=latency_ms,
+                user_corrections=user_corrections,
+                tool_errors=tool_errors,
+                notes=notes,
+            )
+
+        status = "partial_success" if decision.decision_type == "clarify" else "success"
+        notes = "Draft trajectory generated by MemabraRunner with execution hooks." if execution_result else "Draft trajectory generated by MemabraRunner."
+        return Outcome(
+            status=status,
+            steps=steps,
+            latency_ms=latency_ms,
+            user_corrections=user_corrections,
+            tool_errors=0,
+            notes=notes,
+        )
+
+    def _count_tool_errors(self, decision: RouteDecision, execution_result: ActionResult) -> int:
+        if decision.decision_type != "call_tool":
+            return 0
+        results = execution_result.details.get("results", [])
+        if not results:
+            return 1
+        return sum(1 for r in results if r.get("status") == "error")
+
+    def _resolve_status(self, decision: RouteDecision, execution_result: ActionResult, tool_errors: int) -> str:
+        if decision.decision_type != "call_tool":
+            return "failure"
+        total_tools = max(len(decision.selected_ids), len(execution_result.details.get("results", [])))
+        if total_tools > 0 and 0 < tool_errors < total_tools:
+            return "partial_success"
+        return "failure"
+
+
+class RewardEngine:
+    def compute(
+        self,
+        decision: RouteDecision,
+        outcome: Outcome,
+        execution_result: ActionResult | None = None,
+        retrieval_result: RetrievalResult | None = None,
+    ) -> RewardBreakdown:
+        latency_ms = outcome.latency_ms
+        latency_penalty = self._latency_tier_penalty(latency_ms)
+        tool_error = self._tool_error_penalty(outcome)
+        context_cost = self._context_cost(retrieval_result)
+
+        if decision.decision_type == "clarify":
+            return RewardBreakdown(
+                task_success=0.4,
+                retrieval_hit=0.1,
+                user_correction=0.0,
+                latency=latency_penalty,
+                context_cost=context_cost,
+                tool_error=tool_error,
+            )
+
+        if decision.decision_type == "call_tool":
+            task_success = self._tool_task_success(tool_error, outcome)
+            useful_reuse = 0.05 if outcome.status in ("success", "partial_success") and tool_error == 0.0 else 0.0
+            return RewardBreakdown(
+                task_success=task_success,
+                retrieval_hit=0.25,
+                useful_reuse=useful_reuse,
+                latency=latency_penalty,
+                context_cost=context_cost,
+                tool_error=tool_error,
+            )
+
+        return RewardBreakdown(
+            task_success=0.8 if outcome.status == "success" else 0.5,
+            retrieval_hit=0.2,
+            useful_reuse=0.1 if outcome.status == "success" else 0.0,
+            latency=latency_penalty,
+            context_cost=context_cost,
+            tool_error=tool_error,
+        )
+
+    def _latency_tier_penalty(self, latency_ms: int) -> float:
+        if latency_ms < 500:
+            return round(latency_ms / 5000, 3)
+        if latency_ms < 1500:
+            return round(latency_ms / 2000, 3)
+        return round(latency_ms / 1000, 3)
+
+    def _tool_error_penalty(self, outcome: Outcome) -> float:
+        base = 0.35 if outcome.tool_errors > 0 else 0.0
+        extra = 0.15 * max(0, outcome.tool_errors - 1)
+        return round(min(base + extra, 1.0), 3)
+
+    def _context_cost(self, retrieval_result: RetrievalResult | None) -> float:
+        if retrieval_result is None:
+            return 0.0
+        total = len(retrieval_result.memory) + len(retrieval_result.skill) + len(retrieval_result.tool)
+        return round(total * 0.02, 3)
+
+    def _tool_task_success(self, tool_error: float, outcome: Outcome) -> float:
+        if tool_error == 0.0:
+            return 0.8
+        if outcome.status == "partial_success":
+            return max(0.2, 0.6 - tool_error)
+        return max(0.0, 0.2 - tool_error)
diff --git a/src/memabra/persistence.py b/src/memabra/persistence.py
new file mode 100644
index 0000000..e9323e9
--- /dev/null
+++ b/src/memabra/persistence.py
@@ -0,0 +1,40 @@
+from __future__ import annotations
+
+import json
+from pathlib import Path
+from typing import Any
+
+from .memory_store import MemoryRecord
+
+
+class PersistenceStore:
+    def __init__(self, base_dir: str | Path = "docs/projects/memabra/artifacts"):
+        self.base_dir = Path(base_dir)
+        self.trajectories_dir = self.base_dir / "trajectories"
+        self.memories_dir = self.base_dir / "memories"
+        self.trajectories_dir.mkdir(parents=True, exist_ok=True)
+        self.memories_dir.mkdir(parents=True, exist_ok=True)
+
+    def save_trajectory(self, trajectory: dict[str, Any]) -> Path:
+        path = self.trajectories_dir / f"{trajectory['trajectory_id']}.json"
+        path.write_text(json.dumps(trajectory, indent=2, ensure_ascii=False), encoding="utf-8")
+        return path
+
+    def load_trajectory(self, trajectory_id: str) -> dict[str, Any]:
+        path = self.trajectories_dir / f"{trajectory_id}.json"
+        return json.loads(path.read_text(encoding="utf-8"))
+
+    def list_trajectory_paths(self) -> list[Path]:
+        return sorted(self.trajectories_dir.glob("*.json"))
+
+    def save_memory_record(self, record: MemoryRecord) -> Path:
+        path = self.memories_dir / f"{record.id}.json"
+        path.write_text(json.dumps(record.to_dict(), indent=2, ensure_ascii=False), encoding="utf-8")
+        return path
+
+    def load_memory_record(self, record_id: str) -> dict[str, Any]:
+        path = self.memories_dir / f"{record_id}.json"
+        return json.loads(path.read_text(encoding="utf-8"))
+
+    def list_memory_paths(self) -> list[Path]:
+        return sorted(self.memories_dir.glob("*.json"))
diff --git a/src/memabra/promotion.py b/src/memabra/promotion.py
new file mode 100644
index 0000000..921a1cb
--- /dev/null
+++ b/src/memabra/promotion.py
@@ -0,0 +1,59 @@
+from __future__ import annotations
+
+from dataclasses import dataclass
+from typing import Any
+
+from .evaluator import EvaluationResult
+
+
+@dataclass(slots=True)
+class PromotionDecision:
+    accepted: bool
+    reasons: list[str]
+    metrics: dict[str, Any]
+
+
+@dataclass(slots=True)
+class PromotionPolicy:
+    min_reward_delta: float
+    max_error_rate_increase: float
+    max_latency_increase_ms: float
+    required_task_count: int
+
+    def evaluate(self, baseline: EvaluationResult, challenger: EvaluationResult) -> PromotionDecision:
+        reasons: list[str] = []
+        reward_delta = challenger.avg_reward - baseline.avg_reward
+        error_rate_delta = challenger.error_rate - baseline.error_rate
+        latency_delta_ms = challenger.avg_latency_ms - baseline.avg_latency_ms
+
+        if challenger.task_count < self.required_task_count:
+            reasons.append(
+                f"Task count {challenger.task_count} below required {self.required_task_count}"
+            )
+
+        if reward_delta < self.min_reward_delta:
+            reasons.append(
+                f"Reward delta {reward_delta:.4f} below minimum {self.min_reward_delta}"
+            )
+
+        if error_rate_delta > self.max_error_rate_increase:
+            reasons.append(
+                f"Error rate increase {error_rate_delta:.4f} exceeds max {self.max_error_rate_increase}"
+            )
+
+        if latency_delta_ms > self.max_latency_increase_ms:
+            reasons.append(
+                f"Latency increase {latency_delta_ms:.1f}ms exceeds max {self.max_latency_increase_ms}ms"
+            )
+
+        return PromotionDecision(
+            accepted=len(reasons) == 0,
+            reasons=reasons,
+            metrics={
+                "reward_delta": round(reward_delta, 4),
+                "error_rate_delta": round(error_rate_delta, 4),
+                "latency_delta_ms": round(latency_delta_ms, 4),
+                "baseline_avg_reward": baseline.avg_reward,
+                "challenger_avg_reward": challenger.avg_reward,
+            },
+        )
diff --git a/src/memabra/replay.py b/src/memabra/replay.py
new file mode 100644
index 0000000..ebe8508
--- /dev/null
+++ b/src/memabra/replay.py
@@ -0,0 +1,88 @@
+from __future__ import annotations
+
+import json
+from dataclasses import dataclass
+from pathlib import Path
+from typing import Any
+
+from .persistence import PersistenceStore
+
+
+@dataclass(slots=True)
+class ReplaySummary:
+    trajectories: int
+    success_count: int
+    partial_success_count: int
+    failure_count: int
+    average_reward: float
+    average_latency_ms: float
+    average_steps: float
+    average_user_corrections: float
+    direct_answer_count: int
+    memory_action_count: int
+    skill_action_count: int
+    tool_action_count: int
+    clarify_count: int
+    composite_action_count: int
+
+
+class TrajectoryReplay:
+    def load(self, path: str | Path) -> dict[str, Any]:
+        trajectory_path = Path(path)
+        with trajectory_path.open("r", encoding="utf-8") as handle:
+            return json.load(handle)
+
+    def load_many(self, paths: list[str | Path]) -> list[dict[str, Any]]:
+        return [self.load(path) for path in paths]
+
+    def summarize(self, trajectories: list[dict[str, Any]]) -> ReplaySummary:
+        total = len(trajectories)
+        if total == 0:
+            return ReplaySummary(0, 0, 0, 0, 0.0, 0.0, 0.0, 0.0, 0, 0, 0, 0, 0, 0)
+
+        success_count = sum(1 for t in trajectories if t["outcome"]["status"] == "success")
+        partial_success_count = sum(1 for t in trajectories if t["outcome"]["status"] == "partial_success")
+        failure_count = sum(1 for t in trajectories if t["outcome"]["status"] == "failure")
+        average_reward = sum(t["reward"]["total"] for t in trajectories) / total
+        average_latency_ms = sum(t["outcome"]["latency_ms"] for t in trajectories) / total
+        average_steps = sum(t["outcome"]["steps"] for t in trajectories) / total
+        average_user_corrections = sum(t["outcome"]["user_corrections"] for t in trajectories) / total
+
+        decisions = [decision for trajectory in trajectories for decision in trajectory.get("decisions", [])]
+        counts = {
+            "direct_answer": 0,
+            "inject_memory": 0,
+            "load_skill": 0,
+            "call_tool": 0,
+            "clarify": 0,
+            "composite_action": 0,
+        }
+        for decision in decisions:
+            decision_type = decision["decision_type"]
+            counts[decision_type] = counts.get(decision_type, 0) + 1
+
+        return ReplaySummary(
+            trajectories=total,
+            success_count=success_count,
+            partial_success_count=partial_success_count,
+            failure_count=failure_count,
+            average_reward=average_reward,
+            average_latency_ms=average_latency_ms,
+            average_steps=average_steps,
+            average_user_corrections=average_user_corrections,
+            direct_answer_count=counts["direct_answer"],
+            memory_action_count=counts["inject_memory"],
+            skill_action_count=counts["load_skill"],
+            tool_action_count=counts["call_tool"],
+            clarify_count=counts["clarify"],
+            composite_action_count=counts["composite_action"],
+        )
+
+    def summarize_directory(self, directory: str | Path) -> ReplaySummary:
+        base = Path(directory)
+        paths = sorted(base.glob("*.json"))
+        trajectories = self.load_many(paths)
+        return self.summarize(trajectories)
+
+    def summarize_persistence_store(self, persistence_store: PersistenceStore) -> ReplaySummary:
+        return self.summarize(self.load_many(persistence_store.list_trajectory_paths()))
diff --git a/src/memabra/retrieval.py b/src/memabra/retrieval.py
new file mode 100644
index 0000000..90753f4
--- /dev/null
+++ b/src/memabra/retrieval.py
@@ -0,0 +1,88 @@
+from __future__ import annotations
+
+from dataclasses import dataclass
+from typing import Iterable, Protocol
+
+from .candidate_types import CandidateObject, CandidateType
+from .router import TaskContext
+
+
+class CandidateProvider(Protocol):
+    candidate_type: CandidateType
+
+    def list_candidates(self) -> Iterable[CandidateObject]:
+        """Return all available candidates for this provider."""
+
+
+@dataclass(slots=True)
+class InMemoryCandidateProvider:
+    candidate_type: CandidateType
+    candidates: list[CandidateObject]
+
+    def list_candidates(self) -> Iterable[CandidateObject]:
+        return list(self.candidates)
+
+
+@dataclass(slots=True)
+class RetrievalResult:
+    memory: list[CandidateObject]
+    skill: list[CandidateObject]
+    tool: list[CandidateObject]
+
+
+class CandidateRetriever:
+    def __init__(self, providers: Iterable[CandidateProvider]):
+        self.providers = list(providers)
+
+    def retrieve(self, context: TaskContext, top_k: int = 3) -> RetrievalResult:
+        grouped: dict[CandidateType, list[CandidateObject]] = {
+            "memory": [],
+            "skill": [],
+            "tool": [],
+        }
+
+        for provider in self.providers:
+            candidates = [candidate for candidate in provider.list_candidates() if candidate.type == provider.candidate_type]
+            ranked = sorted(
+                candidates,
+                key=lambda candidate: self._score_candidate(candidate, context),
+                reverse=True,
+            )
+            grouped[provider.candidate_type].extend(ranked[:top_k])
+
+        return RetrievalResult(
+            memory=self._dedupe_and_rank(grouped["memory"], context, top_k),
+            skill=self._dedupe_and_rank(grouped["skill"], context, top_k),
+            tool=self._dedupe_and_rank(grouped["tool"], context, top_k),
+        )
+
+    def _dedupe_and_rank(
+        self,
+        candidates: list[CandidateObject],
+        context: TaskContext,
+        top_k: int,
+    ) -> list[CandidateObject]:
+        deduped: dict[str, CandidateObject] = {}
+        for candidate in candidates:
+            current = deduped.get(candidate.id)
+            if current is None or self._score_candidate(candidate, context) > self._score_candidate(current, context):
+                deduped[candidate.id] = candidate
+
+        return sorted(
+            deduped.values(),
+            key=lambda candidate: self._score_candidate(candidate, context),
+            reverse=True,
+        )[:top_k]
+
+    def _score_candidate(self, candidate: CandidateObject, context: TaskContext) -> float:
+        text = " ".join(
+            [
+                context.user_input.lower(),
+                context.conversation_summary.lower(),
+                context.environment_summary.lower(),
+            ]
+        )
+        lexical_hits = sum(1 for token in candidate.triggers + candidate.tags if token.lower() in text)
+        base = candidate.confidence + candidate.success_rate + candidate.freshness
+        penalty = candidate.cost + candidate.risk
+        return base + (0.2 * lexical_hits) - penalty
diff --git a/src/memabra/reward.py b/src/memabra/reward.py
new file mode 100644
index 0000000..7fd2fd2
--- /dev/null
+++ b/src/memabra/reward.py
@@ -0,0 +1,22 @@
+from .telemetry import RewardBreakdown
+
+
+def compute_reward(
+    *,
+    task_success: float,
+    retrieval_hit: float,
+    tool_error: float,
+    user_correction: float,
+    latency: float,
+    context_cost: float,
+    useful_reuse: float,
+) -> RewardBreakdown:
+    return RewardBreakdown(
+        task_success=task_success,
+        retrieval_hit=retrieval_hit,
+        tool_error=tool_error,
+        user_correction=user_correction,
+        latency=latency,
+        context_cost=context_cost,
+        useful_reuse=useful_reuse,
+    )
diff --git a/src/memabra/router.py b/src/memabra/router.py
new file mode 100644
index 0000000..e730c97
--- /dev/null
+++ b/src/memabra/router.py
@@ -0,0 +1,337 @@
+from dataclasses import dataclass, field
+from typing import Any, Iterable, Protocol, runtime_checkable
+
+from .candidate_types import CandidateObject, DecisionType
+from .dataset import TrainingSample
+
+
+@runtime_checkable
+class RouterProtocol(Protocol):
+    def choose(
+        self,
+        context: "TaskContext",
+        memory_candidates: Iterable[CandidateObject],
+        skill_candidates: Iterable[CandidateObject],
+        tool_candidates: Iterable[CandidateObject],
+    ) -> "RouteDecision":
+        ...
+
+
+@dataclass(slots=True)
+class RouteDecision:
+    decision_type: DecisionType
+    selected_ids: list[str] = field(default_factory=list)
+    selected_payloads: list[dict[str, Any]] = field(default_factory=list)
+    rationale: str = ""
+    estimated_cost: float = 0.0
+    score_breakdown: dict[str, float] = field(default_factory=dict)
+    composite_steps: list["RouteDecision"] = field(default_factory=list)
+
+
+@dataclass(slots=True)
+class TaskContext:
+    user_input: str
+    conversation_summary: str = ""
+    environment_summary: str = ""
+    recent_failures: list[str] = field(default_factory=list)
+
+
+class RuleBasedRouter:
+    """Baseline placeholder router for Phase 1.
+
+    The initial implementation is intentionally simple:
+    - prefer direct answer for low-ambiguity, no-tool tasks
+    - prefer memory when user/environment facts appear relevant
+    - prefer skill when a reusable procedure is clearly triggered
+    - prefer tool when current state or side effects must be observed
+    """
+
+    def choose(
+        self,
+        context: TaskContext,
+        memory_candidates: Iterable[CandidateObject],
+        skill_candidates: Iterable[CandidateObject],
+        tool_candidates: Iterable[CandidateObject],
+    ) -> RouteDecision:
+        text = context.user_input.lower()
+        if any(token in text for token in ["why", "think", "design", "name"]):
+            return RouteDecision(
+                decision_type="direct_answer",
+                rationale="Looks like a reasoning-first task with no strong tool trigger.",
+            )
+
+        tool_matches = [c for c in tool_candidates if c.confidence >= 0.6 and c.risk <= 0.7]
+        if any(token in text for token in ["check", "run", "open", "current", "list", "time"]):
+            if tool_matches:
+                best = sorted(tool_matches, key=lambda c: (c.confidence + c.success_rate - c.cost), reverse=True)[0]
+                return RouteDecision(
+                    decision_type="call_tool",
+                    selected_ids=[best.id],
+                    selected_payloads=[dict(best.type_payload)],
+                    rationale="Task asks for current state or external action; tool use is justified.",
+                    estimated_cost=best.cost,
+                )
+
+        memory_matches = [c for c in memory_candidates if c.confidence >= 0.65 and c.freshness >= 0.3]
+        if any(token in text for token in ["prefer", "remember", "usually", "my", "our"]):
+            if memory_matches:
+                best = sorted(memory_matches, key=lambda c: (c.confidence + c.freshness + c.success_rate), reverse=True)[0]
+                return RouteDecision(
+                    decision_type="inject_memory",
+                    selected_ids=[best.id],
+                    selected_payloads=[dict(best.type_payload)],
+                    rationale="Task likely depends on stable user/project facts.",
+                    estimated_cost=best.cost,
+                )
+
+        skill_matches = [c for c in skill_candidates if c.confidence >= 0.55 and c.success_rate >= 0.4]
+        if any(token in text for token in ["fix", "deploy", "review", "setup", "workflow"]):
+            if skill_matches:
+                best = sorted(skill_matches, key=lambda c: (c.success_rate + c.confidence - c.cost), reverse=True)[0]
+                return RouteDecision(
+                    decision_type="load_skill",
+                    selected_ids=[best.id],
+                    selected_payloads=[dict(best.type_payload)],
+                    rationale="Task resembles a reusable procedure; load a skill before action.",
+                    estimated_cost=best.cost,
+                )
+
+        return RouteDecision(
+            decision_type="clarify",
+            rationale="No high-confidence route found from the current heuristic baseline.",
+        )
+
+
+class FeatureScoringRouter:
+    """Router v2 with explicit feature scoring, failure penalties, and composite action preconditions."""
+
+    def choose(
+        self,
+        context: TaskContext,
+        memory_candidates: Iterable[CandidateObject],
+        skill_candidates: Iterable[CandidateObject],
+        tool_candidates: Iterable[CandidateObject],
+    ) -> RouteDecision:
+        scored: list[tuple[CandidateObject, str, float]] = []
+        breakdown: dict[str, float] = {}
+
+        for c in memory_candidates:
+            score = self._score(c, "memory", context)
+            scored.append((c, "memory", score))
+            breakdown[c.id] = score
+
+        for c in skill_candidates:
+            score = self._score(c, "skill", context)
+            scored.append((c, "skill", score))
+            breakdown[c.id] = score
+
+        for c in tool_candidates:
+            score = self._score(c, "tool", context)
+            scored.append((c, "tool", score))
+            breakdown[c.id] = score
+
+        filtered = [item for item in scored if self._passes_threshold(item[0], item[1])]
+
+        if not filtered:
+            return RouteDecision(
+                decision_type="clarify",
+                rationale="No high-confidence route found from feature scoring.",
+                score_breakdown=breakdown,
+            )
+
+        best_candidate, best_type, best_score = max(filtered, key=lambda x: x[2])
+
+        if best_candidate.preconditions:
+            composite_steps: list[RouteDecision] = []
+            for precondition in best_candidate.preconditions:
+                pre_candidate = self._find_best_precondition(precondition, scored)
+                if pre_candidate is not None:
+                    pre_type = precondition
+                    composite_steps.append(
+                        RouteDecision(
+                            decision_type=self._decision_type_for_candidate_type(pre_type),
+                            selected_ids=[pre_candidate.id],
+                            selected_payloads=[dict(pre_candidate.type_payload)],
+                            rationale=f"Satisfy precondition for {best_candidate.id}.",
+                            estimated_cost=pre_candidate.cost,
+                        )
+                    )
+            if composite_steps:
+                composite_steps.append(
+                    RouteDecision(
+                        decision_type=self._decision_type_for_candidate_type(best_type),
+                        selected_ids=[best_candidate.id],
+                        selected_payloads=[dict(best_candidate.type_payload)],
+                        rationale=f"Best {best_type} candidate after feature scoring.",
+                        estimated_cost=best_candidate.cost,
+                        score_breakdown={best_candidate.id: best_score},
+                    )
+                )
+                return RouteDecision(
+                    decision_type="composite_action",
+                    rationale=f"Composite action required for {best_candidate.id}.",
+                    composite_steps=composite_steps,
+                    score_breakdown=breakdown,
+                )
+
+        return RouteDecision(
+            decision_type=self._decision_type_for_candidate_type(best_type),
+            selected_ids=[best_candidate.id],
+            selected_payloads=[dict(best_candidate.type_payload)],
+            rationale=f"Best {best_type} candidate after feature scoring.",
+            estimated_cost=best_candidate.cost,
+            score_breakdown=breakdown,
+        )
+
+    def _score(self, candidate: CandidateObject, candidate_type: str, context: TaskContext) -> float:
+        if candidate_type == "memory":
+            score = (
+                candidate.confidence * 0.35
+                + candidate.freshness * 0.25
+                + candidate.success_rate * 0.25
+                - candidate.cost * 0.1
+                - candidate.risk * 0.05
+            )
+        elif candidate_type == "skill":
+            score = (
+                candidate.confidence * 0.25
+                + candidate.success_rate * 0.35
+                - candidate.cost * 0.2
+                - candidate.risk * 0.2
+            )
+        else:  # tool
+            score = (
+                candidate.confidence * 0.3
+                + candidate.success_rate * 0.3
+                - candidate.cost * 0.1
+                - candidate.risk * 0.3
+            )
+        if candidate.id in context.recent_failures:
+            score -= 0.5
+        return round(score, 4)
+
+    def _passes_threshold(self, candidate: CandidateObject, candidate_type: str) -> bool:
+        if candidate_type == "memory":
+            return candidate.confidence >= 0.65 and candidate.freshness >= 0.3
+        if candidate_type == "skill":
+            return candidate.confidence >= 0.55 and candidate.success_rate >= 0.4
+        if candidate_type == "tool":
+            return candidate.confidence >= 0.6 and candidate.risk <= 0.7
+        return True
+
+    def _decision_type_for_candidate_type(self, candidate_type: str) -> DecisionType:
+        if candidate_type == "memory":
+            return "inject_memory"
+        if candidate_type == "skill":
+            return "load_skill"
+        if candidate_type == "tool":
+            return "call_tool"
+        return "clarify"
+
+    def _find_best_precondition(
+        self,
+        precondition: str,
+        scored: list[tuple[CandidateObject, str, float]],
+    ) -> CandidateObject | None:
+        matches = [
+            item for item in scored if item[1] == precondition and self._passes_threshold(item[0], precondition)
+        ]
+        if not matches:
+            return None
+        best, _, _ = max(matches, key=lambda x: x[2])
+        return best
+
+
+def _extract_features(
+    context: TaskContext,
+    memory_candidates: Iterable[CandidateObject],
+    skill_candidates: Iterable[CandidateObject],
+    tool_candidates: Iterable[CandidateObject],
+) -> dict[str, float]:
+    memory = list(memory_candidates)
+    skill = list(skill_candidates)
+    tool = list(tool_candidates)
+    return {
+        "input_length": float(len(context.user_input)),
+        "memory_count": float(len(memory)),
+        "skill_count": float(len(skill)),
+        "tool_count": float(len(tool)),
+        "top_memory_confidence": max((c.confidence for c in memory), default=0.0),
+        "top_skill_success_rate": max((c.success_rate for c in skill), default=0.0),
+        "top_tool_confidence": max((c.confidence for c in tool), default=0.0),
+        "top_tool_risk": max((c.risk for c in tool), default=0.0),
+    }
+
+
+class SimpleLearningRouter:
+    """Lightweight learning router that trains reward-weighted feature vectors per decision type."""
+
+    def __init__(self) -> None:
+        self._weights: dict[str, dict[str, float]] = {}
+        self._feature_keys: list[str] = []
+
+    def fit(self, samples: list[TrainingSample]) -> None:
+        from collections import defaultdict
+
+        sums: dict[str, dict[str, float]] = defaultdict(lambda: defaultdict(float))
+        counts: dict[str, float] = defaultdict(float)
+        for sample in samples:
+            label = sample.label
+            reward = sample.reward
+            for key, value in sample.features.items():
+                sums[label][key] += value * reward
+            counts[label] += reward
+            if not self._feature_keys:
+                self._feature_keys = list(sample.features.keys())
+
+        self._weights = {}
+        for label, feature_sums in sums.items():
+            total_reward = counts[label]
+            if total_reward == 0:
+                continue
+            self._weights[label] = {k: v / total_reward for k, v in feature_sums.items()}
+
+    def choose(
+        self,
+        context: TaskContext,
+        memory_candidates: Iterable[CandidateObject],
+        skill_candidates: Iterable[CandidateObject],
+        tool_candidates: Iterable[CandidateObject],
+    ) -> RouteDecision:
+        features = _extract_features(context, memory_candidates, skill_candidates, tool_candidates)
+        if not self._weights:
+            return RouteDecision(
+                decision_type="clarify",
+                rationale="Learning router has not been trained yet.",
+            )
+
+        best_label: str | None = None
+        best_score = float("-inf")
+        for label, weights in self._weights.items():
+            score = sum(features.get(k, 0.0) * w for k, w in weights.items())
+            if score > best_score:
+                best_score = score
+                best_label = label
+
+        assert best_label is not None
+        selected_ids: list[str] = []
+        selected_payloads: list[dict[str, Any]] = []
+        if best_label == "inject_memory" and memory_candidates:
+            best = max(memory_candidates, key=lambda c: c.confidence)
+            selected_ids = [best.id]
+            selected_payloads = [dict(best.type_payload)]
+        elif best_label == "load_skill" and skill_candidates:
+            best = max(skill_candidates, key=lambda c: c.success_rate)
+            selected_ids = [best.id]
+            selected_payloads = [dict(best.type_payload)]
+        elif best_label == "call_tool" and tool_candidates:
+            best = max(tool_candidates, key=lambda c: c.confidence - c.risk)
+            selected_ids = [best.id]
+            selected_payloads = [dict(best.type_payload)]
+
+        return RouteDecision(
+            decision_type=best_label,
+            selected_ids=selected_ids,
+            selected_payloads=selected_payloads,
+            rationale=f"Predicted by learning router (score={round(best_score, 4)}).",
+        )
diff --git a/src/memabra/router_versioning.py b/src/memabra/router_versioning.py
new file mode 100644
index 0000000..11cb968
--- /dev/null
+++ b/src/memabra/router_versioning.py
@@ -0,0 +1,97 @@
+from __future__ import annotations
+
+import json
+from dataclasses import dataclass, field
+from datetime import datetime, timezone
+from pathlib import Path
+from typing import Any
+
+from .router import SimpleLearningRouter
+
+
+@dataclass
+class RouterVersionStore:
+    base_dir: str | Path = field(default="docs/projects/memabra/router-versions")
+
+    def __post_init__(self):
+        self._base = Path(self.base_dir)
+        self._versions_dir = self._base / "versions"
+        self._versions_dir.mkdir(parents=True, exist_ok=True)
+        self._current_file = self._base / "current.json"
+
+    def save(
+        self,
+        router: SimpleLearningRouter,
+        version_id: str | None = None,
+        metadata: dict[str, Any] | None = None,
+    ) -> dict[str, Any]:
+        if version_id is None:
+            version_id = datetime.now(timezone.utc).strftime("%Y%m%d-%H%M%S")
+
+        version_path = self._versions_dir / f"{version_id}.json"
+        record = {
+            "version_id": version_id,
+            "weights": router._weights,
+            "feature_keys": router._feature_keys,
+            "metadata": metadata or {},
+        }
+        version_path.write_text(json.dumps(record, indent=2), encoding="utf-8")
+
+        prior = self._read_current()
+        prior_version_id = prior.get("current_version_id")
+
+        current_record = {
+            "current_version_id": version_id,
+            "promotion_source": (metadata or {}).get("promotion_source"),
+            "benchmark_summary": (metadata or {}).get("benchmark_summary"),
+            "prior_version_id": prior_version_id,
+            "saved_at": datetime.now(timezone.utc).isoformat(),
+        }
+        self._current_file.write_text(json.dumps(current_record, indent=2), encoding="utf-8")
+        return record
+
+    def load(self, version_id: str | None = None) -> SimpleLearningRouter:
+        if version_id is None:
+            current = self._read_current()
+            version_id = current.get("current_version_id")
+        if version_id is None:
+            raise ValueError("No version_id provided and no current version set.")
+
+        version_path = self._versions_dir / f"{version_id}.json"
+        record = json.loads(version_path.read_text(encoding="utf-8"))
+        router = SimpleLearningRouter()
+        router._weights = record.get("weights", {})
+        router._feature_keys = record.get("feature_keys", [])
+        return router
+
+    def list_versions(self) -> list[dict[str, Any]]:
+        versions = []
+        for path in sorted(self._versions_dir.glob("*.json")):
+            record = json.loads(path.read_text(encoding="utf-8"))
+            versions.append({
+                "version_id": record.get("version_id"),
+                "metadata": record.get("metadata", {}),
+            })
+        return versions
+
+    def rollback(self, version_id: str) -> dict[str, Any]:
+        version_path = self._versions_dir / f"{version_id}.json"
+        if not version_path.exists():
+            raise ValueError(f"Version '{version_id}' not found.")
+        prior = self._read_current()
+        current_record = {
+            "current_version_id": version_id,
+            "rollback_from": prior.get("current_version_id"),
+            "rolled_back_at": datetime.now(timezone.utc).isoformat(),
+            "prior_version_id": prior.get("prior_version_id"),
+        }
+        self._current_file.write_text(json.dumps(current_record, indent=2), encoding="utf-8")
+        return {"current_version_id": version_id}
+
+    def get_current(self) -> dict[str, Any]:
+        return self._read_current()
+
+    def _read_current(self) -> dict[str, Any]:
+        if not self._current_file.exists():
+            return {}
+        return json.loads(self._current_file.read_text(encoding="utf-8"))
diff --git a/src/memabra/runner.py b/src/memabra/runner.py
new file mode 100644
index 0000000..0bf8dd4
--- /dev/null
+++ b/src/memabra/runner.py
@@ -0,0 +1,237 @@
+from __future__ import annotations
+
+from dataclasses import dataclass, field
+from datetime import UTC, datetime
+from typing import Any
+from uuid import uuid4
+
+from .candidate_types import CandidateObject
+from .case_index import CaseIndex
+from .execution import ExecutionEngine
+from .memory_store import InMemoryMemoryStore, MemoryRecord, MemorySource
+from .outcome import OutcomeEngine, RewardEngine
+from .persistence import PersistenceStore
+from .replay import TrajectoryReplay
+from .retrieval import CandidateRetriever, RetrievalResult
+from .router import RouteDecision, RuleBasedRouter, TaskContext
+from .telemetry import Event, RewardBreakdown
+from .trajectory_summary import TrajectorySummarizer
+
+
+@dataclass(slots=True)
+class MemabraRunner:
+    retriever: CandidateRetriever
+    router: RuleBasedRouter
+    execution_engine: ExecutionEngine | None = None
+    persistence_store: PersistenceStore | None = None
+    memory_store: InMemoryMemoryStore | None = None
+    case_index: CaseIndex | None = None
+    outcome_engine: OutcomeEngine = field(default_factory=OutcomeEngine)
+    reward_engine: RewardEngine = field(default_factory=RewardEngine)
+
+    def run(
+        self,
+        *,
+        context: TaskContext,
+        channel: str = "local",
+        user_id: str | None = None,
+        top_k: int = 3,
+        persist: bool = False,
+    ) -> dict[str, Any]:
+        trajectory_id = f"traj-{uuid4()}"
+        task_id = f"task-{uuid4()}"
+        started_at = datetime.now(UTC).isoformat()
+
+        retrieval_result = self.retriever.retrieve(context, top_k=top_k)
+        if self.case_index is not None:
+            best_trajectory_id = self.case_index.best(context.user_input)
+            if best_trajectory_id is not None:
+                summary = f"Previous successful trajectory: {best_trajectory_id}"
+                if self.persistence_store is not None:
+                    try:
+                        past_trajectory = self.persistence_store.load_trajectory(best_trajectory_id)
+                        summary = TrajectorySummarizer().summarize(past_trajectory)
+                    except Exception:
+                        pass
+                episodic_candidate = CandidateObject(
+                    id=f"episodic-{best_trajectory_id}",
+                    type="memory",
+                    title="Episodic case",
+                    summary=summary,
+                    triggers=["episodic"],
+                    confidence=0.95,
+                    success_rate=0.95,
+                    freshness=1.0,
+                    tags=["episodic"],
+                    source="case_index",
+                )
+                retrieval_result.memory.insert(0, episodic_candidate)
+        decision = self.router.choose(
+            context,
+            retrieval_result.memory,
+            retrieval_result.skill,
+            retrieval_result.tool,
+        )
+        events = self._build_events(trajectory_id, context, retrieval_result, decision)
+        execution_result = None
+        if self.execution_engine is not None:
+            execution_result = self.execution_engine.execute(decision, context, trajectory_id)
+            events.extend(execution_result.events)
+            self._write_back_memory(decision, context, execution_result)
+        outcome = self.outcome_engine.build_outcome(decision, execution_result)
+        reward = self.reward_engine.compute(
+            decision,
+            outcome,
+            execution_result=execution_result,
+            retrieval_result=retrieval_result,
+        )
+        outcome_dict = {
+            "status": outcome.status,
+            "steps": outcome.steps,
+            "latency_ms": outcome.latency_ms,
+            "user_corrections": outcome.user_corrections,
+            "tool_errors": outcome.tool_errors,
+            "notes": outcome.notes,
+        }
+
+        trajectory = {
+            "trajectory_id": trajectory_id,
+            "task": {
+                "task_id": task_id,
+                "input": context.user_input,
+                "channel": channel,
+                "created_at": started_at,
+                "user_id": user_id,
+            },
+            "context_snapshot": {
+                "conversation_summary": context.conversation_summary,
+                "environment_summary": context.environment_summary,
+                "recent_failures": list(context.recent_failures),
+            },
+            "candidate_sets": {
+                "memory": [self._candidate_to_dict(candidate) for candidate in retrieval_result.memory],
+                "skill": [self._candidate_to_dict(candidate) for candidate in retrieval_result.skill],
+                "tool": [self._candidate_to_dict(candidate) for candidate in retrieval_result.tool],
+            },
+            "decisions": [self._decision_to_dict(decision)],
+            "events": [self._event_to_dict(event) for event in events],
+            "outcome": outcome_dict,
+            "reward": {
+                "total": reward.total,
+                "components": {
+                    "task_success": reward.task_success,
+                    "retrieval_hit": reward.retrieval_hit,
+                    "tool_error": reward.tool_error,
+                    "user_correction": reward.user_correction,
+                    "latency": reward.latency,
+                    "context_cost": reward.context_cost,
+                    "useful_reuse": reward.useful_reuse,
+                },
+            },
+        }
+        if persist and self.persistence_store is not None:
+            self.persistence_store.save_trajectory(trajectory)
+        return trajectory
+
+    def summarize_runs(self, trajectories: list[dict[str, Any]]):
+        replay = TrajectoryReplay()
+        return replay.summarize(trajectories)
+
+    def _write_back_memory(self, decision: RouteDecision, context: TaskContext, execution_result) -> None:
+        if self.memory_store is None:
+            return
+        if decision.decision_type == "inject_memory":
+            for record_id in decision.selected_ids:
+                if self.memory_store.get(record_id) is None:
+                    self.memory_store.upsert(
+                        MemoryRecord(
+                            id=record_id,
+                            memory_type="semantic",
+                            fact_status="assumed",
+                            content=context.user_input,
+                            summary=f"Writeback placeholder for {record_id}",
+                            source=MemorySource(kind="system", ref="runner-writeback"),
+                            confidence=0.5,
+                        )
+                    )
+                self.memory_store.mark_used(record_id)
+
+    def _build_events(
+        self,
+        trajectory_id: str,
+        context: TaskContext,
+        retrieval_result: RetrievalResult,
+        decision: RouteDecision,
+    ) -> list[Event]:
+        task_event_id = f"evt-{uuid4()}"
+        retrieve_event_id = f"evt-{uuid4()}"
+        decision_event_id = f"evt-{uuid4()}"
+        return [
+            Event(
+                event_id=task_event_id,
+                trajectory_id=trajectory_id,
+                stage="retrieval",
+                event_type="task_received",
+                payload={"input": context.user_input},
+            ),
+            Event(
+                event_id=retrieve_event_id,
+                trajectory_id=trajectory_id,
+                stage="retrieval",
+                event_type="candidates_recalled",
+                parent_event_id=task_event_id,
+                payload={
+                    "memory_ids": [candidate.id for candidate in retrieval_result.memory],
+                    "skill_ids": [candidate.id for candidate in retrieval_result.skill],
+                    "tool_ids": [candidate.id for candidate in retrieval_result.tool],
+                },
+            ),
+            Event(
+                event_id=decision_event_id,
+                trajectory_id=trajectory_id,
+                stage="policy",
+                event_type="action_selected",
+                parent_event_id=retrieve_event_id,
+                payload=self._decision_to_dict(decision),
+            ),
+        ]
+
+    def _candidate_to_dict(self, candidate) -> dict[str, Any]:
+        return {
+            "id": candidate.id,
+            "type": candidate.type,
+            "title": candidate.title,
+            "summary": candidate.summary,
+            "triggers": list(candidate.triggers),
+            "cost": candidate.cost,
+            "confidence": candidate.confidence,
+            "success_rate": candidate.success_rate,
+            "freshness": candidate.freshness,
+            "risk": candidate.risk,
+            "tags": list(candidate.tags),
+            "source": candidate.source,
+            "type_payload": dict(candidate.type_payload),
+        }
+
+    def _decision_to_dict(self, decision: RouteDecision) -> dict[str, Any]:
+        return {
+            "step": 1,
+            "decision_type": decision.decision_type,
+            "selected_ids": list(decision.selected_ids),
+            "selected_payloads": [dict(payload) for payload in decision.selected_payloads],
+            "rejected_ids": [],
+            "rationale": decision.rationale,
+            "estimated_cost": decision.estimated_cost,
+        }
+
+    def _event_to_dict(self, event: Event) -> dict[str, Any]:
+        return {
+            "event_id": event.event_id,
+            "trajectory_id": event.trajectory_id,
+            "timestamp": event.timestamp,
+            "stage": event.stage,
+            "event_type": event.event_type,
+            "payload": event.payload,
+            "metrics": event.metrics,
+            "parent_event_id": event.parent_event_id,
+        }
diff --git a/src/memabra/schemas.py b/src/memabra/schemas.py
new file mode 100644
index 0000000..bb36ff4
--- /dev/null
+++ b/src/memabra/schemas.py
@@ -0,0 +1,44 @@
+from __future__ import annotations
+
+import json
+from pathlib import Path
+from typing import Any
+
+
+class SchemaValidationError(ValueError):
+    pass
+
+
+class SchemaRegistry:
+    def __init__(self, schema_dir: str | Path = "docs/projects/memabra/schemas"):
+        self.schema_dir = Path(schema_dir)
+
+    def load_schema(self, name: str) -> dict[str, Any]:
+        path = self.schema_dir / name
+        with path.open("r", encoding="utf-8") as handle:
+            return json.load(handle)
+
+    def validate_trajectory(self, document: dict[str, Any]) -> None:
+        self._require_keys(document, ["trajectory_id", "task", "context_snapshot", "candidate_sets", "decisions", "events", "outcome", "reward"])
+        self._require_keys(document["task"], ["task_id", "input", "channel", "created_at"])
+        self._require_keys(document["context_snapshot"], ["conversation_summary", "environment_summary"])
+        self._require_keys(document["candidate_sets"], ["memory", "skill", "tool"])
+        self._require_keys(document["outcome"], ["status", "steps", "latency_ms", "user_corrections"])
+        self._require_keys(document["reward"], ["total", "components"])
+        self._require_keys(
+            document["reward"]["components"],
+            ["task_success", "retrieval_hit", "tool_error", "user_correction", "latency", "context_cost", "useful_reuse"],
+        )
+
+    def validate_memory_record(self, document: dict[str, Any]) -> None:
+        self._require_keys(
+            document,
+            ["id", "memory_type", "fact_status", "content", "summary", "source", "confidence", "created_at", "updated_at", "verification"],
+        )
+        self._require_keys(document["source"], ["kind", "ref"])
+        self._require_keys(document["verification"], ["status", "last_checked_at", "check_method"])
+
+    def _require_keys(self, document: dict[str, Any], keys: list[str]) -> None:
+        missing = [key for key in keys if key not in document]
+        if missing:
+            raise SchemaValidationError(f"Missing required keys: {', '.join(missing)}")
diff --git a/src/memabra/telemetry.py b/src/memabra/telemetry.py
new file mode 100644
index 0000000..ef7abda
--- /dev/null
+++ b/src/memabra/telemetry.py
@@ -0,0 +1,38 @@
+from dataclasses import dataclass, field
+from datetime import datetime, UTC
+from typing import Any
+
+
+@dataclass(slots=True)
+class Event:
+    event_id: str
+    trajectory_id: str
+    stage: str
+    event_type: str
+    payload: dict[str, Any]
+    metrics: dict[str, Any] = field(default_factory=dict)
+    parent_event_id: str | None = None
+    timestamp: str = field(default_factory=lambda: datetime.now(UTC).isoformat())
+
+
+@dataclass(slots=True)
+class RewardBreakdown:
+    task_success: float = 0.0
+    retrieval_hit: float = 0.0
+    tool_error: float = 0.0
+    user_correction: float = 0.0
+    latency: float = 0.0
+    context_cost: float = 0.0
+    useful_reuse: float = 0.0
+
+    @property
+    def total(self) -> float:
+        return (
+            self.task_success
+            + self.retrieval_hit
+            - self.tool_error
+            - self.user_correction
+            - self.latency
+            - self.context_cost
+            + self.useful_reuse
+        )
diff --git a/src/memabra/training_reports.py b/src/memabra/training_reports.py
new file mode 100644
index 0000000..c1d06f3
--- /dev/null
+++ b/src/memabra/training_reports.py
@@ -0,0 +1,75 @@
+from __future__ import annotations
+
+import json
+from dataclasses import dataclass, field
+from datetime import datetime, timezone
+from pathlib import Path
+from typing import Any
+from uuid import uuid4
+
+from .evaluator import EvaluationResult
+from .promotion import PromotionDecision
+
+
+def build_report(
+    *,
+    source_trajectory_ids: list[str],
+    baseline: EvaluationResult,
+    challenger: EvaluationResult,
+    decision: PromotionDecision,
+    promoted_version_id: str | None = None,
+    baseline_version_id: str | None = None,
+) -> dict[str, Any]:
+    return {
+        "report_id": f"report-{uuid4()}",
+        "timestamp": datetime.now(timezone.utc).isoformat(),
+        "source_trajectory_ids": source_trajectory_ids,
+        "sample_count": len(source_trajectory_ids),
+        "baseline_metrics": {
+            "task_count": baseline.task_count,
+            "avg_reward": baseline.avg_reward,
+            "error_rate": baseline.error_rate,
+            "avg_latency_ms": baseline.avg_latency_ms,
+        },
+        "challenger_metrics": {
+            "task_count": challenger.task_count,
+            "avg_reward": challenger.avg_reward,
+            "error_rate": challenger.error_rate,
+            "avg_latency_ms": challenger.avg_latency_ms,
+        },
+        "promotion_decision": {
+            "accepted": decision.accepted,
+            "reasons": decision.reasons,
+            "metrics": decision.metrics,
+        },
+        "promoted_version_id": promoted_version_id,
+        "baseline_version_id": baseline_version_id,
+    }
+
+
+@dataclass
+class TrainingReportStore:
+    base_dir: str | Path = field(default="docs/projects/memabra/training-reports")
+
+    def __post_init__(self):
+        self._base = Path(self.base_dir)
+        self._base.mkdir(parents=True, exist_ok=True)
+
+    def save(self, report: dict[str, Any]) -> dict[str, Any]:
+        report_id = report["report_id"]
+        path = self._base / f"{report_id}.json"
+        path.write_text(json.dumps(report, indent=2), encoding="utf-8")
+        return {"report_id": report_id, "path": str(path)}
+
+    def list_reports(self) -> list[dict[str, Any]]:
+        reports = []
+        for path in sorted(self._base.glob("*.json")):
+            record = json.loads(path.read_text(encoding="utf-8"))
+            reports.append(record)
+        return reports
+
+    def get_report(self, report_id: str) -> dict[str, Any] | None:
+        path = self._base / f"{report_id}.json"
+        if not path.exists():
+            return None
+        return json.loads(path.read_text(encoding="utf-8"))
diff --git a/src/memabra/trajectory_summary.py b/src/memabra/trajectory_summary.py
new file mode 100644
index 0000000..0a67a75
--- /dev/null
+++ b/src/memabra/trajectory_summary.py
@@ -0,0 +1,35 @@
+from __future__ import annotations
+
+from typing import Any
+
+
+class TrajectorySummarizer:
+    def summarize(self, trajectory: dict[str, Any]) -> str:
+        task_input = ""
+        if "task" in trajectory and isinstance(trajectory["task"], dict):
+            task_input = trajectory["task"].get("input", "")
+        if len(task_input) > 60:
+            task_input = task_input[:57] + "..."
+
+        decisions = trajectory.get("decisions", [])
+        action_types = [d.get("decision_type", "unknown") for d in decisions] if isinstance(decisions, list) else []
+        action_str = " -> ".join(action_types) if action_types else "none"
+
+        outcome = trajectory.get("outcome", {}) if isinstance(trajectory.get("outcome"), dict) else {}
+        status = outcome.get("status", "unknown")
+        reward = trajectory.get("reward", {}).get("total", 0.0) if isinstance(trajectory.get("reward"), dict) else 0.0
+        steps = outcome.get("steps", 0)
+        tool_errors = outcome.get("tool_errors", 0)
+        user_corrections = outcome.get("user_corrections", 0)
+
+        parts = [
+            f"Task: '{task_input}'",
+            f"Actions: {action_str}",
+            f"Outcome: {status} (reward={reward}, steps={steps})",
+        ]
+        if tool_errors:
+            parts.append(f"Tool errors: {tool_errors}")
+        if user_corrections:
+            parts.append(f"User corrections: {user_corrections}")
+
+        return " | ".join(parts)
diff --git a/tests/test_app.py b/tests/test_app.py
new file mode 100644
index 0000000..69b063f
--- /dev/null
+++ b/tests/test_app.py
@@ -0,0 +1,197 @@
+from pathlib import Path
+
+from memabra.app import MemabraApp, build_app_with_skills, build_demo_app
+
+
+def test_build_demo_app_runs_task_and_produces_summary(tmp_path: Path):
+    app = build_demo_app(base_dir=tmp_path / "demo-artifacts")
+
+    trajectory = app.run_task("Use my telegram preference for this answer.", channel="telegram", user_id="oza")
+    summary = app.replay_summary()
+
+    assert trajectory["trajectory_id"].startswith("traj-")
+    assert summary.trajectories == 1
+    assert any(event["event_type"] == "memory_injected" for event in trajectory["events"])
+    assert len(list((tmp_path / "demo-artifacts" / "trajectories").glob("*.json"))) == 1
+
+
+def test_app_can_run_tool_task_with_demo_backend(tmp_path: Path):
+    app = build_demo_app(base_dir=tmp_path / "demo-artifacts")
+
+    trajectory = app.run_task("Check the current system status.")
+
+    assert trajectory["decisions"][0]["decision_type"] == "call_tool"
+    assert any(event["event_type"] == "tool_result" for event in trajectory["events"])
+    assert trajectory["outcome"]["status"] == "success"
+
+
+def test_build_app_with_skills_loads_real_skill_from_filesystem(tmp_path: Path):
+    skill_dir = tmp_path / "skills" / "github-auth"
+    skill_dir.mkdir(parents=True)
+    (skill_dir / "SKILL.md").write_text(
+        "---\n"
+        "name: github-auth\n"
+        "description: Authenticate with GitHub.\n"
+        "---\n\n"
+        "# GitHub Auth\n\n"
+        "Use git or gh.\n"
+    )
+
+    app = build_app_with_skills(base_dir=tmp_path / "artifacts", skill_search_paths=[tmp_path / "skills"])
+
+    # github-auth is not in the candidate set by default, so router won't trigger it.
+    # We test that the app builds and a memory task still works.
+    trajectory = app.run_task("Use my telegram preference for this answer.", channel="telegram", user_id="oza")
+    assert trajectory["decisions"][0]["decision_type"] == "inject_memory"
+
+    # Now verify the skill backend is actually wired by loading directly
+    backend = app.runner.execution_engine.skill_executor.backend
+    payload = backend.load_skill("github-auth")
+    assert payload["name"] == "github-auth"
+    assert "Use git or gh." in payload["content"]
+
+
+def test_app_artifact_index_queries_persisted_trajectories(tmp_path: Path):
+    app = build_demo_app(base_dir=tmp_path / "demo-artifacts")
+
+    app.run_task("Use my telegram preference for this answer.", channel="telegram", user_id="u1")
+    app.run_task("Check the current system status.", channel="local", user_id="u2")
+
+    index = app.artifact_index()
+    telegram_trajs = index.query(channel="telegram")
+    tool_trajs = index.query(decision_type="call_tool")
+
+    assert len(telegram_trajs) == 1
+    assert telegram_trajs[0]["task"]["input"] == "Use my telegram preference for this answer."
+    assert len(tool_trajs) == 1
+    assert tool_trajs[0]["task"]["input"] == "Check the current system status."
+
+    slice_ids = index.slice_dataset(channel="local")
+    assert len(slice_ids) == 1
+
+
+def test_app_run_online_learning_cycle_returns_report(tmp_path: Path):
+    from memabra.benchmarks import BenchmarkTask
+    from memabra.promotion import PromotionPolicy
+
+    app = build_demo_app(base_dir=tmp_path / "demo-artifacts")
+    # Seed trajectories
+    for i in range(10):
+        app.run_task(f"Task {i}")
+
+    result = app.run_online_learning_cycle(
+        policy=PromotionPolicy(
+            min_reward_delta=-1.0,
+            max_error_rate_increase=1.0,
+            max_latency_increase_ms=10000.0,
+            required_task_count=1,
+        ),
+        benchmark_tasks=[BenchmarkTask(user_input="Task 0")],
+        min_new_trajectories=1,
+    )
+
+    assert "skipped" in result
+    assert "promoted" in result or result["skipped"] is True
+    assert "report_id" in result
+
+
+def test_app_run_online_learning_cycle_uses_baseline_version(tmp_path: Path):
+    from memabra.benchmarks import BenchmarkTask
+    from memabra.promotion import PromotionPolicy
+    from memabra.router import SimpleLearningRouter
+    from memabra.router_versioning import RouterVersionStore
+
+    app = build_demo_app(base_dir=tmp_path / "demo-artifacts")
+    for i in range(10):
+        app.run_task(f"Task {i}")
+
+    # Save a baseline version
+    baseline_router = SimpleLearningRouter()
+    baseline_router._weights = {"call_tool": {"input_length": 0.99}}
+    baseline_router._feature_keys = ["input_length"]
+    version_dir = tmp_path / "versions"
+    store = RouterVersionStore(base_dir=version_dir)
+    store.save(baseline_router, version_id="v-baseline")
+
+    # Change current router
+    app.set_router(SimpleLearningRouter())
+
+    result = app.run_online_learning_cycle(
+        policy=PromotionPolicy(
+            min_reward_delta=-1.0,
+            max_error_rate_increase=1.0,
+            max_latency_increase_ms=10000.0,
+            required_task_count=1,
+        ),
+        benchmark_tasks=[BenchmarkTask(user_input="Task 0")],
+        min_new_trajectories=1,
+        version_store_base_dir=version_dir,
+        baseline_version_id="v-baseline",
+    )
+
+    assert result["skipped"] is False
+    assert "baseline_metrics" in result
+    assert "challenger_metrics" in result
+
+
+def test_app_run_online_learning_cycle_rebuilds_case_index(tmp_path: Path):
+    from memabra.benchmarks import BenchmarkTask
+    from memabra.promotion import PromotionPolicy
+
+    app = build_demo_app(base_dir=tmp_path / "demo-artifacts")
+    for i in range(10):
+        app.run_task(f"Task {i}")
+
+    case_index_path = tmp_path / "case-index.json"
+    result = app.run_online_learning_cycle(
+        policy=PromotionPolicy(
+            min_reward_delta=-1.0,
+            max_error_rate_increase=1.0,
+            max_latency_increase_ms=10000.0,
+            required_task_count=1,
+        ),
+        benchmark_tasks=[BenchmarkTask(user_input="Task 0")],
+        min_new_trajectories=1,
+        case_index_path=case_index_path,
+    )
+
+    assert result["skipped"] is False
+    assert case_index_path.exists()
+    from memabra.case_index import CaseIndex
+
+    index = CaseIndex.load(case_index_path)
+    assert index.best("Task 0") is not None
+
+
+def test_app_build_case_index_from_trajectories(tmp_path: Path):
+    app = build_demo_app(base_dir=tmp_path / "demo-artifacts")
+    app.run_task("Hello world", channel="local", user_id="u1")
+    app.run_task("Hello world", channel="local", user_id="u2")
+
+    case_index = app.build_case_index()
+
+    assert case_index.best("Hello world") is not None
+
+
+def test_app_save_and_load_case_index(tmp_path: Path):
+    app = build_demo_app(base_dir=tmp_path / "demo-artifacts")
+    app.run_task("Persist this case", channel="local", user_id="u1")
+
+    case_index_path = tmp_path / "case-index.json"
+    app.build_case_index()
+    app.save_case_index(case_index_path)
+    loaded_app = build_demo_app(base_dir=tmp_path / "demo-artifacts")
+    loaded_app.load_case_index(case_index_path)
+
+    assert loaded_app.case_index is not None
+    assert loaded_app.case_index.best("Persist this case") is not None
+
+
+def test_app_best_trajectory_for_input(tmp_path: Path):
+    app = build_demo_app(base_dir=tmp_path / "demo-artifacts")
+    trajectory = app.run_task("Find the best trajectory", channel="local", user_id="u1")
+
+    app.build_case_index()
+    best_id = app.best_trajectory_for("Find the best trajectory")
+
+    assert best_id == trajectory["trajectory_id"]
diff --git a/tests/test_artifact_index.py b/tests/test_artifact_index.py
new file mode 100644
index 0000000..e62495a
--- /dev/null
+++ b/tests/test_artifact_index.py
@@ -0,0 +1,169 @@
+from pathlib import Path
+
+from memabra.persistence import PersistenceStore
+from memabra.artifact_index import ArtifactIndex
+
+
+def _make_trajectory(
+    trajectory_id: str,
+    *,
+    status: str = "success",
+    decision_type: str = "direct_answer",
+    channel: str = "local",
+    reward_total: float = 1.0,
+    latency_ms: int = 100,
+    tool_errors: int = 0,
+    user_corrections: int = 0,
+    input_text: str = "Hello",
+    created_at: str = "2026-01-15T10:00:00Z",
+):
+    return {
+        "trajectory_id": trajectory_id,
+        "task": {
+            "task_id": f"task-{trajectory_id}",
+            "input": input_text,
+            "channel": channel,
+            "created_at": created_at,
+            "user_id": None,
+        },
+        "context_snapshot": {"conversation_summary": "", "environment_summary": "", "recent_failures": []},
+        "candidate_sets": {"memory": [], "skill": [], "tool": []},
+        "decisions": [
+            {
+                "step": 1,
+                "decision_type": decision_type,
+                "selected_ids": [],
+                "selected_payloads": [],
+                "rejected_ids": [],
+                "rationale": "",
+                "estimated_cost": 0.0,
+            }
+        ],
+        "events": [],
+        "outcome": {
+            "status": status,
+            "steps": 1,
+            "latency_ms": latency_ms,
+            "user_corrections": user_corrections,
+            "tool_errors": tool_errors,
+            "notes": None,
+        },
+        "reward": {
+            "total": reward_total,
+            "components": {
+                "task_success": 1.0 if status == "success" else 0.0,
+                "retrieval_hit": 0.0,
+                "tool_error": 0.1 * tool_errors,
+                "user_correction": 0.1 * user_corrections,
+                "latency": 0.0,
+                "context_cost": 0.0,
+                "useful_reuse": 0.0,
+            },
+        },
+    }
+
+
+def test_artifact_index_lists_all_trajectories(tmp_path: Path):
+    persistence = PersistenceStore(base_dir=tmp_path / "artifacts")
+    persistence.save_trajectory(_make_trajectory("traj-1", status="success"))
+    persistence.save_trajectory(_make_trajectory("traj-2", status="failure"))
+
+    index = ArtifactIndex(persistence_store=persistence)
+    results = index.query()
+
+    assert len(results) == 2
+    assert {r["trajectory_id"] for r in results} == {"traj-1", "traj-2"}
+
+
+def test_artifact_index_filters_by_status(tmp_path: Path):
+    persistence = PersistenceStore(base_dir=tmp_path / "artifacts")
+    persistence.save_trajectory(_make_trajectory("traj-1", status="success"))
+    persistence.save_trajectory(_make_trajectory("traj-2", status="failure"))
+    persistence.save_trajectory(_make_trajectory("traj-3", status="partial_success"))
+
+    index = ArtifactIndex(persistence_store=persistence)
+    successes = index.query(status="success")
+    failures = index.query(status="failure")
+
+    assert len(successes) == 1
+    assert successes[0]["trajectory_id"] == "traj-1"
+    assert len(failures) == 1
+    assert failures[0]["trajectory_id"] == "traj-2"
+
+
+def test_artifact_index_filters_by_reward_range(tmp_path: Path):
+    persistence = PersistenceStore(base_dir=tmp_path / "artifacts")
+    persistence.save_trajectory(_make_trajectory("traj-1", reward_total=0.9))
+    persistence.save_trajectory(_make_trajectory("traj-2", reward_total=0.5))
+    persistence.save_trajectory(_make_trajectory("traj-3", reward_total=-0.2))
+
+    index = ArtifactIndex(persistence_store=persistence)
+    high = index.query(min_reward=0.6)
+    low = index.query(max_reward=0.0)
+
+    assert len(high) == 1 and high[0]["trajectory_id"] == "traj-1"
+    assert len(low) == 1 and low[0]["trajectory_id"] == "traj-3"
+
+
+def test_artifact_index_filters_by_decision_type_and_channel(tmp_path: Path):
+    persistence = PersistenceStore(base_dir=tmp_path / "artifacts")
+    persistence.save_trajectory(_make_trajectory("traj-1", decision_type="direct_answer", channel="local"))
+    persistence.save_trajectory(_make_trajectory("traj-2", decision_type="call_tool", channel="telegram"))
+
+    index = ArtifactIndex(persistence_store=persistence)
+    tools = index.query(decision_type="call_tool")
+    telegram = index.query(channel="telegram")
+
+    assert len(tools) == 1 and tools[0]["trajectory_id"] == "traj-2"
+    assert len(telegram) == 1 and telegram[0]["trajectory_id"] == "traj-2"
+
+
+def test_artifact_index_filters_by_tool_errors_and_user_corrections(tmp_path: Path):
+    persistence = PersistenceStore(base_dir=tmp_path / "artifacts")
+    persistence.save_trajectory(_make_trajectory("traj-1", tool_errors=0, user_corrections=0))
+    persistence.save_trajectory(_make_trajectory("traj-2", tool_errors=2, user_corrections=1))
+
+    index = ArtifactIndex(persistence_store=persistence)
+    with_errors = index.query(min_tool_errors=1)
+    with_corrections = index.query(min_user_corrections=1)
+
+    assert len(with_errors) == 1 and with_errors[0]["trajectory_id"] == "traj-2"
+    assert len(with_corrections) == 1 and with_corrections[0]["trajectory_id"] == "traj-2"
+
+
+def test_artifact_index_filters_by_input_text(tmp_path: Path):
+    persistence = PersistenceStore(base_dir=tmp_path / "artifacts")
+    persistence.save_trajectory(_make_trajectory("traj-1", input_text="Deploy the service"))
+    persistence.save_trajectory(_make_trajectory("traj-2", input_text="Check status"))
+
+    index = ArtifactIndex(persistence_store=persistence)
+    deploy = index.query(input_contains="deploy")
+    status = index.query(input_contains="STATUS")
+
+    assert len(deploy) == 1 and deploy[0]["trajectory_id"] == "traj-1"
+    assert len(status) == 1 and status[0]["trajectory_id"] == "traj-2"
+
+
+def test_artifact_index_slice_dataset_returns_ids(tmp_path: Path):
+    persistence = PersistenceStore(base_dir=tmp_path / "artifacts")
+    persistence.save_trajectory(_make_trajectory("traj-1", status="success", reward_total=0.9))
+    persistence.save_trajectory(_make_trajectory("traj-2", status="failure", reward_total=-0.1))
+    persistence.save_trajectory(_make_trajectory("traj-3", status="success", reward_total=0.95))
+
+    index = ArtifactIndex(persistence_store=persistence)
+    slice_ids = index.slice_dataset(status="success", min_reward=0.8)
+
+    assert slice_ids == ["traj-1", "traj-3"]
+
+
+def test_artifact_index_refresh_picks_up_new_files(tmp_path: Path):
+    persistence = PersistenceStore(base_dir=tmp_path / "artifacts")
+    persistence.save_trajectory(_make_trajectory("traj-1"))
+
+    index = ArtifactIndex(persistence_store=persistence)
+    assert len(index.query()) == 1
+
+    persistence.save_trajectory(_make_trajectory("traj-2"))
+    index.refresh()
+
+    assert len(index.query()) == 2
diff --git a/tests/test_benchmarks.py b/tests/test_benchmarks.py
new file mode 100644
index 0000000..fff3554
--- /dev/null
+++ b/tests/test_benchmarks.py
@@ -0,0 +1,38 @@
+from __future__ import annotations
+
+from memabra.benchmarks import BenchmarkSuite, BenchmarkTask, save_benchmark_suite, load_benchmark_suite, default_benchmark_suite
+
+
+def test_benchmark_suite_roundtrip(tmp_path):
+    path = tmp_path / "suite.json"
+    suite = BenchmarkSuite(
+        name="test-suite",
+        tasks=[
+            BenchmarkTask(user_input="Hello", channel="local", user_id="u1"),
+            BenchmarkTask(user_input="World", channel="telegram"),
+        ],
+    )
+
+    save_benchmark_suite(suite, path)
+    loaded = load_benchmark_suite(path)
+
+    assert loaded.name == "test-suite"
+    assert len(loaded.tasks) == 2
+    assert loaded.tasks[0].user_input == "Hello"
+    assert loaded.tasks[0].channel == "local"
+    assert loaded.tasks[0].user_id == "u1"
+    assert loaded.tasks[1].user_input == "World"
+    assert loaded.tasks[1].channel == "telegram"
+    assert loaded.tasks[1].user_id is None
+
+
+def test_default_benchmark_suite_covers_expected_categories():
+    suite = default_benchmark_suite()
+
+    assert suite.name == "default"
+    assert len(suite.tasks) >= 4
+    inputs = [t.user_input.lower() for t in suite.tasks]
+    assert any("memory" in i or "preference" in i for i in inputs)
+    assert any("skill" in i or "deploy" in i for i in inputs)
+    assert any("tool" in i or "status" in i for i in inputs)
+    assert any("composite" in i or "multiple" in i for i in inputs)
diff --git a/tests/test_case_index.py b/tests/test_case_index.py
new file mode 100644
index 0000000..e2d5a06
--- /dev/null
+++ b/tests/test_case_index.py
@@ -0,0 +1,50 @@
+from memabra.case_index import CaseIndex
+
+
+def test_case_index_adds_and_retrieves_best_trajectory():
+    index = CaseIndex()
+    trajectory = {
+        "trajectory_id": "traj-1",
+        "task": {"input": "Hello world"},
+        "outcome": {"status": "success"},
+        "reward": {"total": 1.0},
+    }
+    index.add(trajectory)
+    assert index.best("Hello world") == "traj-1"
+
+
+def test_case_index_returns_none_for_unknown_input():
+    index = CaseIndex()
+    assert index.best("Unknown input") is None
+
+
+def test_case_index_keeps_higher_reward_for_same_input():
+    index = CaseIndex()
+    index.add({
+        "trajectory_id": "traj-low",
+        "task": {"input": "Same input"},
+        "outcome": {"status": "success"},
+        "reward": {"total": 0.5},
+    })
+    index.add({
+        "trajectory_id": "traj-high",
+        "task": {"input": "Same input"},
+        "outcome": {"status": "success"},
+        "reward": {"total": 1.5},
+    })
+    assert index.best("Same input") == "traj-high"
+
+
+def test_case_index_save_and_round_trip(tmp_path):
+    index = CaseIndex()
+    index.add({
+        "trajectory_id": "traj-save",
+        "task": {"input": "Persist me"},
+        "outcome": {"status": "success"},
+        "reward": {"total": 2.0},
+    })
+    path = tmp_path / "case_index.json"
+    index.save(path)
+
+    loaded = CaseIndex.load(path)
+    assert loaded.best("Persist me") == "traj-save"
diff --git a/tests/test_cli_workflow.py b/tests/test_cli_workflow.py
new file mode 100644
index 0000000..a6d3d39
--- /dev/null
+++ b/tests/test_cli_workflow.py
@@ -0,0 +1,574 @@
+from pathlib import Path
+
+from memabra.cli import format_output, run_online_learning_workflow, run_wrapup_workflow
+
+
+def test_run_wrapup_workflow_trains_evaluates_and_versions_router(tmp_path: Path):
+    result = run_wrapup_workflow(base_dir=tmp_path / "demo-artifacts")
+
+    assert result["seed_summary"]["trajectories"] >= 3
+    assert "baseline" in result["comparison"]
+    assert "challenger" in result["comparison"]
+    assert result["saved_version"]["version_id"]
+    assert (tmp_path / "demo-artifacts" / "router-versions" / "current.json").exists()
+
+
+def test_run_online_learning_workflow_runs_cycle_and_returns_report(tmp_path: Path):
+    result = run_online_learning_workflow(base_dir=tmp_path / "demo-artifacts")
+
+    assert "skipped" in result
+    assert "report_id" in result
+    # Since it seeds tasks, it should not skip
+    assert result["skipped"] is False
+    assert result["promoted"] is True
+    assert (tmp_path / "demo-artifacts" / "training-reports").exists()
+
+
+def test_format_output_workflow_text_includes_decision_reason_and_dry_run():
+    payload = {
+        "report_id": "report-123",
+        "skipped": False,
+        "promoted": False,
+        "dry_run": True,
+        "decision": {
+            "accepted": False,
+            "reasons": ["Reward delta too small", "Latency increased"],
+            "metrics": {
+                "reward_delta": -0.12,
+                "error_rate_delta": 0.02,
+                "latency_delta_ms": 12.5,
+            },
+        },
+        "baseline_metrics": {
+            "avg_reward": 1.0,
+            "error_rate": 0.1,
+            "avg_latency_ms": 120.0,
+        },
+        "challenger_metrics": {
+            "avg_reward": 0.88,
+            "error_rate": 0.12,
+            "avg_latency_ms": 132.5,
+        },
+    }
+
+    rendered = format_output(payload, output_format="text", mode="workflow")
+
+    assert "Memabra online learning result" in rendered
+    assert "Summary" in rendered
+    assert "Report ID: report-123" in rendered
+    assert "Skipped: no" in rendered
+    assert "Promoted: no" in rendered
+    assert "Dry run: yes" in rendered
+    assert "Baseline" in rendered
+    assert "Reward: 1.0000" in rendered
+    assert "Error rate: 0.1000" in rendered
+    assert "Latency (ms): 120.0000" in rendered
+    assert "Challenger" in rendered
+    assert "Reward: 0.8800" in rendered
+    assert "Deltas" in rendered
+    assert "Reward delta: -0.1200" in rendered
+    assert "Error rate delta: 0.0200" in rendered
+    assert "Latency delta (ms): 12.5000" in rendered
+    assert "Decision" in rendered
+    assert "Reason: Reward delta too small; Latency increased" in rendered
+
+
+def test_format_output_workflow_text_includes_error_details():
+    payload = {
+        "report_id": "report-err",
+        "skipped": False,
+        "promoted": False,
+        "error": "benchmark crashed",
+    }
+
+    rendered = format_output(payload, output_format="text", mode="workflow")
+
+    assert "Error: benchmark crashed" in rendered
+
+
+def test_format_output_status_text_includes_latest_report_details():
+    payload = {
+        "base_dir": "/tmp/demo-artifacts",
+        "current_version_id": "v2",
+        "version_count": 2,
+        "trajectory_count": 8,
+        "report_count": 3,
+        "latest_report": {
+            "report_id": "report-9",
+            "timestamp": "2026-04-15T06:00:00+00:00",
+            "promoted": True,
+        },
+    }
+
+    rendered = format_output(payload, output_format="text", mode="status")
+
+    assert "Memabra status" in rendered
+    assert "Current version: v2" in rendered
+    assert "Latest report: report-9" in rendered
+    assert "Latest report time: 2026-04-15T06:00:00+00:00" in rendered
+    assert "Latest promotion accepted: yes" in rendered
+
+
+def test_format_output_list_versions_text_marks_current_version():
+    payload = {
+        "current_version_id": "v2",
+        "versions": [
+            {"version_id": "v1", "metadata": {"source": "seed", "avg_reward": 1.2}},
+            {"version_id": "v2", "metadata": {"source": "online_learning", "avg_reward": 1.4}},
+        ],
+    }
+
+    rendered = format_output(payload, output_format="text", mode="list_versions")
+
+    assert "Saved router versions (2 total)" in rendered
+    assert "Current version: v2" in rendered
+    assert "1. v1 (source=seed, avg_reward=1.2)" in rendered
+    assert "2. v2 (current, source=online_learning, avg_reward=1.4)" in rendered
+
+
+def test_main_entrypoint_uses_online_learning_workflow(monkeypatch):
+    from memabra import cli
+
+    calls = []
+
+    def mock_online_learning_workflow(*, base_dir=None, min_new_trajectories=3, seen_trajectory_store=None, **kwargs):
+        calls.append({"base_dir": str(base_dir), "min_new_trajectories": min_new_trajectories, "seen_trajectory_store": seen_trajectory_store})
+        return {"skipped": False, "promoted": True, "report_id": "report-test"}
+
+    monkeypatch.setattr(cli, "run_online_learning_workflow", mock_online_learning_workflow)
+
+    rc = cli.main()
+
+    assert rc == 0
+    assert len(calls) == 1
+    assert calls[0]["min_new_trajectories"] == 3
+
+
+def test_main_entrypoint_parses_base_dir_argument(monkeypatch):
+    from memabra import cli
+
+    calls = []
+
+    def mock_online_learning_workflow(*, base_dir=None, min_new_trajectories=3, seen_trajectory_store=None, **kwargs):
+        calls.append({"base_dir": str(base_dir) if base_dir else None, "min_new_trajectories": min_new_trajectories, "seen_trajectory_store": seen_trajectory_store})
+        return {"skipped": False, "promoted": True, "report_id": "report-test"}
+
+    monkeypatch.setattr(cli, "run_online_learning_workflow", mock_online_learning_workflow)
+
+    rc = cli.main(["--base-dir", "/custom/path"])
+
+    assert rc == 0
+    assert len(calls) == 1
+    assert calls[0]["base_dir"] == "/custom/path"
+
+
+def test_main_entrypoint_parses_min_new_trajectories_argument(monkeypatch):
+    from memabra import cli
+
+    calls = []
+
+    def mock_online_learning_workflow(*, base_dir=None, min_new_trajectories=3, seen_trajectory_store=None, **kwargs):
+        calls.append({"base_dir": str(base_dir) if base_dir else None, "min_new_trajectories": min_new_trajectories, "seen_trajectory_store": seen_trajectory_store})
+        return {"skipped": False, "promoted": True, "report_id": "report-test"}
+
+    monkeypatch.setattr(cli, "run_online_learning_workflow", mock_online_learning_workflow)
+
+    rc = cli.main(["--min-new-trajectories", "10"])
+
+    assert rc == 0
+    assert len(calls) == 1
+    assert calls[0]["min_new_trajectories"] == 10
+
+
+def test_run_online_learning_workflow_skips_on_second_run_when_seen_store_provided(tmp_path: Path):
+    base_dir = tmp_path / "demo-artifacts"
+    seen_store = tmp_path / "seen.json"
+
+    result1 = run_online_learning_workflow(
+        base_dir=base_dir,
+        min_new_trajectories=1,
+        seen_trajectory_store=seen_store,
+    )
+    assert result1["skipped"] is False
+
+    result2 = run_online_learning_workflow(
+        base_dir=base_dir,
+        min_new_trajectories=1,
+        seen_trajectory_store=seen_store,
+    )
+    assert result2["skipped"] is True
+    assert "too few new trajectories" in result2["reason"].lower()
+
+
+def test_main_entrypoint_passes_default_seen_trajectory_store(monkeypatch):
+    from memabra import cli
+
+    calls = []
+
+    def mock_online_learning_workflow(*, base_dir=None, min_new_trajectories=3, seen_trajectory_store=None, dry_run=False, **kwargs):
+        calls.append({
+            "base_dir": str(base_dir) if base_dir else None,
+            "min_new_trajectories": min_new_trajectories,
+            "seen_trajectory_store": str(seen_trajectory_store) if seen_trajectory_store else None,
+            "dry_run": dry_run,
+        })
+        return {"skipped": False, "promoted": True, "report_id": "report-test"}
+
+    monkeypatch.setattr(cli, "run_online_learning_workflow", mock_online_learning_workflow)
+
+    rc = cli.main()
+
+    assert rc == 0
+    assert len(calls) == 1
+    assert calls[0]["seen_trajectory_store"] is not None
+    assert "seen-trajectories.json" in calls[0]["seen_trajectory_store"]
+    assert calls[0]["dry_run"] is False
+
+
+def test_main_entrypoint_passes_dry_run_flag(monkeypatch):
+    from memabra import cli
+
+    calls = []
+
+    def mock_online_learning_workflow(*, base_dir=None, min_new_trajectories=3, seen_trajectory_store=None, dry_run=False, **kwargs):
+        calls.append({
+            "base_dir": str(base_dir) if base_dir else None,
+            "min_new_trajectories": min_new_trajectories,
+            "seen_trajectory_store": str(seen_trajectory_store) if seen_trajectory_store else None,
+            "dry_run": dry_run,
+            "baseline_version": kwargs.get("baseline_version"),
+        })
+        return {"skipped": False, "promoted": True, "report_id": "report-test"}
+
+    monkeypatch.setattr(cli, "run_online_learning_workflow", mock_online_learning_workflow)
+
+    rc = cli.main(["--dry-run"])
+
+    assert rc == 0
+    assert len(calls) == 1
+    assert calls[0]["dry_run"] is True
+
+
+def test_main_entrypoint_passes_baseline_version_flag(monkeypatch):
+    from memabra import cli
+
+    calls = []
+
+    def mock_online_learning_workflow(*, base_dir=None, min_new_trajectories=3, seen_trajectory_store=None, dry_run=False, baseline_version=None, **kwargs):
+        calls.append({
+            "base_dir": str(base_dir) if base_dir else None,
+            "min_new_trajectories": min_new_trajectories,
+            "seen_trajectory_store": str(seen_trajectory_store) if seen_trajectory_store else None,
+            "dry_run": dry_run,
+            "baseline_version": baseline_version,
+        })
+        return {"skipped": False, "promoted": True, "report_id": "report-test"}
+
+    monkeypatch.setattr(cli, "run_online_learning_workflow", mock_online_learning_workflow)
+
+    rc = cli.main(["--baseline-version", "v1"])
+
+    assert rc == 0
+    assert len(calls) == 1
+    assert calls[0]["baseline_version"] == "v1"
+
+
+def test_main_entrypoint_supports_text_format_for_workflow(monkeypatch, capsys):
+    from memabra import cli
+
+    def mock_online_learning_workflow(**kwargs):
+        return {
+            "skipped": False,
+            "promoted": False,
+            "report_id": "report-text",
+            "dry_run": True,
+            "decision": {
+                "accepted": False,
+                "reasons": ["Dry run requested"],
+                "metrics": {
+                    "reward_delta": 0.05,
+                    "error_rate_delta": 0.0,
+                    "latency_delta_ms": 4.0,
+                },
+            },
+            "baseline_metrics": {
+                "avg_reward": 0.8,
+                "error_rate": 0.1,
+                "avg_latency_ms": 90.0,
+            },
+            "challenger_metrics": {
+                "avg_reward": 0.85,
+                "error_rate": 0.1,
+                "avg_latency_ms": 94.0,
+            },
+        }
+
+    monkeypatch.setattr(cli, "run_online_learning_workflow", mock_online_learning_workflow)
+
+    rc = cli.main(["--format", "text", "--dry-run"])
+
+    captured = capsys.readouterr()
+    assert rc == 0
+    assert "Memabra online learning result" in captured.out
+    assert "Summary" in captured.out
+    assert "Dry run: yes" in captured.out
+    assert "Baseline" in captured.out
+    assert "Reward: 0.8000" in captured.out
+    assert "Challenger" in captured.out
+    assert "Reward: 0.8500" in captured.out
+    assert "Deltas" in captured.out
+    assert "Reward delta: 0.0500" in captured.out
+    assert "Reason: Dry run requested" in captured.out
+
+
+def test_main_entrypoint_passes_case_index_flags(monkeypatch):
+    from memabra import cli
+
+    calls = []
+
+    def mock_online_learning_workflow(*, base_dir=None, min_new_trajectories=3, seen_trajectory_store=None, dry_run=False, baseline_version=None, case_index_path=None, rebuild_case_index=False, **kwargs):
+        calls.append({
+            "base_dir": str(base_dir) if base_dir else None,
+            "case_index_path": str(case_index_path) if case_index_path else None,
+            "rebuild_case_index": rebuild_case_index,
+        })
+        return {"skipped": False, "promoted": True, "report_id": "report-test"}
+
+    monkeypatch.setattr(cli, "run_online_learning_workflow", mock_online_learning_workflow)
+
+    rc = cli.main(["--case-index", "/tmp/cases.json", "--rebuild-case-index"])
+
+    assert rc == 0
+    assert len(calls) == 1
+    assert calls[0]["case_index_path"] == "/tmp/cases.json"
+    assert calls[0]["rebuild_case_index"] is True
+
+
+def test_run_online_learning_workflow_loads_existing_case_index(tmp_path: Path):
+    base_dir = tmp_path / "demo-artifacts"
+    case_index_path = tmp_path / "case-index.json"
+
+    # Run once to create trajectories and rebuild case index
+    result1 = run_online_learning_workflow(base_dir=base_dir, min_new_trajectories=1, rebuild_case_index=True, case_index_path=case_index_path)
+    assert result1["skipped"] is False
+    assert case_index_path.exists()
+
+    # Second run should load the existing case index
+    result2 = run_online_learning_workflow(base_dir=base_dir, min_new_trajectories=1, rebuild_case_index=False, case_index_path=case_index_path)
+    assert result2["skipped"] is False
+
+
+def test_run_online_learning_workflow_rebuilds_case_index_after_cycle(tmp_path: Path):
+    base_dir = tmp_path / "demo-artifacts"
+    case_index_path = tmp_path / "case-index.json"
+
+    result = run_online_learning_workflow(
+        base_dir=base_dir,
+        min_new_trajectories=1,
+        case_index_path=case_index_path,
+    )
+    assert result["skipped"] is False
+    assert case_index_path.exists()
+    from memabra.case_index import CaseIndex
+
+    index = CaseIndex.load(case_index_path)
+    # The benchmark task during the cycle should produce a trajectory that gets indexed
+    assert index.best("Use my telegram preference for this answer.") is not None
+
+
+def test_main_entrypoint_defaults_case_index_path_when_rebuild_flag_set(monkeypatch):
+    from memabra import cli
+
+    calls = []
+
+    def mock_online_learning_workflow(*, base_dir=None, min_new_trajectories=3, seen_trajectory_store=None, dry_run=False, baseline_version=None, case_index_path=None, rebuild_case_index=False, **kwargs):
+        calls.append({
+            "base_dir": str(base_dir) if base_dir else None,
+            "case_index_path": str(case_index_path) if case_index_path else None,
+            "rebuild_case_index": rebuild_case_index,
+        })
+        return {"skipped": False, "promoted": True, "report_id": "report-test"}
+
+    monkeypatch.setattr(cli, "run_online_learning_workflow", mock_online_learning_workflow)
+
+    rc = cli.main(["--rebuild-case-index"])
+
+    assert rc == 0
+    assert len(calls) == 1
+    assert calls[0]["rebuild_case_index"] is True
+    assert calls[0]["case_index_path"] is not None
+    assert "case-index.json" in calls[0]["case_index_path"]
+
+
+def test_main_status_flag_prints_status_and_skips_workflow(tmp_path: Path, monkeypatch, capsys):
+    from memabra import cli
+
+    workflow_calls = []
+
+    def mock_online_learning_workflow(**kwargs):
+        workflow_calls.append(kwargs)
+        return {"skipped": False, "promoted": True, "report_id": "report-test"}
+
+    monkeypatch.setattr(cli, "run_online_learning_workflow", mock_online_learning_workflow)
+
+    base_dir = tmp_path / "demo-artifacts"
+    base_dir.mkdir(parents=True, exist_ok=True)
+
+    rc = cli.main(["status", "--base-dir", str(base_dir)])
+
+    captured = capsys.readouterr()
+    assert rc == 0
+    assert len(workflow_calls) == 0
+    assert "current_version_id" in captured.out
+
+
+def test_main_status_flag_supports_text_format(tmp_path: Path, monkeypatch, capsys):
+    from memabra import cli
+
+    workflow_calls = []
+
+    def mock_online_learning_workflow(**kwargs):
+        workflow_calls.append(kwargs)
+        return {"skipped": False, "promoted": True, "report_id": "report-test"}
+
+    monkeypatch.setattr(cli, "run_online_learning_workflow", mock_online_learning_workflow)
+
+    base_dir = tmp_path / "demo-artifacts"
+    base_dir.mkdir(parents=True, exist_ok=True)
+
+    rc = cli.main(["status", "--format", "text", "--base-dir", str(base_dir)])
+
+    captured = capsys.readouterr()
+    assert rc == 0
+    assert len(workflow_calls) == 0
+    assert "Memabra status" in captured.out
+    assert "Current version:" in captured.out
+    assert "Trajectory count:" in captured.out
+
+
+def test_main_rollback_flag_rolls_back_and_skips_workflow(tmp_path: Path, monkeypatch, capsys):
+    from memabra import cli
+    from memabra.router_versioning import RouterVersionStore
+
+    workflow_calls = []
+    rollback_calls = []
+
+    def mock_online_learning_workflow(**kwargs):
+        workflow_calls.append(kwargs)
+        return {"skipped": False, "promoted": True, "report_id": "report-test"}
+
+    def mock_rollback(self, version_id: str):
+        rollback_calls.append(version_id)
+        return {"current_version_id": version_id}
+
+    monkeypatch.setattr(cli, "run_online_learning_workflow", mock_online_learning_workflow)
+    monkeypatch.setattr(RouterVersionStore, "rollback", mock_rollback)
+
+    base_dir = tmp_path / "demo-artifacts"
+    base_dir.mkdir(parents=True, exist_ok=True)
+
+    rc = cli.main(["version", "rollback", "v1", "--base-dir", str(base_dir)])
+
+    captured = capsys.readouterr()
+    assert rc == 0
+    assert len(workflow_calls) == 0
+    assert len(rollback_calls) == 1
+    assert rollback_calls[0] == "v1"
+    assert "current_version_id" in captured.out
+
+
+def test_main_rollback_flag_supports_text_format(tmp_path: Path, monkeypatch, capsys):
+    from memabra import cli
+    from memabra.router_versioning import RouterVersionStore
+
+    def mock_rollback(self, version_id: str):
+        return {"current_version_id": version_id}
+
+    monkeypatch.setattr(RouterVersionStore, "rollback", mock_rollback)
+
+    base_dir = tmp_path / "demo-artifacts"
+    base_dir.mkdir(parents=True, exist_ok=True)
+
+    rc = cli.main(["version", "rollback", "v1", "--format", "text", "--base-dir", str(base_dir)])
+
+    captured = capsys.readouterr()
+    assert rc == 0
+    assert "Rolled back current version to: v1" in captured.out
+
+
+def test_main_rollback_missing_version_prints_error_and_exits_nonzero(tmp_path: Path, monkeypatch, capsys):
+    from memabra import cli
+    from memabra.router_versioning import RouterVersionStore
+
+    def mock_rollback(self, version_id: str):
+        raise ValueError(f"Version '{version_id}' not found.")
+
+    monkeypatch.setattr(RouterVersionStore, "rollback", mock_rollback)
+
+    base_dir = tmp_path / "demo-artifacts"
+    base_dir.mkdir(parents=True, exist_ok=True)
+
+    rc = cli.main(["version", "rollback", "v99", "--base-dir", str(base_dir)])
+
+    captured = capsys.readouterr()
+    assert rc == 1
+    assert "not found" in captured.err.lower()
+
+
+def test_main_list_versions_flag_prints_versions_and_skips_workflow(tmp_path: Path, monkeypatch, capsys):
+    from memabra import cli
+    from memabra.router_versioning import RouterVersionStore
+
+    workflow_calls = []
+
+    def mock_online_learning_workflow(**kwargs):
+        workflow_calls.append(kwargs)
+        return {"skipped": False, "promoted": True, "report_id": "report-test"}
+
+    def mock_list_versions(self):
+        return [
+            {"version_id": "v1", "metadata": {"source": "test"}},
+            {"version_id": "v2", "metadata": {"source": "test"}},
+        ]
+
+    monkeypatch.setattr(cli, "run_online_learning_workflow", mock_online_learning_workflow)
+    monkeypatch.setattr(RouterVersionStore, "list_versions", mock_list_versions)
+
+    base_dir = tmp_path / "demo-artifacts"
+    base_dir.mkdir(parents=True, exist_ok=True)
+
+    rc = cli.main(["version", "list", "--base-dir", str(base_dir)])
+
+    captured = capsys.readouterr()
+    assert rc == 0
+    assert len(workflow_calls) == 0
+    assert "v1" in captured.out
+    assert "v2" in captured.out
+
+
+def test_main_list_versions_flag_supports_text_format(tmp_path: Path, monkeypatch, capsys):
+    from memabra import cli
+    from memabra.router_versioning import RouterVersionStore
+
+    def mock_list_versions(self):
+        return [
+            {"version_id": "v1", "metadata": {"source": "seed", "avg_reward": 1.2}},
+            {"version_id": "v2", "metadata": {"source": "online_learning", "avg_reward": 1.4}},
+        ]
+
+    def mock_get_current(self):
+        return {"current_version_id": "v2"}
+
+    monkeypatch.setattr(RouterVersionStore, "list_versions", mock_list_versions)
+    monkeypatch.setattr(RouterVersionStore, "get_current", mock_get_current)
+
+    base_dir = tmp_path / "demo-artifacts"
+    base_dir.mkdir(parents=True, exist_ok=True)
+
+    rc = cli.main(["version", "list", "--format", "text", "--base-dir", str(base_dir)])
+
+    captured = capsys.readouterr()
+    assert rc == 0
+    assert "Saved router versions (2 total)" in captured.out
+    assert "Current version: v2" in captured.out
+    assert "2. v2 (current, source=online_learning, avg_reward=1.4)" in captured.out
diff --git a/tests/test_dataset.py b/tests/test_dataset.py
new file mode 100644
index 0000000..2617c66
--- /dev/null
+++ b/tests/test_dataset.py
@@ -0,0 +1,49 @@
+from memabra.dataset import DatasetBuilder, TrainingSample
+
+
+def test_dataset_builder_extracts_features_and_label():
+    trajectories = [
+        {
+            "task": {"input": "hello world"},
+            "candidate_sets": {
+                "memory": [{"confidence": 0.8}],
+                "skill": [{"success_rate": 0.9}],
+                "tool": [{"confidence": 0.7, "risk": 0.2}],
+            },
+            "decisions": [{"decision_type": "direct_answer"}],
+            "reward": {"total": 0.95},
+        }
+    ]
+    builder = DatasetBuilder()
+    samples = builder.build(trajectories)
+    assert len(samples) == 1
+    sample = samples[0]
+    assert sample.input_text == "hello world"
+    assert sample.label == "direct_answer"
+    assert sample.reward == 0.95
+    assert sample.features["input_length"] == 11
+    assert sample.features["memory_count"] == 1
+    assert sample.features["skill_count"] == 1
+    assert sample.features["tool_count"] == 1
+    assert sample.features["top_memory_confidence"] == 0.8
+    assert sample.features["top_skill_success_rate"] == 0.9
+    assert sample.features["top_tool_confidence"] == 0.7
+    assert sample.features["top_tool_risk"] == 0.2
+
+
+def test_dataset_builder_handles_empty_candidates():
+    trajectories = [
+        {
+            "task": {"input": "hi"},
+            "candidate_sets": {"memory": [], "skill": [], "tool": []},
+            "decisions": [{"decision_type": "clarify"}],
+            "reward": {"total": 0.0},
+        }
+    ]
+    builder = DatasetBuilder()
+    samples = builder.build(trajectories)
+    assert len(samples) == 1
+    assert samples[0].features["top_memory_confidence"] == 0.0
+    assert samples[0].features["top_skill_success_rate"] == 0.0
+    assert samples[0].features["top_tool_confidence"] == 0.0
+    assert samples[0].features["top_tool_risk"] == 0.0
diff --git a/tests/test_evaluator.py b/tests/test_evaluator.py
new file mode 100644
index 0000000..f6bbc89
--- /dev/null
+++ b/tests/test_evaluator.py
@@ -0,0 +1,54 @@
+from memabra.app import build_demo_app
+from memabra.evaluator import BenchmarkTask, Evaluator
+
+
+def test_evaluator_runs_benchmark_and_reports_metrics(tmp_path):
+    app = build_demo_app(base_dir=tmp_path / "demo-artifacts")
+    evaluator = Evaluator(app)
+    tasks = [
+        BenchmarkTask(user_input="Use my telegram preference."),
+        BenchmarkTask(user_input="Check the current system status."),
+    ]
+    result = evaluator.run(tasks)
+
+    assert result.task_count == 2
+    assert result.avg_reward >= 0.0
+    assert "inject_memory" in result.decision_distribution
+    assert "call_tool" in result.decision_distribution
+    assert result.error_rate == 0.0
+
+
+def test_evaluator_ab_compares_two_routers(tmp_path):
+    from memabra.router import RuleBasedRouter, TaskContext
+
+    app = build_demo_app(base_dir=tmp_path / "demo-artifacts")
+    evaluator = Evaluator(app)
+    tasks = [
+        BenchmarkTask(user_input="Use my telegram preference."),
+        BenchmarkTask(user_input="Check the current system status."),
+    ]
+
+    baseline = evaluator.run(tasks, router=RuleBasedRouter())
+    # Using same router for both arms in this test; real tests would compare different routers
+    challenger = evaluator.run(tasks, router=RuleBasedRouter())
+    comparison = evaluator.compare(baseline, challenger)
+
+    assert comparison["winner"] in ("baseline", "challenger", "tie")
+    assert "avg_reward_delta" in comparison
+    assert "error_rate_delta" in comparison
+
+
+def test_app_trains_learning_router_from_artifact_index(tmp_path):
+    from memabra.router import SimpleLearningRouter
+
+    app = build_demo_app(base_dir=tmp_path / "demo-artifacts")
+    # Generate some training data
+    app.run_task("Use my telegram preference.", channel="local")
+    app.run_task("Check the current system status.", channel="local")
+
+    router = app.train_learning_router()
+
+    assert isinstance(router, SimpleLearningRouter)
+    # After training, the router should be able to make predictions (not fallback to clarify for known patterns)
+    trajectory = app.run_task("Use my telegram preference.", channel="local")
+    assert trajectory["reward"]["total"] >= 0.0
diff --git a/tests/test_execution_persistence.py b/tests/test_execution_persistence.py
new file mode 100644
index 0000000..ea44bba
--- /dev/null
+++ b/tests/test_execution_persistence.py
@@ -0,0 +1,265 @@
+from pathlib import Path
+
+from memabra.candidate_types import CandidateObject
+from memabra.execution import ExecutionEngine, MemoryExecutor, ToolExecutor
+from memabra.memory_store import InMemoryMemoryStore, MemoryRecord, MemorySource
+from memabra.persistence import PersistenceStore
+from memabra.retrieval import CandidateRetriever, InMemoryCandidateProvider
+from memabra.router import RouteDecision, RuleBasedRouter, TaskContext
+from memabra.runner import MemabraRunner
+from memabra.schemas import SchemaRegistry
+
+
+class FailingToolBackend:
+    def run_tool(self, tool_id: str, context: TaskContext, params: dict | None = None) -> dict:
+        return {"status": "error", "output": None, "error": f"{tool_id} failed", "latency_ms": 123}
+
+
+class MixedResultToolBackend:
+    def run_tool(self, tool_id: str, context: TaskContext, params: dict | None = None) -> dict:
+        if tool_id == "tool-ok":
+            return {"status": "success", "output": "ok", "error": None, "latency_ms": 50}
+        return {"status": "error", "output": None, "error": f"{tool_id} failed", "latency_ms": 100}
+
+
+class StaticSkillBackend:
+    def load_skill(self, skill_id: str) -> dict:
+        return {"skill_id": skill_id, "instructions": "Follow the documented deployment workflow."}
+
+
+def test_execution_engine_marks_memory_used_and_runner_persists(tmp_path: Path):
+    memory_store = InMemoryMemoryStore()
+    memory_store.upsert(
+        MemoryRecord(
+            id="mem-telegram-pref",
+            memory_type="semantic",
+            fact_status="verified",
+            content="Prefer plain text on Telegram.",
+            summary="Telegram preference",
+            source=MemorySource(kind="user", ref="session-1"),
+            confidence=0.95,
+        )
+    )
+    retriever = CandidateRetriever(
+        [
+            InMemoryCandidateProvider(
+                candidate_type="memory",
+                candidates=[
+                    CandidateObject(
+                        id="mem-telegram-pref",
+                        type="memory",
+                        title="Telegram preference",
+                        summary="Prefer plain text on Telegram.",
+                        triggers=["telegram", "preference"],
+                        confidence=0.95,
+                        success_rate=0.9,
+                        freshness=0.9,
+                    )
+                ],
+            )
+        ]
+    )
+    persistence = PersistenceStore(base_dir=tmp_path / "artifacts")
+    runner = MemabraRunner(
+        retriever=retriever,
+        router=RuleBasedRouter(),
+        execution_engine=ExecutionEngine(memory_executor=MemoryExecutor(memory_store=memory_store)),
+        persistence_store=persistence,
+        memory_store=memory_store,
+    )
+
+    trajectory = runner.run(
+        context=TaskContext(user_input="Use my telegram preference for this answer."),
+        channel="telegram",
+        user_id="oza",
+        persist=True,
+    )
+
+    SchemaRegistry().validate_trajectory(trajectory)
+    assert any(event["event_type"] == "memory_injected" for event in trajectory["events"])
+    assert memory_store.get("mem-telegram-pref").last_used_at is not None
+    assert persistence.load_trajectory(trajectory["trajectory_id"])["trajectory_id"] == trajectory["trajectory_id"]
+
+
+def test_persistence_store_round_trip_memory_record(tmp_path: Path):
+    persistence = PersistenceStore(base_dir=tmp_path / "artifacts")
+    record = MemoryRecord(
+        id="mem-1",
+        memory_type="semantic",
+        fact_status="assumed",
+        content="User likes concise replies.",
+        summary="Concise reply preference",
+        source=MemorySource(kind="user", ref="session-2"),
+        confidence=0.7,
+    )
+
+    persistence.save_memory_record(record)
+    loaded = persistence.load_memory_record("mem-1")
+    assert loaded["id"] == "mem-1"
+    assert len(persistence.list_memory_paths()) == 1
+
+
+def test_runner_records_tool_failures_in_outcome_and_reward(tmp_path: Path):
+    retriever = CandidateRetriever(
+        [
+            InMemoryCandidateProvider(
+                candidate_type="tool",
+                candidates=[
+                    CandidateObject(
+                        id="tool-terminal",
+                        type="tool",
+                        title="terminal",
+                        summary="Run terminal commands.",
+                        triggers=["check", "current"],
+                        confidence=0.95,
+                        success_rate=0.9,
+                        freshness=1.0,
+                    )
+                ],
+            )
+        ]
+    )
+    persistence = PersistenceStore(base_dir=tmp_path / "artifacts")
+    runner = MemabraRunner(
+        retriever=retriever,
+        router=RuleBasedRouter(),
+        execution_engine=ExecutionEngine(tool_backend=FailingToolBackend()),
+        persistence_store=persistence,
+    )
+
+    trajectory = runner.run(
+        context=TaskContext(user_input="Check the current status."),
+        channel="telegram",
+        persist=True,
+    )
+
+    assert trajectory["outcome"]["status"] == "failure"
+    assert trajectory["outcome"]["tool_errors"] == 1
+    assert trajectory["reward"]["components"]["tool_error"] > 0
+    assert trajectory["reward"]["components"]["latency"] > 0
+    assert any(event["event_type"] == "tool_result" for event in trajectory["events"])
+
+
+def test_runner_loads_skill_payload_from_backend():
+    retriever = CandidateRetriever(
+        [
+            InMemoryCandidateProvider(
+                candidate_type="skill",
+                candidates=[
+                    CandidateObject(
+                        id="skill-deploy",
+                        type="skill",
+                        title="deploy workflow",
+                        summary="Reusable deployment procedure.",
+                        triggers=["deploy", "workflow"],
+                        confidence=0.9,
+                        success_rate=0.95,
+                        freshness=0.8,
+                    )
+                ],
+            )
+        ]
+    )
+    runner = MemabraRunner(
+        retriever=retriever,
+        router=RuleBasedRouter(),
+        execution_engine=ExecutionEngine(skill_backend=StaticSkillBackend()),
+    )
+
+    trajectory = runner.run(context=TaskContext(user_input="Deploy this service with the usual workflow."))
+
+    skill_events = [event for event in trajectory["events"] if event["event_type"] == "skill_loaded"]
+    assert skill_events
+    assert skill_events[0]["payload"]["instructions"] == "Follow the documented deployment workflow."
+
+
+def test_runner_detects_partial_success_for_mixed_tool_results():
+    class BothToolsRouter:
+        def choose(self, context, memory, skill, tool):
+            from memabra.router import RouteDecision
+            return RouteDecision(
+                decision_type="call_tool",
+                selected_ids=["tool-ok", "tool-fail"],
+                selected_payloads=[{}, {}],
+                rationale="Force both tools for testing.",
+            )
+
+    retriever = CandidateRetriever(
+        [
+            InMemoryCandidateProvider(
+                candidate_type="tool",
+                candidates=[
+                    CandidateObject(
+                        id="tool-ok",
+                        type="tool",
+                        title="ok tool",
+                        summary="Always succeeds.",
+                        triggers=["check", "current"],
+                        confidence=0.95,
+                        success_rate=0.9,
+                        freshness=1.0,
+                    ),
+                    CandidateObject(
+                        id="tool-fail",
+                        type="tool",
+                        title="failing tool",
+                        summary="Always fails.",
+                        triggers=["check", "current"],
+                        confidence=0.9,
+                        success_rate=0.5,
+                        freshness=1.0,
+                    ),
+                ],
+            )
+        ]
+    )
+    runner = MemabraRunner(
+        retriever=retriever,
+        router=BothToolsRouter(),
+        execution_engine=ExecutionEngine(tool_backend=MixedResultToolBackend()),
+    )
+
+    trajectory = runner.run(
+        context=TaskContext(user_input="Check the current status."),
+        channel="local",
+    )
+
+    assert trajectory["outcome"]["status"] == "partial_success"
+    assert trajectory["outcome"]["tool_errors"] == 1
+    assert trajectory["reward"]["components"]["tool_error"] > 0
+    assert trajectory["reward"]["components"]["context_cost"] > 0
+
+
+def test_execution_engine_executes_composite_action_sequentially():
+    memory_store = InMemoryMemoryStore()
+    memory_store.upsert(
+        MemoryRecord(
+            id="mem-1",
+            memory_type="semantic",
+            fact_status="verified",
+            content="Prefer concise replies.",
+            summary="Concise preference",
+            source=MemorySource(kind="user", ref="session-1"),
+            confidence=0.9,
+        )
+    )
+    engine = ExecutionEngine(
+        memory_executor=MemoryExecutor(memory_store=memory_store),
+        tool_executor=ToolExecutor(backend=MixedResultToolBackend()),
+    )
+    decision = RouteDecision(
+        decision_type="composite_action",
+        composite_steps=[
+            RouteDecision(decision_type="inject_memory", selected_ids=["mem-1"]),
+            RouteDecision(decision_type="call_tool", selected_ids=["tool-ok"], selected_payloads=[{}]),
+        ],
+    )
+    result = engine.execute(decision, TaskContext(user_input="composite test"), trajectory_id="traj-comp")
+
+    assert result.status == "executed"
+    assert any(event.event_type == "memory_injected" for event in result.events)
+    assert any(event.event_type == "tool_result" for event in result.events)
+    assert len(result.details["steps"]) == 2
+    assert result.details["steps"][0]["decision_type"] == "inject_memory"
+    assert result.details["steps"][1]["decision_type"] == "call_tool"
+
diff --git a/tests/test_learning_router.py b/tests/test_learning_router.py
new file mode 100644
index 0000000..568adb5
--- /dev/null
+++ b/tests/test_learning_router.py
@@ -0,0 +1,91 @@
+from memabra.candidate_types import CandidateObject
+from memabra.dataset import TrainingSample
+from memabra.router import SimpleLearningRouter, TaskContext
+
+
+def test_learning_router_fits_and_predicts():
+    router = SimpleLearningRouter()
+    samples = [
+        TrainingSample(
+            input_text="run tool",
+            features={
+                "input_length": 8,
+                "memory_count": 0,
+                "skill_count": 0,
+                "tool_count": 1,
+                "top_memory_confidence": 0.0,
+                "top_skill_success_rate": 0.0,
+                "top_tool_confidence": 0.9,
+                "top_tool_risk": 0.1,
+            },
+            label="call_tool",
+            reward=1.0,
+        ),
+        TrainingSample(
+            input_text="remember",
+            features={
+                "input_length": 8,
+                "memory_count": 1,
+                "skill_count": 0,
+                "tool_count": 0,
+                "top_memory_confidence": 0.9,
+                "top_skill_success_rate": 0.0,
+                "top_tool_confidence": 0.0,
+                "top_tool_risk": 0.0,
+            },
+            label="inject_memory",
+            reward=1.0,
+        ),
+    ]
+    router.fit(samples)
+
+    tool = CandidateObject(
+        id="t1",
+        type="tool",
+        title="t",
+        summary="s",
+        triggers=[],
+        confidence=0.9,
+        success_rate=0.9,
+        freshness=0.9,
+        cost=0.0,
+        risk=0.1,
+    )
+    decision = router.choose(
+        TaskContext(user_input="run tool"),
+        memory_candidates=[],
+        skill_candidates=[],
+        tool_candidates=[tool],
+    )
+    assert decision.decision_type == "call_tool"
+
+    mem = CandidateObject(
+        id="m1",
+        type="memory",
+        title="m",
+        summary="s",
+        triggers=[],
+        confidence=0.9,
+        success_rate=0.9,
+        freshness=0.9,
+        cost=0.0,
+        risk=0.0,
+    )
+    decision = router.choose(
+        TaskContext(user_input="remember"),
+        memory_candidates=[mem],
+        skill_candidates=[],
+        tool_candidates=[],
+    )
+    assert decision.decision_type == "inject_memory"
+
+
+def test_learning_router_falls_back_to_clarify_when_untrained():
+    router = SimpleLearningRouter()
+    decision = router.choose(
+        TaskContext(user_input="hi"),
+        memory_candidates=[],
+        skill_candidates=[],
+        tool_candidates=[],
+    )
+    assert decision.decision_type == "clarify"
diff --git a/tests/test_memory_store.py b/tests/test_memory_store.py
new file mode 100644
index 0000000..cf8c382
--- /dev/null
+++ b/tests/test_memory_store.py
@@ -0,0 +1,27 @@
+from memabra.memory_store import InMemoryMemoryStore, MemoryRecord, MemorySource
+from memabra.schemas import SchemaRegistry
+
+
+def test_memory_store_verify_and_revoke_round_trip():
+    store = InMemoryMemoryStore()
+    record = MemoryRecord(
+        id="mem-pref-1",
+        memory_type="semantic",
+        fact_status="assumed",
+        content="User prefers plain text on Telegram.",
+        summary="Telegram plain-text preference",
+        source=MemorySource(kind="user", ref="session-1"),
+        confidence=0.9,
+    )
+    store.upsert(record)
+    store.verify("mem-pref-1", status="confirmed", check_method="user-confirmed")
+    store.mark_used("mem-pref-1")
+    store.revoke("mem-pref-1", reason="User changed preference")
+
+    updated = store.get("mem-pref-1")
+    assert updated is not None
+    assert updated.verification.status == "confirmed"
+    assert updated.last_used_at is not None
+    assert updated.fact_status == "revoked"
+
+    SchemaRegistry().validate_memory_record(updated.to_dict())
diff --git a/tests/test_online_learning.py b/tests/test_online_learning.py
new file mode 100644
index 0000000..bd5e40e
--- /dev/null
+++ b/tests/test_online_learning.py
@@ -0,0 +1,348 @@
+from __future__ import annotations
+
+from memabra.app import build_demo_app
+from memabra.benchmarks import BenchmarkTask
+from memabra.dataset import DatasetBuilder
+from memabra.evaluator import Evaluator
+from memabra.online_learning import OnlineLearningCoordinator
+from memabra.promotion import PromotionPolicy
+from memabra.router_versioning import RouterVersionStore
+
+
+def _seed_trajectories(app, count: int):
+    for i in range(count):
+        app.run_task(f"Test task {i}", channel="local")
+
+
+def test_coordinator_skips_when_too_few_new_trajectories(tmp_path):
+    app = build_demo_app(base_dir=tmp_path / "demo-artifacts")
+    _seed_trajectories(app, 2)
+
+    coordinator = OnlineLearningCoordinator(
+        app=app,
+        policy=PromotionPolicy(
+            min_reward_delta=0.01,
+            max_error_rate_increase=0.05,
+            max_latency_increase_ms=100.0,
+            required_task_count=1,
+        ),
+        benchmark_tasks=[BenchmarkTask(user_input="test")],
+        min_new_trajectories=5,
+    )
+
+    result = coordinator.run_cycle()
+
+    assert result["skipped"] is True
+    assert "too few new trajectories" in result["reason"].lower()
+
+
+def test_coordinator_rejects_when_policy_fails(tmp_path):
+    app = build_demo_app(base_dir=tmp_path / "demo-artifacts")
+    # Seed enough trajectories for training and benchmarking
+    _seed_trajectories(app, 10)
+
+    # Use a very strict policy that will reject any challenger
+    policy = PromotionPolicy(
+        min_reward_delta=1.0,  # impossible to meet
+        max_error_rate_increase=0.0,
+        max_latency_increase_ms=0.0,
+        required_task_count=1,
+    )
+
+    coordinator = OnlineLearningCoordinator(
+        app=app,
+        policy=policy,
+        benchmark_tasks=[BenchmarkTask(user_input="Test task 0")],
+        min_new_trajectories=1,
+        version_store_base_dir=tmp_path / "versions",
+    )
+
+    result = coordinator.run_cycle()
+
+    assert result["skipped"] is False
+    assert result["promoted"] is False
+    assert "decision" in result
+    assert result["decision"].accepted is False
+
+
+def test_coordinator_accepts_and_saves_version_when_policy_passes(tmp_path):
+    app = build_demo_app(base_dir=tmp_path / "demo-artifacts")
+    _seed_trajectories(app, 10)
+
+    # Lenient policy that should pass
+    policy = PromotionPolicy(
+        min_reward_delta=-1.0,  # always passes
+        max_error_rate_increase=1.0,
+        max_latency_increase_ms=10000.0,
+        required_task_count=1,
+    )
+
+    version_dir = tmp_path / "versions"
+    report_dir = tmp_path / "reports"
+    coordinator = OnlineLearningCoordinator(
+        app=app,
+        policy=policy,
+        benchmark_tasks=[BenchmarkTask(user_input="Test task 0")],
+        min_new_trajectories=1,
+        version_store_base_dir=version_dir,
+        report_store_base_dir=report_dir,
+    )
+
+    result = coordinator.run_cycle()
+
+    assert result["skipped"] is False
+    assert result["promoted"] is True
+    assert "version_id" in result
+    assert result["decision"].accepted is True
+
+    # Verify version was saved
+    store = RouterVersionStore(base_dir=version_dir)
+    versions = store.list_versions()
+    assert len(versions) == 1
+    assert versions[0]["version_id"] == result["version_id"]
+
+    # Verify report was saved
+    from memabra.training_reports import TrainingReportStore
+    report_store = TrainingReportStore(base_dir=report_dir)
+    reports = report_store.list_reports()
+    assert len(reports) == 1
+    assert reports[0]["promoted_version_id"] == result["version_id"]
+
+
+def test_coordinator_saves_report_on_rejection(tmp_path):
+    app = build_demo_app(base_dir=tmp_path / "demo-artifacts")
+    _seed_trajectories(app, 10)
+
+    policy = PromotionPolicy(
+        min_reward_delta=1.0,
+        max_error_rate_increase=0.0,
+        max_latency_increase_ms=0.0,
+        required_task_count=1,
+    )
+
+    report_dir = tmp_path / "reports"
+    coordinator = OnlineLearningCoordinator(
+        app=app,
+        policy=policy,
+        benchmark_tasks=[BenchmarkTask(user_input="Test task 0")],
+        min_new_trajectories=1,
+        report_store_base_dir=report_dir,
+    )
+
+    result = coordinator.run_cycle()
+
+    assert result["promoted"] is False
+    from memabra.training_reports import TrainingReportStore
+    report_store = TrainingReportStore(base_dir=report_dir)
+    reports = report_store.list_reports()
+    assert len(reports) == 1
+    assert reports[0]["promotion_decision"]["accepted"] is False
+
+
+def test_coordinator_catches_training_exception_and_returns_error_report(tmp_path):
+    app = build_demo_app(base_dir=tmp_path / "demo-artifacts")
+    _seed_trajectories(app, 10)
+
+    policy = PromotionPolicy(
+        min_reward_delta=-1.0,
+        max_error_rate_increase=1.0,
+        max_latency_increase_ms=10000.0,
+        required_task_count=1,
+    )
+
+    report_dir = tmp_path / "reports"
+    coordinator = OnlineLearningCoordinator(
+        app=app,
+        policy=policy,
+        benchmark_tasks=[BenchmarkTask(user_input="Test task 0")],
+        min_new_trajectories=1,
+        report_store_base_dir=report_dir,
+    )
+
+    # Force a training failure by monkeypatching DatasetBuilder.build to raise
+    original_build = DatasetBuilder.build
+    DatasetBuilder.build = lambda self, trajectories: (_ for _ in ()).throw(RuntimeError("simulated training failure"))
+
+    try:
+        result = coordinator.run_cycle()
+    finally:
+        DatasetBuilder.build = original_build
+
+    assert result["skipped"] is False
+    assert result["promoted"] is False
+    assert "error" in result
+    assert "simulated training failure" in result["error"]
+
+    # Verify error report was saved
+    from memabra.training_reports import TrainingReportStore
+    report_store = TrainingReportStore(base_dir=report_dir)
+    reports = report_store.list_reports()
+    assert len(reports) == 1
+    assert reports[0]["promotion_decision"]["accepted"] is False
+    assert "simulated training failure" in reports[0]["promotion_decision"]["reasons"][0]
+
+
+def test_coordinator_persists_seen_trajectory_ids_across_restarts(tmp_path):
+    app = build_demo_app(base_dir=tmp_path / "demo-artifacts")
+    _seed_trajectories(app, 5)
+
+    policy = PromotionPolicy(
+        min_reward_delta=-1.0,
+        max_error_rate_increase=1.0,
+        max_latency_increase_ms=10000.0,
+        required_task_count=1,
+    )
+    benchmark_tasks = [BenchmarkTask(user_input="Test task 0")]
+    seen_store = tmp_path / "seen_trajectories.json"
+    version_dir = tmp_path / "versions"
+    report_dir = tmp_path / "reports"
+
+    coordinator1 = OnlineLearningCoordinator(
+        app=app,
+        policy=policy,
+        benchmark_tasks=benchmark_tasks,
+        min_new_trajectories=1,
+        version_store_base_dir=version_dir,
+        report_store_base_dir=report_dir,
+        seen_trajectory_store=seen_store,
+    )
+    result1 = coordinator1.run_cycle()
+    assert result1["skipped"] is False
+
+    # New coordinator instance pointing to same store
+    coordinator2 = OnlineLearningCoordinator(
+        app=app,
+        policy=policy,
+        benchmark_tasks=benchmark_tasks,
+        min_new_trajectories=1,
+        version_store_base_dir=version_dir,
+        report_store_base_dir=report_dir,
+        seen_trajectory_store=seen_store,
+    )
+    result2 = coordinator2.run_cycle()
+    assert result2["skipped"] is True
+    assert "too few new trajectories" in result2["reason"].lower()
+
+
+def test_coordinator_dry_run_does_not_promote_or_save_version(tmp_path):
+    app = build_demo_app(base_dir=tmp_path / "demo-artifacts")
+    _seed_trajectories(app, 10)
+
+    policy = PromotionPolicy(
+        min_reward_delta=-1.0,
+        max_error_rate_increase=1.0,
+        max_latency_increase_ms=10000.0,
+        required_task_count=1,
+    )
+
+    version_dir = tmp_path / "versions"
+    report_dir = tmp_path / "reports"
+    coordinator = OnlineLearningCoordinator(
+        app=app,
+        policy=policy,
+        benchmark_tasks=[BenchmarkTask(user_input="Test task 0")],
+        min_new_trajectories=1,
+        version_store_base_dir=version_dir,
+        report_store_base_dir=report_dir,
+    )
+
+    result = coordinator.run_cycle(dry_run=True)
+
+    assert result["skipped"] is False
+    assert result["promoted"] is False
+    assert "decision" in result
+    assert result["decision"].accepted is True  # policy would accept, but dry_run blocks promotion
+
+    # No version should be saved
+    store = RouterVersionStore(base_dir=version_dir)
+    assert len(store.list_versions()) == 0
+
+    # Report should still be saved for audit
+    from memabra.training_reports import TrainingReportStore
+
+    report_store = TrainingReportStore(base_dir=report_dir)
+    reports = report_store.list_reports()
+    assert len(reports) == 1
+    assert reports[0].get("dry_run") is True
+
+
+def test_coordinator_rebuilds_case_index_when_path_provided(tmp_path):
+    app = build_demo_app(base_dir=tmp_path / "demo-artifacts")
+    _seed_trajectories(app, 10)
+
+    policy = PromotionPolicy(
+        min_reward_delta=-1.0,
+        max_error_rate_increase=1.0,
+        max_latency_increase_ms=10000.0,
+        required_task_count=1,
+    )
+
+    case_index_path = tmp_path / "case-index.json"
+    coordinator = OnlineLearningCoordinator(
+        app=app,
+        policy=policy,
+        benchmark_tasks=[BenchmarkTask(user_input="Test task 0")],
+        min_new_trajectories=1,
+        case_index_path=case_index_path,
+    )
+
+    result = coordinator.run_cycle()
+
+    assert result["skipped"] is False
+    assert case_index_path.exists()
+    from memabra.case_index import CaseIndex
+
+    index = CaseIndex.load(case_index_path)
+    assert index.best("Test task 0") is not None
+
+
+def test_coordinator_uses_specified_baseline_version(tmp_path):
+    from memabra.router import SimpleLearningRouter
+
+    app = build_demo_app(base_dir=tmp_path / "demo-artifacts")
+    _seed_trajectories(app, 10)
+
+    # Save a baseline version with known weights
+    baseline_router = SimpleLearningRouter()
+    baseline_router._weights = {"call_tool": {"input_length": 0.99}}
+    baseline_router._feature_keys = ["input_length"]
+    version_dir = tmp_path / "versions"
+    store = RouterVersionStore(base_dir=version_dir)
+    store.save(baseline_router, version_id="v-baseline", metadata={"note": "baseline"})
+
+    # Change app's current router to something different
+    different_router = SimpleLearningRouter()
+    different_router._weights = {"clarify": {"input_length": 0.01}}
+    different_router._feature_keys = ["input_length"]
+    app.set_router(different_router)
+
+    policy = PromotionPolicy(
+        min_reward_delta=-1.0,
+        max_error_rate_increase=1.0,
+        max_latency_increase_ms=10000.0,
+        required_task_count=1,
+    )
+
+    report_dir = tmp_path / "reports"
+    coordinator = OnlineLearningCoordinator(
+        app=app,
+        policy=policy,
+        benchmark_tasks=[BenchmarkTask(user_input="Test task 0")],
+        min_new_trajectories=1,
+        version_store_base_dir=version_dir,
+        report_store_base_dir=report_dir,
+    )
+
+    result = coordinator.run_cycle(baseline_version_id="v-baseline")
+
+    assert result["skipped"] is False
+    assert "baseline_metrics" in result
+    assert "challenger_metrics" in result
+
+    # Verify report records the baseline version
+    from memabra.training_reports import TrainingReportStore
+
+    report_store = TrainingReportStore(base_dir=report_dir)
+    reports = report_store.list_reports()
+    assert len(reports) == 1
+    assert reports[0].get("baseline_version_id") == "v-baseline"
diff --git a/tests/test_outcome_reward.py b/tests/test_outcome_reward.py
new file mode 100644
index 0000000..a77c26a
--- /dev/null
+++ b/tests/test_outcome_reward.py
@@ -0,0 +1,126 @@
+from memabra.execution import ActionResult
+from memabra.outcome import OutcomeEngine, RewardEngine
+from memabra.retrieval import RetrievalResult
+from memabra.router import RouteDecision, TaskContext
+from memabra.telemetry import RewardBreakdown
+
+
+def test_outcome_engine_success_for_memory_injection():
+    engine = OutcomeEngine()
+    decision = RouteDecision(decision_type="inject_memory", selected_ids=["mem-1"])
+    result = ActionResult(decision_type="inject_memory", status="executed", details={"latency_ms": 50})
+
+    outcome = engine.build_outcome(decision, result)
+
+    assert outcome.status == "success"
+    assert outcome.steps == 1
+    assert outcome.latency_ms == 50
+    assert outcome.tool_errors == 0
+
+
+def test_outcome_engine_failure_for_tool_error():
+    engine = OutcomeEngine()
+    decision = RouteDecision(decision_type="call_tool", selected_ids=["tool-1"])
+    result = ActionResult(decision_type="call_tool", status="error", details={"latency_ms": 120})
+
+    outcome = engine.build_outcome(decision, result)
+
+    assert outcome.status == "failure"
+    assert outcome.latency_ms == 120
+    assert outcome.tool_errors == 1
+
+
+def test_outcome_engine_counts_multiple_tool_errors():
+    engine = OutcomeEngine()
+    decision = RouteDecision(decision_type="call_tool", selected_ids=["tool-1", "tool-2"])
+    result = ActionResult(
+        decision_type="call_tool",
+        status="error",
+        details={
+            "latency_ms": 200,
+            "results": [
+                {"tool_id": "tool-1", "status": "error"},
+                {"tool_id": "tool-2", "status": "error"},
+            ],
+        },
+    )
+
+    outcome = engine.build_outcome(decision, result)
+
+    assert outcome.status == "failure"
+    assert outcome.tool_errors == 2
+
+
+def test_outcome_engine_partial_success_for_mixed_tool_results():
+    engine = OutcomeEngine()
+    decision = RouteDecision(decision_type="call_tool", selected_ids=["tool-1", "tool-2"])
+    result = ActionResult(
+        decision_type="call_tool",
+        status="error",
+        details={
+            "latency_ms": 200,
+            "results": [
+                {"tool_id": "tool-1", "status": "success"},
+                {"tool_id": "tool-2", "status": "error"},
+            ],
+        },
+    )
+
+    outcome = engine.build_outcome(decision, result)
+
+    assert outcome.status == "partial_success"
+    assert outcome.tool_errors == 1
+
+
+def test_reward_engine_penalizes_latency_by_tier():
+    outcome_engine = OutcomeEngine()
+    reward_engine = RewardEngine()
+    decision = RouteDecision(decision_type="call_tool")
+    outcome_fast = outcome_engine.build_outcome(decision, ActionResult(decision_type="call_tool", status="success", details={"latency_ms": 200}))
+    outcome_slow = outcome_engine.build_outcome(decision, ActionResult(decision_type="call_tool", status="success", details={"latency_ms": 2500}))
+
+    reward_fast = reward_engine.compute(decision, outcome_fast)
+    reward_slow = reward_engine.compute(decision, outcome_slow)
+
+    assert reward_fast.latency < reward_slow.latency
+    assert reward_slow.latency > 0.5
+
+
+def test_reward_engine_context_cost_based_on_candidate_count():
+    from memabra.candidate_types import CandidateObject
+
+    outcome_engine = OutcomeEngine()
+    reward_engine = RewardEngine()
+    decision = RouteDecision(decision_type="direct_answer")
+    outcome = outcome_engine.build_outcome(decision, ActionResult(decision_type="direct_answer", status="skipped", details={"latency_ms": 0}))
+    dummy_candidate = CandidateObject(id="c1", type="memory", title="t", summary="s", triggers=[])
+    retrieval = RetrievalResult(memory=[dummy_candidate, dummy_candidate, dummy_candidate], skill=[dummy_candidate, dummy_candidate], tool=[dummy_candidate])
+
+    reward = reward_engine.compute(decision, outcome, retrieval_result=retrieval)
+
+    assert reward.context_cost > 0
+
+
+def test_reward_engine_reduces_task_success_for_multiple_errors():
+    outcome_engine = OutcomeEngine()
+    reward_engine = RewardEngine()
+    decision = RouteDecision(decision_type="call_tool")
+    outcome = outcome_engine.build_outcome(
+        decision,
+        ActionResult(
+            decision_type="call_tool",
+            status="error",
+            details={
+                "latency_ms": 100,
+                "results": [
+                    {"tool_id": "tool-1", "status": "error"},
+                    {"tool_id": "tool-2", "status": "error"},
+                ],
+            },
+        ),
+    )
+
+    reward = reward_engine.compute(decision, outcome)
+
+    assert reward.task_success < 0.5
+    assert reward.tool_error >= 0.5
diff --git a/tests/test_package_exports.py b/tests/test_package_exports.py
new file mode 100644
index 0000000..5b9deee
--- /dev/null
+++ b/tests/test_package_exports.py
@@ -0,0 +1,22 @@
+def test_memabra_package_exports_alpha_modules():
+    from src import memabra
+
+    assert hasattr(memabra, "promotion")
+    assert hasattr(memabra, "benchmarks")
+    assert hasattr(memabra, "online_learning")
+    assert hasattr(memabra, "training_reports")
+
+
+def test_memabra_top_level_imports():
+    from memabra import PromotionPolicy, BenchmarkSuite, OnlineLearningCoordinator, TrainingReportStore, CaseIndex
+
+    assert PromotionPolicy is not None
+    assert BenchmarkSuite is not None
+    assert OnlineLearningCoordinator is not None
+    assert TrainingReportStore is not None
+    assert CaseIndex is not None
+
+
+def test_benchmark_task_exported_from_package():
+    from memabra import BenchmarkTask
+    assert BenchmarkTask is not None
diff --git a/tests/test_promotion.py b/tests/test_promotion.py
new file mode 100644
index 0000000..4cce6ac
--- /dev/null
+++ b/tests/test_promotion.py
@@ -0,0 +1,112 @@
+from __future__ import annotations
+
+import pytest
+
+from memabra.promotion import PromotionDecision, PromotionPolicy
+from memabra.evaluator import EvaluationResult
+
+
+class TestPromotionPolicy:
+    def test_accepted_when_challenger_improves_on_all_metrics(self):
+        policy = PromotionPolicy(
+            min_reward_delta=0.01,
+            max_error_rate_increase=0.05,
+            max_latency_increase_ms=100.0,
+            required_task_count=2,
+        )
+        baseline = EvaluationResult(
+            task_count=2,
+            avg_reward=0.5,
+            error_rate=0.1,
+            avg_latency_ms=50.0,
+        )
+        challenger = EvaluationResult(
+            task_count=2,
+            avg_reward=0.6,
+            error_rate=0.05,
+            avg_latency_ms=45.0,
+        )
+
+        decision = policy.evaluate(baseline, challenger)
+
+        assert isinstance(decision, PromotionDecision)
+        assert decision.accepted is True
+        assert decision.reasons == []
+        assert decision.metrics["reward_delta"] == pytest.approx(0.1, abs=0.001)
+        assert decision.metrics["error_rate_delta"] == pytest.approx(-0.05, abs=0.001)
+        assert decision.metrics["latency_delta_ms"] == pytest.approx(-5.0, abs=0.001)
+
+    def test_rejected_when_reward_delta_below_minimum(self):
+        policy = PromotionPolicy(
+            min_reward_delta=0.1,
+            max_error_rate_increase=0.05,
+            max_latency_increase_ms=100.0,
+            required_task_count=2,
+        )
+        baseline = EvaluationResult(task_count=2, avg_reward=0.5, error_rate=0.1, avg_latency_ms=50.0)
+        challenger = EvaluationResult(task_count=2, avg_reward=0.55, error_rate=0.1, avg_latency_ms=50.0)
+
+        decision = policy.evaluate(baseline, challenger)
+
+        assert decision.accepted is False
+        assert any("reward" in r.lower() for r in decision.reasons)
+
+    def test_rejected_when_error_rate_increase_exceeds_max(self):
+        policy = PromotionPolicy(
+            min_reward_delta=0.01,
+            max_error_rate_increase=0.05,
+            max_latency_increase_ms=100.0,
+            required_task_count=2,
+        )
+        baseline = EvaluationResult(task_count=2, avg_reward=0.5, error_rate=0.1, avg_latency_ms=50.0)
+        challenger = EvaluationResult(task_count=2, avg_reward=0.6, error_rate=0.2, avg_latency_ms=50.0)
+
+        decision = policy.evaluate(baseline, challenger)
+
+        assert decision.accepted is False
+        assert any("error" in r.lower() for r in decision.reasons)
+
+    def test_rejected_when_latency_increase_exceeds_max(self):
+        policy = PromotionPolicy(
+            min_reward_delta=0.01,
+            max_error_rate_increase=0.05,
+            max_latency_increase_ms=10.0,
+            required_task_count=2,
+        )
+        baseline = EvaluationResult(task_count=2, avg_reward=0.5, error_rate=0.1, avg_latency_ms=50.0)
+        challenger = EvaluationResult(task_count=2, avg_reward=0.6, error_rate=0.1, avg_latency_ms=65.0)
+
+        decision = policy.evaluate(baseline, challenger)
+
+        assert decision.accepted is False
+        assert any("latency" in r.lower() for r in decision.reasons)
+
+    def test_rejected_when_task_count_below_required(self):
+        policy = PromotionPolicy(
+            min_reward_delta=0.01,
+            max_error_rate_increase=0.05,
+            max_latency_increase_ms=100.0,
+            required_task_count=5,
+        )
+        baseline = EvaluationResult(task_count=2, avg_reward=0.5, error_rate=0.1, avg_latency_ms=50.0)
+        challenger = EvaluationResult(task_count=2, avg_reward=0.6, error_rate=0.1, avg_latency_ms=50.0)
+
+        decision = policy.evaluate(baseline, challenger)
+
+        assert decision.accepted is False
+        assert any("task count" in r.lower() for r in decision.reasons)
+
+    def test_multiple_rejection_reasons_accumulate(self):
+        policy = PromotionPolicy(
+            min_reward_delta=0.2,
+            max_error_rate_increase=0.01,
+            max_latency_increase_ms=10.0,
+            required_task_count=10,
+        )
+        baseline = EvaluationResult(task_count=2, avg_reward=0.5, error_rate=0.1, avg_latency_ms=50.0)
+        challenger = EvaluationResult(task_count=2, avg_reward=0.55, error_rate=0.15, avg_latency_ms=70.0)
+
+        decision = policy.evaluate(baseline, challenger)
+
+        assert decision.accepted is False
+        assert len(decision.reasons) >= 3
diff --git a/tests/test_replay.py b/tests/test_replay.py
new file mode 100644
index 0000000..2685c31
--- /dev/null
+++ b/tests/test_replay.py
@@ -0,0 +1,57 @@
+from pathlib import Path
+
+from memabra.persistence import PersistenceStore
+from memabra.replay import TrajectoryReplay
+
+
+EXAMPLE_DIR = "docs/examples"
+
+
+def test_replay_summary_counts_outcomes_and_actions():
+    replay = TrajectoryReplay()
+    summary = replay.summarize_directory(EXAMPLE_DIR)
+
+    assert summary.trajectories == 4
+    assert summary.success_count == 2
+    assert summary.partial_success_count == 1
+    assert summary.failure_count == 1
+    assert summary.direct_answer_count == 1
+    assert summary.memory_action_count == 1
+    assert summary.tool_action_count == 2
+    assert summary.skill_action_count == 0
+
+
+def test_replay_can_summarize_persisted_artifacts(tmp_path: Path):
+    persistence = PersistenceStore(base_dir=tmp_path / "artifacts")
+    persistence.save_trajectory(
+        {
+            "trajectory_id": "traj-1",
+            "task": {"task_id": "task-1", "input": "A", "channel": "local", "created_at": "2026-01-01T00:00:00Z", "user_id": None},
+            "context_snapshot": {"conversation_summary": "", "environment_summary": "", "recent_failures": []},
+            "candidate_sets": {"memory": [], "skill": [], "tool": []},
+            "decisions": [{"step": 1, "decision_type": "direct_answer", "selected_ids": [], "rejected_ids": [], "rationale": "", "estimated_cost": 0}],
+            "events": [],
+            "outcome": {"status": "success", "steps": 1, "latency_ms": 10, "user_corrections": 0, "tool_errors": 0, "notes": None},
+            "reward": {"total": 1.0, "components": {"task_success": 1.0, "retrieval_hit": 0.0, "tool_error": 0.0, "user_correction": 0.0, "latency": 0.0, "context_cost": 0.0, "useful_reuse": 0.0}},
+        }
+    )
+    persistence.save_trajectory(
+        {
+            "trajectory_id": "traj-2",
+            "task": {"task_id": "task-2", "input": "B", "channel": "local", "created_at": "2026-01-01T00:00:00Z", "user_id": None},
+            "context_snapshot": {"conversation_summary": "", "environment_summary": "", "recent_failures": []},
+            "candidate_sets": {"memory": [], "skill": [], "tool": []},
+            "decisions": [{"step": 1, "decision_type": "call_tool", "selected_ids": ["tool-1"], "rejected_ids": [], "rationale": "", "estimated_cost": 0.1}],
+            "events": [],
+            "outcome": {"status": "failure", "steps": 1, "latency_ms": 50, "user_corrections": 0, "tool_errors": 1, "notes": None},
+            "reward": {"total": -0.2, "components": {"task_success": 0.2, "retrieval_hit": 0.0, "tool_error": 0.3, "user_correction": 0.0, "latency": 0.05, "context_cost": 0.0, "useful_reuse": 0.0}},
+        }
+    )
+
+    replay = TrajectoryReplay()
+    summary = replay.summarize_persistence_store(persistence)
+
+    assert summary.trajectories == 2
+    assert summary.success_count == 1
+    assert summary.failure_count == 1
+    assert summary.tool_action_count == 1
diff --git a/tests/test_retrieval.py b/tests/test_retrieval.py
new file mode 100644
index 0000000..42b4068
--- /dev/null
+++ b/tests/test_retrieval.py
@@ -0,0 +1,45 @@
+from memabra.candidate_types import CandidateObject
+from memabra.retrieval import CandidateRetriever, InMemoryCandidateProvider
+from memabra.router import TaskContext
+
+
+def test_retriever_ranks_trigger_matches_first():
+    retriever = CandidateRetriever(
+        [
+            InMemoryCandidateProvider(
+                candidate_type="memory",
+                candidates=[
+                    CandidateObject(
+                        id="mem-weak",
+                        type="memory",
+                        title="Generic preference",
+                        summary="A weak preference record",
+                        confidence=0.4,
+                        success_rate=0.4,
+                        freshness=0.4,
+                        triggers=["generic"],
+                    ),
+                    CandidateObject(
+                        id="mem-strong",
+                        type="memory",
+                        title="Formatting preference",
+                        summary="Telegram prefers plain text",
+                        confidence=0.8,
+                        success_rate=0.9,
+                        freshness=0.9,
+                        triggers=["telegram", "formatting"],
+                        tags=["output"],
+                    ),
+                ],
+            )
+        ]
+    )
+
+    result = retriever.retrieve(
+        TaskContext(user_input="Use my telegram formatting preference for the output."),
+        top_k=2,
+    )
+
+    assert [candidate.id for candidate in result.memory] == ["mem-strong", "mem-weak"]
+    assert result.skill == []
+    assert result.tool == []
diff --git a/tests/test_router_feature_scoring.py b/tests/test_router_feature_scoring.py
new file mode 100644
index 0000000..5975eb3
--- /dev/null
+++ b/tests/test_router_feature_scoring.py
@@ -0,0 +1,137 @@
+from memabra.candidate_types import CandidateObject
+from memabra.router import FeatureScoringRouter, TaskContext
+
+
+def test_feature_scoring_router_computes_score_breakdown_and_selects_best():
+    router = FeatureScoringRouter()
+    memory = CandidateObject(
+        id="mem-1",
+        type="memory",
+        title="m1",
+        summary="s1",
+        confidence=0.9,
+        success_rate=0.9,
+        freshness=0.9,
+        cost=0.1,
+        risk=0.1,
+    )
+    tool = CandidateObject(
+        id="tool-1",
+        type="tool",
+        title="t1",
+        summary="s1",
+        confidence=0.8,
+        success_rate=0.8,
+        freshness=0.8,
+        cost=0.1,
+        risk=0.1,
+    )
+    decision = router.choose(
+        TaskContext(user_input="do something"),
+        memory_candidates=[memory],
+        skill_candidates=[],
+        tool_candidates=[tool],
+    )
+    assert decision.decision_type == "inject_memory"
+    assert "mem-1" in decision.score_breakdown
+    assert "tool-1" in decision.score_breakdown
+    assert decision.score_breakdown["mem-1"] > decision.score_breakdown["tool-1"]
+
+
+def test_feature_scoring_router_applies_failure_penalty():
+    router = FeatureScoringRouter()
+    tool_a = CandidateObject(
+        id="tool-a",
+        type="tool",
+        title="ta",
+        summary="sa",
+        confidence=0.9,
+        success_rate=0.9,
+        freshness=0.9,
+        cost=0.0,
+        risk=0.0,
+    )
+    tool_b = CandidateObject(
+        id="tool-b",
+        type="tool",
+        title="tb",
+        summary="sb",
+        confidence=0.9,
+        success_rate=0.9,
+        freshness=0.9,
+        cost=0.0,
+        risk=0.0,
+    )
+    context = TaskContext(user_input="run tool", recent_failures=["tool-b"])
+    decision = router.choose(
+        context,
+        memory_candidates=[],
+        skill_candidates=[],
+        tool_candidates=[tool_a, tool_b],
+    )
+    assert decision.decision_type == "call_tool"
+    assert decision.selected_ids == ["tool-a"]
+    assert decision.score_breakdown["tool-b"] < decision.score_breakdown["tool-a"]
+
+
+def test_feature_scoring_router_emits_composite_action_for_preconditions():
+    router = FeatureScoringRouter()
+    memory = CandidateObject(
+        id="mem-1",
+        type="memory",
+        title="m1",
+        summary="s1",
+        confidence=0.7,
+        success_rate=0.5,
+        freshness=0.3,
+        cost=0.0,
+        risk=0.0,
+    )
+    tool = CandidateObject(
+        id="tool-1",
+        type="tool",
+        title="t1",
+        summary="s1",
+        confidence=0.9,
+        success_rate=0.9,
+        freshness=0.9,
+        cost=0.0,
+        risk=0.0,
+        preconditions=["memory"],
+    )
+    decision = router.choose(
+        TaskContext(user_input="run tool"),
+        memory_candidates=[memory],
+        skill_candidates=[],
+        tool_candidates=[tool],
+    )
+    assert decision.decision_type == "composite_action"
+    assert len(decision.composite_steps) == 2
+    assert decision.composite_steps[0].decision_type == "inject_memory"
+    assert decision.composite_steps[0].selected_ids == ["mem-1"]
+    assert decision.composite_steps[1].decision_type == "call_tool"
+    assert decision.composite_steps[1].selected_ids == ["tool-1"]
+
+
+def test_feature_scoring_router_fallback_when_precondition_missing():
+    router = FeatureScoringRouter()
+    tool = CandidateObject(
+        id="tool-1",
+        type="tool",
+        title="t1",
+        summary="s1",
+        confidence=0.9,
+        success_rate=0.9,
+        freshness=0.9,
+        cost=0.0,
+        risk=0.0,
+        preconditions=["memory"],
+    )
+    decision = router.choose(
+        TaskContext(user_input="run tool"),
+        memory_candidates=[],
+        skill_candidates=[],
+        tool_candidates=[tool],
+    )
+    assert decision.decision_type == "call_tool"
+    assert decision.selected_ids == ["tool-1"]
diff --git a/tests/test_router_protocol.py b/tests/test_router_protocol.py
new file mode 100644
index 0000000..7475110
--- /dev/null
+++ b/tests/test_router_protocol.py
@@ -0,0 +1,12 @@
+from memabra.router import (
+    FeatureScoringRouter,
+    RouterProtocol,
+    RuleBasedRouter,
+    SimpleLearningRouter,
+)
+
+
+def test_all_router_implementations_conform_to_router_protocol():
+    assert isinstance(RuleBasedRouter(), RouterProtocol)
+    assert isinstance(FeatureScoringRouter(), RouterProtocol)
+    assert isinstance(SimpleLearningRouter(), RouterProtocol)
diff --git a/tests/test_router_smoke.py b/tests/test_router_smoke.py
new file mode 100644
index 0000000..08ae048
--- /dev/null
+++ b/tests/test_router_smoke.py
@@ -0,0 +1,25 @@
+from memabra.candidate_types import CandidateObject
+from memabra.router import RuleBasedRouter, TaskContext
+
+
+def test_router_prefers_memory_for_preference_queries():
+    router = RuleBasedRouter()
+    decision = router.choose(
+        TaskContext(user_input="Remember my preferred deployment region"),
+        memory_candidates=[
+            CandidateObject(
+                id="mem-1",
+                type="memory",
+                title="Preferred region",
+                summary="User prefers us-west-2",
+                confidence=0.9,
+                freshness=0.8,
+                success_rate=0.9,
+            )
+        ],
+        skill_candidates=[],
+        tool_candidates=[],
+    )
+
+    assert decision.decision_type == "inject_memory"
+    assert decision.selected_ids == ["mem-1"]
diff --git a/tests/test_router_versioning.py b/tests/test_router_versioning.py
new file mode 100644
index 0000000..367293c
--- /dev/null
+++ b/tests/test_router_versioning.py
@@ -0,0 +1,115 @@
+import json
+from pathlib import Path
+
+from memabra.router import SimpleLearningRouter
+from memabra.router_versioning import RouterVersionStore
+
+
+def test_save_and_load_router_version(tmp_path):
+    store = RouterVersionStore(base_dir=tmp_path)
+    router = SimpleLearningRouter()
+    router._weights = {"call_tool": {"input_length": 0.5, "tool_count": 1.2}}
+    router._feature_keys = ["input_length", "tool_count"]
+
+    store.save(router, version_id="v1", metadata={"avg_reward": 0.75})
+    loaded = store.load("v1")
+
+    assert loaded._weights == router._weights
+    assert loaded._feature_keys == router._feature_keys
+
+
+def test_list_versions_returns_metadata(tmp_path):
+    store = RouterVersionStore(base_dir=tmp_path)
+    router = SimpleLearningRouter()
+    router._weights = {"inject_memory": {"memory_count": 0.8}}
+    router._feature_keys = ["memory_count"]
+
+    store.save(router, version_id="v1", metadata={"avg_reward": 0.75})
+    store.save(router, version_id="v2", metadata={"avg_reward": 0.82})
+
+    versions = store.list_versions()
+    assert len(versions) == 2
+    assert versions[0]["version_id"] == "v1"
+    assert versions[0]["metadata"]["avg_reward"] == 0.75
+    assert versions[1]["version_id"] == "v2"
+    assert versions[1]["metadata"]["avg_reward"] == 0.82
+
+
+def test_rollback_changes_current_version(tmp_path):
+    store = RouterVersionStore(base_dir=tmp_path)
+    router = SimpleLearningRouter()
+    router._weights = {"a": {"x": 1.0}}
+    router._feature_keys = ["x"]
+
+    store.save(router, version_id="v1")
+    store.save(router, version_id="v2")
+    assert store.get_current()["current_version_id"] == "v2"
+
+    store.rollback("v1")
+    current = store.get_current()
+    assert current["current_version_id"] == "v1"
+    assert current.get("rollback_from") == "v2"
+    assert "rolled_back_at" in current
+
+
+def test_save_tracks_active_router_metadata(tmp_path):
+    store = RouterVersionStore(base_dir=tmp_path)
+    router = SimpleLearningRouter()
+    router._weights = {"a": {"x": 1.0}}
+    router._feature_keys = ["x"]
+
+    store.save(
+        router,
+        version_id="v1",
+        metadata={"promotion_source": "online_learning", "benchmark_summary": {"reward_delta": 0.1}},
+    )
+
+    current = store.get_current()
+    assert current["current_version_id"] == "v1"
+    assert current["promotion_source"] == "online_learning"
+    assert current["benchmark_summary"]["reward_delta"] == 0.1
+    assert current.get("prior_version_id") is None
+
+
+def test_save_records_prior_version_id(tmp_path):
+    store = RouterVersionStore(base_dir=tmp_path)
+    router = SimpleLearningRouter()
+    router._weights = {"a": {"x": 1.0}}
+    router._feature_keys = ["x"]
+
+    store.save(router, version_id="v1")
+    store.save(router, version_id="v2")
+
+    current = store.get_current()
+    assert current["current_version_id"] == "v2"
+    assert current["prior_version_id"] == "v1"
+
+
+def test_load_without_version_uses_current(tmp_path):
+    store = RouterVersionStore(base_dir=tmp_path)
+    router = SimpleLearningRouter()
+    router._weights = {"call_tool": {"input_length": 0.5}}
+    router._feature_keys = ["input_length"]
+
+    store.save(router, version_id="v1")
+    loaded = store.load()
+
+    assert loaded._weights == router._weights
+
+
+def test_app_save_and_load_learning_router(tmp_path):
+    from memabra.app import MemabraApp, build_demo_app
+
+    app = build_demo_app(base_dir=tmp_path / "artifacts")
+    router = SimpleLearningRouter()
+    router._weights = {"clarify": {"input_length": 0.1}}
+    router._feature_keys = ["input_length"]
+    app.runner.router = router
+
+    version_dir = tmp_path / "router-versions"
+    app.save_learning_router(version_id="v-test", base_dir=version_dir, metadata={"note": "test"})
+    loaded_app = build_demo_app(base_dir=tmp_path / "artifacts")
+    loaded_app.load_learning_router(version_id="v-test", base_dir=version_dir)
+
+    assert loaded_app.runner.router._weights == router._weights
+    assert loaded_app.runner.router._feature_keys == router._feature_keys
diff --git a/tests/test_runner.py b/tests/test_runner.py
new file mode 100644
index 0000000..bf63247
--- /dev/null
+++ b/tests/test_runner.py
@@ -0,0 +1,96 @@
+from memabra.candidate_types import CandidateObject
+from memabra.retrieval import CandidateRetriever, InMemoryCandidateProvider
+from memabra.router import RuleBasedRouter, TaskContext
+from memabra.runner import MemabraRunner
+from memabra.schemas import SchemaRegistry
+
+
+def test_runner_produces_valid_draft_trajectory():
+    retriever = CandidateRetriever(
+        [
+            InMemoryCandidateProvider(
+                candidate_type="memory",
+                candidates=[
+                    CandidateObject(
+                        id="mem-1",
+                        type="memory",
+                        title="Output preference",
+                        summary="Prefer plain text on Telegram.",
+                        triggers=["telegram", "preference"],
+                        confidence=0.9,
+                        success_rate=0.8,
+                        freshness=0.9,
+                        tags=["output"],
+                    )
+                ],
+            )
+        ]
+    )
+    runner = MemabraRunner(retriever=retriever, router=RuleBasedRouter())
+
+    trajectory = runner.run(
+        context=TaskContext(
+            user_input="Use my telegram preference for this answer.",
+            conversation_summary="User often cares about output formatting.",
+        ),
+        channel="telegram",
+        user_id="oza",
+    )
+
+    SchemaRegistry().validate_trajectory(trajectory)
+    assert trajectory["decisions"][0]["decision_type"] == "inject_memory"
+    assert trajectory["candidate_sets"]["memory"][0]["id"] == "mem-1"
+    assert len(trajectory["events"]) == 3
+
+
+def test_runner_injects_episodic_candidate_when_case_index_matches(tmp_path):
+    from memabra.case_index import CaseIndex
+    from memabra.persistence import PersistenceStore
+
+    store = PersistenceStore(base_dir=tmp_path / "artifacts")
+    retriever = CandidateRetriever(
+        [
+            InMemoryCandidateProvider(
+                candidate_type="memory",
+                candidates=[],
+            ),
+            InMemoryCandidateProvider(
+                candidate_type="skill",
+                candidates=[],
+            ),
+            InMemoryCandidateProvider(
+                candidate_type="tool",
+                candidates=[],
+            ),
+        ]
+    )
+    runner = MemabraRunner(retriever=retriever, router=RuleBasedRouter(), persistence_store=store)
+
+    # First run creates a trajectory
+    trajectory1 = runner.run(
+        context=TaskContext(user_input="Hello world"),
+        channel="local",
+        persist=True,
+    )
+
+    # Build case index from the trajectory
+    case_index = CaseIndex()
+    case_index.add(trajectory1)
+
+    # Second run with case index should inject an episodic candidate
+    runner_with_case = MemabraRunner(
+        retriever=retriever,
+        router=RuleBasedRouter(),
+        persistence_store=store,
+        case_index=case_index,
+    )
+    trajectory2 = runner_with_case.run(
+        context=TaskContext(user_input="Hello world"),
+        channel="local",
+        persist=True,
+    )
+
+    memory_candidates = trajectory2["candidate_sets"]["memory"]
+    assert any(c["id"].startswith("episodic-") for c in memory_candidates)
+    # With a persistence store, the runner should generate a rich episodic summary
+    assert any("Task:" in c["summary"] for c in memory_candidates)
diff --git a/tests/test_schemas.py b/tests/test_schemas.py
new file mode 100644
index 0000000..d53fb10
--- /dev/null
+++ b/tests/test_schemas.py
@@ -0,0 +1,30 @@
+import pytest
+
+from memabra.schemas import SchemaRegistry, SchemaValidationError
+
+
+EXAMPLE_TRAJECTORY = "docs/examples/trajectory_success_memory.json"
+
+
+def test_schema_registry_validates_example_trajectory():
+    registry = SchemaRegistry()
+    with open(EXAMPLE_TRAJECTORY, "r", encoding="utf-8") as f:
+        example = __import__("json").load(f)
+    registry.validate_trajectory(example)
+
+
+def test_schema_registry_rejects_missing_required_keys():
+    registry = SchemaRegistry()
+    with pytest.raises(SchemaValidationError):
+        registry.validate_trajectory({"trajectory_id": "oops"})
+
+
+def test_no_resource_warning_from_schema_validation():
+    import warnings
+
+    with warnings.catch_warnings(record=True) as w:
+        warnings.simplefilter("always", ResourceWarning)
+        test_schema_registry_validates_example_trajectory()
+
+    resource_warnings = [x for x in w if issubclass(x.category, ResourceWarning)]
+    assert len(resource_warnings) == 0
diff --git a/tests/test_skill_adapters.py b/tests/test_skill_adapters.py
new file mode 100644
index 0000000..baa6d23
--- /dev/null
+++ b/tests/test_skill_adapters.py
@@ -0,0 +1,107 @@
+from pathlib import Path
+
+from memabra.candidate_types import CandidateObject
+from memabra.execution import ExecutionEngine, FileSystemSkillBackend, SkillExecutor
+from memabra.retrieval import CandidateRetriever, InMemoryCandidateProvider
+from memabra.router import RouteDecision, RuleBasedRouter, TaskContext
+from memabra.runner import MemabraRunner
+
+
+def test_filesystem_skill_backend_loads_skill_from_directory(tmp_path: Path):
+    skill_dir = tmp_path / "category-a" / "skill-demo"
+    skill_dir.mkdir(parents=True)
+    skill_file = skill_dir / "SKILL.md"
+    skill_file.write_text(
+        "---\n"
+        "name: skill-demo\n"
+        "description: A demo skill for testing.\n"
+        "version: 1.0.0\n"
+        "---\n\n"
+        "# Demo Skill\n\n"
+        "This is the demo skill body.\n"
+    )
+
+    backend = FileSystemSkillBackend(search_paths=[tmp_path])
+    payload = backend.load_skill("skill-demo")
+
+    assert payload["skill_id"] == "skill-demo"
+    assert payload["name"] == "skill-demo"
+    assert payload["description"] == "A demo skill for testing."
+    assert "This is the demo skill body." in payload["content"]
+
+
+def test_filesystem_skill_backend_returns_error_for_missing_skill(tmp_path: Path):
+    backend = FileSystemSkillBackend(search_paths=[tmp_path])
+    payload = backend.load_skill("nonexistent")
+
+    assert payload["skill_id"] == "nonexistent"
+    assert payload["status"] == "error"
+    assert "not found" in payload["error"].lower()
+
+
+def test_skill_executor_uses_filesystem_backend_to_load_payload(tmp_path: Path):
+    skill_dir = tmp_path / "ops" / "skill-deploy"
+    skill_dir.mkdir(parents=True)
+    skill_file = skill_dir / "SKILL.md"
+    skill_file.write_text(
+        "---\n"
+        "name: skill-deploy\n"
+        "description: Deploy workflow skill.\n"
+        "---\n\n"
+        "# Deploy Workflow\n\n"
+        "1. Build\n2. Test\n3. Deploy\n"
+    )
+
+    backend = FileSystemSkillBackend(search_paths=[tmp_path])
+    executor = SkillExecutor(backend=backend)
+    decision = RouteDecision(decision_type="load_skill", selected_ids=["skill-deploy"])
+    result = executor.execute(decision, TaskContext(user_input="deploy"), trajectory_id="traj-1")
+
+    assert result.status == "executed"
+    assert result.details["payloads"][0]["name"] == "skill-deploy"
+    assert "1. Build" in result.details["payloads"][0]["content"]
+    assert any(event.event_type == "skill_loaded" for event in result.events)
+
+
+def test_execution_engine_runs_skill_path_end_to_end(tmp_path: Path):
+    skill_dir = tmp_path / "ops" / "skill-deploy"
+    skill_dir.mkdir(parents=True)
+    (skill_dir / "SKILL.md").write_text(
+        "---\n"
+        "name: skill-deploy\n"
+        "description: Deploy workflow skill.\n"
+        "---\n\n"
+        "Deploy steps here.\n"
+    )
+
+    retriever = CandidateRetriever(
+        [
+            InMemoryCandidateProvider(
+                candidate_type="skill",
+                candidates=[
+                    CandidateObject(
+                        id="skill-deploy",
+                        type="skill",
+                        title="deploy workflow",
+                        summary="Reusable deployment procedure.",
+                        triggers=["deploy", "workflow"],
+                        confidence=0.9,
+                        success_rate=0.95,
+                        freshness=0.8,
+                    )
+                ],
+            )
+        ]
+    )
+    runner = MemabraRunner(
+        retriever=retriever,
+        router=RuleBasedRouter(),
+        execution_engine=ExecutionEngine(skill_backend=FileSystemSkillBackend(search_paths=[tmp_path])),
+    )
+
+    trajectory = runner.run(context=TaskContext(user_input="Deploy this service with the usual workflow."))
+
+    skill_events = [event for event in trajectory["events"] if event["event_type"] == "skill_loaded"]
+    assert skill_events
+    assert skill_events[0]["payload"]["name"] == "skill-deploy"
+    assert "Deploy steps here." in skill_events[0]["payload"]["content"]
diff --git a/tests/test_tool_adapters.py b/tests/test_tool_adapters.py
new file mode 100644
index 0000000..e1cdd1a
--- /dev/null
+++ b/tests/test_tool_adapters.py
@@ -0,0 +1,66 @@
+from memabra.router import TaskContext
+
+
+def test_local_function_tool_adapter_executes_callable():
+    from memabra.execution import LocalFunctionToolAdapter
+
+    def add(a: int, b: int) -> int:
+        return a + b
+
+    adapter = LocalFunctionToolAdapter(func=add)
+    result = adapter.run_tool("add", TaskContext(user_input="add 1 and 2"), {"a": 1, "b": 2})
+
+    assert result["status"] == "success"
+    assert result["output"] == 3
+    assert result["error"] is None
+
+
+def test_subprocess_tool_adapter_executes_command():
+    from memabra.execution import SubprocessToolAdapter
+
+    adapter = SubprocessToolAdapter(command="echo hello")
+    result = adapter.run_tool("echo", TaskContext(user_input="say hello"))
+
+    assert result["status"] == "success"
+    assert "hello" in result["output"]
+    assert result["error"] is None
+    assert result["latency_ms"] >= 0
+
+
+def test_tool_registry_resolves_and_runs_tools():
+    from memabra.execution import LocalFunctionToolAdapter, ToolRegistry
+
+    registry = ToolRegistry()
+    registry.register("double", LocalFunctionToolAdapter(func=lambda x: x * 2))
+
+    result = registry.run_tool("double", TaskContext(user_input="double 5"), {"x": 5})
+
+    assert result["status"] == "success"
+    assert result["output"] == 10
+
+
+def test_tool_registry_returns_error_for_unknown_tool():
+    from memabra.execution import ToolRegistry
+
+    registry = ToolRegistry()
+    result = registry.run_tool("missing", TaskContext(user_input="missing"))
+
+    assert result["status"] == "error"
+    assert "not found" in result["error"].lower()
+
+
+def test_tool_executor_uses_registry_and_produces_result_events():
+    from memabra.execution import ToolExecutor, ToolRegistry, LocalFunctionToolAdapter
+    from memabra.router import RouteDecision
+
+    registry = ToolRegistry()
+    registry.register("add", LocalFunctionToolAdapter(func=lambda a, b: a + b))
+
+    executor = ToolExecutor(backend=registry)
+    decision = RouteDecision(decision_type="call_tool", selected_ids=["add"], selected_payloads=[{"a": 2, "b": 3}])
+    result = executor.execute(decision, TaskContext(user_input="add 2 and 3"), trajectory_id="traj-1")
+
+    assert result.status == "executed"
+    assert result.details["results"][0]["output"] == 5
+    assert any(event.event_type == "tool_called" for event in result.events)
+    assert any(event.event_type == "tool_result" for event in result.events)
diff --git a/tests/test_training_reports.py b/tests/test_training_reports.py
new file mode 100644
index 0000000..b6f358c
--- /dev/null
+++ b/tests/test_training_reports.py
@@ -0,0 +1,74 @@
+from __future__ import annotations
+
+from datetime import datetime, timezone
+
+from memabra.evaluator import EvaluationResult
+from memabra.promotion import PromotionDecision, PromotionPolicy
+from memabra.training_reports import TrainingReportStore, build_report
+
+
+def test_build_report_includes_all_required_fields():
+    baseline = EvaluationResult(task_count=2, avg_reward=0.5, error_rate=0.1, avg_latency_ms=50.0)
+    challenger = EvaluationResult(task_count=2, avg_reward=0.6, error_rate=0.05, avg_latency_ms=45.0)
+    decision = PromotionDecision(accepted=True, reasons=[], metrics={"reward_delta": 0.1})
+
+    report = build_report(
+        source_trajectory_ids=["t1", "t2"],
+        baseline=baseline,
+        challenger=challenger,
+        decision=decision,
+        promoted_version_id="v-2026",
+    )
+
+    assert report["source_trajectory_ids"] == ["t1", "t2"]
+    assert report["sample_count"] == 2
+    assert "timestamp" in report
+    assert report["promoted_version_id"] == "v-2026"
+    assert report["baseline_metrics"]["avg_reward"] == 0.5
+    assert report["challenger_metrics"]["avg_reward"] == 0.6
+    assert report["promotion_decision"]["accepted"] is True
+
+
+def test_training_report_store_save_and_list(tmp_path):
+    store = TrainingReportStore(base_dir=tmp_path / "reports")
+    report = build_report(
+        source_trajectory_ids=["t1"],
+        baseline=EvaluationResult(task_count=1, avg_reward=0.5, error_rate=0.0, avg_latency_ms=10.0),
+        challenger=EvaluationResult(task_count=1, avg_reward=0.6, error_rate=0.0, avg_latency_ms=10.0),
+        decision=PromotionDecision(accepted=False, reasons=["reward too low"], metrics={}),
+    )
+
+    saved = store.save(report)
+    reports = store.list_reports()
+
+    assert len(reports) == 1
+    assert reports[0]["report_id"] == saved["report_id"]
+    assert reports[0]["promotion_decision"]["accepted"] is False
+
+
+def test_training_report_store_get_report_returns_specific_report(tmp_path):
+    from memabra.training_reports import TrainingReportStore, build_report
+    from memabra.evaluator import EvaluationResult
+    from memabra.promotion import PromotionDecision
+
+    store = TrainingReportStore(base_dir=tmp_path)
+    report = build_report(
+        source_trajectory_ids=["t1", "t2"],
+        baseline=EvaluationResult(task_count=1, trajectories=[], avg_reward=0.5, error_rate=0.0, avg_latency_ms=10.0, decision_distribution={}),
+        challenger=EvaluationResult(task_count=1, trajectories=[], avg_reward=0.6, error_rate=0.0, avg_latency_ms=10.0, decision_distribution={}),
+        decision=PromotionDecision(accepted=True, reasons=[], metrics={}),
+        promoted_version_id="v1",
+    )
+    store.save(report)
+
+    fetched = store.get_report(report["report_id"])
+    assert fetched is not None
+    assert fetched["report_id"] == report["report_id"]
+    assert fetched["promoted_version_id"] == "v1"
+
+
+def test_training_report_store_get_report_missing_returns_none(tmp_path):
+    from memabra.training_reports import TrainingReportStore
+
+    store = TrainingReportStore(base_dir=tmp_path)
+    assert store.get_report("nonexistent") is None
diff --git a/tests/test_trajectory_summary.py b/tests/test_trajectory_summary.py
new file mode 100644
index 0000000..2f91623
--- /dev/null
+++ b/tests/test_trajectory_summary.py
@@ -0,0 +1,58 @@
+from memabra.trajectory_summary import TrajectorySummarizer
+
+
+def test_summarize_direct_answer_success():
+    summarizer = TrajectorySummarizer()
+    trajectory = {
+        "task": {"input": "What is 2+2?"},
+        "decisions": [{"decision_type": "direct_answer"}],
+        "outcome": {"status": "success", "steps": 1, "tool_errors": 0, "user_corrections": 0},
+        "reward": {"total": 1.0},
+    }
+    summary = summarizer.summarize(trajectory)
+    assert "Task: 'What is 2+2?'" in summary
+    assert "Actions: direct_answer" in summary
+    assert "Outcome: success (reward=1.0, steps=1)" in summary
+
+
+def test_summarize_multi_step_with_tool_errors():
+    summarizer = TrajectorySummarizer()
+    trajectory = {
+        "task": {"input": "Run analysis"},
+        "decisions": [
+            {"decision_type": "clarify"},
+            {"decision_type": "call_tool"},
+            {"decision_type": "direct_answer"},
+        ],
+        "outcome": {"status": "partial_success", "steps": 3, "tool_errors": 1, "user_corrections": 1},
+        "reward": {"total": 0.5},
+    }
+    summary = summarizer.summarize(trajectory)
+    assert "Actions: clarify -> call_tool -> direct_answer" in summary
+    assert "Outcome: partial_success (reward=0.5, steps=3)" in summary
+    assert "Tool errors: 1" in summary
+    assert "User corrections: 1" in summary
+
+
+def test_summarize_truncates_long_input():
+    summarizer = TrajectorySummarizer()
+    long_input = "a" * 100
+    trajectory = {
+        "task": {"input": long_input},
+        "decisions": [{"decision_type": "direct_answer"}],
+        "outcome": {"status": "success", "steps": 1, "tool_errors": 0, "user_corrections": 0},
+        "reward": {"total": 0.9},
+    }
+    summary = summarizer.summarize(trajectory)
+    assert "Task: '" in summary
+    assert "..." in summary
+    assert len(summary) < 300
+
+
+def test_summarize_handles_missing_fields_gracefully():
+    summarizer = TrajectorySummarizer()
+    trajectory = {}
+    summary = summarizer.summarize(trajectory)
+    assert "Task: ''" in summary
+    assert "Actions: none" in summary
+    assert "Outcome: unknown (reward=0.0, steps=0)" in summary