Initial standalone memabra release

This commit is contained in:
Carlos Ouyang
2026-04-15 11:06:05 +08:00
commit 58f9f221b1
464 changed files with 30256 additions and 0 deletions

84
README.md Normal file
View File

@@ -0,0 +1,84 @@
# memabra
An intuition-driven control plane for agent memory and action selection.
## What is memabra?
memabra is a local-first, observable, trainable, and replayable agent memory and action orchestration system.
Instead of being a simple memory database, memabra acts as a meta-cognitive controller for agents: given a task, it quickly decides whether to answer directly, recall memory, load a skill, or invoke a tool — and continuously improves this judgment based on task outcomes.
## Install
```bash
git clone https://github.com/TacitLab/memabra.git
cd memabra
python -m venv venv
source venv/bin/activate
pip install -e ".[dev]"
```
## Quick start
### 1. See the available commands
```bash
memabra --help
```
### 2. Run a dry-run evaluation
A safe way to see the full workflow without actually promoting a new router version:
```bash
memabra run --dry-run --format text
```
### 3. Check system status
```bash
memabra status --format text
```
### 4. List saved router versions
```bash
memabra version list --format text
```
### 5. Roll back to a previous version
```bash
memabra version rollback <version-id> --format text
```
## CLI subcommands
| Command | Description |
|---------|-------------|
| `memabra run` | Run the online learning workflow |
| `memabra status` | Show current system state |
| `memabra version list` | List all saved router versions |
| `memabra version rollback <id>` | Roll back to a specific version |
## Text output format
By default, memabra prints JSON. For operator-friendly summaries, add `--format text`:
- **Status** — current version, trajectory/report counts, latest report timing and promotion outcome.
- **Version list** — total count, current active version highlighted.
- **Workflow** — grouped into Summary, Baseline, Challenger, Deltas, and Decision sections with normalized `yes/no` flags and fixed-precision metrics.
## Running tests
```bash
pytest tests/ -q
```
## Project status
See [docs/PROGRESS.md](docs/PROGRESS.md) for a detailed capability roadmap and [docs/DEMO.md](docs/DEMO.md) for walkthrough examples.
## License
MIT

View File

@@ -0,0 +1,252 @@
# memabra Alpha Iteration 1 Plan
> For Hermes: continue this plan autonomously in small TDD-driven increments. Each run should complete one or more concrete tasks, update this file's progress section, run targeted tests first, then run the full memabra test suite.
Goal: turn memabra from a showable prototype into a safe self-improving alpha by adding an online learning loop with automatic training, evaluation, gated promotion, and rollback-safe router deployment.
Architecture:
- Keep the current layered design.
- Do not replace existing routers; add an orchestration layer around them.
- Promotion must be benchmark-gated: no automatic router switch without passing evaluation thresholds.
- Persist every training/promotion attempt as an auditable artifact.
Tech stack:
- Existing memabra Python package under `src/memabra/`
- Existing pytest suite under `tests/memabra/`
- Existing persistence via JSON artifacts; keep it simple for alpha
---
## Acceptance criteria
Alpha Iteration 1 is complete when memabra can:
1. detect newly accumulated trajectories
2. build a training dataset from eligible trajectories
3. train a challenger router automatically
4. run challenger vs baseline on a fixed benchmark set
5. promote challenger only if thresholds are met
6. save a versioned promoted router
7. keep an auditable training/promotion report
8. leave the currently active router unchanged when challenger loses
---
## Implementation phases
### Phase A — Benchmark-gated online learning loop
#### Task A1: Add a promotion policy object
Objective: define explicit acceptance rules for promoting a challenger router.
Files:
- Create: `src/memabra/promotion.py`
- Create: `tests/memabra/test_promotion.py`
Required behavior:
- Define a `PromotionPolicy` dataclass
- Inputs should include at least:
- `min_reward_delta`
- `max_error_rate_increase`
- `max_latency_increase_ms`
- `required_task_count`
- Provide `evaluate(baseline, challenger) -> PromotionDecision`
- `PromotionDecision` should include:
- `accepted: bool`
- `reasons: list[str]`
- `metrics: dict`
TDD steps:
1. Write failing tests for accepted and rejected cases.
2. Run targeted tests and verify failure.
3. Implement minimal policy logic.
4. Re-run targeted tests.
5. Re-run full memabra suite.
#### Task A2: Add benchmark suite persistence
Objective: store and load a fixed benchmark task set for repeatable evaluations.
Files:
- Create: `src/memabra/benchmarks.py`
- Create: `tests/memabra/test_benchmarks.py`
Required behavior:
- Define a serializable benchmark suite format
- Load/save benchmark tasks from JSON
- Provide a default benchmark seed for memory/tool/skill/composite coverage
TDD steps:
1. Write failing benchmark round-trip tests.
2. Verify RED.
3. Implement load/save helpers.
4. Verify GREEN.
5. Run full suite.
#### Task A3: Add online training coordinator
Objective: orchestrate dataset selection, training, evaluation, and promotion.
Files:
- Create: `src/memabra/online_learning.py`
- Create: `tests/memabra/test_online_learning.py`
Required behavior:
- Define `OnlineLearningCoordinator`
- It should:
- query trajectories from `ArtifactIndex`
- enforce minimum new trajectory count
- train a challenger with `DatasetBuilder`
- evaluate challenger with `Evaluator`
- apply `PromotionPolicy`
- save promoted routers via `RouterVersionStore`
- emit a structured report whether accepted or rejected
TDD steps:
1. Write failing tests for:
- skip when too few new trajectories
- reject when policy fails
- accept and save version when policy passes
2. Verify failure.
3. Implement minimal coordinator.
4. Verify targeted tests.
5. Run full suite.
### Phase B — Auditability and safe deployment
#### Task B1: Add training run reports
Objective: persist every online-learning attempt, not just successful promotions.
Files:
- Extend: `src/memabra/persistence.py` or create `src/memabra/training_reports.py`
- Create: `tests/memabra/test_training_reports.py`
Required behavior:
- Save a JSON report per training run
- Include:
- timestamp
- source trajectory ids
- sample count
- baseline metrics
- challenger metrics
- promotion decision
- promoted version id if any
#### Task B2: Add active router metadata tracking
Objective: make it obvious which router is active and why.
Files:
- Extend: `src/memabra/router_versioning.py`
- Extend: `tests/memabra/test_router_versioning.py`
Required behavior:
- Track metadata for current active router
- Record promotion source, benchmark result summary, and prior version
- Make rollback preserve audit trail
### Phase C — Product surface and automation
#### Task C1: Add app-level online learning entrypoint
Objective: expose one-call retrain/evaluate/promote behavior from `MemabraApp`.
Files:
- Extend: `src/memabra/app.py`
- Extend: `tests/memabra/test_app.py`
Required behavior:
- Add a method like `run_online_learning_cycle(...)`
- Return a structured result dict/report
#### Task C2: Add CLI entrypoint for the alpha loop
Objective: make the safe online-learning loop runnable from the command line.
Files:
- Extend: `src/memabra/cli.py`
- Extend: `tests/memabra/test_cli_workflow.py`
- Update: `docs/projects/memabra/DEMO.md`
Required behavior:
- Add a callable workflow that:
- seeds or uses existing artifacts
- runs one online-learning cycle
- prints the report JSON
#### Task C3: Update docs and wrap-up materials
Objective: document the alpha loop clearly.
Files:
- Update: `docs/projects/memabra/PROGRESS.md`
- Update: `docs/projects/memabra/ROADMAP.md`
- Update: `docs/projects/memabra/DEMO.md`
- Optional: create `docs/projects/memabra/ONLINE_LEARNING.md`
Required behavior:
- Explain promotion gates
- Explain how to run one cycle manually
- Explain where reports and versions are stored
---
## Suggested run order for autonomous 20-minute cycles
Cycle group 1:
- A1 promotion policy
- A2 benchmark suite persistence
Cycle group 2:
- A3 online training coordinator
Cycle group 3:
- B1 training run reports
- B2 active router metadata tracking
Cycle group 4:
- C1 app-level entrypoint
- C2 CLI workflow
- C3 docs cleanup
---
## Estimated autonomous runs
Recommended initial budget: 18 runs at every 20 minutes.
Reasoning:
- 3 to 4 runs for Phase A
- 3 to 4 runs for Phase B
- 2 to 3 runs for Phase C
- remaining runs as slack for regression fixes, docs cleanup, and one or two extra quality passes
At 20 minutes per run, 18 runs gives about 6 hours of autonomous iteration, which is a reasonable overnight alpha push.
---
## Progress tracker
- [x] Task A1 — promotion policy
- [x] Task A2 — benchmark suite persistence
- [x] Task A3 — online training coordinator
- [x] Task B1 — training run reports
- [x] Task B2 — active router metadata tracking
- [x] Task C1 — app-level online learning entrypoint
- [x] Task C2 — CLI online learning workflow
- [x] Task C3 — docs cleanup and operator guidance
- [x] Task D1 — baseline version selection for online learning
- [x] Task E1 — task case index for episodic retrieval
## Run log
- 2026-04-14: Plan created. Ready for autonomous overnight execution.
- 2026-04-14 22:52 UTC: Completed Tasks A1A3. Promotion policy, benchmark persistence, and online training coordinator implemented with tests. Full suite: 71 passed.
- 2026-04-14 23:22 UTC: Completed Tasks B1C3. Training reports, active router metadata tracking, app/CLI entrypoints, and docs implemented with tests. Full suite: 78 passed.
- 2026-04-14 23:24 UTC: Quality pass — CLI main() now defaults to online-learning workflow, fixed schema test resource warning, added missing alpha module exports to package __init__.py. Full suite: 82 passed.
- 2026-04-14 23:50 UTC: Docs and repo hygiene pass — updated DEMO.md and ONLINE_LEARNING.md to reflect that `python -m src.memabra.cli` runs the online-learning workflow; added `docs/projects/memabra/demo-artifacts/` to `.gitignore`; verified CLI end-to-end (promoted=true, version saved, report emitted). Full suite: 82 passed.
- 2026-04-15 00:49 UTC: Safety and usability pass — added exception handling in `OnlineLearningCoordinator` so training/evaluation failures emit error reports instead of crashing; added CLI argument parsing (`--base-dir`, `--min-new-trajectories`); fixed `python -m src.memabra.cli` RuntimeWarning via lazy `cli` import; added `TrainingReportStore.get_report()` for by-id lookup; exported `BenchmarkTask` from package `__init__.py`; updated DEMO.md and ONLINE_LEARNING.md. Full suite: 88 passed.
- 2026-04-15 01:15 UTC: Repo hygiene and commit pass — verified end-to-end CLI workflow produced a promoted router, version, and report; updated `.gitignore` to exclude runtime artifact directories (`router-versions/`, `training-reports/`); committed entire memabra alpha codebase (67 files, 6,818 insertions). Full suite: 88 passed.
- 2026-04-15 02:00 UTC: Persistence pass — `OnlineLearningCoordinator` now supports `seen_trajectory_store` to persist seen trajectory IDs across restarts, preventing duplicate retraining in cron jobs. Added `test_coordinator_persists_seen_trajectory_ids_across_restarts`. Fixed evaluation leakage by refreshing the artifact index after benchmarking and marking post-evaluation trajectories as seen. Wired `seen_trajectory_store` through `app.py` and `cli.py`; CLI now defaults to `<base-dir>/seen-trajectories.json`. Added corresponding tests. Full suite: 91 passed.
- 2026-04-15 02:27 UTC: Dry-run pass — committed pending persistence-pass changes, then added `--dry-run` CLI flag and `dry_run` parameter through the full stack (`OnlineLearningCoordinator`, `app.py`, `cli.py`). In dry-run mode training and evaluation execute but promotion and version saving are skipped; an audit report is still emitted with `dry_run: true`. Added `test_coordinator_dry_run_does_not_promote_or_save_version` and `test_main_entrypoint_passes_dry_run_flag`. Updated `ONLINE_LEARNING.md`. Full suite: 93 passed.
- 2026-04-15 02:51 UTC: Baseline-version pass — added `baseline_version_id` parameter to `OnlineLearningCoordinator.run_cycle()`, `MemabraApp.run_online_learning_cycle()`, and CLI `--baseline-version` flag. This lets operators evaluate a challenger against a specific saved router version rather than the currently active one. Added tests for coordinator, app, and CLI. Updated `ONLINE_LEARNING.md`. Full suite: 96 passed.
- 2026-04-15 03:18 UTC: Verification pass — confirmed all tasks A1D1 are complete and stable. Ran full memabra suite (96 passed) and end-to-end CLI workflow (promoted=true, version saved, report emitted). No code changes required; repo is clean and ready for operator review.
- 2026-04-15 04:02 UTC: Started Phase E — added `CaseIndex` (`src/memabra/case_index.py`) for task-level episodic retrieval. Maps normalized task inputs to the highest-reward trajectory ID, with JSON save/load. Added `tests/memabra/test_case_index.py` (4 tests). Full suite: 100 passed.
- 2026-04-15 04:27 UTC: Integrated `CaseIndex` into `MemabraApp` and `MemabraRunner` for episodic retrieval. Added app-level methods (`build_case_index`, `save_case_index`, `load_case_index`, `best_trajectory_for`). Runner now injects an episodic memory candidate when a case index hit occurs. Added CLI flags `--case-index` and `--rebuild-case-index`. Updated docs. Full suite: 107 passed.
- 2026-04-15 04:54 UTC: Added `case_index_path` support to `OnlineLearningCoordinator` so the case index is automatically rebuilt after each online-learning cycle (including benchmark-generated trajectories). Wired parameter through `app.py` and `cli.py`. Added tests for coordinator, app, and CLI. Full suite: 110 passed.
- 2026-04-15 05:18 UTC: Added `TrajectorySummarizer` (`src/memabra/trajectory_summary.py`) for generating human-readable trajectory summaries. Integrated summarizer into `MemabraRunner` so episodic memory candidates contain rich summaries when a `persistence_store` is available. Added `tests/memabra/test_trajectory_summary.py` (4 tests) and updated runner test. Full suite: 114 passed.
- 2026-04-15 05:42 UTC: Added CLI `--status` flag (`src/memabra/cli.py`) to print current system state (active router version, version count, trajectory count, report count, latest report summary) without running a learning cycle. Added `tests/memabra/test_cli_workflow.py::test_main_status_flag_prints_status_and_skips_workflow`. Full suite: 115 passed.
- 2026-04-15 06:05 UTC: Added CLI `--rollback` and `--list-versions` flags for operator-safe router version management. Added error handling for missing rollback targets (exits 1 with clean message). Added corresponding tests. Full suite: 118 passed. Updated `ONLINE_LEARNING.md` and `DEMO.md` documentation.

219
docs/ARCHITECTURE.md Normal file
View File

@@ -0,0 +1,219 @@
# Architecture
## 1. 问题定义
我们要解决的不是“怎样让模型记住更多”,而是:
当 agent 遇到一个任务时,怎样在有限上下文、有限工具预算和有限时间下,快速决定是否要调用 memory、skill、tool并让这个决策过程能够被训练和修正。
## 2. 系统总览
系统采用四层架构。
### 2.1 Retrieval Layer候选召回层
输入:
- 当前用户任务
- 对话短摘要
- 当前环境状态
- 失败历史 / 最近修正
输出:
- top-k memory candidates
- top-k skill candidates
- top-k tool candidates
职责:
- 从不同来源召回候选对象
- 统一为标准候选格式
- 不做最终决策,只做缩小搜索空间
### 2.2 Policy Layer直觉 / 路由层)
输入:
- 当前任务表示
- 候选对象集合
- 历史选择特征
- 成本与风险信号
输出:
- 直接回答
- 读取某条 memory
- 加载某个 skill
- 调用某个 tool
- 组合动作(如先 skill 后 tool
- 请求澄清
职责:
- 模拟“直觉”
- 先做快速动作选择
- 后续可从规则逐步升级到分类器、reranker、bandit、RL policy
### 2.3 Execution Layer执行层
职责:
- 注入记忆到上下文
- 加载 skill 指令
- 调用真实工具
- 记录执行步骤、耗时、报错、产出
### 2.4 Evaluation Layer反馈 / 归因层)
职责:
- 判断任务是否成功
- 分析步骤数、重试数、错误率、用户修正次数
- 拆解 reward
- 产生可训练轨迹
没有这一层,就没有真正的“学习”,只有玄学调参。
## 3. 统一对象模型
虽然 memory、skill、tool 性质不同,但在召回和路由阶段可以统一成候选对象:
```json
{
"id": "string",
"type": "memory|skill|tool",
"title": "string",
"summary": "string",
"triggers": ["string"],
"cost": 0.0,
"confidence": 0.0,
"success_rate": 0.0,
"freshness": 0.0,
"risk": 0.0,
"embedding": "vector-ref",
"tags": ["string"],
"source": "user|system|generated|external"
}
```
注意:统一的是候选接口,不是语义本体。
三类对象必须保持边界:
- memory 存事实
- skill 存程序
- tool 存动作能力
## 4. 记忆系统分层
### 4.1 Semantic Memory事实记忆
例如:
- 用户偏好
- 机器环境
- 项目约定
- API 限制
### 4.2 Procedural Memory程序性记忆
即 skill
- 某类任务的处理流程
- 踩坑经验
- 验证步骤
### 4.3 Episodic Memory情景记忆
- 某次任务的具体轨迹
- 当时用了什么资源
- 为什么成功或失败
### 4.4 Working Memory工作记忆
- 当前任务临时状态
- 本轮推理中间产物
- 不应直接沉淀为长期记忆
## 5. 训练策略:先外部策略,后端到端
### 5.1 Phase A不改基础模型权重
先训练一个小型策略器,决定:
- 要不要查记忆
- 查哪类记忆
- 要不要 skill
- 先用哪个工具
可选实现:
- 规则 + 分数融合
- 轻量分类器
- reranker
- contextual bandit
### 5.2 Phase B从轨迹中学 reranking / routing
训练输入:
- 任务上下文
- 候选对象集合
- 实际动作
- 结果 reward
训练目标:
- 最大化任务完成率
- 最小化无效调用
- 减少用户重复提供信息
- 减少不必要的上下文膨胀
### 5.3 Phase C端到端实验
只有当以下条件成立,才值得考虑:
- 已有高质量轨迹数据
- 能做 credit assignment
- 有稳定的离线评估环境
- 能控制灾难性遗忘
## 6. Feedback & Reward 设计
reward 不能只看任务是否成功。要拆成多项:
- task_success最终是否完成
- efficiency用了多少步
- retrieval_hit是否命中关键 memory/skill/tool
- user_correction_penalty用户是否纠正
- tool_error_penalty是否触发无效工具调用
- context_cost_penalty上下文是否膨胀过度
- latency_penalty是否过慢
可组合为:
```text
R = a*task_success + b*retrieval_hit - c*tool_error - d*user_correction - e*latency - f*context_cost
```
## 7. 关键难点
### 7.1 Credit Assignment
成功了,到底是谁的功劳?
要记录候选集、最终选择、未选备选项,才能做反事实分析。
### 7.2 False Reinforcement
错误记忆被反复命中,会自我强化。
需要:
- 置信度
- 可撤销
- 最近验证时间
- 来源追踪
### 7.3 Exploitation vs Exploration
老选最稳的对象会变保守,永远学不到新模式。
需要安全探索机制。
### 7.4 Type Boundary Collapse
如果把 memory、skill、tool 混成一个大向量池,系统会越来越糊。
## 8. 推荐 MVP
### MVP-1可观测系统
- 定义对象 schema
- 定义事件 schema
- 统一记录轨迹
- 做基础检索
- 用规则路由
### MVP-2轻量学习型路由
- 加入候选打分器
- 从优秀轨迹训练动作选择器
- 做离线回放评估
### MVP-3在线自适应
- 使用 bandit / preference updates
- 根据任务结果微调路由策略
### MVP-4端到端试验场
- 小规模实验性训练
- 与分层方案对比
- 验证是否真有收益
## 9. 核心原则
1. 先可观测,再可学习
2. 先学路由,再学大脑
3. 先做分层归因,再做端到端优化
4. 优化“何时依赖什么”,而不是盲目优化“模型看起来更聪明”

94
docs/DECISIONS.md Normal file
View File

@@ -0,0 +1,94 @@
# Design Decisions
## D-001: 不以端到端训练作为第一阶段目标
决定:
第一阶段采用分层架构,不直接训练一个从任务到动作的黑盒大模型。
原因:
- 反馈稀疏
- credit assignment 困难
- 数据量不足时容易学偏
- 可解释性太差,难 debug
影响:
项目先构建 observability、logging、router 和 reward 层。
## D-002: 将 memory、skill、tool 统一为候选对象接口,但不混淆类型
决定:
在召回和排序阶段,三者共享统一候选 schema在存储、执行和评估阶段保持强类型边界。
原因:
- 统一召回便于路由决策
- 保持类型边界可避免语义坍塌
影响:
后续 schema 设计需要同时支持统一特征和类型特有字段。
## D-003: 记忆分为 facts / procedures / episodes / working 四层
决定:
长期系统至少区分:
- facts
- procedures
- episodes
- working memory
原因:
“记忆”不是一坨文本,人的有效直觉来自多种记忆系统协同。
影响:
每个写入动作都要先判定落到哪一层,而不是直接塞进统一向量库。
## D-004: 先优化路由策略,再考虑学习基础模型内部权重
决定:
学习目标先放在 external policy 上,而不是 foundation model 的参数上。
原因:
- 小模型更便宜
- 训练更稳定
- 更容易比较实验结果
- 更适合本地部署
影响:
需要专门设计 router features、训练样本和离线评估框架。
## D-005: reward 必须拆分,不使用单一任务成败信号
决定:
reward 将拆分为 success、efficiency、retrieval_hit、user_correction、tool_error、latency、context_cost 等因子。
原因:
只看任务成功会掩盖大量中间行为质量问题。
影响:
需要事件级 logging不能只存最终答案。
## D-006: 所有学习都建立在可回放轨迹上
决定:
任何策略更新都必须能追溯到完整 trajectory。
原因:
不可回放,就无法排查策略劣化;不可回放,也无法做人类审计。
影响:
trajectory schema 和 replay 工具会成为基础设施,而不是可选项。
## D-007: 项目正式命名为 memabra
决定:
项目正式名采用 `memabra`
副标题:
An intuition-driven control plane for agent memory and action selection.
原因:
- 需要一个可品牌化、可传播的短名
- 技术本质由副标题补足
- 避免旧名把项目误导成“单纯记忆管理工具”
影响:
后续所有原型代码、文档、schema 标识、演示材料统一使用 memabra。

148
docs/DEMO.md Normal file
View File

@@ -0,0 +1,148 @@
# Demo
memabra now has a polished wrap-up workflow in addition to the lower-level demo app.
## Quick run
If you installed the repo in editable mode, prefer the dedicated CLI command:
```bash
source venv/bin/activate
memabra
```
The legacy developer entrypoint still works too:
```bash
source venv/bin/activate
python -m src.memabra.cli
```
This runs the online-learning loop: it seeds demo tasks, trains a challenger router, evaluates it against a benchmark suite, promotes it if thresholds are met, and prints a JSON report.
You can override the default artifact directory and minimum trajectory threshold:
```bash
source venv/bin/activate
memabra run --base-dir /custom/artifacts --min-new-trajectories 5
```
You can also enable episodic retrieval by rebuilding the case index from saved trajectories:
```bash
source venv/bin/activate
memabra run --rebuild-case-index
```
You can check system status, list versions, or roll back without running a learning cycle:
```bash
source venv/bin/activate
memabra status
memabra version list
memabra version rollback 20260414-123456
```
If you want operator-friendly output instead of raw JSON, use `--format text`:
```bash
source venv/bin/activate
memabra status --format text
memabra version list --format text
memabra version rollback 20260414-123456 --format text
memabra run --dry-run --format text
```
The text formatter is aimed at operators: status output includes the latest report timing/outcome, version listings highlight the currently active router version, and workflow output is grouped into summary/baseline/challenger/deltas/decision sections with normalized yes/no and fixed-precision metrics.
You can also call it programmatically:
```bash
source venv/bin/activate
python - <<'PY'
from src.memabra.cli import run_online_learning_workflow
result = run_online_learning_workflow()
print(result)
PY
```
The online-learning workflow will:
1. build a demo app
2. seed example tasks (if no trajectories exist yet)
3. run one online-learning cycle
4. train a challenger router
5. evaluate it against the baseline on a fixed benchmark suite
6. promote it only if the promotion policy accepts
7. persist a training report under `training-reports/`
8. print a JSON report
## Python API
```python
from src.memabra.cli import run_wrapup_workflow, run_online_learning_workflow
# Legacy wrap-up demo
result = run_wrapup_workflow()
print(result)
# Safe online-learning loop with benchmark-gated promotion
result = run_online_learning_workflow()
print(result)
```
## Lower-level demo app
You can still drive the app manually:
```bash
source venv/bin/activate
python - <<'PY'
from src.memabra.app import build_demo_app
app = build_demo_app()
for prompt in [
'Use my telegram preference for this answer.',
'Check the current system status.',
'Deploy this service with the usual workflow.',
]:
trajectory = app.run_task(prompt, channel='telegram', user_id='oza')
print(prompt)
print(trajectory['decisions'][0]['decision_type'], trajectory['outcome']['status'], trajectory['reward']['total'])
print([event['event_type'] for event in trajectory['events']])
print('---')
print(app.replay_summary())
PY
```
## Output locations
By default the workflows write to:
- `docs/projects/memabra/demo-artifacts/trajectories/`
- `docs/projects/memabra/demo-artifacts/memories/`
- `docs/projects/memabra/demo-artifacts/router-versions/`
- `docs/projects/memabra/demo-artifacts/training-reports/`
## What this proves
The alpha is able to demonstrate the whole loop:
- retrieval
- routing
- execution
- persistence
- replay
- training
- evaluation
- router versioning
- benchmark-gated promotion
- auditable training reports
## Limits
This is still an alpha:
- learning is lightweight, not a deep model
- storage is JSON-file based
- promotion policy thresholds are manually configured
- tool/skill integration is still narrower than a production agent platform
But it is now a safe, self-improving alpha, not just a pile of modules.

View File

@@ -0,0 +1,77 @@
# Execution and Persistence
## 目标
给 memabra 补上两块真正让系统“落地”的骨头:
- execution让路由决策进入可执行动作层
- persistence让 trajectory 和 memory record 能落到磁盘
## 当前实现
### execution.py
提供:
- `ActionResult`
- `MemoryExecutor`
- `SkillExecutor`
- `ToolExecutor` (原 MockToolExecutor现已升级为可接真实后端
- `ExecutionEngine`
- `ToolBackend` 协议(支持 `params` 传参)
- `LocalFunctionToolAdapter` — 将工具映射到本地 Python 函数
- `SubprocessToolAdapter` — 将工具映射到 shell 命令
- `ToolRegistry` — 按 `tool_id` 注册、查找和执行工具
当前行为:
- `inject_memory` 会产出 `memory_injected` 事件,并在有 memory store 时标记 `last_used_at`
- `load_skill` 会产出 `skill_loaded` 事件
- `call_tool` 会通过 `ToolBackend` 协议调用真实后端,产出 `tool_called``tool_result` 事件
- `RouteDecision` 现在携带 `selected_payloads`,可以将候选参数经由 `ToolExecutor` 传递给后端
- 其他 decision_type 先走 noop
这一步的意义是:
memabra 第一次有了 execution stage而不是只有 policy stage。
并且 tool 层现在可以接入真实的本地函数或子进程后端,不再是纯 mock。
### persistence.py
提供:
- `PersistenceStore`
当前能力:
- 保存 trajectory 到 `artifacts/trajectories/`
- 读取 trajectory
- 列出 trajectory 文件
- 保存 memory record 到 `artifacts/memories/`
- 读取 memory record
- 列出 memory 文件
这意味着 prototype artifacts 已经不再只是内存态漂浮物。
### runner writeback integration
runner 现在支持:
- 挂 execution engine
- 挂 persistence store
- 挂 memory store
- 执行后扩展 execution events
- 可选把 trajectory 落盘
- 对 memory inject 决策进行基本 writeback / mark_used
## 当前闭环
现在的最小系统流程已经变成:
任务 -> retrieval -> router -> execution -> trajectory -> validation -> persistence -> replay
这就真正有点 agent runtime 的味儿了。
## 当前限制
- ~~tool 执行还是 mock 的~~ 已升级为可插拔式真实后端
- skill 执行只是事件层,不是真加载技能
- writeback 逻辑还很粗糙
- persistence 目前是 JSON 文件,没有索引层
## 下一步建议
1. ~~做真实 `ToolExecutor` / `SkillExecutor` adapter 协议~~ tool adapter 已完成
2. 做真实 `SkillExecutor` adapter从文件系统加载 skill payload
3. 把 persistence 接到 replay 默认数据源
4. 给 runner 增加 outcome / reward 的真实更新逻辑
5. 做 richer telemetry 和失败事件归因

48
docs/NAMING.md Normal file
View File

@@ -0,0 +1,48 @@
# Naming
最终命名确定为:
# memabra
副标题:
An intuition-driven control plane for agent memory and action selection.
## 选择理由
这个名字成立,因为它同时满足两件事:
1. 作为品牌名,它短、好记、有辨识度。
2. 作为系统名,它配合副标题后,能准确表达项目本质不是“记忆库”,而是 memory、skill、tool 的动作选择与控制系统。
## 命名策略
- 品牌名:`memabra`
- 技术描述:`An intuition-driven control plane for agent memory and action selection.`
这样分层后:
- `memabra` 负责让人记住
- 副标题负责让人看懂
## 为什么不用纯功能名
`Agent Memory Manager` 这样直接描述功能的名字,问题是太窄:
- 太像存储工具
- 没体现 routing / policy / evaluation / learning
- 没体现它是 agent 的元认知控制器
## 内部表达建议
在技术文档里,可以把 memabra 描述为:
- local-first metacognitive router
- agent memory and action orchestration system
- intuition-driven control plane
这三个说法分别适合:
- 研究语境
- 工程语境
- 对外介绍语境
## 结论
命名不再强调“memory manager”而强调“intuition-driven control”。
这更接近项目真正的骨架。

171
docs/ONLINE_LEARNING.md Normal file
View File

@@ -0,0 +1,171 @@
# Online Learning Operator Guide
## What it does
memabra's online learning loop lets the system safely retrain its router from accumulated trajectories, evaluate the new challenger against the current baseline, and promote it only if explicit thresholds are met.
## How to run one cycle
### From Python
```python
from src.memabra.cli import run_online_learning_workflow
result = run_online_learning_workflow()
print(result)
```
### From the shell
```bash
source venv/bin/activate
python -m src.memabra.cli
```
Or with custom options:
```bash
source venv/bin/activate
python -m src.memabra.cli --base-dir /custom/artifacts --min-new-trajectories 5
```
By default the CLI persists seen trajectory IDs to `<base-dir>/seen-trajectories.json` so repeated runs skip already-processed data. You can override the path:
```bash
source venv/bin/activate
python -m src.memabra.cli --seen-trajectory-store /custom/artifacts/seen.json
```
### Dry-run mode
To train and evaluate a challenger without actually promoting it or saving a new router version:
```bash
source venv/bin/activate
python -m src.memabra.cli --dry-run
```
This still produces a training report (with `dry_run: true`) so you can inspect what would have happened before allowing a real promotion.
### Evaluate against a specific baseline version
By default the online-learning cycle uses the currently active router as the baseline. You can pin the baseline to a specific saved version instead:
```bash
source venv/bin/activate
python -m src.memabra.cli --baseline-version 20260414-123456
```
This is useful when you want to compare a challenger against a known-good version rather than whatever happens to be active right now. The report will record `baseline_version_id` for audit.
### Episodic retrieval with case index
You can load or rebuild a case index for episodic retrieval during task execution:
```bash
source venv/bin/activate
python -m src.memabra.cli --rebuild-case-index
```
This builds a `CaseIndex` from all saved trajectories and saves it to the default path (`<base-dir>/case-index.json`). On subsequent runs, load it without rebuilding:
```bash
source venv/bin/activate
python -m src.memabra.cli --case-index /custom/artifacts/case-index.json
```
When a case index path is provided, the online-learning cycle automatically rebuilds the index after training and evaluation, so benchmark-generated trajectories are included for future episodic retrieval.
When a case index is loaded, the runner injects an episodic memory candidate into retrieval for inputs that match a previously seen task, surfacing the best past trajectory as a hint to the router.
Or inline:
```bash
source venv/bin/activate
python - <<'PY'
from src.memabra.cli import run_online_learning_workflow
print(run_online_learning_workflow())
PY
```
## Promotion gates
A challenger is promoted only when **all** of the following are true:
- `reward_delta >= min_reward_delta` — the challenger must improve average reward by at least this amount
- `error_rate_delta <= max_error_rate_increase` — the challenger must not increase errors beyond this limit
- `latency_delta_ms <= max_latency_increase_ms` — the challenger must not become slower beyond this limit
- `task_count >= required_task_count` — the benchmark must include at least this many tasks
Default policy in the CLI workflow is lenient for alpha exploration. In production you should tighten these thresholds.
## Where reports and versions are stored
By default everything lands under:
- `docs/projects/memabra/demo-artifacts/trajectories/` — raw task trajectories
- `docs/projects/memabra/demo-artifacts/router-versions/versions/` — versioned router weights
- `docs/projects/memabra/demo-artifacts/router-versions/current.json` — active router metadata (includes promotion source, benchmark summary, prior version, rollback history)
- `docs/projects/memabra/demo-artifacts/training-reports/` — one JSON report per training run
## What happens when the challenger loses
- The active router in the app **remains unchanged**
- A training report is still saved with the rejection reasons
- No new version is registered as current
## Rolling back
You can roll back to any previous version from Python:
```python
from src.memabra.router_versioning import RouterVersionStore
store = RouterVersionStore()
store.rollback("20260414-123456")
current = store.get_current()
print(current)
```
Or from the CLI:
```bash
source venv/bin/activate
python -m src.memabra.cli --rollback 20260414-123456
```
To see all available versions before rolling back:
```bash
source venv/bin/activate
python -m src.memabra.cli --list-versions
```
Rollback preserves an audit trail in `current.json` (`rollback_from`, `rolled_back_at`).
## Status check
To quickly inspect the current system state without running a learning cycle:
```bash
source venv/bin/activate
python -m src.memabra.cli --status
```
## Architecture summary
```
Trajectories -> ArtifactIndex -> DatasetBuilder -> SimpleLearningRouter (challenger)
|
v
BenchmarkSuite -> Evaluator -> baseline vs challenger
|
v
PromotionPolicy.evaluate()
|
+-------------------+-------------------+
| accepted | rejected
v v
RouterVersionStore.save() training report saved
app.set_router(challenger) active router unchanged
```

162
docs/PROGRESS.md Normal file
View File

@@ -0,0 +1,162 @@
# memabra Progress
## Current status
Project status: safe self-improving alpha, benchmark-gated online learning loop complete
Date: 2026-04-15
Project: memabra
Subtitle: An intuition-driven control plane for agent memory and action selection.
## What exists now
memabra now has a complete safe self-improving alpha control-plane loop:
- candidate retrieval
- routing decisions
- memory / skill / tool execution
- telemetry events
- trajectory construction
- runtime validation
- artifact persistence
- replay and analytics
- artifact indexing and dataset slicing
- lightweight learning router training
- A/B evaluation
- router weight versioning and rollback
- benchmark-gated promotion with explicit policy thresholds
- auditable training reports
- exception-safe online learning coordinator
- configurable CLI entrypoint
- persisted seen-trajectory tracking across restarts (safe for cron jobs)
- dry-run mode for training/evaluation without promotion risk
- baseline version selection for challenger evaluation
- task case index (`CaseIndex`) for episodic retrieval: maps normalized inputs to the best past trajectory ID
- `CaseIndex` integration into `MemabraApp` (build, save, load, lookup) and `MemabraRunner` (injects episodic candidate on matching inputs)
- CLI flags `--case-index` and `--rebuild-case-index` for operator-managed episodic retrieval
- `OnlineLearningCoordinator` auto-rebuilds case index after each cycle when `case_index_path` is provided, ensuring benchmark-generated trajectories are indexed
- `TrajectorySummarizer` generates human-readable trajectory summaries from task input, decisions, outcome, and reward
- `MemabraRunner` enriches episodic memory candidate summaries using `TrajectorySummarizer` when `persistence_store` is available
- CLI `--status` flag prints current system state (active router version, counts, latest report) without triggering a learning cycle
- CLI is now subcommand-driven (`run`, `status`, `version list`, `version rollback`) with a dedicated packaged `memabra` entrypoint
- CLI `--format text` mode provides operator-friendly summaries for status checks, version listings, rollbacks, and workflow runs, including latest report details, current-version highlighting, sectioned workflow summaries, normalized yes/no flags, and fixed-precision benchmark/promotion metrics
## Major completed capabilities
### Foundations
- project naming, architecture, roadmap, decisions, reward spec
- candidate / event / trajectory / memory schemas
- prototype package structure under `src/memabra/`
### Runtime path
- `retrieval.py`: typed candidate retrieval
- `router.py`: heuristic router, feature-scoring router, learning router
- `execution.py`: memory, skill, tool executors and adapters
- `runner.py`: end-to-end task -> trajectory orchestration
- `persistence.py`: trajectory and memory artifact storage
- `replay.py`: replay summaries over examples and persisted runs
- `memory_store.py`: typed memory records with verify/revoke support
### Adapters and evaluation
- real tool adapters:
- `LocalFunctionToolAdapter`
- `SubprocessToolAdapter`
- `ToolRegistry`
- real skill loading:
- `FileSystemSkillBackend`
- richer evaluation path:
- `OutcomeEngine`
- `RewardEngine`
- `ArtifactIndex`
- `DatasetBuilder`
- `Evaluator`
- `RouterVersionStore`
- Alpha Iteration 1 — online learning loop:
- `PromotionPolicy` with benchmark-gated promotion rules
- `BenchmarkSuite` persistence (JSON load/save + default seed)
- `OnlineLearningCoordinator` for retrain/evaluate/promote cycles
- exception-safe coordinator: training/evaluation failures emit auditable error reports instead of crashing
- `TrainingReportStore.get_report()` for by-id report lookup
### Product/demo surface
- `app.py`: `MemabraApp`, demo builders, artifact index access, training hooks, `run_online_learning_cycle`
- `cli.py`: wrap-up workflow and `run_online_learning_workflow` with benchmark-gated promotion
- `cli.py`: argument parsing (`--base-dir`, `--min-new-trajectories`) and clean `python -m src.memabra.cli` execution
- `DEMO.md`: runnable walkthrough with CLI options
## Current test status
Command:
`source venv/bin/activate && python -m pytest tests/memabra -q`
Latest result:
`118 passed`
All alpha iteration 1 source, tests, and documentation have been committed to the repository (commit `34cf507c`).
## Most important current files
### Core package
- `src/memabra/app.py`
- `src/memabra/cli.py`
- `src/memabra/router.py`
- `src/memabra/runner.py`
- `src/memabra/execution.py`
- `src/memabra/evaluator.py`
- `src/memabra/router_versioning.py`
- `src/memabra/promotion.py`
- `src/memabra/online_learning.py`
- `src/memabra/training_reports.py`
- `src/memabra/benchmarks.py`
- `src/memabra/case_index.py`
### Tests
- `tests/memabra/test_app.py`
- `tests/memabra/test_cli_workflow.py`
- `tests/memabra/test_package_exports.py`
- `tests/memabra/test_promotion.py`
- `tests/memabra/test_online_learning.py`
- `tests/memabra/test_training_reports.py`
- `tests/memabra/test_benchmarks.py`
- `tests/memabra/test_router_versioning.py`
- `tests/memabra/test_evaluator.py`
- `tests/memabra/test_router_protocol.py`
- `tests/memabra/test_execution_persistence.py`
## Wrap-up status
The project is now in a safe self-improving alpha state.
It can:
- run realistic demo tasks
- persist trajectories
- replay and inspect results
- train a lightweight router from saved artifacts
- compare baseline vs challenger routers
- apply a promotion policy with explicit thresholds
- save and reload router versions with metadata
- emit auditable training reports
- run an online-learning cycle from the CLI
- leave the active router unchanged when challenger fails
- survive training/evaluation failures gracefully and emit error reports
- accept CLI overrides for artifact directory and trajectory thresholds
- persist seen-trajectory state across restarts so cron jobs don't retrain on the same data
- default CLI `main()` persists seen trajectories to `<base-dir>/seen-trajectories.json`
- run in dry-run mode to evaluate a challenger without promoting it
- run in baseline-version mode to compare a challenger against a specific saved version instead of the currently active router
- index successful task cases by normalized input for episodic retrieval (`CaseIndex`)
- build/save/load a case index from `MemabraApp`
- inject episodic memory candidates during runner retrieval when a similar past task exists
- use `--case-index` and `--rebuild-case-index` CLI flags to manage episodic retrieval
- online-learning cycles automatically refresh the case index after training/evaluation when a case-index path is configured
- episodic memory candidates now include rich human-readable summaries when the past trajectory is available via `persistence_store`
- CLI `--status` flag provides a quick read-only snapshot of the active router, versions, trajectories, and reports
- CLI `--rollback` and `--list-versions` flags enable operator-safe router version management without touching code
## Next sensible frontier
1. tighter integration with real Hermes trajectories
2. multi-turn conversation state and working-memory updates
3. richer real-world tool ecosystem integration (MCP, web, git, files)
4. stronger storage/index backend beyond plain JSON files
## One-line summary
memabra is now a runnable, test-covered safe self-improving alpha for agent memory/action routing, with online learning, benchmark-gated promotion, and auditable reports.

90
docs/PROTOTYPE_LAYOUT.md Normal file
View File

@@ -0,0 +1,90 @@
# Prototype Layout
## 目标
为 memabra 建立一个最小可运行的原型目录结构,让后续 rule-based router、replay harness、sample trajectories 和训练样本生成都能有明确落点。
## 目录结构
```text
src/memabra/
├── __init__.py
├── candidate_types.py # 统一候选对象与决策类型
├── router.py # Rule-based router baseline
├── telemetry.py # 事件、reward、轨迹的运行时结构
├── reward.py # reward 聚合逻辑
├── retrieval.py # 后续:候选召回接口
├── memory_store.py # 后续:长期记忆存取
├── replay.py # 后续trajectory 回放与评估
└── schemas.py # 后续schema 装载/校验
tests/memabra/
└── test_router_smoke.py # baseline 冒烟测试
```
## 当前已落地
已创建:
- `src/memabra/__init__.py`
- `src/memabra/candidate_types.py`
- `src/memabra/router.py`
- `src/memabra/telemetry.py`
- `src/memabra/reward.py`
- `tests/memabra/test_router_smoke.py`
## 模块边界
### candidate_types.py
负责:
- `CandidateObject`
- `DecisionType`
- 后续可扩展 memory/skill/tool type-specific adapter
### router.py
负责:
- `TaskContext`
- `RouteDecision`
- `RuleBasedRouter`
当前只实现 baseline 启发式,后续升级为:
- 特征打分器
- reranker
- learned policy
### telemetry.py
负责:
- 原子事件结构
- reward breakdown
- 后续 trajectory runtime objects
### reward.py
负责:
- reward 组合与计算
- 后续权重版本化
## 设计原则
1. 先有可运行 baseline再抽象复杂接口
2. 数据结构先简单,但字段命名与 Phase 0 schema 保持一致
3. 先保证 replayable再考虑高性能
4. 不提前引入数据库或向量库耦合
## 下一步落点
- `retrieval.py`:定义候选召回接口
- `replay.py`:实现 trajectory 读取、回放和指标计算
- `schemas.py`:把 JSON schema 转成运行时校验入口
- `sample_data/`:放示例 candidates 和 trajectories
## 验证建议
在项目根目录运行:
```bash
source venv/bin/activate
python -m pytest tests/memabra/test_router_smoke.py -q
```
期望:
- baseline router 冒烟测试通过
- 说明最小原型骨架已可被导入和调用

87
docs/README.md Normal file
View File

@@ -0,0 +1,87 @@
# memabra
An intuition-driven control plane for agent memory and action selection.
## Quick start
If you are working from this repository, activate the virtualenv and install the project in editable mode so the dedicated `memabra` command is available:
```bash
source venv/bin/activate
uv pip install -e ".[dev]"
memabra --help
memabra run --base-dir /tmp/memabra-demo --format text --dry-run
```
The dedicated CLI is the fastest way to experience the alpha. It supports subcommands for different operations:
- `memabra run` — run the online-learning loop
- `memabra status` — show system status
- `memabra version list` — list saved router versions
- `memabra version rollback <id>` — roll back to a version
memabra 的目标,不是做一个“会存东西的记忆库”,而是做一个本地 agent 的元认知控制器:
在面对任务时,能像人的直觉一样,快速判断该直接回答、查记忆、加载 skill、还是调用工具并且根据任务结果持续优化这种判断。
一句话定义:
这是一个 local-first、可观测、可训练、可回放的 agent memory and action orchestration system。
## 为什么要做
传统 agent 的常见问题:
- 上下文越来越胖,什么都往 prompt 里塞
- 记忆、skill、工具是三套割裂系统
- 成功或失败后,很难知道到底是哪一步起了作用
- 想“学习”时,缺少可归因的轨迹数据
memabra 要解决的本质问题是:
什么时候该依赖什么。
## 核心观点
先不要一上来做端到端神经网络大一统训练。
先建立 4 层结构:
1. 检索层:召回候选 memory / skill / tool
2. 路由层:决定调用什么,以及先后顺序
3. 执行层:真正注入记忆、加载 skill、调用工具
4. 评估层:记录结果,分配 credit形成训练样本
如果这 4 层都看不清,直接端到端训练,大概率会学成“少调工具、靠模型硬猜”的歪路子。
## 项目输出
当前目录先以方案与设计文档为主:
- `ARCHITECTURE.md`:系统架构
- `ROADMAP.md`:分阶段路线图
- `DECISIONS.md`:关键设计决策
- `PROGRESS.md`:当前进度和下一步
- `schemas/`Phase 0 的统一 schema
- `reward_spec.md`:奖励设计草案
后续可以补:
- `experiments/`:训练与评估实验
- `src/`:原型代码
- `tests/`:验证与回归测试
## 目标能力
最终希望具备:
- 统一管理 facts / procedures / episodes 三类长期信息
- 给 memory、skill、tool 建立统一候选召回机制
- 让一个“直觉策略器”做快速动作选择
- 通过任务结果反推策略好坏
- 逐步从规则系统过渡到可学习策略
- 在本地环境下可持续演化
## 当前状态
项目已初始化,并已进入 Phase 0 基础定义阶段:
- 完成方向澄清
- 确立分层路线
- 完成命名
- 建立项目目录
- 写入首版架构、路线图、决策和进度文档
- 准备补齐 schema 与 reward 规范
下一步建议直接进入 Phase 0
定义统一对象模型、轨迹日志结构、reward 拆分方案。

View File

@@ -0,0 +1,60 @@
# Replay and Retrieval
## 目标
把 memabra 的最小闭环接起来:
- retrieval 负责把 memory / skill / tool 候选召回出来
- replay 负责读取 trajectories 并汇总行为结果
这两者一接上,系统就不再只是静态文档和单点 router而是具备了
- 候选输入
- 决策输出
- 轨迹回放
- 基础统计
## 当前实现
### retrieval.py
提供:
- `CandidateProvider` 协议
- `InMemoryCandidateProvider`
- `CandidateRetriever`
- `RetrievalResult`
当前策略:
- 使用 trigger/tag 与任务文本做简单 lexical matching
- 结合 confidence / success_rate / freshness / cost / risk 做 baseline 排序
- 对不同 provider 输出做按类型聚合与去重
### replay.py
提供:
- `TrajectoryReplay`
- `ReplaySummary`
当前能力:
- 加载单个 trajectory JSON
- 加载目录下多个 trajectory
- 汇总 outcome counts
- 汇总 reward、latency、steps、user corrections
- 统计各类 decision_type 次数
## 为什么这一步重要
没有 retrievalrouter 只能对空候选做假动作。
没有 replayreward 和 trajectory 只是躺在磁盘上的 JSON 标本。
这一步之后memabra 第一次拥有了最小闭环:
任务 -> 候选 -> 决策 -> 轨迹 -> 回放统计
## 当前局限
- retrieval 还是词面匹配,不是 embedding 或 learned ranking
- replay 只做汇总,不做 schema 校验和 counterfactual 对比
- 还没有把 router 与 retriever 真正串成 end-to-end runner
## 下一步
1.`schemas.py` 做运行时校验
2.`memory_store.py` 和 provider 接口
3.`runner.py` 把 retrieval + router + telemetry 串起来
4. 给 replay 加基线比较和 reward breakdown 分析

136
docs/ROADMAP.md Normal file
View File

@@ -0,0 +1,136 @@
# Roadmap
## 总体目标
构建一个本地 agent 记忆管理与元认知控制系统,使 agent 能在 memory、skill、tool 之间做可学习的动作选择,并通过任务反馈逐步优化策略。
## Phase 0 — Foundations / 仓基
目标:先把“对象”和“轨迹”定义清楚。
交付物:
- 统一候选对象 schema
- memory / skill / tool 类型边界定义
- 事件日志 schema
- trajectory schema
- reward 拆解草案
- 评估指标草案
- 原型目录布局草案
- baseline router 设计文档
- 示例 trajectories
成功标准:
- 对任何一次任务,都能完整记录:看到了什么、选了什么、结果如何
- 文档足够清晰,后续实现不靠拍脑袋
- 有第一批 success / failure trajectory 样本可供 replay 使用
状态:已完成
## Phase 1 — Observable MVP / 可观测最小系统
目标:做一个不学习、但能完整运行和记录的版本。
交付物:
- 候选召回模块
- memory/skill/tool 统一候选接口
- 基于规则或启发式的 router
- 执行适配层
- 轨迹日志落盘
- 基础可视化 / 回放能力
成功标准:
- 给定任务,系统能做出动作选择
- 每次动作都能复盘
- 可以统计简单指标:命中率、工具调用率、任务完成率
状态:已完成
## Phase 2 — Learned Router / 学习型路由器
目标:让"直觉"开始可训练。
交付物:
- 候选特征工程
- 训练样本构建流程
- 轻量分类器 / reranker / bandit
- 离线评估基线
- 路由策略 A/B 对比
成功标准:
- 学习型路由在离线回放中优于规则路由
- 减少明显无效调用
- 能识别高价值 memory / skill / tool 场景
状态已完成SimpleLearningRouter、DatasetBuilder、Evaluator、A/B comparison、RouterVersionStore
## Phase 3 — Rewarded Adaptation / 带反馈的适应
目标:利用任务结果对策略做持续更新。
交付物:
- reward 聚合器
- 用户修正信号接入
- online / batch 更新机制
- safe exploration 策略
- 记忆置信度更新机制
- benchmark-gated promotion policy
- training run reports
- active router metadata tracking
成功标准:
- 策略可在连续任务中改善
- 不会因为少量坏反馈快速崩掉
- 可以识别并降权错误记忆
- promotion 必须经过 benchmark 验证
状态已完成online learning coordinator、promotion policy、training reports、version metadata、benchmark-gated promotion、active router tracking、app/CLI entrypoints 已实现)
### Phase 4 — Episodic Learning / 情景学习
目标:把过往任务轨迹变成真正有用的 episodic memory。
交付物:
- 任务案例索引 (done)
- episode retrieval (done — via CaseIndex and runner injection)
- 相似任务复用 (done — runner injects episodic candidate)
- trajectory summarization (done — `TrajectorySummarizer` generates human-readable summaries)
成功标准:
- 对重复型任务,系统能复用历史成功路径
- episode 不会污染事实记忆和 skill 库
状态:进行中 (核心功能已完成)
## Phase 5 — End-to-End Experiments / 端到端实验
目标:验证是否值得把路由进一步内化到神经模型权重中。
交付物:
- 训练数据集定义
- SFT / preference / RL 实验方案
- 与分层系统的对照评估
- 风险分析:遗忘、过拟合、行为漂移
成功标准:
- 至少在一组明确任务上优于分层基线
- 不显著降低可解释性和稳定性
状态:未开始
## 每阶段都要守住的底线
- 必须可回放
- 必须可归因
- 必须分清 memory、skill、tool
- 必须有失败样本,不只看成功样本
- 必须能撤销错误记忆与错误策略
## 当前优先级
1. real adapters
2. richer reward/outcome updates
3. persistence-backed replay
4. router scoring v2
5. 再谈 learned router
这五步不打牢,后面训练都是空中楼阁。

213
docs/ROUTER_BASELINE.md Normal file
View File

@@ -0,0 +1,213 @@
# Rule-Based Router Baseline
## 目标
定义 memabra 在 Phase 1 使用的第一版路由策略。这个版本不学习,只靠显式规则和候选对象属性做动作选择。
它的价值不在于聪明,而在于:
- 可观察
- 可解释
- 可回放
- 可作为 learned router 的基线
## 动作空间
router 当前允许的动作:
1. `direct_answer`
2. `inject_memory`
3. `load_skill`
4. `call_tool`
5. `clarify`
6. `composite_action`
### direct_answer
适用场景:
- 纯分析、命名、结构设计、解释类任务
- 不依赖实时状态
- 没有明显外部资源调用必要
### inject_memory
适用场景:
- 用户偏好
- 项目约定
- 环境事实
- 历史已知稳定事实
### load_skill
适用场景:
- 任务像一个可复用 procedure
- 存在已知工作流
- 过往在类似任务中复用价值高
### call_tool
适用场景:
- 需要获取当前状态
- 需要访问文件、系统、网页、进程、时间等实时信息
- 需要执行动作而不是纯推理
### clarify
适用场景:
- 高风险且候选信号弱
- 信息缺失会显著改变动作选择
- 所有候选都低置信度
### composite_action
适用场景:
- 先 memory 再 tool
- 先 skill 再 tool
- 先 memory 再 skill
当前 baseline 先以单动作为主,组合动作先作为保留动作类型。
## 候选打分思路
每个候选对象都有公共字段:
- `confidence`
- `success_rate`
- `cost`
- `freshness`
- `risk`
baseline 不做复杂学习,只用线性直觉打分。
### memory score
```text
memory_score = confidence + freshness + success_rate - cost - risk
```
### skill score
```text
skill_score = confidence + success_rate - cost - risk
```
### tool score
```text
tool_score = confidence + success_rate - cost - risk
```
注意:
- memory 更看 freshness
- tool 更看 risk
- skill 更看 success_rate
## 第一版规则
### Rule 1: reasoning-first 任务优先 direct_answer
若用户输入中明显包含以下信号:
- why
- think
- design
- name
且不存在强 tool 触发词,则优先 `direct_answer`
### Rule 2: 需要实时状态时优先 tool
若输入中出现:
- check
- run
- open
- current
- list
- time
则优先找高置信 `tool` 候选。
额外门槛:
- `confidence >= 0.6`
- `risk <= 0.7`
### Rule 3: 用户/项目稳定事实优先 memory
若输入中出现:
- prefer
- remember
- usually
- my
- our
则优先找高置信、较新鲜的 `memory` 候选。
额外门槛:
- `confidence >= 0.65`
- `freshness >= 0.3`
### Rule 4: 可复用工作流优先 skill
若输入中出现:
- fix
- deploy
- review
- setup
- workflow
则优先找高 success_rate 的 `skill` 候选。
额外门槛:
- `confidence >= 0.55`
- `success_rate >= 0.4`
### Rule 5: 没把握就 clarify
如果没有任何一类候选达到门槛,则返回 `clarify`
这条规则很丑,但很必要。
宁可问一句,也别瞎调一堆工具把屋顶掀了。
## 冲突解决顺序
当多个动作同时触发时baseline 使用以下优先级:
```text
tool > memory > skill > direct_answer > clarify
```
原因:
- 实时信息需求通常最硬
- 事实约束其次
- skill 更像增强器
- 纯回答放在明确无外部需求时
后续版本可改成:
- 先 task intent classification
- 再 per-type ranking
- 最后做 global arbitration
## 已知局限
1. 关键词触发太脆
2. 不看长程上下文
3. 不支持真正的组合动作规划
4. 不做反事实选择比较
5. 容易被表面词汇误导
## baseline 的真正用途
不是追求高智能,而是提供:
- 第一版可运行系统
- 第一批可记录轨迹
- 第一批失败样本
- learned router 的比较对象
## 下一步
从这个 baseline 往后长,有三条路线:
1. 引入显式特征工程
2. 引入候选 reranker
3. 引入 bandit / lightweight policy learning
在此之前,不要急着把 heuristic 糊成“伪智能”。先把 replay 和 metrics 做出来。
---
## 实现进展FeatureScoringRouter (v2)
已在 `src/memabra/router.py` 中实现 `FeatureScoringRouter`,作为对 `RuleBasedRouter` 的升级:
- 明确特征打分memory / skill / tool 分别使用不同权重组合 `confidence``success_rate``freshness``cost``risk`
- 失败惩罚:候选 `id` 出现在 `TaskContext.recent_failures` 中时,自动扣减 0.5 分
- 复合动作前置条件:`CandidateObject` 新增 `preconditions` 字段,支持声明如 `["memory"]` 等前置类型
- 复合动作执行:`ExecutionEngine` 已支持 `composite_action` 决策类型,按 `composite_steps` 顺序递归执行子步骤
- 打分透明度:`RouteDecision.score_breakdown` 记录每个候选的最终得分,方便追溯与评估
`FeatureScoringRouter` 保持了可解释性,同时为后续学习型策略提供了结构化特征输出。

83
docs/RUNNER_AND_STORE.md Normal file
View File

@@ -0,0 +1,83 @@
# Runner, Schemas, and Memory Store
## 目标
把 memabra 从“能分别检索、路由、回放”推进到“能产出合法 draft trajectory、能校验数据、能管理 typed memory records”。
## 当前实现
### runner.py
提供:
- `MemabraRunner`
能力:
- 接收 `TaskContext`
- 调用 retriever 获取候选
- 调用 router 生成动作决策
- 自动生成 draft trajectory
- 产出最小事件流:
- `task_received`
- `candidates_recalled`
- `action_selected`
意义:
这让 memabra 第一次具备了一个 task-to-trajectory 的实际入口。
### schemas.py
提供:
- `SchemaRegistry`
- `SchemaValidationError`
当前策略:
- 先做轻量级 runtime validation
- 不依赖外部库
- 先校验关键 required keys
这还不是完整 JSON Schema engine但足够先守住地板线避免样本结构乱飞。
### memory_store.py
提供:
- `MemoryRecord`
- `MemorySource`
- `VerificationState`
- `InMemoryMemoryStore`
当前能力:
- upsert
- get
- list_by_type
- mark_used
- verify
- revoke
意义:
现在 memabra 终于不是只会“谈记忆”,而是有一个 typed memory record runtime 了。
## 当前闭环
现在已有:
- retrieval
- router
- runner
- replay
- memory store
- schema validation
也就是:
任务 -> 候选召回 -> 路由决策 -> trajectory 草稿 -> 回放统计
并且 memory record 本身也能做校验和状态变更。
## 还差什么
- execution adapter真实工具/skill/memory 注入)
- 完整 JSON Schema 验证
- trajectory 持久化层
- richer reward aggregation
- counterfactual replay
## 建议下一步
1.`execution.py`
2.`persistence.py`
3. 给 runner 接上 memory store 和 telemetry writeback
4. 做 richer router scoring v2

View File

@@ -0,0 +1,13 @@
{
"current_version_id": "20260414-165018",
"promotion_source": null,
"benchmark_summary": {
"reward_delta": -0.446,
"error_rate_delta": 0.0,
"latency_delta_ms": -21.0,
"baseline_avg_reward": 0.886,
"challenger_avg_reward": 0.44
},
"prior_version_id": "20260414-155224",
"saved_at": "2026-04-14T16:50:18.865976+00:00"
}

View File

@@ -0,0 +1,50 @@
{
"version_id": "20260414-143742",
"weights": {
"inject_memory": {
"input_length": 43.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.95,
"top_skill_success_rate": 0.9,
"top_tool_confidence": 0.95,
"top_tool_risk": 0.0
},
"load_skill": {
"input_length": 44.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.95,
"top_skill_success_rate": 0.9,
"top_tool_confidence": 0.95,
"top_tool_risk": 0.0
},
"call_tool": {
"input_length": 32.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9500000000000001,
"top_skill_success_rate": 0.9000000000000001,
"top_tool_confidence": 0.9500000000000001,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"avg_reward": 1.04,
"task_count": 3,
"source": "wrapup_workflow"
}
}

View File

@@ -0,0 +1,50 @@
{
"version_id": "20260414-152738",
"weights": {
"load_skill": {
"input_length": 42.15803814713897,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9499999999999997,
"top_skill_success_rate": 0.9,
"top_tool_confidence": 0.9499999999999997,
"top_tool_risk": 0.0
},
"call_tool": {
"input_length": 32.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.95,
"top_skill_success_rate": 0.9000000000000001,
"top_tool_confidence": 0.95,
"top_tool_risk": 0.0
},
"inject_memory": {
"input_length": 42.99999999999999,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.95,
"top_skill_success_rate": 0.8999999999999999,
"top_tool_confidence": 0.95,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"avg_reward": 1.04,
"task_count": 3,
"source": "wrapup_workflow"
}
}

View File

@@ -0,0 +1,55 @@
{
"version_id": "20260414-155224",
"weights": {
"load_skill": {
"input_length": 42.38663484486874,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9499999999999997,
"top_skill_success_rate": 0.9,
"top_tool_confidence": 0.9499999999999997,
"top_tool_risk": 0.0
},
"call_tool": {
"input_length": 32.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9500000000000001,
"top_skill_success_rate": 0.9000000000000001,
"top_tool_confidence": 0.9500000000000001,
"top_tool_risk": 0.0
},
"inject_memory": {
"input_length": 41.75894988066825,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9499999999999997,
"top_skill_success_rate": 0.8999999999999999,
"top_tool_confidence": 0.9499999999999997,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"source": "online_learning",
"benchmark_summary": {
"reward_delta": 0.154,
"error_rate_delta": 0.0,
"latency_delta_ms": -21.0,
"baseline_avg_reward": 0.886,
"challenger_avg_reward": 1.04
}
}
}

View File

@@ -0,0 +1,65 @@
{
"version_id": "20260414-165018",
"weights": {
"load_skill": {
"input_length": 41.594896331738454,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9499999999999998,
"top_skill_success_rate": 0.9000000000000001,
"top_tool_confidence": 0.9499999999999998,
"top_tool_risk": 0.0
},
"call_tool": {
"input_length": 32.85406896551724,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.95,
"top_skill_success_rate": 0.9,
"top_tool_confidence": 0.95,
"top_tool_risk": 0.0
},
"clarify": {
"input_length": 51.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.95,
"top_skill_success_rate": 0.9,
"top_tool_confidence": 0.95,
"top_tool_risk": 0.0
},
"inject_memory": {
"input_length": 41.45435244161358,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9499999999999996,
"top_skill_success_rate": 0.8999999999999998,
"top_tool_confidence": 0.9499999999999996,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"source": "online_learning",
"benchmark_summary": {
"reward_delta": -0.446,
"error_rate_delta": 0.0,
"latency_delta_ms": -21.0,
"baseline_avg_reward": 0.886,
"challenger_avg_reward": 0.44
}
}
}

View File

@@ -0,0 +1,52 @@
{
"report_id": "report-886de309-18d0-4be6-b626-0f7d2edc8b72",
"timestamp": "2026-04-14T15:52:24.610516+00:00",
"source_trajectory_ids": [
"traj-004e53d5-006c-4e61-91a4-dc51cf7ee9bd",
"traj-120aec7e-a74d-42d6-8846-c472680cc2f3",
"traj-179d0c19-3f0f-4429-a85b-3e01802290d3",
"traj-1ac5bb3d-f865-4c8c-8ff4-a9c29472b303",
"traj-3f6687ff-3a55-4a26-a7bc-8397d8da7d15",
"traj-439e4552-f248-43cb-b4eb-25db14da1ebc",
"traj-58ec7a90-3ada-4b78-bc6a-6351be4eb4b5",
"traj-6a5aaff5-9336-4a1d-b102-80f1196427ae",
"traj-707b1dec-1d9a-4a71-a07a-54841155103c",
"traj-80784ce5-fc14-4fee-9f5f-90dcec26179b",
"traj-819443a2-79ea-48b7-a543-8bb7356dba36",
"traj-9144cbc3-1ccf-4660-aad9-8db5797461eb",
"traj-9190707c-5486-4266-a6c8-32f34c6c63ec",
"traj-adb05c91-4c0c-493a-af84-517efea3f406",
"traj-affbeb5b-eb52-40fd-94cb-48b7c374f1fc",
"traj-bcad8fa2-ffd3-4e5b-9ddb-720f3898826e",
"traj-c5907bfb-61d2-47f9-a6c5-2300701bb551",
"traj-c9c11bdc-852b-4aef-851c-f2968806e535",
"traj-d2d3a115-36d8-466f-9d14-bf741316f698",
"traj-d3575889-7458-44b9-b3f1-f04cd766ca76",
"traj-dd361c81-40a1-4892-9914-2140870fff95"
],
"sample_count": 21,
"baseline_metrics": {
"task_count": 4,
"avg_reward": 0.886,
"error_rate": 0.0,
"avg_latency_ms": 21.0
},
"challenger_metrics": {
"task_count": 4,
"avg_reward": 1.04,
"error_rate": 0.0,
"avg_latency_ms": 0.0
},
"promotion_decision": {
"accepted": true,
"reasons": [],
"metrics": {
"reward_delta": 0.154,
"error_rate_delta": 0.0,
"latency_delta_ms": -21.0,
"baseline_avg_reward": 0.886,
"challenger_avg_reward": 1.04
}
},
"promoted_version_id": "20260414-155224"
}

View File

@@ -0,0 +1,60 @@
{
"report_id": "report-e7050e1f-fa3c-42e4-9178-e57f69b2dc1d",
"timestamp": "2026-04-14T16:50:18.866221+00:00",
"source_trajectory_ids": [
"traj-004e53d5-006c-4e61-91a4-dc51cf7ee9bd",
"traj-120aec7e-a74d-42d6-8846-c472680cc2f3",
"traj-179d0c19-3f0f-4429-a85b-3e01802290d3",
"traj-1ac5bb3d-f865-4c8c-8ff4-a9c29472b303",
"traj-217ccafa-716c-4534-813b-a489ed7d6079",
"traj-3f6687ff-3a55-4a26-a7bc-8397d8da7d15",
"traj-439e4552-f248-43cb-b4eb-25db14da1ebc",
"traj-58ec7a90-3ada-4b78-bc6a-6351be4eb4b5",
"traj-6a5aaff5-9336-4a1d-b102-80f1196427ae",
"traj-707b1dec-1d9a-4a71-a07a-54841155103c",
"traj-74e92442-04fd-4f5a-979f-2dd81a7f08e1",
"traj-80784ce5-fc14-4fee-9f5f-90dcec26179b",
"traj-819443a2-79ea-48b7-a543-8bb7356dba36",
"traj-9144cbc3-1ccf-4660-aad9-8db5797461eb",
"traj-9190707c-5486-4266-a6c8-32f34c6c63ec",
"traj-9edc5088-09cc-42d6-a160-cede5357f535",
"traj-adb05c91-4c0c-493a-af84-517efea3f406",
"traj-affbeb5b-eb52-40fd-94cb-48b7c374f1fc",
"traj-b786c15f-388d-4228-9da4-c9e82b61570a",
"traj-bcad8fa2-ffd3-4e5b-9ddb-720f3898826e",
"traj-c5907bfb-61d2-47f9-a6c5-2300701bb551",
"traj-c9c11bdc-852b-4aef-851c-f2968806e535",
"traj-d2d3a115-36d8-466f-9d14-bf741316f698",
"traj-d3575889-7458-44b9-b3f1-f04cd766ca76",
"traj-dd361c81-40a1-4892-9914-2140870fff95",
"traj-e197ee51-e87c-4203-b9ee-c2f2d530cceb",
"traj-ed1d8812-f0ac-4994-86ab-21b3cf0fcb17",
"traj-f1d895a0-5442-448f-8936-4ee8b07822e6",
"traj-ffb40d01-7956-4d7b-a41c-9618487fe619"
],
"sample_count": 29,
"baseline_metrics": {
"task_count": 4,
"avg_reward": 0.886,
"error_rate": 0.0,
"avg_latency_ms": 21.0
},
"challenger_metrics": {
"task_count": 4,
"avg_reward": 0.44,
"error_rate": 0.0,
"avg_latency_ms": 0.0
},
"promotion_decision": {
"accepted": true,
"reasons": [],
"metrics": {
"reward_delta": -0.446,
"error_rate_delta": 0.0,
"latency_delta_ms": -21.0,
"baseline_avg_reward": 0.886,
"challenger_avg_reward": 0.44
}
},
"promoted_version_id": "20260414-165018"
}

View File

@@ -0,0 +1,192 @@
{
"trajectory_id": "traj-004e53d5-006c-4e61-91a4-dc51cf7ee9bd",
"task": {
"task_id": "task-5977495f-189b-4a87-8924-4834bded854c",
"input": "Check the current system status.",
"channel": "local",
"created_at": "2026-04-14T14:37:42.381631+00:00",
"user_id": "oza"
},
"context_snapshot": {
"conversation_summary": "",
"environment_summary": "",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-telegram-pref",
"type": "memory",
"title": "Telegram preference",
"summary": "Prefer plain text on Telegram.",
"triggers": [
"telegram",
"preference",
"answer"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 0.9,
"risk": 0.0,
"tags": [
"output"
],
"source": "user",
"type_payload": {}
}
],
"skill": [
{
"id": "skill-deploy",
"type": "skill",
"title": "Deploy workflow",
"summary": "Reusable deployment workflow.",
"triggers": [
"deploy",
"workflow",
"service"
],
"cost": 0.0,
"confidence": 0.8,
"success_rate": 0.9,
"freshness": 0.8,
"risk": 0.0,
"tags": [
"ops"
],
"source": "system",
"type_payload": {}
}
],
"tool": [
{
"id": "tool-terminal",
"type": "tool",
"title": "terminal",
"summary": "Run terminal-style inspection commands.",
"triggers": [
"check",
"current",
"status",
"system"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 1.0,
"risk": 0.0,
"tags": [
"inspection"
],
"source": "system",
"type_payload": {}
}
]
},
"decisions": [
{
"step": 1,
"decision_type": "load_skill",
"selected_ids": [
"skill-deploy"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Predicted by learning router (score=1413.615).",
"estimated_cost": 0.0
}
],
"events": [
{
"event_id": "evt-501ed3a1-622f-4e8a-b90b-2fb0384d89bd",
"trajectory_id": "traj-004e53d5-006c-4e61-91a4-dc51cf7ee9bd",
"timestamp": "2026-04-14T14:37:42.381702+00:00",
"stage": "retrieval",
"event_type": "task_received",
"payload": {
"input": "Check the current system status."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-4b6839de-ac61-414f-8939-3ba335a93cfa",
"trajectory_id": "traj-004e53d5-006c-4e61-91a4-dc51cf7ee9bd",
"timestamp": "2026-04-14T14:37:42.381707+00:00",
"stage": "retrieval",
"event_type": "candidates_recalled",
"payload": {
"memory_ids": [
"mem-telegram-pref"
],
"skill_ids": [
"skill-deploy"
],
"tool_ids": [
"tool-terminal"
]
},
"metrics": {},
"parent_event_id": "evt-501ed3a1-622f-4e8a-b90b-2fb0384d89bd"
},
{
"event_id": "evt-1b229a15-af51-4924-932d-4d0318f0ba26",
"trajectory_id": "traj-004e53d5-006c-4e61-91a4-dc51cf7ee9bd",
"timestamp": "2026-04-14T14:37:42.381711+00:00",
"stage": "policy",
"event_type": "action_selected",
"payload": {
"step": 1,
"decision_type": "load_skill",
"selected_ids": [
"skill-deploy"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Predicted by learning router (score=1413.615).",
"estimated_cost": 0.0
},
"metrics": {},
"parent_event_id": "evt-4b6839de-ac61-414f-8939-3ba335a93cfa"
},
{
"event_id": "evt-skill-traj-004e53d5-006c-4e61-91a4-dc51cf7ee9bd-skill-deploy",
"trajectory_id": "traj-004e53d5-006c-4e61-91a4-dc51cf7ee9bd",
"timestamp": "2026-04-14T14:37:42.381718+00:00",
"stage": "execution",
"event_type": "skill_loaded",
"payload": {
"skill_id": "skill-deploy",
"input": "Check the current system status.",
"instructions": "Demo skill payload loaded successfully."
},
"metrics": {},
"parent_event_id": null
}
],
"outcome": {
"status": "success",
"steps": 1,
"latency_ms": 0,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Draft trajectory generated by MemabraRunner with execution hooks."
},
"reward": {
"total": 1.04,
"components": {
"task_success": 0.8,
"retrieval_hit": 0.2,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.0,
"context_cost": 0.06,
"useful_reuse": 0.1
}
}
}

View File

@@ -0,0 +1,207 @@
{
"trajectory_id": "traj-120aec7e-a74d-42d6-8846-c472680cc2f3",
"task": {
"task_id": "task-78a318e6-c8b4-4d05-bfd8-2ebe4b19710f",
"input": "Check the current system status.",
"channel": "local",
"created_at": "2026-04-14T15:27:38.518486+00:00",
"user_id": "oza"
},
"context_snapshot": {
"conversation_summary": "",
"environment_summary": "",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-telegram-pref",
"type": "memory",
"title": "Telegram preference",
"summary": "Prefer plain text on Telegram.",
"triggers": [
"telegram",
"preference",
"answer"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 0.9,
"risk": 0.0,
"tags": [
"output"
],
"source": "user",
"type_payload": {}
}
],
"skill": [
{
"id": "skill-deploy",
"type": "skill",
"title": "Deploy workflow",
"summary": "Reusable deployment workflow.",
"triggers": [
"deploy",
"workflow",
"service"
],
"cost": 0.0,
"confidence": 0.8,
"success_rate": 0.9,
"freshness": 0.8,
"risk": 0.0,
"tags": [
"ops"
],
"source": "system",
"type_payload": {}
}
],
"tool": [
{
"id": "tool-terminal",
"type": "tool",
"title": "terminal",
"summary": "Run terminal-style inspection commands.",
"triggers": [
"check",
"current",
"status",
"system"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 1.0,
"risk": 0.0,
"tags": [
"inspection"
],
"source": "system",
"type_payload": {}
}
]
},
"decisions": [
{
"step": 1,
"decision_type": "call_tool",
"selected_ids": [
"tool-terminal"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Task asks for current state or external action; tool use is justified.",
"estimated_cost": 0.0
}
],
"events": [
{
"event_id": "evt-be0db4ba-93b9-4cf7-bd76-51c1af70c6d4",
"trajectory_id": "traj-120aec7e-a74d-42d6-8846-c472680cc2f3",
"timestamp": "2026-04-14T15:27:38.518550+00:00",
"stage": "retrieval",
"event_type": "task_received",
"payload": {
"input": "Check the current system status."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-fb7734b7-bdab-4e24-8dec-a9debf02529d",
"trajectory_id": "traj-120aec7e-a74d-42d6-8846-c472680cc2f3",
"timestamp": "2026-04-14T15:27:38.518556+00:00",
"stage": "retrieval",
"event_type": "candidates_recalled",
"payload": {
"memory_ids": [
"mem-telegram-pref"
],
"skill_ids": [
"skill-deploy"
],
"tool_ids": [
"tool-terminal"
]
},
"metrics": {},
"parent_event_id": "evt-be0db4ba-93b9-4cf7-bd76-51c1af70c6d4"
},
{
"event_id": "evt-8ed4e73b-2b45-44a6-9ab6-cc6184202dc0",
"trajectory_id": "traj-120aec7e-a74d-42d6-8846-c472680cc2f3",
"timestamp": "2026-04-14T15:27:38.518561+00:00",
"stage": "policy",
"event_type": "action_selected",
"payload": {
"step": 1,
"decision_type": "call_tool",
"selected_ids": [
"tool-terminal"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Task asks for current state or external action; tool use is justified.",
"estimated_cost": 0.0
},
"metrics": {},
"parent_event_id": "evt-fb7734b7-bdab-4e24-8dec-a9debf02529d"
},
{
"event_id": "evt-tool-traj-120aec7e-a74d-42d6-8846-c472680cc2f3-tool-terminal",
"trajectory_id": "traj-120aec7e-a74d-42d6-8846-c472680cc2f3",
"timestamp": "2026-04-14T15:27:38.518572+00:00",
"stage": "execution",
"event_type": "tool_called",
"payload": {
"tool_id": "tool-terminal",
"input": "Check the current system status."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-tool-result-traj-120aec7e-a74d-42d6-8846-c472680cc2f3-tool-terminal",
"trajectory_id": "traj-120aec7e-a74d-42d6-8846-c472680cc2f3",
"timestamp": "2026-04-14T15:27:38.518575+00:00",
"stage": "execution",
"event_type": "tool_result",
"payload": {
"tool_id": "tool-terminal",
"status": "success",
"output": "demo-result-for:tool-terminal",
"error": null,
"latency_ms": 42
},
"metrics": {},
"parent_event_id": null
}
],
"outcome": {
"status": "success",
"steps": 1,
"latency_ms": 42,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Draft trajectory generated by MemabraRunner with execution hooks."
},
"reward": {
"total": 1.032,
"components": {
"task_success": 0.8,
"retrieval_hit": 0.25,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.008,
"context_cost": 0.06,
"useful_reuse": 0.05
}
}
}

View File

@@ -0,0 +1,207 @@
{
"trajectory_id": "traj-179d0c19-3f0f-4429-a85b-3e01802290d3",
"task": {
"task_id": "task-c0d9120f-4b28-4815-bcbc-1ea1cb523129",
"input": "Check the current system status.",
"channel": "telegram",
"created_at": "2026-04-14T15:27:38.512676+00:00",
"user_id": "oza"
},
"context_snapshot": {
"conversation_summary": "",
"environment_summary": "",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-telegram-pref",
"type": "memory",
"title": "Telegram preference",
"summary": "Prefer plain text on Telegram.",
"triggers": [
"telegram",
"preference",
"answer"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 0.9,
"risk": 0.0,
"tags": [
"output"
],
"source": "user",
"type_payload": {}
}
],
"skill": [
{
"id": "skill-deploy",
"type": "skill",
"title": "Deploy workflow",
"summary": "Reusable deployment workflow.",
"triggers": [
"deploy",
"workflow",
"service"
],
"cost": 0.0,
"confidence": 0.8,
"success_rate": 0.9,
"freshness": 0.8,
"risk": 0.0,
"tags": [
"ops"
],
"source": "system",
"type_payload": {}
}
],
"tool": [
{
"id": "tool-terminal",
"type": "tool",
"title": "terminal",
"summary": "Run terminal-style inspection commands.",
"triggers": [
"check",
"current",
"status",
"system"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 1.0,
"risk": 0.0,
"tags": [
"inspection"
],
"source": "system",
"type_payload": {}
}
]
},
"decisions": [
{
"step": 1,
"decision_type": "call_tool",
"selected_ids": [
"tool-terminal"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Task asks for current state or external action; tool use is justified.",
"estimated_cost": 0.0
}
],
"events": [
{
"event_id": "evt-2e159144-a5dc-4bab-bb15-026b156788a7",
"trajectory_id": "traj-179d0c19-3f0f-4429-a85b-3e01802290d3",
"timestamp": "2026-04-14T15:27:38.512756+00:00",
"stage": "retrieval",
"event_type": "task_received",
"payload": {
"input": "Check the current system status."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-84681604-ee59-4618-8b1b-bdc521e58e7d",
"trajectory_id": "traj-179d0c19-3f0f-4429-a85b-3e01802290d3",
"timestamp": "2026-04-14T15:27:38.512762+00:00",
"stage": "retrieval",
"event_type": "candidates_recalled",
"payload": {
"memory_ids": [
"mem-telegram-pref"
],
"skill_ids": [
"skill-deploy"
],
"tool_ids": [
"tool-terminal"
]
},
"metrics": {},
"parent_event_id": "evt-2e159144-a5dc-4bab-bb15-026b156788a7"
},
{
"event_id": "evt-6404a35f-8775-4fc1-9648-62a27f4a1b23",
"trajectory_id": "traj-179d0c19-3f0f-4429-a85b-3e01802290d3",
"timestamp": "2026-04-14T15:27:38.512767+00:00",
"stage": "policy",
"event_type": "action_selected",
"payload": {
"step": 1,
"decision_type": "call_tool",
"selected_ids": [
"tool-terminal"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Task asks for current state or external action; tool use is justified.",
"estimated_cost": 0.0
},
"metrics": {},
"parent_event_id": "evt-84681604-ee59-4618-8b1b-bdc521e58e7d"
},
{
"event_id": "evt-tool-traj-179d0c19-3f0f-4429-a85b-3e01802290d3-tool-terminal",
"trajectory_id": "traj-179d0c19-3f0f-4429-a85b-3e01802290d3",
"timestamp": "2026-04-14T15:27:38.512781+00:00",
"stage": "execution",
"event_type": "tool_called",
"payload": {
"tool_id": "tool-terminal",
"input": "Check the current system status."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-tool-result-traj-179d0c19-3f0f-4429-a85b-3e01802290d3-tool-terminal",
"trajectory_id": "traj-179d0c19-3f0f-4429-a85b-3e01802290d3",
"timestamp": "2026-04-14T15:27:38.512785+00:00",
"stage": "execution",
"event_type": "tool_result",
"payload": {
"tool_id": "tool-terminal",
"status": "success",
"output": "demo-result-for:tool-terminal",
"error": null,
"latency_ms": 42
},
"metrics": {},
"parent_event_id": null
}
],
"outcome": {
"status": "success",
"steps": 1,
"latency_ms": 42,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Draft trajectory generated by MemabraRunner with execution hooks."
},
"reward": {
"total": 1.032,
"components": {
"task_success": 0.8,
"retrieval_hit": 0.25,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.008,
"context_cost": 0.06,
"useful_reuse": 0.05
}
}
}

View File

@@ -0,0 +1,192 @@
{
"trajectory_id": "traj-1ac5bb3d-f865-4c8c-8ff4-a9c29472b303",
"task": {
"task_id": "task-f3701d8c-4931-4e43-8488-5fc670e5b2b1",
"input": "Deploy this service with the usual workflow.",
"channel": "local",
"created_at": "2026-04-14T14:37:42.380802+00:00",
"user_id": "oza"
},
"context_snapshot": {
"conversation_summary": "",
"environment_summary": "",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-telegram-pref",
"type": "memory",
"title": "Telegram preference",
"summary": "Prefer plain text on Telegram.",
"triggers": [
"telegram",
"preference",
"answer"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 0.9,
"risk": 0.0,
"tags": [
"output"
],
"source": "user",
"type_payload": {}
}
],
"skill": [
{
"id": "skill-deploy",
"type": "skill",
"title": "Deploy workflow",
"summary": "Reusable deployment workflow.",
"triggers": [
"deploy",
"workflow",
"service"
],
"cost": 0.0,
"confidence": 0.8,
"success_rate": 0.9,
"freshness": 0.8,
"risk": 0.0,
"tags": [
"ops"
],
"source": "system",
"type_payload": {}
}
],
"tool": [
{
"id": "tool-terminal",
"type": "tool",
"title": "terminal",
"summary": "Run terminal-style inspection commands.",
"triggers": [
"check",
"current",
"status",
"system"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 1.0,
"risk": 0.0,
"tags": [
"inspection"
],
"source": "system",
"type_payload": {}
}
]
},
"decisions": [
{
"step": 1,
"decision_type": "load_skill",
"selected_ids": [
"skill-deploy"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Task resembles a reusable procedure; load a skill before action.",
"estimated_cost": 0.0
}
],
"events": [
{
"event_id": "evt-480e859f-7e5f-42f0-bfcc-f3cb954f75d5",
"trajectory_id": "traj-1ac5bb3d-f865-4c8c-8ff4-a9c29472b303",
"timestamp": "2026-04-14T14:37:42.380861+00:00",
"stage": "retrieval",
"event_type": "task_received",
"payload": {
"input": "Deploy this service with the usual workflow."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-398d16c2-3d12-44a7-8af2-aa306e20195c",
"trajectory_id": "traj-1ac5bb3d-f865-4c8c-8ff4-a9c29472b303",
"timestamp": "2026-04-14T14:37:42.380867+00:00",
"stage": "retrieval",
"event_type": "candidates_recalled",
"payload": {
"memory_ids": [
"mem-telegram-pref"
],
"skill_ids": [
"skill-deploy"
],
"tool_ids": [
"tool-terminal"
]
},
"metrics": {},
"parent_event_id": "evt-480e859f-7e5f-42f0-bfcc-f3cb954f75d5"
},
{
"event_id": "evt-b63063ea-1ac7-4b85-a6c7-76a03791bc85",
"trajectory_id": "traj-1ac5bb3d-f865-4c8c-8ff4-a9c29472b303",
"timestamp": "2026-04-14T14:37:42.380871+00:00",
"stage": "policy",
"event_type": "action_selected",
"payload": {
"step": 1,
"decision_type": "load_skill",
"selected_ids": [
"skill-deploy"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Task resembles a reusable procedure; load a skill before action.",
"estimated_cost": 0.0
},
"metrics": {},
"parent_event_id": "evt-398d16c2-3d12-44a7-8af2-aa306e20195c"
},
{
"event_id": "evt-skill-traj-1ac5bb3d-f865-4c8c-8ff4-a9c29472b303-skill-deploy",
"trajectory_id": "traj-1ac5bb3d-f865-4c8c-8ff4-a9c29472b303",
"timestamp": "2026-04-14T14:37:42.380877+00:00",
"stage": "execution",
"event_type": "skill_loaded",
"payload": {
"skill_id": "skill-deploy",
"input": "Deploy this service with the usual workflow.",
"instructions": "Demo skill payload loaded successfully."
},
"metrics": {},
"parent_event_id": null
}
],
"outcome": {
"status": "success",
"steps": 1,
"latency_ms": 0,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Draft trajectory generated by MemabraRunner with execution hooks."
},
"reward": {
"total": 1.04,
"components": {
"task_success": 0.8,
"retrieval_hit": 0.2,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.0,
"context_cost": 0.06,
"useful_reuse": 0.1
}
}
}

View File

@@ -0,0 +1,170 @@
{
"trajectory_id": "traj-1c2b1a9e-7290-4ea4-be52-c6ba60b72da0",
"task": {
"task_id": "task-bb730dc5-88ed-4455-9dbb-6cbba55ad0ce",
"input": "Check current system status with a tool.",
"channel": "local",
"created_at": "2026-04-14T16:50:18.864549+00:00",
"user_id": null
},
"context_snapshot": {
"conversation_summary": "",
"environment_summary": "",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-telegram-pref",
"type": "memory",
"title": "Telegram preference",
"summary": "Prefer plain text on Telegram.",
"triggers": [
"telegram",
"preference",
"answer"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 0.9,
"risk": 0.0,
"tags": [
"output"
],
"source": "user",
"type_payload": {}
}
],
"skill": [
{
"id": "skill-deploy",
"type": "skill",
"title": "Deploy workflow",
"summary": "Reusable deployment workflow.",
"triggers": [
"deploy",
"workflow",
"service"
],
"cost": 0.0,
"confidence": 0.8,
"success_rate": 0.9,
"freshness": 0.8,
"risk": 0.0,
"tags": [
"ops"
],
"source": "system",
"type_payload": {}
}
],
"tool": [
{
"id": "tool-terminal",
"type": "tool",
"title": "terminal",
"summary": "Run terminal-style inspection commands.",
"triggers": [
"check",
"current",
"status",
"system"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 1.0,
"risk": 0.0,
"tags": [
"inspection"
],
"source": "system",
"type_payload": {}
}
]
},
"decisions": [
{
"step": 1,
"decision_type": "clarify",
"selected_ids": [],
"selected_payloads": [],
"rejected_ids": [],
"rationale": "Predicted by learning router (score=2045.615).",
"estimated_cost": 0.0
}
],
"events": [
{
"event_id": "evt-f491ed7a-0017-463f-a346-2b13aac2ef27",
"trajectory_id": "traj-1c2b1a9e-7290-4ea4-be52-c6ba60b72da0",
"timestamp": "2026-04-14T16:50:18.864653+00:00",
"stage": "retrieval",
"event_type": "task_received",
"payload": {
"input": "Check current system status with a tool."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-9b88da4b-fe41-4522-ba53-e88adf3df3b4",
"trajectory_id": "traj-1c2b1a9e-7290-4ea4-be52-c6ba60b72da0",
"timestamp": "2026-04-14T16:50:18.864663+00:00",
"stage": "retrieval",
"event_type": "candidates_recalled",
"payload": {
"memory_ids": [
"mem-telegram-pref"
],
"skill_ids": [
"skill-deploy"
],
"tool_ids": [
"tool-terminal"
]
},
"metrics": {},
"parent_event_id": "evt-f491ed7a-0017-463f-a346-2b13aac2ef27"
},
{
"event_id": "evt-2fc97f2c-8219-44d3-98c7-5a86ad88326d",
"trajectory_id": "traj-1c2b1a9e-7290-4ea4-be52-c6ba60b72da0",
"timestamp": "2026-04-14T16:50:18.864669+00:00",
"stage": "policy",
"event_type": "action_selected",
"payload": {
"step": 1,
"decision_type": "clarify",
"selected_ids": [],
"selected_payloads": [],
"rejected_ids": [],
"rationale": "Predicted by learning router (score=2045.615).",
"estimated_cost": 0.0
},
"metrics": {},
"parent_event_id": "evt-9b88da4b-fe41-4522-ba53-e88adf3df3b4"
}
],
"outcome": {
"status": "partial_success",
"steps": 1,
"latency_ms": 0,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Draft trajectory generated by MemabraRunner with execution hooks."
},
"reward": {
"total": 0.44,
"components": {
"task_success": 0.4,
"retrieval_hit": 0.1,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.0,
"context_cost": 0.06,
"useful_reuse": 0.0
}
}
}

View File

@@ -0,0 +1,207 @@
{
"trajectory_id": "traj-1ea60d6e-0b83-4cdf-a601-159373c780ee",
"task": {
"task_id": "task-c5221ec3-e5b9-4a2f-9774-fbb75018fe08",
"input": "Check current system status with a tool.",
"channel": "local",
"created_at": "2026-04-14T16:50:18.862393+00:00",
"user_id": null
},
"context_snapshot": {
"conversation_summary": "",
"environment_summary": "",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-telegram-pref",
"type": "memory",
"title": "Telegram preference",
"summary": "Prefer plain text on Telegram.",
"triggers": [
"telegram",
"preference",
"answer"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 0.9,
"risk": 0.0,
"tags": [
"output"
],
"source": "user",
"type_payload": {}
}
],
"skill": [
{
"id": "skill-deploy",
"type": "skill",
"title": "Deploy workflow",
"summary": "Reusable deployment workflow.",
"triggers": [
"deploy",
"workflow",
"service"
],
"cost": 0.0,
"confidence": 0.8,
"success_rate": 0.9,
"freshness": 0.8,
"risk": 0.0,
"tags": [
"ops"
],
"source": "system",
"type_payload": {}
}
],
"tool": [
{
"id": "tool-terminal",
"type": "tool",
"title": "terminal",
"summary": "Run terminal-style inspection commands.",
"triggers": [
"check",
"current",
"status",
"system"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 1.0,
"risk": 0.0,
"tags": [
"inspection"
],
"source": "system",
"type_payload": {}
}
]
},
"decisions": [
{
"step": 1,
"decision_type": "call_tool",
"selected_ids": [
"tool-terminal"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Task asks for current state or external action; tool use is justified.",
"estimated_cost": 0.0
}
],
"events": [
{
"event_id": "evt-93525bc5-5e71-481c-a7d4-0282ef59e0a3",
"trajectory_id": "traj-1ea60d6e-0b83-4cdf-a601-159373c780ee",
"timestamp": "2026-04-14T16:50:18.862483+00:00",
"stage": "retrieval",
"event_type": "task_received",
"payload": {
"input": "Check current system status with a tool."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-a01d1dff-a6dc-4c25-a5a5-14efd6f182b2",
"trajectory_id": "traj-1ea60d6e-0b83-4cdf-a601-159373c780ee",
"timestamp": "2026-04-14T16:50:18.862492+00:00",
"stage": "retrieval",
"event_type": "candidates_recalled",
"payload": {
"memory_ids": [
"mem-telegram-pref"
],
"skill_ids": [
"skill-deploy"
],
"tool_ids": [
"tool-terminal"
]
},
"metrics": {},
"parent_event_id": "evt-93525bc5-5e71-481c-a7d4-0282ef59e0a3"
},
{
"event_id": "evt-28946864-c699-42fd-9802-dbfe6cb09043",
"trajectory_id": "traj-1ea60d6e-0b83-4cdf-a601-159373c780ee",
"timestamp": "2026-04-14T16:50:18.862498+00:00",
"stage": "policy",
"event_type": "action_selected",
"payload": {
"step": 1,
"decision_type": "call_tool",
"selected_ids": [
"tool-terminal"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Task asks for current state or external action; tool use is justified.",
"estimated_cost": 0.0
},
"metrics": {},
"parent_event_id": "evt-a01d1dff-a6dc-4c25-a5a5-14efd6f182b2"
},
{
"event_id": "evt-tool-traj-1ea60d6e-0b83-4cdf-a601-159373c780ee-tool-terminal",
"trajectory_id": "traj-1ea60d6e-0b83-4cdf-a601-159373c780ee",
"timestamp": "2026-04-14T16:50:18.862511+00:00",
"stage": "execution",
"event_type": "tool_called",
"payload": {
"tool_id": "tool-terminal",
"input": "Check current system status with a tool."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-tool-result-traj-1ea60d6e-0b83-4cdf-a601-159373c780ee-tool-terminal",
"trajectory_id": "traj-1ea60d6e-0b83-4cdf-a601-159373c780ee",
"timestamp": "2026-04-14T16:50:18.862515+00:00",
"stage": "execution",
"event_type": "tool_result",
"payload": {
"tool_id": "tool-terminal",
"status": "success",
"output": "demo-result-for:tool-terminal",
"error": null,
"latency_ms": 42
},
"metrics": {},
"parent_event_id": null
}
],
"outcome": {
"status": "success",
"steps": 1,
"latency_ms": 42,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Draft trajectory generated by MemabraRunner with execution hooks."
},
"reward": {
"total": 1.032,
"components": {
"task_success": 0.8,
"retrieval_hit": 0.25,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.008,
"context_cost": 0.06,
"useful_reuse": 0.05
}
}
}

View File

@@ -0,0 +1,170 @@
{
"trajectory_id": "traj-217ccafa-716c-4534-813b-a489ed7d6079",
"task": {
"task_id": "task-5f14e5ed-0635-44a0-82e8-419187b040f3",
"input": "Use multiple capabilities: memory, skill, and tool.",
"channel": "local",
"created_at": "2026-04-14T15:52:24.605025+00:00",
"user_id": null
},
"context_snapshot": {
"conversation_summary": "",
"environment_summary": "",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-telegram-pref",
"type": "memory",
"title": "Telegram preference",
"summary": "Prefer plain text on Telegram.",
"triggers": [
"telegram",
"preference",
"answer"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 0.9,
"risk": 0.0,
"tags": [
"output"
],
"source": "user",
"type_payload": {}
}
],
"skill": [
{
"id": "skill-deploy",
"type": "skill",
"title": "Deploy workflow",
"summary": "Reusable deployment workflow.",
"triggers": [
"deploy",
"workflow",
"service"
],
"cost": 0.0,
"confidence": 0.8,
"success_rate": 0.9,
"freshness": 0.8,
"risk": 0.0,
"tags": [
"ops"
],
"source": "system",
"type_payload": {}
}
],
"tool": [
{
"id": "tool-terminal",
"type": "tool",
"title": "terminal",
"summary": "Run terminal-style inspection commands.",
"triggers": [
"check",
"current",
"status",
"system"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 1.0,
"risk": 0.0,
"tags": [
"inspection"
],
"source": "system",
"type_payload": {}
}
]
},
"decisions": [
{
"step": 1,
"decision_type": "clarify",
"selected_ids": [],
"selected_payloads": [],
"rejected_ids": [],
"rationale": "No high-confidence route found from the current heuristic baseline.",
"estimated_cost": 0.0
}
],
"events": [
{
"event_id": "evt-13ccd07e-9bfd-4ff8-8080-47c400f0be6f",
"trajectory_id": "traj-217ccafa-716c-4534-813b-a489ed7d6079",
"timestamp": "2026-04-14T15:52:24.605116+00:00",
"stage": "retrieval",
"event_type": "task_received",
"payload": {
"input": "Use multiple capabilities: memory, skill, and tool."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-7ecaa289-b7bb-4ac6-ad62-9afb4a49d4a8",
"trajectory_id": "traj-217ccafa-716c-4534-813b-a489ed7d6079",
"timestamp": "2026-04-14T15:52:24.605126+00:00",
"stage": "retrieval",
"event_type": "candidates_recalled",
"payload": {
"memory_ids": [
"mem-telegram-pref"
],
"skill_ids": [
"skill-deploy"
],
"tool_ids": [
"tool-terminal"
]
},
"metrics": {},
"parent_event_id": "evt-13ccd07e-9bfd-4ff8-8080-47c400f0be6f"
},
{
"event_id": "evt-ad398931-c79d-411a-93f8-8c5834f5446d",
"trajectory_id": "traj-217ccafa-716c-4534-813b-a489ed7d6079",
"timestamp": "2026-04-14T15:52:24.605138+00:00",
"stage": "policy",
"event_type": "action_selected",
"payload": {
"step": 1,
"decision_type": "clarify",
"selected_ids": [],
"selected_payloads": [],
"rejected_ids": [],
"rationale": "No high-confidence route found from the current heuristic baseline.",
"estimated_cost": 0.0
},
"metrics": {},
"parent_event_id": "evt-7ecaa289-b7bb-4ac6-ad62-9afb4a49d4a8"
}
],
"outcome": {
"status": "partial_success",
"steps": 1,
"latency_ms": 0,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Draft trajectory generated by MemabraRunner with execution hooks."
},
"reward": {
"total": 0.44,
"components": {
"task_success": 0.4,
"retrieval_hit": 0.1,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.0,
"context_cost": 0.06,
"useful_reuse": 0.0
}
}
}

View File

@@ -0,0 +1,185 @@
{
"trajectory_id": "traj-3f6687ff-3a55-4a26-a7bc-8397d8da7d15",
"task": {
"task_id": "task-aeed227c-2e87-45d8-8d98-e270656556b6",
"input": "Use my telegram preference for this answer.",
"channel": "telegram",
"created_at": "2026-04-14T06:53:08.731336+00:00",
"user_id": "oza"
},
"context_snapshot": {
"conversation_summary": "",
"environment_summary": "",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-telegram-pref",
"type": "memory",
"title": "Telegram preference",
"summary": "Prefer plain text on Telegram.",
"triggers": [
"telegram",
"preference",
"answer"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 0.9,
"risk": 0.0,
"tags": [
"output"
],
"source": "user",
"type_payload": {}
}
],
"skill": [
{
"id": "skill-deploy",
"type": "skill",
"title": "Deploy workflow",
"summary": "Reusable deployment workflow.",
"triggers": [
"deploy",
"workflow",
"service"
],
"cost": 0.0,
"confidence": 0.8,
"success_rate": 0.9,
"freshness": 0.8,
"risk": 0.0,
"tags": [
"ops"
],
"source": "system",
"type_payload": {}
}
],
"tool": [
{
"id": "tool-terminal",
"type": "tool",
"title": "terminal",
"summary": "Run terminal-style inspection commands.",
"triggers": [
"check",
"current",
"status",
"system"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 1.0,
"risk": 0.0,
"tags": [
"inspection"
],
"source": "system",
"type_payload": {}
}
]
},
"decisions": [
{
"step": 1,
"decision_type": "inject_memory",
"selected_ids": [
"mem-telegram-pref"
],
"rejected_ids": [],
"rationale": "Task likely depends on stable user/project facts.",
"estimated_cost": 0.0
}
],
"events": [
{
"event_id": "evt-d71b1fdf-5343-4ac1-89a0-75488c1ce30b",
"trajectory_id": "traj-3f6687ff-3a55-4a26-a7bc-8397d8da7d15",
"timestamp": "2026-04-14T06:53:08.731418+00:00",
"stage": "retrieval",
"event_type": "task_received",
"payload": {
"input": "Use my telegram preference for this answer."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-1f750475-1127-41e5-9f94-c87e4b019ee2",
"trajectory_id": "traj-3f6687ff-3a55-4a26-a7bc-8397d8da7d15",
"timestamp": "2026-04-14T06:53:08.731427+00:00",
"stage": "retrieval",
"event_type": "candidates_recalled",
"payload": {
"memory_ids": [
"mem-telegram-pref"
],
"skill_ids": [
"skill-deploy"
],
"tool_ids": [
"tool-terminal"
]
},
"metrics": {},
"parent_event_id": "evt-d71b1fdf-5343-4ac1-89a0-75488c1ce30b"
},
{
"event_id": "evt-741967a5-41b9-4917-9b95-4047f89e6e19",
"trajectory_id": "traj-3f6687ff-3a55-4a26-a7bc-8397d8da7d15",
"timestamp": "2026-04-14T06:53:08.731432+00:00",
"stage": "policy",
"event_type": "action_selected",
"payload": {
"step": 1,
"decision_type": "inject_memory",
"selected_ids": [
"mem-telegram-pref"
],
"rejected_ids": [],
"rationale": "Task likely depends on stable user/project facts.",
"estimated_cost": 0.0
},
"metrics": {},
"parent_event_id": "evt-1f750475-1127-41e5-9f94-c87e4b019ee2"
},
{
"event_id": "evt-memory-traj-3f6687ff-3a55-4a26-a7bc-8397d8da7d15-mem-telegram-pref",
"trajectory_id": "traj-3f6687ff-3a55-4a26-a7bc-8397d8da7d15",
"timestamp": "2026-04-14T06:53:08.731437+00:00",
"stage": "execution",
"event_type": "memory_injected",
"payload": {
"record_id": "mem-telegram-pref",
"input": "Use my telegram preference for this answer."
},
"metrics": {},
"parent_event_id": null
}
],
"outcome": {
"status": "success",
"steps": 1,
"latency_ms": 0,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Draft trajectory generated by MemabraRunner with execution hooks."
},
"reward": {
"total": 1.1,
"components": {
"task_success": 0.8,
"retrieval_hit": 0.2,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.0,
"context_cost": 0.0,
"useful_reuse": 0.1
}
}
}

View File

@@ -0,0 +1,207 @@
{
"trajectory_id": "traj-439e4552-f248-43cb-b4eb-25db14da1ebc",
"task": {
"task_id": "task-cde62e1c-0106-4803-9c7d-a0c2f58206d6",
"input": "Check the current system status.",
"channel": "local",
"created_at": "2026-04-14T14:37:42.380386+00:00",
"user_id": "oza"
},
"context_snapshot": {
"conversation_summary": "",
"environment_summary": "",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-telegram-pref",
"type": "memory",
"title": "Telegram preference",
"summary": "Prefer plain text on Telegram.",
"triggers": [
"telegram",
"preference",
"answer"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 0.9,
"risk": 0.0,
"tags": [
"output"
],
"source": "user",
"type_payload": {}
}
],
"skill": [
{
"id": "skill-deploy",
"type": "skill",
"title": "Deploy workflow",
"summary": "Reusable deployment workflow.",
"triggers": [
"deploy",
"workflow",
"service"
],
"cost": 0.0,
"confidence": 0.8,
"success_rate": 0.9,
"freshness": 0.8,
"risk": 0.0,
"tags": [
"ops"
],
"source": "system",
"type_payload": {}
}
],
"tool": [
{
"id": "tool-terminal",
"type": "tool",
"title": "terminal",
"summary": "Run terminal-style inspection commands.",
"triggers": [
"check",
"current",
"status",
"system"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 1.0,
"risk": 0.0,
"tags": [
"inspection"
],
"source": "system",
"type_payload": {}
}
]
},
"decisions": [
{
"step": 1,
"decision_type": "call_tool",
"selected_ids": [
"tool-terminal"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Task asks for current state or external action; tool use is justified.",
"estimated_cost": 0.0
}
],
"events": [
{
"event_id": "evt-9252427a-3ceb-476a-b72d-a7e4f812194c",
"trajectory_id": "traj-439e4552-f248-43cb-b4eb-25db14da1ebc",
"timestamp": "2026-04-14T14:37:42.380442+00:00",
"stage": "retrieval",
"event_type": "task_received",
"payload": {
"input": "Check the current system status."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-333fbd7f-75b1-495f-acfa-6a66348ef16e",
"trajectory_id": "traj-439e4552-f248-43cb-b4eb-25db14da1ebc",
"timestamp": "2026-04-14T14:37:42.380447+00:00",
"stage": "retrieval",
"event_type": "candidates_recalled",
"payload": {
"memory_ids": [
"mem-telegram-pref"
],
"skill_ids": [
"skill-deploy"
],
"tool_ids": [
"tool-terminal"
]
},
"metrics": {},
"parent_event_id": "evt-9252427a-3ceb-476a-b72d-a7e4f812194c"
},
{
"event_id": "evt-7f4eddba-f609-4d72-bf7c-cd6a938233a7",
"trajectory_id": "traj-439e4552-f248-43cb-b4eb-25db14da1ebc",
"timestamp": "2026-04-14T14:37:42.380452+00:00",
"stage": "policy",
"event_type": "action_selected",
"payload": {
"step": 1,
"decision_type": "call_tool",
"selected_ids": [
"tool-terminal"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Task asks for current state or external action; tool use is justified.",
"estimated_cost": 0.0
},
"metrics": {},
"parent_event_id": "evt-333fbd7f-75b1-495f-acfa-6a66348ef16e"
},
{
"event_id": "evt-tool-traj-439e4552-f248-43cb-b4eb-25db14da1ebc-tool-terminal",
"trajectory_id": "traj-439e4552-f248-43cb-b4eb-25db14da1ebc",
"timestamp": "2026-04-14T14:37:42.380461+00:00",
"stage": "execution",
"event_type": "tool_called",
"payload": {
"tool_id": "tool-terminal",
"input": "Check the current system status."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-tool-result-traj-439e4552-f248-43cb-b4eb-25db14da1ebc-tool-terminal",
"trajectory_id": "traj-439e4552-f248-43cb-b4eb-25db14da1ebc",
"timestamp": "2026-04-14T14:37:42.380464+00:00",
"stage": "execution",
"event_type": "tool_result",
"payload": {
"tool_id": "tool-terminal",
"status": "success",
"output": "demo-result-for:tool-terminal",
"error": null,
"latency_ms": 42
},
"metrics": {},
"parent_event_id": null
}
],
"outcome": {
"status": "success",
"steps": 1,
"latency_ms": 42,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Draft trajectory generated by MemabraRunner with execution hooks."
},
"reward": {
"total": 1.032,
"components": {
"task_success": 0.8,
"retrieval_hit": 0.25,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.008,
"context_cost": 0.06,
"useful_reuse": 0.05
}
}
}

View File

@@ -0,0 +1,192 @@
{
"trajectory_id": "traj-58ec7a90-3ada-4b78-bc6a-6351be4eb4b5",
"task": {
"task_id": "task-0c82e670-45ab-45f9-af74-c5920f5eb9b3",
"input": "Deploy this service with the usual workflow.",
"channel": "telegram",
"created_at": "2026-04-14T14:37:42.378256+00:00",
"user_id": "oza"
},
"context_snapshot": {
"conversation_summary": "",
"environment_summary": "",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-telegram-pref",
"type": "memory",
"title": "Telegram preference",
"summary": "Prefer plain text on Telegram.",
"triggers": [
"telegram",
"preference",
"answer"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 0.9,
"risk": 0.0,
"tags": [
"output"
],
"source": "user",
"type_payload": {}
}
],
"skill": [
{
"id": "skill-deploy",
"type": "skill",
"title": "Deploy workflow",
"summary": "Reusable deployment workflow.",
"triggers": [
"deploy",
"workflow",
"service"
],
"cost": 0.0,
"confidence": 0.8,
"success_rate": 0.9,
"freshness": 0.8,
"risk": 0.0,
"tags": [
"ops"
],
"source": "system",
"type_payload": {}
}
],
"tool": [
{
"id": "tool-terminal",
"type": "tool",
"title": "terminal",
"summary": "Run terminal-style inspection commands.",
"triggers": [
"check",
"current",
"status",
"system"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 1.0,
"risk": 0.0,
"tags": [
"inspection"
],
"source": "system",
"type_payload": {}
}
]
},
"decisions": [
{
"step": 1,
"decision_type": "load_skill",
"selected_ids": [
"skill-deploy"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Task resembles a reusable procedure; load a skill before action.",
"estimated_cost": 0.0
}
],
"events": [
{
"event_id": "evt-757f035e-551f-4b55-a506-2aac41134885",
"trajectory_id": "traj-58ec7a90-3ada-4b78-bc6a-6351be4eb4b5",
"timestamp": "2026-04-14T14:37:42.378322+00:00",
"stage": "retrieval",
"event_type": "task_received",
"payload": {
"input": "Deploy this service with the usual workflow."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-dfcdd452-1902-4a6c-97fc-fd6a993c2045",
"trajectory_id": "traj-58ec7a90-3ada-4b78-bc6a-6351be4eb4b5",
"timestamp": "2026-04-14T14:37:42.378327+00:00",
"stage": "retrieval",
"event_type": "candidates_recalled",
"payload": {
"memory_ids": [
"mem-telegram-pref"
],
"skill_ids": [
"skill-deploy"
],
"tool_ids": [
"tool-terminal"
]
},
"metrics": {},
"parent_event_id": "evt-757f035e-551f-4b55-a506-2aac41134885"
},
{
"event_id": "evt-c680ed8f-a6b0-48d1-bcd4-7423089aa916",
"trajectory_id": "traj-58ec7a90-3ada-4b78-bc6a-6351be4eb4b5",
"timestamp": "2026-04-14T14:37:42.378332+00:00",
"stage": "policy",
"event_type": "action_selected",
"payload": {
"step": 1,
"decision_type": "load_skill",
"selected_ids": [
"skill-deploy"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Task resembles a reusable procedure; load a skill before action.",
"estimated_cost": 0.0
},
"metrics": {},
"parent_event_id": "evt-dfcdd452-1902-4a6c-97fc-fd6a993c2045"
},
{
"event_id": "evt-skill-traj-58ec7a90-3ada-4b78-bc6a-6351be4eb4b5-skill-deploy",
"trajectory_id": "traj-58ec7a90-3ada-4b78-bc6a-6351be4eb4b5",
"timestamp": "2026-04-14T14:37:42.378339+00:00",
"stage": "execution",
"event_type": "skill_loaded",
"payload": {
"skill_id": "skill-deploy",
"input": "Deploy this service with the usual workflow.",
"instructions": "Demo skill payload loaded successfully."
},
"metrics": {},
"parent_event_id": null
}
],
"outcome": {
"status": "success",
"steps": 1,
"latency_ms": 0,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Draft trajectory generated by MemabraRunner with execution hooks."
},
"reward": {
"total": 1.04,
"components": {
"task_success": 0.8,
"retrieval_hit": 0.2,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.0,
"context_cost": 0.06,
"useful_reuse": 0.1
}
}
}

View File

@@ -0,0 +1,191 @@
{
"trajectory_id": "traj-6a5aaff5-9336-4a1d-b102-80f1196427ae",
"task": {
"task_id": "task-549e2de3-bb55-4797-a862-e59f8d69a7e5",
"input": "Use my telegram preference for this answer.",
"channel": "telegram",
"created_at": "2026-04-14T15:27:38.519692+00:00",
"user_id": "oza"
},
"context_snapshot": {
"conversation_summary": "",
"environment_summary": "",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-telegram-pref",
"type": "memory",
"title": "Telegram preference",
"summary": "Prefer plain text on Telegram.",
"triggers": [
"telegram",
"preference",
"answer"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 0.9,
"risk": 0.0,
"tags": [
"output"
],
"source": "user",
"type_payload": {}
}
],
"skill": [
{
"id": "skill-deploy",
"type": "skill",
"title": "Deploy workflow",
"summary": "Reusable deployment workflow.",
"triggers": [
"deploy",
"workflow",
"service"
],
"cost": 0.0,
"confidence": 0.8,
"success_rate": 0.9,
"freshness": 0.8,
"risk": 0.0,
"tags": [
"ops"
],
"source": "system",
"type_payload": {}
}
],
"tool": [
{
"id": "tool-terminal",
"type": "tool",
"title": "terminal",
"summary": "Run terminal-style inspection commands.",
"triggers": [
"check",
"current",
"status",
"system"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 1.0,
"risk": 0.0,
"tags": [
"inspection"
],
"source": "system",
"type_payload": {}
}
]
},
"decisions": [
{
"step": 1,
"decision_type": "inject_memory",
"selected_ids": [
"mem-telegram-pref"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Predicted by learning router (score=1854.615).",
"estimated_cost": 0.0
}
],
"events": [
{
"event_id": "evt-369333af-5ca9-4c11-b163-6144d925ba91",
"trajectory_id": "traj-6a5aaff5-9336-4a1d-b102-80f1196427ae",
"timestamp": "2026-04-14T15:27:38.519774+00:00",
"stage": "retrieval",
"event_type": "task_received",
"payload": {
"input": "Use my telegram preference for this answer."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-51d31531-c49b-4af7-86f8-9fc3b5aff7a0",
"trajectory_id": "traj-6a5aaff5-9336-4a1d-b102-80f1196427ae",
"timestamp": "2026-04-14T15:27:38.519780+00:00",
"stage": "retrieval",
"event_type": "candidates_recalled",
"payload": {
"memory_ids": [
"mem-telegram-pref"
],
"skill_ids": [
"skill-deploy"
],
"tool_ids": [
"tool-terminal"
]
},
"metrics": {},
"parent_event_id": "evt-369333af-5ca9-4c11-b163-6144d925ba91"
},
{
"event_id": "evt-3a842acf-5111-4b77-98a2-2a18c5a4a61d",
"trajectory_id": "traj-6a5aaff5-9336-4a1d-b102-80f1196427ae",
"timestamp": "2026-04-14T15:27:38.519784+00:00",
"stage": "policy",
"event_type": "action_selected",
"payload": {
"step": 1,
"decision_type": "inject_memory",
"selected_ids": [
"mem-telegram-pref"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Predicted by learning router (score=1854.615).",
"estimated_cost": 0.0
},
"metrics": {},
"parent_event_id": "evt-51d31531-c49b-4af7-86f8-9fc3b5aff7a0"
},
{
"event_id": "evt-memory-traj-6a5aaff5-9336-4a1d-b102-80f1196427ae-mem-telegram-pref",
"trajectory_id": "traj-6a5aaff5-9336-4a1d-b102-80f1196427ae",
"timestamp": "2026-04-14T15:27:38.519790+00:00",
"stage": "execution",
"event_type": "memory_injected",
"payload": {
"record_id": "mem-telegram-pref",
"input": "Use my telegram preference for this answer."
},
"metrics": {},
"parent_event_id": null
}
],
"outcome": {
"status": "success",
"steps": 1,
"latency_ms": 0,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Draft trajectory generated by MemabraRunner with execution hooks."
},
"reward": {
"total": 1.04,
"components": {
"task_success": 0.8,
"retrieval_hit": 0.2,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.0,
"context_cost": 0.06,
"useful_reuse": 0.1
}
}
}

View File

@@ -0,0 +1,207 @@
{
"trajectory_id": "traj-707b1dec-1d9a-4a71-a07a-54841155103c",
"task": {
"task_id": "task-23d5816f-12f3-4247-8c4f-9c01d13b1fd8",
"input": "Check the current system status.",
"channel": "telegram",
"created_at": "2026-04-14T14:37:42.377746+00:00",
"user_id": "oza"
},
"context_snapshot": {
"conversation_summary": "",
"environment_summary": "",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-telegram-pref",
"type": "memory",
"title": "Telegram preference",
"summary": "Prefer plain text on Telegram.",
"triggers": [
"telegram",
"preference",
"answer"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 0.9,
"risk": 0.0,
"tags": [
"output"
],
"source": "user",
"type_payload": {}
}
],
"skill": [
{
"id": "skill-deploy",
"type": "skill",
"title": "Deploy workflow",
"summary": "Reusable deployment workflow.",
"triggers": [
"deploy",
"workflow",
"service"
],
"cost": 0.0,
"confidence": 0.8,
"success_rate": 0.9,
"freshness": 0.8,
"risk": 0.0,
"tags": [
"ops"
],
"source": "system",
"type_payload": {}
}
],
"tool": [
{
"id": "tool-terminal",
"type": "tool",
"title": "terminal",
"summary": "Run terminal-style inspection commands.",
"triggers": [
"check",
"current",
"status",
"system"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 1.0,
"risk": 0.0,
"tags": [
"inspection"
],
"source": "system",
"type_payload": {}
}
]
},
"decisions": [
{
"step": 1,
"decision_type": "call_tool",
"selected_ids": [
"tool-terminal"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Task asks for current state or external action; tool use is justified.",
"estimated_cost": 0.0
}
],
"events": [
{
"event_id": "evt-15616207-b055-41b3-98e7-fca3fdd89ce9",
"trajectory_id": "traj-707b1dec-1d9a-4a71-a07a-54841155103c",
"timestamp": "2026-04-14T14:37:42.377821+00:00",
"stage": "retrieval",
"event_type": "task_received",
"payload": {
"input": "Check the current system status."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-431bb458-0488-4712-93d5-d7a689048022",
"trajectory_id": "traj-707b1dec-1d9a-4a71-a07a-54841155103c",
"timestamp": "2026-04-14T14:37:42.377827+00:00",
"stage": "retrieval",
"event_type": "candidates_recalled",
"payload": {
"memory_ids": [
"mem-telegram-pref"
],
"skill_ids": [
"skill-deploy"
],
"tool_ids": [
"tool-terminal"
]
},
"metrics": {},
"parent_event_id": "evt-15616207-b055-41b3-98e7-fca3fdd89ce9"
},
{
"event_id": "evt-8bb2db02-56ae-4fad-a0bc-e30cd7fed98e",
"trajectory_id": "traj-707b1dec-1d9a-4a71-a07a-54841155103c",
"timestamp": "2026-04-14T14:37:42.377831+00:00",
"stage": "policy",
"event_type": "action_selected",
"payload": {
"step": 1,
"decision_type": "call_tool",
"selected_ids": [
"tool-terminal"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Task asks for current state or external action; tool use is justified.",
"estimated_cost": 0.0
},
"metrics": {},
"parent_event_id": "evt-431bb458-0488-4712-93d5-d7a689048022"
},
{
"event_id": "evt-tool-traj-707b1dec-1d9a-4a71-a07a-54841155103c-tool-terminal",
"trajectory_id": "traj-707b1dec-1d9a-4a71-a07a-54841155103c",
"timestamp": "2026-04-14T14:37:42.377843+00:00",
"stage": "execution",
"event_type": "tool_called",
"payload": {
"tool_id": "tool-terminal",
"input": "Check the current system status."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-tool-result-traj-707b1dec-1d9a-4a71-a07a-54841155103c-tool-terminal",
"trajectory_id": "traj-707b1dec-1d9a-4a71-a07a-54841155103c",
"timestamp": "2026-04-14T14:37:42.377846+00:00",
"stage": "execution",
"event_type": "tool_result",
"payload": {
"tool_id": "tool-terminal",
"status": "success",
"output": "demo-result-for:tool-terminal",
"error": null,
"latency_ms": 42
},
"metrics": {},
"parent_event_id": null
}
],
"outcome": {
"status": "success",
"steps": 1,
"latency_ms": 42,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Draft trajectory generated by MemabraRunner with execution hooks."
},
"reward": {
"total": 1.032,
"components": {
"task_success": 0.8,
"retrieval_hit": 0.25,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.008,
"context_cost": 0.06,
"useful_reuse": 0.05
}
}
}

View File

@@ -0,0 +1,207 @@
{
"trajectory_id": "traj-74e92442-04fd-4f5a-979f-2dd81a7f08e1",
"task": {
"task_id": "task-e0c612c6-d846-4dc0-9c30-4a66d0a78d2a",
"input": "Check current system status with a tool.",
"channel": "local",
"created_at": "2026-04-14T15:52:24.604470+00:00",
"user_id": null
},
"context_snapshot": {
"conversation_summary": "",
"environment_summary": "",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-telegram-pref",
"type": "memory",
"title": "Telegram preference",
"summary": "Prefer plain text on Telegram.",
"triggers": [
"telegram",
"preference",
"answer"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 0.9,
"risk": 0.0,
"tags": [
"output"
],
"source": "user",
"type_payload": {}
}
],
"skill": [
{
"id": "skill-deploy",
"type": "skill",
"title": "Deploy workflow",
"summary": "Reusable deployment workflow.",
"triggers": [
"deploy",
"workflow",
"service"
],
"cost": 0.0,
"confidence": 0.8,
"success_rate": 0.9,
"freshness": 0.8,
"risk": 0.0,
"tags": [
"ops"
],
"source": "system",
"type_payload": {}
}
],
"tool": [
{
"id": "tool-terminal",
"type": "tool",
"title": "terminal",
"summary": "Run terminal-style inspection commands.",
"triggers": [
"check",
"current",
"status",
"system"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 1.0,
"risk": 0.0,
"tags": [
"inspection"
],
"source": "system",
"type_payload": {}
}
]
},
"decisions": [
{
"step": 1,
"decision_type": "call_tool",
"selected_ids": [
"tool-terminal"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Task asks for current state or external action; tool use is justified.",
"estimated_cost": 0.0
}
],
"events": [
{
"event_id": "evt-7befe34c-6cf6-422b-9615-11fd64b50899",
"trajectory_id": "traj-74e92442-04fd-4f5a-979f-2dd81a7f08e1",
"timestamp": "2026-04-14T15:52:24.604556+00:00",
"stage": "retrieval",
"event_type": "task_received",
"payload": {
"input": "Check current system status with a tool."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-8533f7c9-696d-413d-8484-d434ffccdd02",
"trajectory_id": "traj-74e92442-04fd-4f5a-979f-2dd81a7f08e1",
"timestamp": "2026-04-14T15:52:24.604565+00:00",
"stage": "retrieval",
"event_type": "candidates_recalled",
"payload": {
"memory_ids": [
"mem-telegram-pref"
],
"skill_ids": [
"skill-deploy"
],
"tool_ids": [
"tool-terminal"
]
},
"metrics": {},
"parent_event_id": "evt-7befe34c-6cf6-422b-9615-11fd64b50899"
},
{
"event_id": "evt-2f878de3-e77d-42f6-8252-b692a11a69ac",
"trajectory_id": "traj-74e92442-04fd-4f5a-979f-2dd81a7f08e1",
"timestamp": "2026-04-14T15:52:24.604571+00:00",
"stage": "policy",
"event_type": "action_selected",
"payload": {
"step": 1,
"decision_type": "call_tool",
"selected_ids": [
"tool-terminal"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Task asks for current state or external action; tool use is justified.",
"estimated_cost": 0.0
},
"metrics": {},
"parent_event_id": "evt-8533f7c9-696d-413d-8484-d434ffccdd02"
},
{
"event_id": "evt-tool-traj-74e92442-04fd-4f5a-979f-2dd81a7f08e1-tool-terminal",
"trajectory_id": "traj-74e92442-04fd-4f5a-979f-2dd81a7f08e1",
"timestamp": "2026-04-14T15:52:24.604584+00:00",
"stage": "execution",
"event_type": "tool_called",
"payload": {
"tool_id": "tool-terminal",
"input": "Check current system status with a tool."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-tool-result-traj-74e92442-04fd-4f5a-979f-2dd81a7f08e1-tool-terminal",
"trajectory_id": "traj-74e92442-04fd-4f5a-979f-2dd81a7f08e1",
"timestamp": "2026-04-14T15:52:24.604588+00:00",
"stage": "execution",
"event_type": "tool_result",
"payload": {
"tool_id": "tool-terminal",
"status": "success",
"output": "demo-result-for:tool-terminal",
"error": null,
"latency_ms": 42
},
"metrics": {},
"parent_event_id": null
}
],
"outcome": {
"status": "success",
"steps": 1,
"latency_ms": 42,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Draft trajectory generated by MemabraRunner with execution hooks."
},
"reward": {
"total": 1.032,
"components": {
"task_success": 0.8,
"retrieval_hit": 0.25,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.008,
"context_cost": 0.06,
"useful_reuse": 0.05
}
}
}

View File

@@ -0,0 +1,191 @@
{
"trajectory_id": "traj-77ab4624-013b-4f56-b600-b3e0cbef7a06",
"task": {
"task_id": "task-ad6649f7-dcca-4dd3-9521-3409c5f4e746",
"input": "Recall my saved preference from memory.",
"channel": "local",
"created_at": "2026-04-14T16:50:18.861213+00:00",
"user_id": null
},
"context_snapshot": {
"conversation_summary": "",
"environment_summary": "",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-telegram-pref",
"type": "memory",
"title": "Telegram preference",
"summary": "Prefer plain text on Telegram.",
"triggers": [
"telegram",
"preference",
"answer"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 0.9,
"risk": 0.0,
"tags": [
"output"
],
"source": "user",
"type_payload": {}
}
],
"skill": [
{
"id": "skill-deploy",
"type": "skill",
"title": "Deploy workflow",
"summary": "Reusable deployment workflow.",
"triggers": [
"deploy",
"workflow",
"service"
],
"cost": 0.0,
"confidence": 0.8,
"success_rate": 0.9,
"freshness": 0.8,
"risk": 0.0,
"tags": [
"ops"
],
"source": "system",
"type_payload": {}
}
],
"tool": [
{
"id": "tool-terminal",
"type": "tool",
"title": "terminal",
"summary": "Run terminal-style inspection commands.",
"triggers": [
"check",
"current",
"status",
"system"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 1.0,
"risk": 0.0,
"tags": [
"inspection"
],
"source": "system",
"type_payload": {}
}
]
},
"decisions": [
{
"step": 1,
"decision_type": "inject_memory",
"selected_ids": [
"mem-telegram-pref"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Task likely depends on stable user/project facts.",
"estimated_cost": 0.0
}
],
"events": [
{
"event_id": "evt-61f191c1-68c7-4f0b-ab9b-f22b131e2637",
"trajectory_id": "traj-77ab4624-013b-4f56-b600-b3e0cbef7a06",
"timestamp": "2026-04-14T16:50:18.861293+00:00",
"stage": "retrieval",
"event_type": "task_received",
"payload": {
"input": "Recall my saved preference from memory."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-f0fbf671-18c5-4db2-86e5-68950b030992",
"trajectory_id": "traj-77ab4624-013b-4f56-b600-b3e0cbef7a06",
"timestamp": "2026-04-14T16:50:18.861299+00:00",
"stage": "retrieval",
"event_type": "candidates_recalled",
"payload": {
"memory_ids": [
"mem-telegram-pref"
],
"skill_ids": [
"skill-deploy"
],
"tool_ids": [
"tool-terminal"
]
},
"metrics": {},
"parent_event_id": "evt-61f191c1-68c7-4f0b-ab9b-f22b131e2637"
},
{
"event_id": "evt-168a76e7-3c64-4f65-8a74-0969942d6d94",
"trajectory_id": "traj-77ab4624-013b-4f56-b600-b3e0cbef7a06",
"timestamp": "2026-04-14T16:50:18.861304+00:00",
"stage": "policy",
"event_type": "action_selected",
"payload": {
"step": 1,
"decision_type": "inject_memory",
"selected_ids": [
"mem-telegram-pref"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Task likely depends on stable user/project facts.",
"estimated_cost": 0.0
},
"metrics": {},
"parent_event_id": "evt-f0fbf671-18c5-4db2-86e5-68950b030992"
},
{
"event_id": "evt-memory-traj-77ab4624-013b-4f56-b600-b3e0cbef7a06-mem-telegram-pref",
"trajectory_id": "traj-77ab4624-013b-4f56-b600-b3e0cbef7a06",
"timestamp": "2026-04-14T16:50:18.861310+00:00",
"stage": "execution",
"event_type": "memory_injected",
"payload": {
"record_id": "mem-telegram-pref",
"input": "Recall my saved preference from memory."
},
"metrics": {},
"parent_event_id": null
}
],
"outcome": {
"status": "success",
"steps": 1,
"latency_ms": 0,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Draft trajectory generated by MemabraRunner with execution hooks."
},
"reward": {
"total": 1.04,
"components": {
"task_success": 0.8,
"retrieval_hit": 0.2,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.0,
"context_cost": 0.06,
"useful_reuse": 0.1
}
}
}

View File

@@ -0,0 +1,192 @@
{
"trajectory_id": "traj-80784ce5-fc14-4fee-9f5f-90dcec26179b",
"task": {
"task_id": "task-37fe7921-66da-4390-a9bf-31209ae8a890",
"input": "Use my telegram preference for this answer.",
"channel": "telegram",
"created_at": "2026-04-14T14:37:42.381229+00:00",
"user_id": "oza"
},
"context_snapshot": {
"conversation_summary": "",
"environment_summary": "",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-telegram-pref",
"type": "memory",
"title": "Telegram preference",
"summary": "Prefer plain text on Telegram.",
"triggers": [
"telegram",
"preference",
"answer"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 0.9,
"risk": 0.0,
"tags": [
"output"
],
"source": "user",
"type_payload": {}
}
],
"skill": [
{
"id": "skill-deploy",
"type": "skill",
"title": "Deploy workflow",
"summary": "Reusable deployment workflow.",
"triggers": [
"deploy",
"workflow",
"service"
],
"cost": 0.0,
"confidence": 0.8,
"success_rate": 0.9,
"freshness": 0.8,
"risk": 0.0,
"tags": [
"ops"
],
"source": "system",
"type_payload": {}
}
],
"tool": [
{
"id": "tool-terminal",
"type": "tool",
"title": "terminal",
"summary": "Run terminal-style inspection commands.",
"triggers": [
"check",
"current",
"status",
"system"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 1.0,
"risk": 0.0,
"tags": [
"inspection"
],
"source": "system",
"type_payload": {}
}
]
},
"decisions": [
{
"step": 1,
"decision_type": "load_skill",
"selected_ids": [
"skill-deploy"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Predicted by learning router (score=1897.615).",
"estimated_cost": 0.0
}
],
"events": [
{
"event_id": "evt-01cb59f2-27b0-4be7-b0f9-c878634363ba",
"trajectory_id": "traj-80784ce5-fc14-4fee-9f5f-90dcec26179b",
"timestamp": "2026-04-14T14:37:42.381299+00:00",
"stage": "retrieval",
"event_type": "task_received",
"payload": {
"input": "Use my telegram preference for this answer."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-4281fe16-c753-4024-a0ff-e82f518e16dc",
"trajectory_id": "traj-80784ce5-fc14-4fee-9f5f-90dcec26179b",
"timestamp": "2026-04-14T14:37:42.381305+00:00",
"stage": "retrieval",
"event_type": "candidates_recalled",
"payload": {
"memory_ids": [
"mem-telegram-pref"
],
"skill_ids": [
"skill-deploy"
],
"tool_ids": [
"tool-terminal"
]
},
"metrics": {},
"parent_event_id": "evt-01cb59f2-27b0-4be7-b0f9-c878634363ba"
},
{
"event_id": "evt-ccc6afd4-82c2-4774-ba2a-732ffa9296a4",
"trajectory_id": "traj-80784ce5-fc14-4fee-9f5f-90dcec26179b",
"timestamp": "2026-04-14T14:37:42.381309+00:00",
"stage": "policy",
"event_type": "action_selected",
"payload": {
"step": 1,
"decision_type": "load_skill",
"selected_ids": [
"skill-deploy"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Predicted by learning router (score=1897.615).",
"estimated_cost": 0.0
},
"metrics": {},
"parent_event_id": "evt-4281fe16-c753-4024-a0ff-e82f518e16dc"
},
{
"event_id": "evt-skill-traj-80784ce5-fc14-4fee-9f5f-90dcec26179b-skill-deploy",
"trajectory_id": "traj-80784ce5-fc14-4fee-9f5f-90dcec26179b",
"timestamp": "2026-04-14T14:37:42.381314+00:00",
"stage": "execution",
"event_type": "skill_loaded",
"payload": {
"skill_id": "skill-deploy",
"input": "Use my telegram preference for this answer.",
"instructions": "Demo skill payload loaded successfully."
},
"metrics": {},
"parent_event_id": null
}
],
"outcome": {
"status": "success",
"steps": 1,
"latency_ms": 0,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Draft trajectory generated by MemabraRunner with execution hooks."
},
"reward": {
"total": 1.04,
"components": {
"task_success": 0.8,
"retrieval_hit": 0.2,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.0,
"context_cost": 0.06,
"useful_reuse": 0.1
}
}
}

View File

@@ -0,0 +1,191 @@
{
"trajectory_id": "traj-819443a2-79ea-48b7-a543-8bb7356dba36",
"task": {
"task_id": "task-8e991184-4d09-47bd-9a70-2f3d591d875c",
"input": "Use my telegram preference for this answer.",
"channel": "telegram",
"created_at": "2026-04-14T14:37:42.377206+00:00",
"user_id": "oza"
},
"context_snapshot": {
"conversation_summary": "",
"environment_summary": "",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-telegram-pref",
"type": "memory",
"title": "Telegram preference",
"summary": "Prefer plain text on Telegram.",
"triggers": [
"telegram",
"preference",
"answer"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 0.9,
"risk": 0.0,
"tags": [
"output"
],
"source": "user",
"type_payload": {}
}
],
"skill": [
{
"id": "skill-deploy",
"type": "skill",
"title": "Deploy workflow",
"summary": "Reusable deployment workflow.",
"triggers": [
"deploy",
"workflow",
"service"
],
"cost": 0.0,
"confidence": 0.8,
"success_rate": 0.9,
"freshness": 0.8,
"risk": 0.0,
"tags": [
"ops"
],
"source": "system",
"type_payload": {}
}
],
"tool": [
{
"id": "tool-terminal",
"type": "tool",
"title": "terminal",
"summary": "Run terminal-style inspection commands.",
"triggers": [
"check",
"current",
"status",
"system"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 1.0,
"risk": 0.0,
"tags": [
"inspection"
],
"source": "system",
"type_payload": {}
}
]
},
"decisions": [
{
"step": 1,
"decision_type": "inject_memory",
"selected_ids": [
"mem-telegram-pref"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Task likely depends on stable user/project facts.",
"estimated_cost": 0.0
}
],
"events": [
{
"event_id": "evt-79db8272-c394-40e1-b0d3-c905c305ea26",
"trajectory_id": "traj-819443a2-79ea-48b7-a543-8bb7356dba36",
"timestamp": "2026-04-14T14:37:42.377281+00:00",
"stage": "retrieval",
"event_type": "task_received",
"payload": {
"input": "Use my telegram preference for this answer."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-22367f19-007b-49cd-9ac4-30bbcc77e8a2",
"trajectory_id": "traj-819443a2-79ea-48b7-a543-8bb7356dba36",
"timestamp": "2026-04-14T14:37:42.377287+00:00",
"stage": "retrieval",
"event_type": "candidates_recalled",
"payload": {
"memory_ids": [
"mem-telegram-pref"
],
"skill_ids": [
"skill-deploy"
],
"tool_ids": [
"tool-terminal"
]
},
"metrics": {},
"parent_event_id": "evt-79db8272-c394-40e1-b0d3-c905c305ea26"
},
{
"event_id": "evt-84fe05fe-8ccc-4782-8cd2-28d56a659658",
"trajectory_id": "traj-819443a2-79ea-48b7-a543-8bb7356dba36",
"timestamp": "2026-04-14T14:37:42.377292+00:00",
"stage": "policy",
"event_type": "action_selected",
"payload": {
"step": 1,
"decision_type": "inject_memory",
"selected_ids": [
"mem-telegram-pref"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Task likely depends on stable user/project facts.",
"estimated_cost": 0.0
},
"metrics": {},
"parent_event_id": "evt-22367f19-007b-49cd-9ac4-30bbcc77e8a2"
},
{
"event_id": "evt-memory-traj-819443a2-79ea-48b7-a543-8bb7356dba36-mem-telegram-pref",
"trajectory_id": "traj-819443a2-79ea-48b7-a543-8bb7356dba36",
"timestamp": "2026-04-14T14:37:42.377297+00:00",
"stage": "execution",
"event_type": "memory_injected",
"payload": {
"record_id": "mem-telegram-pref",
"input": "Use my telegram preference for this answer."
},
"metrics": {},
"parent_event_id": null
}
],
"outcome": {
"status": "success",
"steps": 1,
"latency_ms": 0,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Draft trajectory generated by MemabraRunner with execution hooks."
},
"reward": {
"total": 1.04,
"components": {
"task_success": 0.8,
"retrieval_hit": 0.2,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.0,
"context_cost": 0.06,
"useful_reuse": 0.1
}
}
}

View File

@@ -0,0 +1,192 @@
{
"trajectory_id": "traj-9144cbc3-1ccf-4660-aad9-8db5797461eb",
"task": {
"task_id": "task-57677ff6-710a-478e-9a5d-e1367db05212",
"input": "Deploy this service with the usual workflow.",
"channel": "telegram",
"created_at": "2026-04-14T15:27:38.514525+00:00",
"user_id": "oza"
},
"context_snapshot": {
"conversation_summary": "",
"environment_summary": "",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-telegram-pref",
"type": "memory",
"title": "Telegram preference",
"summary": "Prefer plain text on Telegram.",
"triggers": [
"telegram",
"preference",
"answer"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 0.9,
"risk": 0.0,
"tags": [
"output"
],
"source": "user",
"type_payload": {}
}
],
"skill": [
{
"id": "skill-deploy",
"type": "skill",
"title": "Deploy workflow",
"summary": "Reusable deployment workflow.",
"triggers": [
"deploy",
"workflow",
"service"
],
"cost": 0.0,
"confidence": 0.8,
"success_rate": 0.9,
"freshness": 0.8,
"risk": 0.0,
"tags": [
"ops"
],
"source": "system",
"type_payload": {}
}
],
"tool": [
{
"id": "tool-terminal",
"type": "tool",
"title": "terminal",
"summary": "Run terminal-style inspection commands.",
"triggers": [
"check",
"current",
"status",
"system"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 1.0,
"risk": 0.0,
"tags": [
"inspection"
],
"source": "system",
"type_payload": {}
}
]
},
"decisions": [
{
"step": 1,
"decision_type": "load_skill",
"selected_ids": [
"skill-deploy"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Task resembles a reusable procedure; load a skill before action.",
"estimated_cost": 0.0
}
],
"events": [
{
"event_id": "evt-fce4e540-2400-45f8-8050-50f7631422e4",
"trajectory_id": "traj-9144cbc3-1ccf-4660-aad9-8db5797461eb",
"timestamp": "2026-04-14T15:27:38.514602+00:00",
"stage": "retrieval",
"event_type": "task_received",
"payload": {
"input": "Deploy this service with the usual workflow."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-eef1d203-79d4-4037-ae0b-6dff74e035f5",
"trajectory_id": "traj-9144cbc3-1ccf-4660-aad9-8db5797461eb",
"timestamp": "2026-04-14T15:27:38.514609+00:00",
"stage": "retrieval",
"event_type": "candidates_recalled",
"payload": {
"memory_ids": [
"mem-telegram-pref"
],
"skill_ids": [
"skill-deploy"
],
"tool_ids": [
"tool-terminal"
]
},
"metrics": {},
"parent_event_id": "evt-fce4e540-2400-45f8-8050-50f7631422e4"
},
{
"event_id": "evt-da150fe5-beff-45b0-a67d-9860205a9690",
"trajectory_id": "traj-9144cbc3-1ccf-4660-aad9-8db5797461eb",
"timestamp": "2026-04-14T15:27:38.514615+00:00",
"stage": "policy",
"event_type": "action_selected",
"payload": {
"step": 1,
"decision_type": "load_skill",
"selected_ids": [
"skill-deploy"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Task resembles a reusable procedure; load a skill before action.",
"estimated_cost": 0.0
},
"metrics": {},
"parent_event_id": "evt-eef1d203-79d4-4037-ae0b-6dff74e035f5"
},
{
"event_id": "evt-skill-traj-9144cbc3-1ccf-4660-aad9-8db5797461eb-skill-deploy",
"trajectory_id": "traj-9144cbc3-1ccf-4660-aad9-8db5797461eb",
"timestamp": "2026-04-14T15:27:38.514623+00:00",
"stage": "execution",
"event_type": "skill_loaded",
"payload": {
"skill_id": "skill-deploy",
"input": "Deploy this service with the usual workflow.",
"instructions": "Demo skill payload loaded successfully."
},
"metrics": {},
"parent_event_id": null
}
],
"outcome": {
"status": "success",
"steps": 1,
"latency_ms": 0,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Draft trajectory generated by MemabraRunner with execution hooks."
},
"reward": {
"total": 1.04,
"components": {
"task_success": 0.8,
"retrieval_hit": 0.2,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.0,
"context_cost": 0.06,
"useful_reuse": 0.1
}
}
}

View File

@@ -0,0 +1,191 @@
{
"trajectory_id": "traj-9190707c-5486-4266-a6c8-32f34c6c63ec",
"task": {
"task_id": "task-9f58c7ff-0bfb-4a46-bfbc-94b72b454f44",
"input": "Use my telegram preference for this answer.",
"channel": "telegram",
"created_at": "2026-04-14T14:37:42.379938+00:00",
"user_id": "oza"
},
"context_snapshot": {
"conversation_summary": "",
"environment_summary": "",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-telegram-pref",
"type": "memory",
"title": "Telegram preference",
"summary": "Prefer plain text on Telegram.",
"triggers": [
"telegram",
"preference",
"answer"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 0.9,
"risk": 0.0,
"tags": [
"output"
],
"source": "user",
"type_payload": {}
}
],
"skill": [
{
"id": "skill-deploy",
"type": "skill",
"title": "Deploy workflow",
"summary": "Reusable deployment workflow.",
"triggers": [
"deploy",
"workflow",
"service"
],
"cost": 0.0,
"confidence": 0.8,
"success_rate": 0.9,
"freshness": 0.8,
"risk": 0.0,
"tags": [
"ops"
],
"source": "system",
"type_payload": {}
}
],
"tool": [
{
"id": "tool-terminal",
"type": "tool",
"title": "terminal",
"summary": "Run terminal-style inspection commands.",
"triggers": [
"check",
"current",
"status",
"system"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 1.0,
"risk": 0.0,
"tags": [
"inspection"
],
"source": "system",
"type_payload": {}
}
]
},
"decisions": [
{
"step": 1,
"decision_type": "inject_memory",
"selected_ids": [
"mem-telegram-pref"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Task likely depends on stable user/project facts.",
"estimated_cost": 0.0
}
],
"events": [
{
"event_id": "evt-1e3b099e-dc40-45e4-9710-3d7f96dc459c",
"trajectory_id": "traj-9190707c-5486-4266-a6c8-32f34c6c63ec",
"timestamp": "2026-04-14T14:37:42.379999+00:00",
"stage": "retrieval",
"event_type": "task_received",
"payload": {
"input": "Use my telegram preference for this answer."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-95d17ab2-a0af-44e6-97db-55600c5d0517",
"trajectory_id": "traj-9190707c-5486-4266-a6c8-32f34c6c63ec",
"timestamp": "2026-04-14T14:37:42.380024+00:00",
"stage": "retrieval",
"event_type": "candidates_recalled",
"payload": {
"memory_ids": [
"mem-telegram-pref"
],
"skill_ids": [
"skill-deploy"
],
"tool_ids": [
"tool-terminal"
]
},
"metrics": {},
"parent_event_id": "evt-1e3b099e-dc40-45e4-9710-3d7f96dc459c"
},
{
"event_id": "evt-cef79d76-9bcf-41c7-a430-13e18d46e95f",
"trajectory_id": "traj-9190707c-5486-4266-a6c8-32f34c6c63ec",
"timestamp": "2026-04-14T14:37:42.380029+00:00",
"stage": "policy",
"event_type": "action_selected",
"payload": {
"step": 1,
"decision_type": "inject_memory",
"selected_ids": [
"mem-telegram-pref"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Task likely depends on stable user/project facts.",
"estimated_cost": 0.0
},
"metrics": {},
"parent_event_id": "evt-95d17ab2-a0af-44e6-97db-55600c5d0517"
},
{
"event_id": "evt-memory-traj-9190707c-5486-4266-a6c8-32f34c6c63ec-mem-telegram-pref",
"trajectory_id": "traj-9190707c-5486-4266-a6c8-32f34c6c63ec",
"timestamp": "2026-04-14T14:37:42.380034+00:00",
"stage": "execution",
"event_type": "memory_injected",
"payload": {
"record_id": "mem-telegram-pref",
"input": "Use my telegram preference for this answer."
},
"metrics": {},
"parent_event_id": null
}
],
"outcome": {
"status": "success",
"steps": 1,
"latency_ms": 0,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Draft trajectory generated by MemabraRunner with execution hooks."
},
"reward": {
"total": 1.04,
"components": {
"task_success": 0.8,
"retrieval_hit": 0.2,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.0,
"context_cost": 0.06,
"useful_reuse": 0.1
}
}
}

View File

@@ -0,0 +1,207 @@
{
"trajectory_id": "traj-9edc5088-09cc-42d6-a160-cede5357f535",
"task": {
"task_id": "task-18b8251b-4a68-45e1-93ba-645fe21a279f",
"input": "Run the deploy workflow skill.",
"channel": "local",
"created_at": "2026-04-14T15:52:24.603850+00:00",
"user_id": null
},
"context_snapshot": {
"conversation_summary": "",
"environment_summary": "",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-telegram-pref",
"type": "memory",
"title": "Telegram preference",
"summary": "Prefer plain text on Telegram.",
"triggers": [
"telegram",
"preference",
"answer"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 0.9,
"risk": 0.0,
"tags": [
"output"
],
"source": "user",
"type_payload": {}
}
],
"skill": [
{
"id": "skill-deploy",
"type": "skill",
"title": "Deploy workflow",
"summary": "Reusable deployment workflow.",
"triggers": [
"deploy",
"workflow",
"service"
],
"cost": 0.0,
"confidence": 0.8,
"success_rate": 0.9,
"freshness": 0.8,
"risk": 0.0,
"tags": [
"ops"
],
"source": "system",
"type_payload": {}
}
],
"tool": [
{
"id": "tool-terminal",
"type": "tool",
"title": "terminal",
"summary": "Run terminal-style inspection commands.",
"triggers": [
"check",
"current",
"status",
"system"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 1.0,
"risk": 0.0,
"tags": [
"inspection"
],
"source": "system",
"type_payload": {}
}
]
},
"decisions": [
{
"step": 1,
"decision_type": "call_tool",
"selected_ids": [
"tool-terminal"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Task asks for current state or external action; tool use is justified.",
"estimated_cost": 0.0
}
],
"events": [
{
"event_id": "evt-33bd0017-cf1b-44ac-892b-c2004bc44c1a",
"trajectory_id": "traj-9edc5088-09cc-42d6-a160-cede5357f535",
"timestamp": "2026-04-14T15:52:24.603951+00:00",
"stage": "retrieval",
"event_type": "task_received",
"payload": {
"input": "Run the deploy workflow skill."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-de288e29-228d-46d4-a657-34edae35fea4",
"trajectory_id": "traj-9edc5088-09cc-42d6-a160-cede5357f535",
"timestamp": "2026-04-14T15:52:24.603961+00:00",
"stage": "retrieval",
"event_type": "candidates_recalled",
"payload": {
"memory_ids": [
"mem-telegram-pref"
],
"skill_ids": [
"skill-deploy"
],
"tool_ids": [
"tool-terminal"
]
},
"metrics": {},
"parent_event_id": "evt-33bd0017-cf1b-44ac-892b-c2004bc44c1a"
},
{
"event_id": "evt-256cd272-bcee-48e2-b36b-a4048b6aef3e",
"trajectory_id": "traj-9edc5088-09cc-42d6-a160-cede5357f535",
"timestamp": "2026-04-14T15:52:24.603968+00:00",
"stage": "policy",
"event_type": "action_selected",
"payload": {
"step": 1,
"decision_type": "call_tool",
"selected_ids": [
"tool-terminal"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Task asks for current state or external action; tool use is justified.",
"estimated_cost": 0.0
},
"metrics": {},
"parent_event_id": "evt-de288e29-228d-46d4-a657-34edae35fea4"
},
{
"event_id": "evt-tool-traj-9edc5088-09cc-42d6-a160-cede5357f535-tool-terminal",
"trajectory_id": "traj-9edc5088-09cc-42d6-a160-cede5357f535",
"timestamp": "2026-04-14T15:52:24.603984+00:00",
"stage": "execution",
"event_type": "tool_called",
"payload": {
"tool_id": "tool-terminal",
"input": "Run the deploy workflow skill."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-tool-result-traj-9edc5088-09cc-42d6-a160-cede5357f535-tool-terminal",
"trajectory_id": "traj-9edc5088-09cc-42d6-a160-cede5357f535",
"timestamp": "2026-04-14T15:52:24.603990+00:00",
"stage": "execution",
"event_type": "tool_result",
"payload": {
"tool_id": "tool-terminal",
"status": "success",
"output": "demo-result-for:tool-terminal",
"error": null,
"latency_ms": 42
},
"metrics": {},
"parent_event_id": null
}
],
"outcome": {
"status": "success",
"steps": 1,
"latency_ms": 42,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Draft trajectory generated by MemabraRunner with execution hooks."
},
"reward": {
"total": 1.032,
"components": {
"task_success": 0.8,
"retrieval_hit": 0.25,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.008,
"context_cost": 0.06,
"useful_reuse": 0.05
}
}
}

View File

@@ -0,0 +1,191 @@
{
"trajectory_id": "traj-adb05c91-4c0c-493a-af84-517efea3f406",
"task": {
"task_id": "task-66d9a459-4bad-40a5-beda-a9cb30f2e790",
"input": "Use my telegram preference for this answer.",
"channel": "telegram",
"created_at": "2026-04-14T15:27:38.517870+00:00",
"user_id": "oza"
},
"context_snapshot": {
"conversation_summary": "",
"environment_summary": "",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-telegram-pref",
"type": "memory",
"title": "Telegram preference",
"summary": "Prefer plain text on Telegram.",
"triggers": [
"telegram",
"preference",
"answer"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 0.9,
"risk": 0.0,
"tags": [
"output"
],
"source": "user",
"type_payload": {}
}
],
"skill": [
{
"id": "skill-deploy",
"type": "skill",
"title": "Deploy workflow",
"summary": "Reusable deployment workflow.",
"triggers": [
"deploy",
"workflow",
"service"
],
"cost": 0.0,
"confidence": 0.8,
"success_rate": 0.9,
"freshness": 0.8,
"risk": 0.0,
"tags": [
"ops"
],
"source": "system",
"type_payload": {}
}
],
"tool": [
{
"id": "tool-terminal",
"type": "tool",
"title": "terminal",
"summary": "Run terminal-style inspection commands.",
"triggers": [
"check",
"current",
"status",
"system"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 1.0,
"risk": 0.0,
"tags": [
"inspection"
],
"source": "system",
"type_payload": {}
}
]
},
"decisions": [
{
"step": 1,
"decision_type": "inject_memory",
"selected_ids": [
"mem-telegram-pref"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Task likely depends on stable user/project facts.",
"estimated_cost": 0.0
}
],
"events": [
{
"event_id": "evt-57882c3b-f081-4cb6-b622-98594bfd7b82",
"trajectory_id": "traj-adb05c91-4c0c-493a-af84-517efea3f406",
"timestamp": "2026-04-14T15:27:38.517938+00:00",
"stage": "retrieval",
"event_type": "task_received",
"payload": {
"input": "Use my telegram preference for this answer."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-6eafb4ae-7960-4f17-a928-77834f432cbb",
"trajectory_id": "traj-adb05c91-4c0c-493a-af84-517efea3f406",
"timestamp": "2026-04-14T15:27:38.517945+00:00",
"stage": "retrieval",
"event_type": "candidates_recalled",
"payload": {
"memory_ids": [
"mem-telegram-pref"
],
"skill_ids": [
"skill-deploy"
],
"tool_ids": [
"tool-terminal"
]
},
"metrics": {},
"parent_event_id": "evt-57882c3b-f081-4cb6-b622-98594bfd7b82"
},
{
"event_id": "evt-b90de7a7-83a3-4bed-b63d-bf07ba3fc06a",
"trajectory_id": "traj-adb05c91-4c0c-493a-af84-517efea3f406",
"timestamp": "2026-04-14T15:27:38.517950+00:00",
"stage": "policy",
"event_type": "action_selected",
"payload": {
"step": 1,
"decision_type": "inject_memory",
"selected_ids": [
"mem-telegram-pref"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Task likely depends on stable user/project facts.",
"estimated_cost": 0.0
},
"metrics": {},
"parent_event_id": "evt-6eafb4ae-7960-4f17-a928-77834f432cbb"
},
{
"event_id": "evt-memory-traj-adb05c91-4c0c-493a-af84-517efea3f406-mem-telegram-pref",
"trajectory_id": "traj-adb05c91-4c0c-493a-af84-517efea3f406",
"timestamp": "2026-04-14T15:27:38.517955+00:00",
"stage": "execution",
"event_type": "memory_injected",
"payload": {
"record_id": "mem-telegram-pref",
"input": "Use my telegram preference for this answer."
},
"metrics": {},
"parent_event_id": null
}
],
"outcome": {
"status": "success",
"steps": 1,
"latency_ms": 0,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Draft trajectory generated by MemabraRunner with execution hooks."
},
"reward": {
"total": 1.04,
"components": {
"task_success": 0.8,
"retrieval_hit": 0.2,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.0,
"context_cost": 0.06,
"useful_reuse": 0.1
}
}
}

View File

@@ -0,0 +1,186 @@
{
"trajectory_id": "traj-affbeb5b-eb52-40fd-94cb-48b7c374f1fc",
"task": {
"task_id": "task-c88d23cc-88f6-4352-a506-e37187a0e28a",
"input": "Deploy this service with the usual workflow.",
"channel": "telegram",
"created_at": "2026-04-14T06:53:08.732451+00:00",
"user_id": "oza"
},
"context_snapshot": {
"conversation_summary": "",
"environment_summary": "",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-telegram-pref",
"type": "memory",
"title": "Telegram preference",
"summary": "Prefer plain text on Telegram.",
"triggers": [
"telegram",
"preference",
"answer"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 0.9,
"risk": 0.0,
"tags": [
"output"
],
"source": "user",
"type_payload": {}
}
],
"skill": [
{
"id": "skill-deploy",
"type": "skill",
"title": "Deploy workflow",
"summary": "Reusable deployment workflow.",
"triggers": [
"deploy",
"workflow",
"service"
],
"cost": 0.0,
"confidence": 0.8,
"success_rate": 0.9,
"freshness": 0.8,
"risk": 0.0,
"tags": [
"ops"
],
"source": "system",
"type_payload": {}
}
],
"tool": [
{
"id": "tool-terminal",
"type": "tool",
"title": "terminal",
"summary": "Run terminal-style inspection commands.",
"triggers": [
"check",
"current",
"status",
"system"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 1.0,
"risk": 0.0,
"tags": [
"inspection"
],
"source": "system",
"type_payload": {}
}
]
},
"decisions": [
{
"step": 1,
"decision_type": "load_skill",
"selected_ids": [
"skill-deploy"
],
"rejected_ids": [],
"rationale": "Task resembles a reusable procedure; load a skill before action.",
"estimated_cost": 0.0
}
],
"events": [
{
"event_id": "evt-56b47bb2-7cd9-4d1a-9364-b2b6c2b82759",
"trajectory_id": "traj-affbeb5b-eb52-40fd-94cb-48b7c374f1fc",
"timestamp": "2026-04-14T06:53:08.732515+00:00",
"stage": "retrieval",
"event_type": "task_received",
"payload": {
"input": "Deploy this service with the usual workflow."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-62bc72e7-4b3f-4a72-a98e-1ad5bf86aaa4",
"trajectory_id": "traj-affbeb5b-eb52-40fd-94cb-48b7c374f1fc",
"timestamp": "2026-04-14T06:53:08.732521+00:00",
"stage": "retrieval",
"event_type": "candidates_recalled",
"payload": {
"memory_ids": [
"mem-telegram-pref"
],
"skill_ids": [
"skill-deploy"
],
"tool_ids": [
"tool-terminal"
]
},
"metrics": {},
"parent_event_id": "evt-56b47bb2-7cd9-4d1a-9364-b2b6c2b82759"
},
{
"event_id": "evt-23968c32-845c-4fb2-86bb-723d70dfec80",
"trajectory_id": "traj-affbeb5b-eb52-40fd-94cb-48b7c374f1fc",
"timestamp": "2026-04-14T06:53:08.732525+00:00",
"stage": "policy",
"event_type": "action_selected",
"payload": {
"step": 1,
"decision_type": "load_skill",
"selected_ids": [
"skill-deploy"
],
"rejected_ids": [],
"rationale": "Task resembles a reusable procedure; load a skill before action.",
"estimated_cost": 0.0
},
"metrics": {},
"parent_event_id": "evt-62bc72e7-4b3f-4a72-a98e-1ad5bf86aaa4"
},
{
"event_id": "evt-skill-traj-affbeb5b-eb52-40fd-94cb-48b7c374f1fc-skill-deploy",
"trajectory_id": "traj-affbeb5b-eb52-40fd-94cb-48b7c374f1fc",
"timestamp": "2026-04-14T06:53:08.732531+00:00",
"stage": "execution",
"event_type": "skill_loaded",
"payload": {
"skill_id": "skill-deploy",
"input": "Deploy this service with the usual workflow.",
"instructions": "Demo skill payload loaded successfully."
},
"metrics": {},
"parent_event_id": null
}
],
"outcome": {
"status": "success",
"steps": 1,
"latency_ms": 0,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Draft trajectory generated by MemabraRunner with execution hooks."
},
"reward": {
"total": 1.1,
"components": {
"task_success": 0.8,
"retrieval_hit": 0.2,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.0,
"context_cost": 0.0,
"useful_reuse": 0.1
}
}
}

View File

@@ -0,0 +1,191 @@
{
"trajectory_id": "traj-b786c15f-388d-4228-9da4-c9e82b61570a",
"task": {
"task_id": "task-920b26df-8e03-47b3-af48-99454d142e90",
"input": "Recall my saved preference from memory.",
"channel": "local",
"created_at": "2026-04-14T15:52:24.603298+00:00",
"user_id": null
},
"context_snapshot": {
"conversation_summary": "",
"environment_summary": "",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-telegram-pref",
"type": "memory",
"title": "Telegram preference",
"summary": "Prefer plain text on Telegram.",
"triggers": [
"telegram",
"preference",
"answer"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 0.9,
"risk": 0.0,
"tags": [
"output"
],
"source": "user",
"type_payload": {}
}
],
"skill": [
{
"id": "skill-deploy",
"type": "skill",
"title": "Deploy workflow",
"summary": "Reusable deployment workflow.",
"triggers": [
"deploy",
"workflow",
"service"
],
"cost": 0.0,
"confidence": 0.8,
"success_rate": 0.9,
"freshness": 0.8,
"risk": 0.0,
"tags": [
"ops"
],
"source": "system",
"type_payload": {}
}
],
"tool": [
{
"id": "tool-terminal",
"type": "tool",
"title": "terminal",
"summary": "Run terminal-style inspection commands.",
"triggers": [
"check",
"current",
"status",
"system"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 1.0,
"risk": 0.0,
"tags": [
"inspection"
],
"source": "system",
"type_payload": {}
}
]
},
"decisions": [
{
"step": 1,
"decision_type": "inject_memory",
"selected_ids": [
"mem-telegram-pref"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Task likely depends on stable user/project facts.",
"estimated_cost": 0.0
}
],
"events": [
{
"event_id": "evt-795ad519-4e78-4fdd-b1a9-3e1e2b2cdea0",
"trajectory_id": "traj-b786c15f-388d-4228-9da4-c9e82b61570a",
"timestamp": "2026-04-14T15:52:24.603384+00:00",
"stage": "retrieval",
"event_type": "task_received",
"payload": {
"input": "Recall my saved preference from memory."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-1fbe3cfc-ed78-40f6-b0d9-25ccd14a0110",
"trajectory_id": "traj-b786c15f-388d-4228-9da4-c9e82b61570a",
"timestamp": "2026-04-14T15:52:24.603390+00:00",
"stage": "retrieval",
"event_type": "candidates_recalled",
"payload": {
"memory_ids": [
"mem-telegram-pref"
],
"skill_ids": [
"skill-deploy"
],
"tool_ids": [
"tool-terminal"
]
},
"metrics": {},
"parent_event_id": "evt-795ad519-4e78-4fdd-b1a9-3e1e2b2cdea0"
},
{
"event_id": "evt-a57f0922-dbfe-424a-a704-2a382ffa219b",
"trajectory_id": "traj-b786c15f-388d-4228-9da4-c9e82b61570a",
"timestamp": "2026-04-14T15:52:24.603396+00:00",
"stage": "policy",
"event_type": "action_selected",
"payload": {
"step": 1,
"decision_type": "inject_memory",
"selected_ids": [
"mem-telegram-pref"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Task likely depends on stable user/project facts.",
"estimated_cost": 0.0
},
"metrics": {},
"parent_event_id": "evt-1fbe3cfc-ed78-40f6-b0d9-25ccd14a0110"
},
{
"event_id": "evt-memory-traj-b786c15f-388d-4228-9da4-c9e82b61570a-mem-telegram-pref",
"trajectory_id": "traj-b786c15f-388d-4228-9da4-c9e82b61570a",
"timestamp": "2026-04-14T15:52:24.603401+00:00",
"stage": "execution",
"event_type": "memory_injected",
"payload": {
"record_id": "mem-telegram-pref",
"input": "Recall my saved preference from memory."
},
"metrics": {},
"parent_event_id": null
}
],
"outcome": {
"status": "success",
"steps": 1,
"latency_ms": 0,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Draft trajectory generated by MemabraRunner with execution hooks."
},
"reward": {
"total": 1.04,
"components": {
"task_success": 0.8,
"retrieval_hit": 0.2,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.0,
"context_cost": 0.06,
"useful_reuse": 0.1
}
}
}

View File

@@ -0,0 +1,192 @@
{
"trajectory_id": "traj-bcad8fa2-ffd3-4e5b-9ddb-720f3898826e",
"task": {
"task_id": "task-35b31642-86af-4e2c-a255-cdbe19659101",
"input": "Deploy this service with the usual workflow.",
"channel": "local",
"created_at": "2026-04-14T14:37:42.382074+00:00",
"user_id": "oza"
},
"context_snapshot": {
"conversation_summary": "",
"environment_summary": "",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-telegram-pref",
"type": "memory",
"title": "Telegram preference",
"summary": "Prefer plain text on Telegram.",
"triggers": [
"telegram",
"preference",
"answer"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 0.9,
"risk": 0.0,
"tags": [
"output"
],
"source": "user",
"type_payload": {}
}
],
"skill": [
{
"id": "skill-deploy",
"type": "skill",
"title": "Deploy workflow",
"summary": "Reusable deployment workflow.",
"triggers": [
"deploy",
"workflow",
"service"
],
"cost": 0.0,
"confidence": 0.8,
"success_rate": 0.9,
"freshness": 0.8,
"risk": 0.0,
"tags": [
"ops"
],
"source": "system",
"type_payload": {}
}
],
"tool": [
{
"id": "tool-terminal",
"type": "tool",
"title": "terminal",
"summary": "Run terminal-style inspection commands.",
"triggers": [
"check",
"current",
"status",
"system"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 1.0,
"risk": 0.0,
"tags": [
"inspection"
],
"source": "system",
"type_payload": {}
}
]
},
"decisions": [
{
"step": 1,
"decision_type": "load_skill",
"selected_ids": [
"skill-deploy"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Predicted by learning router (score=1941.615).",
"estimated_cost": 0.0
}
],
"events": [
{
"event_id": "evt-8f072b70-4161-46fc-bede-cceb930d4cc2",
"trajectory_id": "traj-bcad8fa2-ffd3-4e5b-9ddb-720f3898826e",
"timestamp": "2026-04-14T14:37:42.382140+00:00",
"stage": "retrieval",
"event_type": "task_received",
"payload": {
"input": "Deploy this service with the usual workflow."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-fc15daf4-f738-455e-8b12-39143b3c3d6c",
"trajectory_id": "traj-bcad8fa2-ffd3-4e5b-9ddb-720f3898826e",
"timestamp": "2026-04-14T14:37:42.382146+00:00",
"stage": "retrieval",
"event_type": "candidates_recalled",
"payload": {
"memory_ids": [
"mem-telegram-pref"
],
"skill_ids": [
"skill-deploy"
],
"tool_ids": [
"tool-terminal"
]
},
"metrics": {},
"parent_event_id": "evt-8f072b70-4161-46fc-bede-cceb930d4cc2"
},
{
"event_id": "evt-d899c751-6157-4548-893e-b766eeafeb3d",
"trajectory_id": "traj-bcad8fa2-ffd3-4e5b-9ddb-720f3898826e",
"timestamp": "2026-04-14T14:37:42.382150+00:00",
"stage": "policy",
"event_type": "action_selected",
"payload": {
"step": 1,
"decision_type": "load_skill",
"selected_ids": [
"skill-deploy"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Predicted by learning router (score=1941.615).",
"estimated_cost": 0.0
},
"metrics": {},
"parent_event_id": "evt-fc15daf4-f738-455e-8b12-39143b3c3d6c"
},
{
"event_id": "evt-skill-traj-bcad8fa2-ffd3-4e5b-9ddb-720f3898826e-skill-deploy",
"trajectory_id": "traj-bcad8fa2-ffd3-4e5b-9ddb-720f3898826e",
"timestamp": "2026-04-14T14:37:42.382155+00:00",
"stage": "execution",
"event_type": "skill_loaded",
"payload": {
"skill_id": "skill-deploy",
"input": "Deploy this service with the usual workflow.",
"instructions": "Demo skill payload loaded successfully."
},
"metrics": {},
"parent_event_id": null
}
],
"outcome": {
"status": "success",
"steps": 1,
"latency_ms": 0,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Draft trajectory generated by MemabraRunner with execution hooks."
},
"reward": {
"total": 1.04,
"components": {
"task_success": 0.8,
"retrieval_hit": 0.2,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.0,
"context_cost": 0.06,
"useful_reuse": 0.1
}
}
}

View File

@@ -0,0 +1,207 @@
{
"trajectory_id": "traj-c0faa5d1-dcb4-4e86-ac6b-2abb15026f43",
"task": {
"task_id": "task-1a24d0bb-b2e6-44f0-8095-2ed74368dc9d",
"input": "Run the deploy workflow skill.",
"channel": "local",
"created_at": "2026-04-14T16:50:18.861760+00:00",
"user_id": null
},
"context_snapshot": {
"conversation_summary": "",
"environment_summary": "",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-telegram-pref",
"type": "memory",
"title": "Telegram preference",
"summary": "Prefer plain text on Telegram.",
"triggers": [
"telegram",
"preference",
"answer"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 0.9,
"risk": 0.0,
"tags": [
"output"
],
"source": "user",
"type_payload": {}
}
],
"skill": [
{
"id": "skill-deploy",
"type": "skill",
"title": "Deploy workflow",
"summary": "Reusable deployment workflow.",
"triggers": [
"deploy",
"workflow",
"service"
],
"cost": 0.0,
"confidence": 0.8,
"success_rate": 0.9,
"freshness": 0.8,
"risk": 0.0,
"tags": [
"ops"
],
"source": "system",
"type_payload": {}
}
],
"tool": [
{
"id": "tool-terminal",
"type": "tool",
"title": "terminal",
"summary": "Run terminal-style inspection commands.",
"triggers": [
"check",
"current",
"status",
"system"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 1.0,
"risk": 0.0,
"tags": [
"inspection"
],
"source": "system",
"type_payload": {}
}
]
},
"decisions": [
{
"step": 1,
"decision_type": "call_tool",
"selected_ids": [
"tool-terminal"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Task asks for current state or external action; tool use is justified.",
"estimated_cost": 0.0
}
],
"events": [
{
"event_id": "evt-b4437076-cc94-4903-a2c7-3dd7c644dcc5",
"trajectory_id": "traj-c0faa5d1-dcb4-4e86-ac6b-2abb15026f43",
"timestamp": "2026-04-14T16:50:18.861861+00:00",
"stage": "retrieval",
"event_type": "task_received",
"payload": {
"input": "Run the deploy workflow skill."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-dd3dad15-7ace-47f7-9dd0-cf4955aa16ec",
"trajectory_id": "traj-c0faa5d1-dcb4-4e86-ac6b-2abb15026f43",
"timestamp": "2026-04-14T16:50:18.861871+00:00",
"stage": "retrieval",
"event_type": "candidates_recalled",
"payload": {
"memory_ids": [
"mem-telegram-pref"
],
"skill_ids": [
"skill-deploy"
],
"tool_ids": [
"tool-terminal"
]
},
"metrics": {},
"parent_event_id": "evt-b4437076-cc94-4903-a2c7-3dd7c644dcc5"
},
{
"event_id": "evt-c3b04a4a-2506-47db-8d08-c8939c0eba08",
"trajectory_id": "traj-c0faa5d1-dcb4-4e86-ac6b-2abb15026f43",
"timestamp": "2026-04-14T16:50:18.861878+00:00",
"stage": "policy",
"event_type": "action_selected",
"payload": {
"step": 1,
"decision_type": "call_tool",
"selected_ids": [
"tool-terminal"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Task asks for current state or external action; tool use is justified.",
"estimated_cost": 0.0
},
"metrics": {},
"parent_event_id": "evt-dd3dad15-7ace-47f7-9dd0-cf4955aa16ec"
},
{
"event_id": "evt-tool-traj-c0faa5d1-dcb4-4e86-ac6b-2abb15026f43-tool-terminal",
"trajectory_id": "traj-c0faa5d1-dcb4-4e86-ac6b-2abb15026f43",
"timestamp": "2026-04-14T16:50:18.861901+00:00",
"stage": "execution",
"event_type": "tool_called",
"payload": {
"tool_id": "tool-terminal",
"input": "Run the deploy workflow skill."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-tool-result-traj-c0faa5d1-dcb4-4e86-ac6b-2abb15026f43-tool-terminal",
"trajectory_id": "traj-c0faa5d1-dcb4-4e86-ac6b-2abb15026f43",
"timestamp": "2026-04-14T16:50:18.861906+00:00",
"stage": "execution",
"event_type": "tool_result",
"payload": {
"tool_id": "tool-terminal",
"status": "success",
"output": "demo-result-for:tool-terminal",
"error": null,
"latency_ms": 42
},
"metrics": {},
"parent_event_id": null
}
],
"outcome": {
"status": "success",
"steps": 1,
"latency_ms": 42,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Draft trajectory generated by MemabraRunner with execution hooks."
},
"reward": {
"total": 1.032,
"components": {
"task_success": 0.8,
"retrieval_hit": 0.25,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.008,
"context_cost": 0.06,
"useful_reuse": 0.05
}
}
}

View File

@@ -0,0 +1,191 @@
{
"trajectory_id": "traj-c5907bfb-61d2-47f9-a6c5-2300701bb551",
"task": {
"task_id": "task-c1f58e80-f0eb-47e9-92ab-9b1a84351dff",
"input": "Use my telegram preference for this answer.",
"channel": "telegram",
"created_at": "2026-04-14T15:27:38.512116+00:00",
"user_id": "oza"
},
"context_snapshot": {
"conversation_summary": "",
"environment_summary": "",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-telegram-pref",
"type": "memory",
"title": "Telegram preference",
"summary": "Prefer plain text on Telegram.",
"triggers": [
"telegram",
"preference",
"answer"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 0.9,
"risk": 0.0,
"tags": [
"output"
],
"source": "user",
"type_payload": {}
}
],
"skill": [
{
"id": "skill-deploy",
"type": "skill",
"title": "Deploy workflow",
"summary": "Reusable deployment workflow.",
"triggers": [
"deploy",
"workflow",
"service"
],
"cost": 0.0,
"confidence": 0.8,
"success_rate": 0.9,
"freshness": 0.8,
"risk": 0.0,
"tags": [
"ops"
],
"source": "system",
"type_payload": {}
}
],
"tool": [
{
"id": "tool-terminal",
"type": "tool",
"title": "terminal",
"summary": "Run terminal-style inspection commands.",
"triggers": [
"check",
"current",
"status",
"system"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 1.0,
"risk": 0.0,
"tags": [
"inspection"
],
"source": "system",
"type_payload": {}
}
]
},
"decisions": [
{
"step": 1,
"decision_type": "inject_memory",
"selected_ids": [
"mem-telegram-pref"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Task likely depends on stable user/project facts.",
"estimated_cost": 0.0
}
],
"events": [
{
"event_id": "evt-212f6d74-bafd-483b-b8ec-cf4a33bf67da",
"trajectory_id": "traj-c5907bfb-61d2-47f9-a6c5-2300701bb551",
"timestamp": "2026-04-14T15:27:38.512204+00:00",
"stage": "retrieval",
"event_type": "task_received",
"payload": {
"input": "Use my telegram preference for this answer."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-34b409a4-9ba9-4921-b3a6-e4c41bf7660c",
"trajectory_id": "traj-c5907bfb-61d2-47f9-a6c5-2300701bb551",
"timestamp": "2026-04-14T15:27:38.512211+00:00",
"stage": "retrieval",
"event_type": "candidates_recalled",
"payload": {
"memory_ids": [
"mem-telegram-pref"
],
"skill_ids": [
"skill-deploy"
],
"tool_ids": [
"tool-terminal"
]
},
"metrics": {},
"parent_event_id": "evt-212f6d74-bafd-483b-b8ec-cf4a33bf67da"
},
{
"event_id": "evt-d117772a-0e77-4068-8ca5-0adacfcee184",
"trajectory_id": "traj-c5907bfb-61d2-47f9-a6c5-2300701bb551",
"timestamp": "2026-04-14T15:27:38.512216+00:00",
"stage": "policy",
"event_type": "action_selected",
"payload": {
"step": 1,
"decision_type": "inject_memory",
"selected_ids": [
"mem-telegram-pref"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Task likely depends on stable user/project facts.",
"estimated_cost": 0.0
},
"metrics": {},
"parent_event_id": "evt-34b409a4-9ba9-4921-b3a6-e4c41bf7660c"
},
{
"event_id": "evt-memory-traj-c5907bfb-61d2-47f9-a6c5-2300701bb551-mem-telegram-pref",
"trajectory_id": "traj-c5907bfb-61d2-47f9-a6c5-2300701bb551",
"timestamp": "2026-04-14T15:27:38.512223+00:00",
"stage": "execution",
"event_type": "memory_injected",
"payload": {
"record_id": "mem-telegram-pref",
"input": "Use my telegram preference for this answer."
},
"metrics": {},
"parent_event_id": null
}
],
"outcome": {
"status": "success",
"steps": 1,
"latency_ms": 0,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Draft trajectory generated by MemabraRunner with execution hooks."
},
"reward": {
"total": 1.04,
"components": {
"task_success": 0.8,
"retrieval_hit": 0.2,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.0,
"context_cost": 0.06,
"useful_reuse": 0.1
}
}
}

View File

@@ -0,0 +1,191 @@
{
"trajectory_id": "traj-c9c11bdc-852b-4aef-851c-f2968806e535",
"task": {
"task_id": "task-c08fbd42-a324-4430-8277-94c666661238",
"input": "Check the current system status.",
"channel": "local",
"created_at": "2026-04-14T15:27:38.520185+00:00",
"user_id": "oza"
},
"context_snapshot": {
"conversation_summary": "",
"environment_summary": "",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-telegram-pref",
"type": "memory",
"title": "Telegram preference",
"summary": "Prefer plain text on Telegram.",
"triggers": [
"telegram",
"preference",
"answer"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 0.9,
"risk": 0.0,
"tags": [
"output"
],
"source": "user",
"type_payload": {}
}
],
"skill": [
{
"id": "skill-deploy",
"type": "skill",
"title": "Deploy workflow",
"summary": "Reusable deployment workflow.",
"triggers": [
"deploy",
"workflow",
"service"
],
"cost": 0.0,
"confidence": 0.8,
"success_rate": 0.9,
"freshness": 0.8,
"risk": 0.0,
"tags": [
"ops"
],
"source": "system",
"type_payload": {}
}
],
"tool": [
{
"id": "tool-terminal",
"type": "tool",
"title": "terminal",
"summary": "Run terminal-style inspection commands.",
"triggers": [
"check",
"current",
"status",
"system"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 1.0,
"risk": 0.0,
"tags": [
"inspection"
],
"source": "system",
"type_payload": {}
}
]
},
"decisions": [
{
"step": 1,
"decision_type": "inject_memory",
"selected_ids": [
"mem-telegram-pref"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Predicted by learning router (score=1381.615).",
"estimated_cost": 0.0
}
],
"events": [
{
"event_id": "evt-2e0920c4-6830-4c86-a4a3-139028e46176",
"trajectory_id": "traj-c9c11bdc-852b-4aef-851c-f2968806e535",
"timestamp": "2026-04-14T15:27:38.520262+00:00",
"stage": "retrieval",
"event_type": "task_received",
"payload": {
"input": "Check the current system status."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-f81b2a77-a012-4c62-9700-93f1b31daeb2",
"trajectory_id": "traj-c9c11bdc-852b-4aef-851c-f2968806e535",
"timestamp": "2026-04-14T15:27:38.520268+00:00",
"stage": "retrieval",
"event_type": "candidates_recalled",
"payload": {
"memory_ids": [
"mem-telegram-pref"
],
"skill_ids": [
"skill-deploy"
],
"tool_ids": [
"tool-terminal"
]
},
"metrics": {},
"parent_event_id": "evt-2e0920c4-6830-4c86-a4a3-139028e46176"
},
{
"event_id": "evt-2b1fe09d-30b3-46c9-a706-373d5c8da08e",
"trajectory_id": "traj-c9c11bdc-852b-4aef-851c-f2968806e535",
"timestamp": "2026-04-14T15:27:38.520273+00:00",
"stage": "policy",
"event_type": "action_selected",
"payload": {
"step": 1,
"decision_type": "inject_memory",
"selected_ids": [
"mem-telegram-pref"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Predicted by learning router (score=1381.615).",
"estimated_cost": 0.0
},
"metrics": {},
"parent_event_id": "evt-f81b2a77-a012-4c62-9700-93f1b31daeb2"
},
{
"event_id": "evt-memory-traj-c9c11bdc-852b-4aef-851c-f2968806e535-mem-telegram-pref",
"trajectory_id": "traj-c9c11bdc-852b-4aef-851c-f2968806e535",
"timestamp": "2026-04-14T15:27:38.520280+00:00",
"stage": "execution",
"event_type": "memory_injected",
"payload": {
"record_id": "mem-telegram-pref",
"input": "Check the current system status."
},
"metrics": {},
"parent_event_id": null
}
],
"outcome": {
"status": "success",
"steps": 1,
"latency_ms": 0,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Draft trajectory generated by MemabraRunner with execution hooks."
},
"reward": {
"total": 1.04,
"components": {
"task_success": 0.8,
"retrieval_hit": 0.2,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.0,
"context_cost": 0.06,
"useful_reuse": 0.1
}
}
}

View File

@@ -0,0 +1,201 @@
{
"trajectory_id": "traj-d2d3a115-36d8-466f-9d14-bf741316f698",
"task": {
"task_id": "task-00ccd7d0-72d9-458f-87fa-be0ee5571e44",
"input": "Check the current system status.",
"channel": "telegram",
"created_at": "2026-04-14T06:53:08.731950+00:00",
"user_id": "oza"
},
"context_snapshot": {
"conversation_summary": "",
"environment_summary": "",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-telegram-pref",
"type": "memory",
"title": "Telegram preference",
"summary": "Prefer plain text on Telegram.",
"triggers": [
"telegram",
"preference",
"answer"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 0.9,
"risk": 0.0,
"tags": [
"output"
],
"source": "user",
"type_payload": {}
}
],
"skill": [
{
"id": "skill-deploy",
"type": "skill",
"title": "Deploy workflow",
"summary": "Reusable deployment workflow.",
"triggers": [
"deploy",
"workflow",
"service"
],
"cost": 0.0,
"confidence": 0.8,
"success_rate": 0.9,
"freshness": 0.8,
"risk": 0.0,
"tags": [
"ops"
],
"source": "system",
"type_payload": {}
}
],
"tool": [
{
"id": "tool-terminal",
"type": "tool",
"title": "terminal",
"summary": "Run terminal-style inspection commands.",
"triggers": [
"check",
"current",
"status",
"system"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 1.0,
"risk": 0.0,
"tags": [
"inspection"
],
"source": "system",
"type_payload": {}
}
]
},
"decisions": [
{
"step": 1,
"decision_type": "call_tool",
"selected_ids": [
"tool-terminal"
],
"rejected_ids": [],
"rationale": "Task asks for current state or external action; tool use is justified.",
"estimated_cost": 0.0
}
],
"events": [
{
"event_id": "evt-63d64eb8-16b1-4dc7-ae03-7c094bc6e64f",
"trajectory_id": "traj-d2d3a115-36d8-466f-9d14-bf741316f698",
"timestamp": "2026-04-14T06:53:08.732042+00:00",
"stage": "retrieval",
"event_type": "task_received",
"payload": {
"input": "Check the current system status."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-04ef718b-6973-465d-920e-bc501a6e02ad",
"trajectory_id": "traj-d2d3a115-36d8-466f-9d14-bf741316f698",
"timestamp": "2026-04-14T06:53:08.732049+00:00",
"stage": "retrieval",
"event_type": "candidates_recalled",
"payload": {
"memory_ids": [
"mem-telegram-pref"
],
"skill_ids": [
"skill-deploy"
],
"tool_ids": [
"tool-terminal"
]
},
"metrics": {},
"parent_event_id": "evt-63d64eb8-16b1-4dc7-ae03-7c094bc6e64f"
},
{
"event_id": "evt-50f19e1e-8771-42c1-8846-95b5e4a6f491",
"trajectory_id": "traj-d2d3a115-36d8-466f-9d14-bf741316f698",
"timestamp": "2026-04-14T06:53:08.732053+00:00",
"stage": "policy",
"event_type": "action_selected",
"payload": {
"step": 1,
"decision_type": "call_tool",
"selected_ids": [
"tool-terminal"
],
"rejected_ids": [],
"rationale": "Task asks for current state or external action; tool use is justified.",
"estimated_cost": 0.0
},
"metrics": {},
"parent_event_id": "evt-04ef718b-6973-465d-920e-bc501a6e02ad"
},
{
"event_id": "evt-tool-traj-d2d3a115-36d8-466f-9d14-bf741316f698-tool-terminal",
"trajectory_id": "traj-d2d3a115-36d8-466f-9d14-bf741316f698",
"timestamp": "2026-04-14T06:53:08.732064+00:00",
"stage": "execution",
"event_type": "tool_called",
"payload": {
"tool_id": "tool-terminal",
"input": "Check the current system status."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-tool-result-traj-d2d3a115-36d8-466f-9d14-bf741316f698-tool-terminal",
"trajectory_id": "traj-d2d3a115-36d8-466f-9d14-bf741316f698",
"timestamp": "2026-04-14T06:53:08.732068+00:00",
"stage": "execution",
"event_type": "tool_result",
"payload": {
"tool_id": "tool-terminal",
"status": "success",
"output": "demo-result-for:tool-terminal",
"error": null,
"latency_ms": 42
},
"metrics": {},
"parent_event_id": null
}
],
"outcome": {
"status": "success",
"steps": 1,
"latency_ms": 42,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Draft trajectory generated by MemabraRunner with execution hooks."
},
"reward": {
"total": 1.058,
"components": {
"task_success": 0.8,
"retrieval_hit": 0.25,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.042,
"context_cost": 0.0,
"useful_reuse": 0.05
}
}
}

View File

@@ -0,0 +1,191 @@
{
"trajectory_id": "traj-d3575889-7458-44b9-b3f1-f04cd766ca76",
"task": {
"task_id": "task-9db54b7d-a508-49ac-bd3c-bd5af3eabc61",
"input": "Deploy this service with the usual workflow.",
"channel": "local",
"created_at": "2026-04-14T15:27:38.520867+00:00",
"user_id": "oza"
},
"context_snapshot": {
"conversation_summary": "",
"environment_summary": "",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-telegram-pref",
"type": "memory",
"title": "Telegram preference",
"summary": "Prefer plain text on Telegram.",
"triggers": [
"telegram",
"preference",
"answer"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 0.9,
"risk": 0.0,
"tags": [
"output"
],
"source": "user",
"type_payload": {}
}
],
"skill": [
{
"id": "skill-deploy",
"type": "skill",
"title": "Deploy workflow",
"summary": "Reusable deployment workflow.",
"triggers": [
"deploy",
"workflow",
"service"
],
"cost": 0.0,
"confidence": 0.8,
"success_rate": 0.9,
"freshness": 0.8,
"risk": 0.0,
"tags": [
"ops"
],
"source": "system",
"type_payload": {}
}
],
"tool": [
{
"id": "tool-terminal",
"type": "tool",
"title": "terminal",
"summary": "Run terminal-style inspection commands.",
"triggers": [
"check",
"current",
"status",
"system"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 1.0,
"risk": 0.0,
"tags": [
"inspection"
],
"source": "system",
"type_payload": {}
}
]
},
"decisions": [
{
"step": 1,
"decision_type": "inject_memory",
"selected_ids": [
"mem-telegram-pref"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Predicted by learning router (score=1897.615).",
"estimated_cost": 0.0
}
],
"events": [
{
"event_id": "evt-3e658630-fea8-44c3-afd2-fc936a2eed37",
"trajectory_id": "traj-d3575889-7458-44b9-b3f1-f04cd766ca76",
"timestamp": "2026-04-14T15:27:38.520945+00:00",
"stage": "retrieval",
"event_type": "task_received",
"payload": {
"input": "Deploy this service with the usual workflow."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-03990424-3433-4147-a963-353863758b31",
"trajectory_id": "traj-d3575889-7458-44b9-b3f1-f04cd766ca76",
"timestamp": "2026-04-14T15:27:38.520951+00:00",
"stage": "retrieval",
"event_type": "candidates_recalled",
"payload": {
"memory_ids": [
"mem-telegram-pref"
],
"skill_ids": [
"skill-deploy"
],
"tool_ids": [
"tool-terminal"
]
},
"metrics": {},
"parent_event_id": "evt-3e658630-fea8-44c3-afd2-fc936a2eed37"
},
{
"event_id": "evt-10dfab37-ded7-473e-9de9-2f922c5bf7c8",
"trajectory_id": "traj-d3575889-7458-44b9-b3f1-f04cd766ca76",
"timestamp": "2026-04-14T15:27:38.520956+00:00",
"stage": "policy",
"event_type": "action_selected",
"payload": {
"step": 1,
"decision_type": "inject_memory",
"selected_ids": [
"mem-telegram-pref"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Predicted by learning router (score=1897.615).",
"estimated_cost": 0.0
},
"metrics": {},
"parent_event_id": "evt-03990424-3433-4147-a963-353863758b31"
},
{
"event_id": "evt-memory-traj-d3575889-7458-44b9-b3f1-f04cd766ca76-mem-telegram-pref",
"trajectory_id": "traj-d3575889-7458-44b9-b3f1-f04cd766ca76",
"timestamp": "2026-04-14T15:27:38.520961+00:00",
"stage": "execution",
"event_type": "memory_injected",
"payload": {
"record_id": "mem-telegram-pref",
"input": "Deploy this service with the usual workflow."
},
"metrics": {},
"parent_event_id": null
}
],
"outcome": {
"status": "success",
"steps": 1,
"latency_ms": 0,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Draft trajectory generated by MemabraRunner with execution hooks."
},
"reward": {
"total": 1.04,
"components": {
"task_success": 0.8,
"retrieval_hit": 0.2,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.0,
"context_cost": 0.06,
"useful_reuse": 0.1
}
}
}

View File

@@ -0,0 +1,170 @@
{
"trajectory_id": "traj-d99b5307-1749-4e80-867a-877e087f226f",
"task": {
"task_id": "task-9cda8e38-dcdf-4877-bc19-48444df0531e",
"input": "Use multiple capabilities: memory, skill, and tool.",
"channel": "local",
"created_at": "2026-04-14T16:50:18.865109+00:00",
"user_id": null
},
"context_snapshot": {
"conversation_summary": "",
"environment_summary": "",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-telegram-pref",
"type": "memory",
"title": "Telegram preference",
"summary": "Prefer plain text on Telegram.",
"triggers": [
"telegram",
"preference",
"answer"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 0.9,
"risk": 0.0,
"tags": [
"output"
],
"source": "user",
"type_payload": {}
}
],
"skill": [
{
"id": "skill-deploy",
"type": "skill",
"title": "Deploy workflow",
"summary": "Reusable deployment workflow.",
"triggers": [
"deploy",
"workflow",
"service"
],
"cost": 0.0,
"confidence": 0.8,
"success_rate": 0.9,
"freshness": 0.8,
"risk": 0.0,
"tags": [
"ops"
],
"source": "system",
"type_payload": {}
}
],
"tool": [
{
"id": "tool-terminal",
"type": "tool",
"title": "terminal",
"summary": "Run terminal-style inspection commands.",
"triggers": [
"check",
"current",
"status",
"system"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 1.0,
"risk": 0.0,
"tags": [
"inspection"
],
"source": "system",
"type_payload": {}
}
]
},
"decisions": [
{
"step": 1,
"decision_type": "clarify",
"selected_ids": [],
"selected_payloads": [],
"rejected_ids": [],
"rationale": "Predicted by learning router (score=2606.615).",
"estimated_cost": 0.0
}
],
"events": [
{
"event_id": "evt-88a21058-c409-4836-a1b8-ef6cc63ac51e",
"trajectory_id": "traj-d99b5307-1749-4e80-867a-877e087f226f",
"timestamp": "2026-04-14T16:50:18.865214+00:00",
"stage": "retrieval",
"event_type": "task_received",
"payload": {
"input": "Use multiple capabilities: memory, skill, and tool."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-44d46564-2d71-4bed-8a3f-d3fc96fce9ef",
"trajectory_id": "traj-d99b5307-1749-4e80-867a-877e087f226f",
"timestamp": "2026-04-14T16:50:18.865225+00:00",
"stage": "retrieval",
"event_type": "candidates_recalled",
"payload": {
"memory_ids": [
"mem-telegram-pref"
],
"skill_ids": [
"skill-deploy"
],
"tool_ids": [
"tool-terminal"
]
},
"metrics": {},
"parent_event_id": "evt-88a21058-c409-4836-a1b8-ef6cc63ac51e"
},
{
"event_id": "evt-e21e8afe-d676-4839-b9d0-fd60441b983a",
"trajectory_id": "traj-d99b5307-1749-4e80-867a-877e087f226f",
"timestamp": "2026-04-14T16:50:18.865231+00:00",
"stage": "policy",
"event_type": "action_selected",
"payload": {
"step": 1,
"decision_type": "clarify",
"selected_ids": [],
"selected_payloads": [],
"rejected_ids": [],
"rationale": "Predicted by learning router (score=2606.615).",
"estimated_cost": 0.0
},
"metrics": {},
"parent_event_id": "evt-44d46564-2d71-4bed-8a3f-d3fc96fce9ef"
}
],
"outcome": {
"status": "partial_success",
"steps": 1,
"latency_ms": 0,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Draft trajectory generated by MemabraRunner with execution hooks."
},
"reward": {
"total": 0.44,
"components": {
"task_success": 0.4,
"retrieval_hit": 0.1,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.0,
"context_cost": 0.06,
"useful_reuse": 0.0
}
}
}

View File

@@ -0,0 +1,192 @@
{
"trajectory_id": "traj-dd361c81-40a1-4892-9914-2140870fff95",
"task": {
"task_id": "task-789e89f1-828b-405e-ab11-43dd00107f5f",
"input": "Deploy this service with the usual workflow.",
"channel": "local",
"created_at": "2026-04-14T15:27:38.519101+00:00",
"user_id": "oza"
},
"context_snapshot": {
"conversation_summary": "",
"environment_summary": "",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-telegram-pref",
"type": "memory",
"title": "Telegram preference",
"summary": "Prefer plain text on Telegram.",
"triggers": [
"telegram",
"preference",
"answer"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 0.9,
"risk": 0.0,
"tags": [
"output"
],
"source": "user",
"type_payload": {}
}
],
"skill": [
{
"id": "skill-deploy",
"type": "skill",
"title": "Deploy workflow",
"summary": "Reusable deployment workflow.",
"triggers": [
"deploy",
"workflow",
"service"
],
"cost": 0.0,
"confidence": 0.8,
"success_rate": 0.9,
"freshness": 0.8,
"risk": 0.0,
"tags": [
"ops"
],
"source": "system",
"type_payload": {}
}
],
"tool": [
{
"id": "tool-terminal",
"type": "tool",
"title": "terminal",
"summary": "Run terminal-style inspection commands.",
"triggers": [
"check",
"current",
"status",
"system"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 1.0,
"risk": 0.0,
"tags": [
"inspection"
],
"source": "system",
"type_payload": {}
}
]
},
"decisions": [
{
"step": 1,
"decision_type": "load_skill",
"selected_ids": [
"skill-deploy"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Task resembles a reusable procedure; load a skill before action.",
"estimated_cost": 0.0
}
],
"events": [
{
"event_id": "evt-ec9bd980-c648-43fc-8428-83a6ce0cf375",
"trajectory_id": "traj-dd361c81-40a1-4892-9914-2140870fff95",
"timestamp": "2026-04-14T15:27:38.519171+00:00",
"stage": "retrieval",
"event_type": "task_received",
"payload": {
"input": "Deploy this service with the usual workflow."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-e0f1f4e9-2a70-424d-bff6-34a156134b0f",
"trajectory_id": "traj-dd361c81-40a1-4892-9914-2140870fff95",
"timestamp": "2026-04-14T15:27:38.519177+00:00",
"stage": "retrieval",
"event_type": "candidates_recalled",
"payload": {
"memory_ids": [
"mem-telegram-pref"
],
"skill_ids": [
"skill-deploy"
],
"tool_ids": [
"tool-terminal"
]
},
"metrics": {},
"parent_event_id": "evt-ec9bd980-c648-43fc-8428-83a6ce0cf375"
},
{
"event_id": "evt-9b1ea6f8-ac54-4aa4-ae0f-44aa3a0128dd",
"trajectory_id": "traj-dd361c81-40a1-4892-9914-2140870fff95",
"timestamp": "2026-04-14T15:27:38.519181+00:00",
"stage": "policy",
"event_type": "action_selected",
"payload": {
"step": 1,
"decision_type": "load_skill",
"selected_ids": [
"skill-deploy"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Task resembles a reusable procedure; load a skill before action.",
"estimated_cost": 0.0
},
"metrics": {},
"parent_event_id": "evt-e0f1f4e9-2a70-424d-bff6-34a156134b0f"
},
{
"event_id": "evt-skill-traj-dd361c81-40a1-4892-9914-2140870fff95-skill-deploy",
"trajectory_id": "traj-dd361c81-40a1-4892-9914-2140870fff95",
"timestamp": "2026-04-14T15:27:38.519188+00:00",
"stage": "execution",
"event_type": "skill_loaded",
"payload": {
"skill_id": "skill-deploy",
"input": "Deploy this service with the usual workflow.",
"instructions": "Demo skill payload loaded successfully."
},
"metrics": {},
"parent_event_id": null
}
],
"outcome": {
"status": "success",
"steps": 1,
"latency_ms": 0,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Draft trajectory generated by MemabraRunner with execution hooks."
},
"reward": {
"total": 1.04,
"components": {
"task_success": 0.8,
"retrieval_hit": 0.2,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.0,
"context_cost": 0.06,
"useful_reuse": 0.1
}
}
}

View File

@@ -0,0 +1,192 @@
{
"trajectory_id": "traj-e197ee51-e87c-4203-b9ee-c2f2d530cceb",
"task": {
"task_id": "task-144d7465-796c-4dd0-a4e2-c2be42872c4a",
"input": "Run the deploy workflow skill.",
"channel": "local",
"created_at": "2026-04-14T15:52:24.606059+00:00",
"user_id": null
},
"context_snapshot": {
"conversation_summary": "",
"environment_summary": "",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-telegram-pref",
"type": "memory",
"title": "Telegram preference",
"summary": "Prefer plain text on Telegram.",
"triggers": [
"telegram",
"preference",
"answer"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 0.9,
"risk": 0.0,
"tags": [
"output"
],
"source": "user",
"type_payload": {}
}
],
"skill": [
{
"id": "skill-deploy",
"type": "skill",
"title": "Deploy workflow",
"summary": "Reusable deployment workflow.",
"triggers": [
"deploy",
"workflow",
"service"
],
"cost": 0.0,
"confidence": 0.8,
"success_rate": 0.9,
"freshness": 0.8,
"risk": 0.0,
"tags": [
"ops"
],
"source": "system",
"type_payload": {}
}
],
"tool": [
{
"id": "tool-terminal",
"type": "tool",
"title": "terminal",
"summary": "Run terminal-style inspection commands.",
"triggers": [
"check",
"current",
"status",
"system"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 1.0,
"risk": 0.0,
"tags": [
"inspection"
],
"source": "system",
"type_payload": {}
}
]
},
"decisions": [
{
"step": 1,
"decision_type": "load_skill",
"selected_ids": [
"skill-deploy"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Predicted by learning router (score=1277.214).",
"estimated_cost": 0.0
}
],
"events": [
{
"event_id": "evt-184ab2f3-c1c6-4af1-8241-d55b4731e606",
"trajectory_id": "traj-e197ee51-e87c-4203-b9ee-c2f2d530cceb",
"timestamp": "2026-04-14T15:52:24.606169+00:00",
"stage": "retrieval",
"event_type": "task_received",
"payload": {
"input": "Run the deploy workflow skill."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-9dd959ce-a5ce-42fd-b975-a03dd713adf6",
"trajectory_id": "traj-e197ee51-e87c-4203-b9ee-c2f2d530cceb",
"timestamp": "2026-04-14T15:52:24.606180+00:00",
"stage": "retrieval",
"event_type": "candidates_recalled",
"payload": {
"memory_ids": [
"mem-telegram-pref"
],
"skill_ids": [
"skill-deploy"
],
"tool_ids": [
"tool-terminal"
]
},
"metrics": {},
"parent_event_id": "evt-184ab2f3-c1c6-4af1-8241-d55b4731e606"
},
{
"event_id": "evt-537a8488-f6eb-4f15-94ac-3e1f195c584a",
"trajectory_id": "traj-e197ee51-e87c-4203-b9ee-c2f2d530cceb",
"timestamp": "2026-04-14T15:52:24.606193+00:00",
"stage": "policy",
"event_type": "action_selected",
"payload": {
"step": 1,
"decision_type": "load_skill",
"selected_ids": [
"skill-deploy"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Predicted by learning router (score=1277.214).",
"estimated_cost": 0.0
},
"metrics": {},
"parent_event_id": "evt-9dd959ce-a5ce-42fd-b975-a03dd713adf6"
},
{
"event_id": "evt-skill-traj-e197ee51-e87c-4203-b9ee-c2f2d530cceb-skill-deploy",
"trajectory_id": "traj-e197ee51-e87c-4203-b9ee-c2f2d530cceb",
"timestamp": "2026-04-14T15:52:24.606202+00:00",
"stage": "execution",
"event_type": "skill_loaded",
"payload": {
"skill_id": "skill-deploy",
"input": "Run the deploy workflow skill.",
"instructions": "Demo skill payload loaded successfully."
},
"metrics": {},
"parent_event_id": null
}
],
"outcome": {
"status": "success",
"steps": 1,
"latency_ms": 0,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Draft trajectory generated by MemabraRunner with execution hooks."
},
"reward": {
"total": 1.04,
"components": {
"task_success": 0.8,
"retrieval_hit": 0.2,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.0,
"context_cost": 0.06,
"useful_reuse": 0.1
}
}
}

View File

@@ -0,0 +1,170 @@
{
"trajectory_id": "traj-e9c37170-8764-4d70-ba0d-90213b275229",
"task": {
"task_id": "task-f61f5344-3be7-4a7a-9dfa-b8d2a9c30a42",
"input": "Recall my saved preference from memory.",
"channel": "local",
"created_at": "2026-04-14T16:50:18.863539+00:00",
"user_id": null
},
"context_snapshot": {
"conversation_summary": "",
"environment_summary": "",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-telegram-pref",
"type": "memory",
"title": "Telegram preference",
"summary": "Prefer plain text on Telegram.",
"triggers": [
"telegram",
"preference",
"answer"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 0.9,
"risk": 0.0,
"tags": [
"output"
],
"source": "user",
"type_payload": {}
}
],
"skill": [
{
"id": "skill-deploy",
"type": "skill",
"title": "Deploy workflow",
"summary": "Reusable deployment workflow.",
"triggers": [
"deploy",
"workflow",
"service"
],
"cost": 0.0,
"confidence": 0.8,
"success_rate": 0.9,
"freshness": 0.8,
"risk": 0.0,
"tags": [
"ops"
],
"source": "system",
"type_payload": {}
}
],
"tool": [
{
"id": "tool-terminal",
"type": "tool",
"title": "terminal",
"summary": "Run terminal-style inspection commands.",
"triggers": [
"check",
"current",
"status",
"system"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 1.0,
"risk": 0.0,
"tags": [
"inspection"
],
"source": "system",
"type_payload": {}
}
]
},
"decisions": [
{
"step": 1,
"decision_type": "clarify",
"selected_ids": [],
"selected_payloads": [],
"rejected_ids": [],
"rationale": "Predicted by learning router (score=1994.615).",
"estimated_cost": 0.0
}
],
"events": [
{
"event_id": "evt-21762ef9-6490-4e3f-8f3c-2ba17e20c050",
"trajectory_id": "traj-e9c37170-8764-4d70-ba0d-90213b275229",
"timestamp": "2026-04-14T16:50:18.863643+00:00",
"stage": "retrieval",
"event_type": "task_received",
"payload": {
"input": "Recall my saved preference from memory."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-40f5b045-1e94-4c07-8cf5-5a245a946b9d",
"trajectory_id": "traj-e9c37170-8764-4d70-ba0d-90213b275229",
"timestamp": "2026-04-14T16:50:18.863652+00:00",
"stage": "retrieval",
"event_type": "candidates_recalled",
"payload": {
"memory_ids": [
"mem-telegram-pref"
],
"skill_ids": [
"skill-deploy"
],
"tool_ids": [
"tool-terminal"
]
},
"metrics": {},
"parent_event_id": "evt-21762ef9-6490-4e3f-8f3c-2ba17e20c050"
},
{
"event_id": "evt-5ed49c2e-d2b3-46ec-859e-ec00f8c001c2",
"trajectory_id": "traj-e9c37170-8764-4d70-ba0d-90213b275229",
"timestamp": "2026-04-14T16:50:18.863659+00:00",
"stage": "policy",
"event_type": "action_selected",
"payload": {
"step": 1,
"decision_type": "clarify",
"selected_ids": [],
"selected_payloads": [],
"rejected_ids": [],
"rationale": "Predicted by learning router (score=1994.615).",
"estimated_cost": 0.0
},
"metrics": {},
"parent_event_id": "evt-40f5b045-1e94-4c07-8cf5-5a245a946b9d"
}
],
"outcome": {
"status": "partial_success",
"steps": 1,
"latency_ms": 0,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Draft trajectory generated by MemabraRunner with execution hooks."
},
"reward": {
"total": 0.44,
"components": {
"task_success": 0.4,
"retrieval_hit": 0.1,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.0,
"context_cost": 0.06,
"useful_reuse": 0.0
}
}
}

View File

@@ -0,0 +1,170 @@
{
"trajectory_id": "traj-ebc0d1f0-d01f-4c1f-8cdb-23c3d184b2c5",
"task": {
"task_id": "task-d7578bf3-95da-43f2-9b31-2c80ccb4fe33",
"input": "Run the deploy workflow skill.",
"channel": "local",
"created_at": "2026-04-14T16:50:18.864056+00:00",
"user_id": null
},
"context_snapshot": {
"conversation_summary": "",
"environment_summary": "",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-telegram-pref",
"type": "memory",
"title": "Telegram preference",
"summary": "Prefer plain text on Telegram.",
"triggers": [
"telegram",
"preference",
"answer"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 0.9,
"risk": 0.0,
"tags": [
"output"
],
"source": "user",
"type_payload": {}
}
],
"skill": [
{
"id": "skill-deploy",
"type": "skill",
"title": "Deploy workflow",
"summary": "Reusable deployment workflow.",
"triggers": [
"deploy",
"workflow",
"service"
],
"cost": 0.0,
"confidence": 0.8,
"success_rate": 0.9,
"freshness": 0.8,
"risk": 0.0,
"tags": [
"ops"
],
"source": "system",
"type_payload": {}
}
],
"tool": [
{
"id": "tool-terminal",
"type": "tool",
"title": "terminal",
"summary": "Run terminal-style inspection commands.",
"triggers": [
"check",
"current",
"status",
"system"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 1.0,
"risk": 0.0,
"tags": [
"inspection"
],
"source": "system",
"type_payload": {}
}
]
},
"decisions": [
{
"step": 1,
"decision_type": "clarify",
"selected_ids": [],
"selected_payloads": [],
"rejected_ids": [],
"rationale": "Predicted by learning router (score=1535.615).",
"estimated_cost": 0.0
}
],
"events": [
{
"event_id": "evt-4e1aa172-112d-4000-8708-f2184e114ee5",
"trajectory_id": "traj-ebc0d1f0-d01f-4c1f-8cdb-23c3d184b2c5",
"timestamp": "2026-04-14T16:50:18.864163+00:00",
"stage": "retrieval",
"event_type": "task_received",
"payload": {
"input": "Run the deploy workflow skill."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-b9e7f5b9-2f27-4f4d-8f76-8ba6b39620eb",
"trajectory_id": "traj-ebc0d1f0-d01f-4c1f-8cdb-23c3d184b2c5",
"timestamp": "2026-04-14T16:50:18.864173+00:00",
"stage": "retrieval",
"event_type": "candidates_recalled",
"payload": {
"memory_ids": [
"mem-telegram-pref"
],
"skill_ids": [
"skill-deploy"
],
"tool_ids": [
"tool-terminal"
]
},
"metrics": {},
"parent_event_id": "evt-4e1aa172-112d-4000-8708-f2184e114ee5"
},
{
"event_id": "evt-07dcc07d-d9c4-4698-881d-925294dadadf",
"trajectory_id": "traj-ebc0d1f0-d01f-4c1f-8cdb-23c3d184b2c5",
"timestamp": "2026-04-14T16:50:18.864179+00:00",
"stage": "policy",
"event_type": "action_selected",
"payload": {
"step": 1,
"decision_type": "clarify",
"selected_ids": [],
"selected_payloads": [],
"rejected_ids": [],
"rationale": "Predicted by learning router (score=1535.615).",
"estimated_cost": 0.0
},
"metrics": {},
"parent_event_id": "evt-b9e7f5b9-2f27-4f4d-8f76-8ba6b39620eb"
}
],
"outcome": {
"status": "partial_success",
"steps": 1,
"latency_ms": 0,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Draft trajectory generated by MemabraRunner with execution hooks."
},
"reward": {
"total": 0.44,
"components": {
"task_success": 0.4,
"retrieval_hit": 0.1,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.0,
"context_cost": 0.06,
"useful_reuse": 0.0
}
}
}

View File

@@ -0,0 +1,192 @@
{
"trajectory_id": "traj-ed1d8812-f0ac-4994-86ab-21b3cf0fcb17",
"task": {
"task_id": "task-d9131553-8868-4dac-8f06-69be44c43f4e",
"input": "Use multiple capabilities: memory, skill, and tool.",
"channel": "local",
"created_at": "2026-04-14T15:52:24.607062+00:00",
"user_id": null
},
"context_snapshot": {
"conversation_summary": "",
"environment_summary": "",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-telegram-pref",
"type": "memory",
"title": "Telegram preference",
"summary": "Prefer plain text on Telegram.",
"triggers": [
"telegram",
"preference",
"answer"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 0.9,
"risk": 0.0,
"tags": [
"output"
],
"source": "user",
"type_payload": {}
}
],
"skill": [
{
"id": "skill-deploy",
"type": "skill",
"title": "Deploy workflow",
"summary": "Reusable deployment workflow.",
"triggers": [
"deploy",
"workflow",
"service"
],
"cost": 0.0,
"confidence": 0.8,
"success_rate": 0.9,
"freshness": 0.8,
"risk": 0.0,
"tags": [
"ops"
],
"source": "system",
"type_payload": {}
}
],
"tool": [
{
"id": "tool-terminal",
"type": "tool",
"title": "terminal",
"summary": "Run terminal-style inspection commands.",
"triggers": [
"check",
"current",
"status",
"system"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 1.0,
"risk": 0.0,
"tags": [
"inspection"
],
"source": "system",
"type_payload": {}
}
]
},
"decisions": [
{
"step": 1,
"decision_type": "load_skill",
"selected_ids": [
"skill-deploy"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Predicted by learning router (score=2167.3334).",
"estimated_cost": 0.0
}
],
"events": [
{
"event_id": "evt-a8bc2d4a-1557-4029-899f-7fa93b764b11",
"trajectory_id": "traj-ed1d8812-f0ac-4994-86ab-21b3cf0fcb17",
"timestamp": "2026-04-14T15:52:24.607165+00:00",
"stage": "retrieval",
"event_type": "task_received",
"payload": {
"input": "Use multiple capabilities: memory, skill, and tool."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-832ac7e6-d619-4e24-ad74-bcca1042806e",
"trajectory_id": "traj-ed1d8812-f0ac-4994-86ab-21b3cf0fcb17",
"timestamp": "2026-04-14T15:52:24.607175+00:00",
"stage": "retrieval",
"event_type": "candidates_recalled",
"payload": {
"memory_ids": [
"mem-telegram-pref"
],
"skill_ids": [
"skill-deploy"
],
"tool_ids": [
"tool-terminal"
]
},
"metrics": {},
"parent_event_id": "evt-a8bc2d4a-1557-4029-899f-7fa93b764b11"
},
{
"event_id": "evt-0aaff11d-de9f-4e28-bc92-6def76857a20",
"trajectory_id": "traj-ed1d8812-f0ac-4994-86ab-21b3cf0fcb17",
"timestamp": "2026-04-14T15:52:24.607182+00:00",
"stage": "policy",
"event_type": "action_selected",
"payload": {
"step": 1,
"decision_type": "load_skill",
"selected_ids": [
"skill-deploy"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Predicted by learning router (score=2167.3334).",
"estimated_cost": 0.0
},
"metrics": {},
"parent_event_id": "evt-832ac7e6-d619-4e24-ad74-bcca1042806e"
},
{
"event_id": "evt-skill-traj-ed1d8812-f0ac-4994-86ab-21b3cf0fcb17-skill-deploy",
"trajectory_id": "traj-ed1d8812-f0ac-4994-86ab-21b3cf0fcb17",
"timestamp": "2026-04-14T15:52:24.607192+00:00",
"stage": "execution",
"event_type": "skill_loaded",
"payload": {
"skill_id": "skill-deploy",
"input": "Use multiple capabilities: memory, skill, and tool.",
"instructions": "Demo skill payload loaded successfully."
},
"metrics": {},
"parent_event_id": null
}
],
"outcome": {
"status": "success",
"steps": 1,
"latency_ms": 0,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Draft trajectory generated by MemabraRunner with execution hooks."
},
"reward": {
"total": 1.04,
"components": {
"task_success": 0.8,
"retrieval_hit": 0.2,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.0,
"context_cost": 0.06,
"useful_reuse": 0.1
}
}
}

View File

@@ -0,0 +1,192 @@
{
"trajectory_id": "traj-f1d895a0-5442-448f-8936-4ee8b07822e6",
"task": {
"task_id": "task-053282d0-1f43-409f-a230-343d3faa02df",
"input": "Check current system status with a tool.",
"channel": "local",
"created_at": "2026-04-14T15:52:24.606551+00:00",
"user_id": null
},
"context_snapshot": {
"conversation_summary": "",
"environment_summary": "",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-telegram-pref",
"type": "memory",
"title": "Telegram preference",
"summary": "Prefer plain text on Telegram.",
"triggers": [
"telegram",
"preference",
"answer"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 0.9,
"risk": 0.0,
"tags": [
"output"
],
"source": "user",
"type_payload": {}
}
],
"skill": [
{
"id": "skill-deploy",
"type": "skill",
"title": "Deploy workflow",
"summary": "Reusable deployment workflow.",
"triggers": [
"deploy",
"workflow",
"service"
],
"cost": 0.0,
"confidence": 0.8,
"success_rate": 0.9,
"freshness": 0.8,
"risk": 0.0,
"tags": [
"ops"
],
"source": "system",
"type_payload": {}
}
],
"tool": [
{
"id": "tool-terminal",
"type": "tool",
"title": "terminal",
"summary": "Run terminal-style inspection commands.",
"triggers": [
"check",
"current",
"status",
"system"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 1.0,
"risk": 0.0,
"tags": [
"inspection"
],
"source": "system",
"type_payload": {}
}
]
},
"decisions": [
{
"step": 1,
"decision_type": "load_skill",
"selected_ids": [
"skill-deploy"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Predicted by learning router (score=1701.0804).",
"estimated_cost": 0.0
}
],
"events": [
{
"event_id": "evt-5900439a-2a97-41fe-a82e-96181c99fee1",
"trajectory_id": "traj-f1d895a0-5442-448f-8936-4ee8b07822e6",
"timestamp": "2026-04-14T15:52:24.606656+00:00",
"stage": "retrieval",
"event_type": "task_received",
"payload": {
"input": "Check current system status with a tool."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-e0965597-ddee-4ccd-ae72-b51105101428",
"trajectory_id": "traj-f1d895a0-5442-448f-8936-4ee8b07822e6",
"timestamp": "2026-04-14T15:52:24.606666+00:00",
"stage": "retrieval",
"event_type": "candidates_recalled",
"payload": {
"memory_ids": [
"mem-telegram-pref"
],
"skill_ids": [
"skill-deploy"
],
"tool_ids": [
"tool-terminal"
]
},
"metrics": {},
"parent_event_id": "evt-5900439a-2a97-41fe-a82e-96181c99fee1"
},
{
"event_id": "evt-047dc545-d6c2-4a67-b0db-26b79e994e63",
"trajectory_id": "traj-f1d895a0-5442-448f-8936-4ee8b07822e6",
"timestamp": "2026-04-14T15:52:24.606672+00:00",
"stage": "policy",
"event_type": "action_selected",
"payload": {
"step": 1,
"decision_type": "load_skill",
"selected_ids": [
"skill-deploy"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Predicted by learning router (score=1701.0804).",
"estimated_cost": 0.0
},
"metrics": {},
"parent_event_id": "evt-e0965597-ddee-4ccd-ae72-b51105101428"
},
{
"event_id": "evt-skill-traj-f1d895a0-5442-448f-8936-4ee8b07822e6-skill-deploy",
"trajectory_id": "traj-f1d895a0-5442-448f-8936-4ee8b07822e6",
"timestamp": "2026-04-14T15:52:24.606681+00:00",
"stage": "execution",
"event_type": "skill_loaded",
"payload": {
"skill_id": "skill-deploy",
"input": "Check current system status with a tool.",
"instructions": "Demo skill payload loaded successfully."
},
"metrics": {},
"parent_event_id": null
}
],
"outcome": {
"status": "success",
"steps": 1,
"latency_ms": 0,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Draft trajectory generated by MemabraRunner with execution hooks."
},
"reward": {
"total": 1.04,
"components": {
"task_success": 0.8,
"retrieval_hit": 0.2,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.0,
"context_cost": 0.06,
"useful_reuse": 0.1
}
}
}

View File

@@ -0,0 +1,170 @@
{
"trajectory_id": "traj-f511e978-ad79-4be6-bbab-461b5ad9ecb3",
"task": {
"task_id": "task-c3c52f6d-4793-4687-9838-d98fd99a6074",
"input": "Use multiple capabilities: memory, skill, and tool.",
"channel": "local",
"created_at": "2026-04-14T16:50:18.863031+00:00",
"user_id": null
},
"context_snapshot": {
"conversation_summary": "",
"environment_summary": "",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-telegram-pref",
"type": "memory",
"title": "Telegram preference",
"summary": "Prefer plain text on Telegram.",
"triggers": [
"telegram",
"preference",
"answer"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 0.9,
"risk": 0.0,
"tags": [
"output"
],
"source": "user",
"type_payload": {}
}
],
"skill": [
{
"id": "skill-deploy",
"type": "skill",
"title": "Deploy workflow",
"summary": "Reusable deployment workflow.",
"triggers": [
"deploy",
"workflow",
"service"
],
"cost": 0.0,
"confidence": 0.8,
"success_rate": 0.9,
"freshness": 0.8,
"risk": 0.0,
"tags": [
"ops"
],
"source": "system",
"type_payload": {}
}
],
"tool": [
{
"id": "tool-terminal",
"type": "tool",
"title": "terminal",
"summary": "Run terminal-style inspection commands.",
"triggers": [
"check",
"current",
"status",
"system"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 1.0,
"risk": 0.0,
"tags": [
"inspection"
],
"source": "system",
"type_payload": {}
}
]
},
"decisions": [
{
"step": 1,
"decision_type": "clarify",
"selected_ids": [],
"selected_payloads": [],
"rejected_ids": [],
"rationale": "No high-confidence route found from the current heuristic baseline.",
"estimated_cost": 0.0
}
],
"events": [
{
"event_id": "evt-1cfd8f39-f961-43da-9fb4-9e37dd7072f0",
"trajectory_id": "traj-f511e978-ad79-4be6-bbab-461b5ad9ecb3",
"timestamp": "2026-04-14T16:50:18.863119+00:00",
"stage": "retrieval",
"event_type": "task_received",
"payload": {
"input": "Use multiple capabilities: memory, skill, and tool."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-a7f6a38f-76c5-4342-a592-4acbd15efe9f",
"trajectory_id": "traj-f511e978-ad79-4be6-bbab-461b5ad9ecb3",
"timestamp": "2026-04-14T16:50:18.863129+00:00",
"stage": "retrieval",
"event_type": "candidates_recalled",
"payload": {
"memory_ids": [
"mem-telegram-pref"
],
"skill_ids": [
"skill-deploy"
],
"tool_ids": [
"tool-terminal"
]
},
"metrics": {},
"parent_event_id": "evt-1cfd8f39-f961-43da-9fb4-9e37dd7072f0"
},
{
"event_id": "evt-79e3d820-34bf-4c20-9286-2e20dd3e068c",
"trajectory_id": "traj-f511e978-ad79-4be6-bbab-461b5ad9ecb3",
"timestamp": "2026-04-14T16:50:18.863136+00:00",
"stage": "policy",
"event_type": "action_selected",
"payload": {
"step": 1,
"decision_type": "clarify",
"selected_ids": [],
"selected_payloads": [],
"rejected_ids": [],
"rationale": "No high-confidence route found from the current heuristic baseline.",
"estimated_cost": 0.0
},
"metrics": {},
"parent_event_id": "evt-a7f6a38f-76c5-4342-a592-4acbd15efe9f"
}
],
"outcome": {
"status": "partial_success",
"steps": 1,
"latency_ms": 0,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Draft trajectory generated by MemabraRunner with execution hooks."
},
"reward": {
"total": 0.44,
"components": {
"task_success": 0.4,
"retrieval_hit": 0.1,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.0,
"context_cost": 0.06,
"useful_reuse": 0.0
}
}
}

View File

@@ -0,0 +1,192 @@
{
"trajectory_id": "traj-ffb40d01-7956-4d7b-a41c-9618487fe619",
"task": {
"task_id": "task-f0aed2e6-8d9b-42f8-a20c-5eb8af052d3b",
"input": "Recall my saved preference from memory.",
"channel": "local",
"created_at": "2026-04-14T15:52:24.605509+00:00",
"user_id": null
},
"context_snapshot": {
"conversation_summary": "",
"environment_summary": "",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-telegram-pref",
"type": "memory",
"title": "Telegram preference",
"summary": "Prefer plain text on Telegram.",
"triggers": [
"telegram",
"preference",
"answer"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 0.9,
"risk": 0.0,
"tags": [
"output"
],
"source": "user",
"type_payload": {}
}
],
"skill": [
{
"id": "skill-deploy",
"type": "skill",
"title": "Deploy workflow",
"summary": "Reusable deployment workflow.",
"triggers": [
"deploy",
"workflow",
"service"
],
"cost": 0.0,
"confidence": 0.8,
"success_rate": 0.9,
"freshness": 0.8,
"risk": 0.0,
"tags": [
"ops"
],
"source": "system",
"type_payload": {}
}
],
"tool": [
{
"id": "tool-terminal",
"type": "tool",
"title": "terminal",
"summary": "Run terminal-style inspection commands.",
"triggers": [
"check",
"current",
"status",
"system"
],
"cost": 0.0,
"confidence": 0.95,
"success_rate": 0.9,
"freshness": 1.0,
"risk": 0.0,
"tags": [
"inspection"
],
"source": "system",
"type_payload": {}
}
]
},
"decisions": [
{
"step": 1,
"decision_type": "load_skill",
"selected_ids": [
"skill-deploy"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Predicted by learning router (score=1658.6938).",
"estimated_cost": 0.0
}
],
"events": [
{
"event_id": "evt-44233637-eb1a-47de-972c-942ee409dd78",
"trajectory_id": "traj-ffb40d01-7956-4d7b-a41c-9618487fe619",
"timestamp": "2026-04-14T15:52:24.605614+00:00",
"stage": "retrieval",
"event_type": "task_received",
"payload": {
"input": "Recall my saved preference from memory."
},
"metrics": {},
"parent_event_id": null
},
{
"event_id": "evt-01123ad4-7d52-4c82-bca1-1a3b5014196f",
"trajectory_id": "traj-ffb40d01-7956-4d7b-a41c-9618487fe619",
"timestamp": "2026-04-14T15:52:24.605625+00:00",
"stage": "retrieval",
"event_type": "candidates_recalled",
"payload": {
"memory_ids": [
"mem-telegram-pref"
],
"skill_ids": [
"skill-deploy"
],
"tool_ids": [
"tool-terminal"
]
},
"metrics": {},
"parent_event_id": "evt-44233637-eb1a-47de-972c-942ee409dd78"
},
{
"event_id": "evt-a9a657a1-1e3e-49f3-8ea0-9528c12c633f",
"trajectory_id": "traj-ffb40d01-7956-4d7b-a41c-9618487fe619",
"timestamp": "2026-04-14T15:52:24.605632+00:00",
"stage": "policy",
"event_type": "action_selected",
"payload": {
"step": 1,
"decision_type": "load_skill",
"selected_ids": [
"skill-deploy"
],
"selected_payloads": [
{}
],
"rejected_ids": [],
"rationale": "Predicted by learning router (score=1658.6938).",
"estimated_cost": 0.0
},
"metrics": {},
"parent_event_id": "evt-01123ad4-7d52-4c82-bca1-1a3b5014196f"
},
{
"event_id": "evt-skill-traj-ffb40d01-7956-4d7b-a41c-9618487fe619-skill-deploy",
"trajectory_id": "traj-ffb40d01-7956-4d7b-a41c-9618487fe619",
"timestamp": "2026-04-14T15:52:24.605642+00:00",
"stage": "execution",
"event_type": "skill_loaded",
"payload": {
"skill_id": "skill-deploy",
"input": "Recall my saved preference from memory.",
"instructions": "Demo skill payload loaded successfully."
},
"metrics": {},
"parent_event_id": null
}
],
"outcome": {
"status": "success",
"steps": 1,
"latency_ms": 0,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Draft trajectory generated by MemabraRunner with execution hooks."
},
"reward": {
"total": 1.04,
"components": {
"task_success": 0.8,
"retrieval_hit": 0.2,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.0,
"context_cost": 0.06,
"useful_reuse": 0.1
}
}
}

View File

@@ -0,0 +1,66 @@
{
"trajectory_id": "traj-failure-missed-memory-001",
"task": {
"task_id": "task-004",
"input": "Use my usual formatting preferences for this write-up.",
"channel": "telegram",
"created_at": "2026-04-14T13:05:00Z",
"user_id": "oza"
},
"context_snapshot": {
"conversation_summary": "User has repeated stable formatting preferences in earlier sessions.",
"environment_summary": "No tool call required.",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-format-1",
"type": "memory",
"title": "Telegram formatting preference",
"summary": "Prefer plain text over markdown for Telegram delivery.",
"triggers": ["format", "telegram", "write-up"],
"cost": 0.05,
"confidence": 0.9,
"success_rate": 0.95,
"freshness": 0.95,
"risk": 0.05,
"tags": ["preference", "output"],
"source": "system"
}
],
"skill": [],
"tool": []
},
"decisions": [
{
"step": 1,
"decision_type": "direct_answer",
"selected_ids": [],
"rejected_ids": ["mem-format-1"],
"rationale": "Router failed to recognize a preference-triggered task and skipped memory injection.",
"estimated_cost": 0.0
}
],
"events": [],
"outcome": {
"status": "partial_success",
"steps": 1,
"latency_ms": 300,
"user_corrections": 1,
"tool_errors": 0,
"notes": "Answer was serviceable but ignored known formatting preference."
},
"reward": {
"total": 0.18,
"components": {
"task_success": 0.5,
"retrieval_hit": -0.1,
"tool_error": 0.0,
"user_correction": 0.2,
"latency": 0.02,
"context_cost": 0.0,
"useful_reuse": 0.0
}
}
}

View File

@@ -0,0 +1,67 @@
{
"trajectory_id": "traj-failure-overtool-001",
"task": {
"task_id": "task-003",
"input": "Name this project.",
"channel": "telegram",
"created_at": "2026-04-14T13:04:00Z",
"user_id": "oza"
},
"context_snapshot": {
"conversation_summary": "User asks for naming help for an agent memory project.",
"environment_summary": "No real-time state lookup required.",
"recent_failures": ["The agent previously overused tools for pure reasoning tasks."]
},
"candidate_sets": {
"memory": [],
"skill": [],
"tool": [
{
"id": "tool-web-1",
"type": "tool",
"title": "web_search",
"summary": "Search the web for information.",
"triggers": ["name", "idea"],
"cost": 0.4,
"confidence": 0.62,
"success_rate": 0.55,
"freshness": 1.0,
"risk": 0.3,
"tags": ["research"],
"source": "system"
}
],
"skill": []
},
"decisions": [
{
"step": 1,
"decision_type": "call_tool",
"selected_ids": ["tool-web-1"],
"rejected_ids": [],
"rationale": "Incorrectly treated naming as a research task rather than a reasoning task.",
"estimated_cost": 0.4
}
],
"events": [],
"outcome": {
"status": "failure",
"steps": 2,
"latency_ms": 2400,
"user_corrections": 1,
"tool_errors": 1,
"notes": "Over-tooled a pure reasoning task and forced unnecessary latency."
},
"reward": {
"total": -0.82,
"components": {
"task_success": -0.3,
"retrieval_hit": 0.0,
"tool_error": 0.35,
"user_correction": 0.25,
"latency": 0.12,
"context_cost": 0.1,
"useful_reuse": 0.0
}
}
}

View File

@@ -0,0 +1,66 @@
{
"trajectory_id": "traj-success-memory-001",
"task": {
"task_id": "task-001",
"input": "Remember my preferred deployment region and use it next time.",
"channel": "telegram",
"created_at": "2026-04-14T13:02:00Z",
"user_id": "oza"
},
"context_snapshot": {
"conversation_summary": "User is defining a local agent memory project and references recurring preferences.",
"environment_summary": "No live tool call required.",
"recent_failures": []
},
"candidate_sets": {
"memory": [
{
"id": "mem-region-1",
"type": "memory",
"title": "Preferred deployment region",
"summary": "User prefers us-west-2 for deployments.",
"triggers": ["deployment", "region", "preference"],
"cost": 0.1,
"confidence": 0.93,
"success_rate": 0.88,
"freshness": 0.9,
"risk": 0.1,
"tags": ["preference", "deployment"],
"source": "user"
}
],
"skill": [],
"tool": []
},
"decisions": [
{
"step": 1,
"decision_type": "inject_memory",
"selected_ids": ["mem-region-1"],
"rejected_ids": [],
"rationale": "User request depends on a stable preference, so memory injection is the lowest-cost correct route.",
"estimated_cost": 0.1
}
],
"events": [],
"outcome": {
"status": "success",
"steps": 1,
"latency_ms": 350,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Correctly identified preference storage request without unnecessary tools."
},
"reward": {
"total": 1.72,
"components": {
"task_success": 1.0,
"retrieval_hit": 0.45,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.03,
"context_cost": 0.05,
"useful_reuse": 0.35
}
}
}

View File

@@ -0,0 +1,67 @@
{
"trajectory_id": "traj-success-tool-001",
"task": {
"task_id": "task-002",
"input": "Check the current test status for the prototype.",
"channel": "telegram",
"created_at": "2026-04-14T13:03:00Z",
"user_id": "oza"
},
"context_snapshot": {
"conversation_summary": "User wants concrete progress on the memabra prototype.",
"environment_summary": "Pytest is available in the local repo environment.",
"recent_failures": []
},
"candidate_sets": {
"memory": [],
"skill": [],
"tool": [
{
"id": "tool-terminal-1",
"type": "tool",
"title": "terminal",
"summary": "Run shell commands in the local environment.",
"triggers": ["check", "current", "test"],
"cost": 0.2,
"confidence": 0.95,
"success_rate": 0.92,
"freshness": 1.0,
"risk": 0.2,
"tags": ["system", "tests"],
"source": "system"
}
],
"skill": []
},
"decisions": [
{
"step": 1,
"decision_type": "call_tool",
"selected_ids": ["tool-terminal-1"],
"rejected_ids": [],
"rationale": "Current test status is a live system fact and must be observed with a tool.",
"estimated_cost": 0.2
}
],
"events": [],
"outcome": {
"status": "success",
"steps": 1,
"latency_ms": 700,
"user_corrections": 0,
"tool_errors": 0,
"notes": "Terminal used appropriately to inspect live test state."
},
"reward": {
"total": 1.6,
"components": {
"task_success": 1.0,
"retrieval_hit": 0.4,
"tool_error": 0.0,
"user_correction": 0.0,
"latency": 0.08,
"context_cost": 0.02,
"useful_reuse": 0.3
}
}
}

191
docs/reward_spec.md Normal file
View File

@@ -0,0 +1,191 @@
# Reward Specification
## 目标
memabra 的 reward 不是简单判断“任务做成没”,而是评估:
- 是否选对了 memory / skill / tool
- 是否高效
- 是否稳定
- 是否减少了用户重复输入和纠正
- 是否控制了工具成本与上下文成本
reward 的作用不是直接美化分数,而是给路由策略提供可归因、可优化的训练信号。
## Reward 组成
总奖励记为:
```text
R = ws*S + wr*H - we*E - wc*C - wl*L - wx*X + wu*U
```
其中:
- `S` = task success
- `H` = retrieval hit quality
- `E` = execution/tool error penalty
- `C` = user correction penalty
- `L` = latency penalty
- `X` = context cost penalty
- `U` = useful reuse bonus
## 1. Task Success (`S`)
定义:任务最终是否完成,以及完成质量如何。
建议取值:
- `1.0`:完整达成目标
- `0.5`:部分达成
- `0.0`:未完成
- `-0.5`:明显误导或做错方向
数据来源:
- 自动任务验收器
- 用户显式反馈
- 回放对比规则
## 2. Retrieval Hit Quality (`H`)
定义:是否命中对任务真正有帮助的 memory / skill / tool。
建议拆分:
- `Hm`memory hit
- `Hs`skill hit
- `Ht`tool hit
取值思路:
- 命中高价值候选并帮助减少步骤:正奖励
- 召回很多但没用:低奖励或 0
- 漏掉关键候选:负奖励
## 3. Execution / Tool Error Penalty (`E`)
定义:是否出现无效调用、错误调用、明显多余调用。
示例:
- 调了不该调的工具
- 工具参数明显错
- 重复调用同一无效动作
- 本可以直接答,却走了长链路
建议取值:
- 每次轻微错误:`0.1``0.3`
- 严重错误:`0.5``1.0`
## 4. User Correction Penalty (`C`)
定义:用户是否需要补充本应已知的信息,或纠正错误动作。
示例:
- 用户重复说明偏好
- 用户指出调用了错误工具
- 用户要求撤回错误记忆
解释:
这项对长期系统非常关键,因为它直接代表“系统到底有没有真正学会”。
## 5. Latency Penalty (`L`)
定义:系统完成任务消耗的时间和步骤是否过长。
建议包括:
- wall-clock latency
- action count
- retry count
思路:
- 少量额外推理可以接受
- 大量无效绕路必须惩罚
## 6. Context Cost Penalty (`X`)
定义:是否过度膨胀上下文。
包括:
- 注入了太多无关 memory
- 加载了不必要的 skill
- 输出了过大的中间内容
原因:
agent 很容易“为了保险多塞一点”,结果把上下文拖死。
这个成本必须显式进 reward。
## 7. Useful Reuse Bonus (`U`)
定义:是否复用了正确的长期信息,并确实提升了效率或质量。
例子:
- 成功复用用户偏好,避免再次确认
- 复用已验证的 skill减少试错
- 复用相似 episode加速完成任务
## 初始权重建议
可先用一个朴素版本:
```text
ws = 1.0
wr = 0.35
we = 0.30
wc = 0.40
wl = 0.15
wx = 0.20
wu = 0.25
```
解释:
- success 最高
- user correction 罚得较重,因为它直接暴露系统没学会
- retrieval hit 有明显价值,但不能盖过结果
- latency/context 重要,但初期不该过重
## 信号来源
reward 可来自三类来源:
### A. 显式信号
- 用户说“对/不对”
- 用户纠正
- 用户二次要求重做
### B. 隐式信号
- 是否减少步骤
- 是否触发错误
- 是否重复问同样的问题
- 是否超时
### C. 程序性验收
- 测试是否通过
- 目标文件是否生成
- 指定字段是否匹配
- 工具执行是否成功
## 反事实记录要求
为后续训练,必须记录:
- 候选集有哪些
- 最终选了谁
- 哪些高分候选没有被选
- 每个动作的局部 outcome
否则 reward 只能打给“整个过程”,无法学习具体路由策略。
## 初期策略
Phase 0 / Phase 1 不建议直接把 reward 用于大模型权重更新。
先用于:
- 路由规则评估
- 样本打标
- 候选排序优化
- bandit / reranker 训练
## 风险
- 只看 success会奖励瞎猫碰死耗子
- 只看效率,会让系统不敢探索
- 只看用户反馈,会受用户表达噪声影响
- 不记录反事实,训练会非常盲
## 当前结论
reward 在 memabra 中不是附属件,而是学习闭环的核心基础设施。
如果 reward 设计不清,后面所有“根据结果更新权重”都会变成伪学习。

View File

@@ -0,0 +1,13 @@
{
"current_version_id": "20260415-023347",
"promotion_source": null,
"benchmark_summary": {
"reward_delta": 0.0,
"error_rate_delta": 0.0,
"latency_delta_ms": 0.0,
"baseline_avg_reward": 0.44,
"challenger_avg_reward": 0.44
},
"prior_version_id": "20260415-023347",
"saved_at": "2026-04-15T02:33:47.916903+00:00"
}

View File

@@ -0,0 +1,35 @@
{
"version_id": "20260414-150123",
"weights": {
"clarify": {
"input_length": 6.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9500000000000001,
"top_skill_success_rate": 0.8999999999999998,
"top_tool_confidence": 0.9500000000000001,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"source": "online_learning",
"benchmark_summary": {
"reward_delta": 0.0,
"error_rate_delta": 0.0,
"latency_delta_ms": 0.0,
"baseline_avg_reward": 0.44,
"challenger_avg_reward": 0.44
}
}
}

View File

@@ -0,0 +1,35 @@
{
"version_id": "20260414-150127",
"weights": {
"clarify": {
"input_length": 6.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9500000000000001,
"top_skill_success_rate": 0.8999999999999998,
"top_tool_confidence": 0.9500000000000001,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"source": "online_learning",
"benchmark_summary": {
"reward_delta": 0.0,
"error_rate_delta": 0.0,
"latency_delta_ms": 0.0,
"baseline_avg_reward": 0.44,
"challenger_avg_reward": 0.44
}
}
}

View File

@@ -0,0 +1,35 @@
{
"version_id": "20260414-150228",
"weights": {
"clarify": {
"input_length": 6.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9500000000000001,
"top_skill_success_rate": 0.8999999999999998,
"top_tool_confidence": 0.9500000000000001,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"source": "online_learning",
"benchmark_summary": {
"reward_delta": 0.0,
"error_rate_delta": 0.0,
"latency_delta_ms": 0.0,
"baseline_avg_reward": 0.44,
"challenger_avg_reward": 0.44
}
}
}

View File

@@ -0,0 +1,35 @@
{
"version_id": "20260414-150426",
"weights": {
"clarify": {
"input_length": 6.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9500000000000001,
"top_skill_success_rate": 0.8999999999999998,
"top_tool_confidence": 0.9500000000000001,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"source": "online_learning",
"benchmark_summary": {
"reward_delta": 0.0,
"error_rate_delta": 0.0,
"latency_delta_ms": 0.0,
"baseline_avg_reward": 0.44,
"challenger_avg_reward": 0.44
}
}
}

View File

@@ -0,0 +1,35 @@
{
"version_id": "20260414-152505",
"weights": {
"clarify": {
"input_length": 6.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9500000000000001,
"top_skill_success_rate": 0.8999999999999998,
"top_tool_confidence": 0.9500000000000001,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"source": "online_learning",
"benchmark_summary": {
"reward_delta": 0.0,
"error_rate_delta": 0.0,
"latency_delta_ms": 0.0,
"baseline_avg_reward": 0.44,
"challenger_avg_reward": 0.44
}
}
}

View File

@@ -0,0 +1,35 @@
{
"version_id": "20260414-152530",
"weights": {
"clarify": {
"input_length": 6.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9500000000000001,
"top_skill_success_rate": 0.8999999999999998,
"top_tool_confidence": 0.9500000000000001,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"source": "online_learning",
"benchmark_summary": {
"reward_delta": 0.0,
"error_rate_delta": 0.0,
"latency_delta_ms": 0.0,
"baseline_avg_reward": 0.44,
"challenger_avg_reward": 0.44
}
}
}

View File

@@ -0,0 +1,35 @@
{
"version_id": "20260414-152625",
"weights": {
"clarify": {
"input_length": 6.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9500000000000001,
"top_skill_success_rate": 0.8999999999999998,
"top_tool_confidence": 0.9500000000000001,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"source": "online_learning",
"benchmark_summary": {
"reward_delta": 0.0,
"error_rate_delta": 0.0,
"latency_delta_ms": 0.0,
"baseline_avg_reward": 0.44,
"challenger_avg_reward": 0.44
}
}
}

View File

@@ -0,0 +1,35 @@
{
"version_id": "20260414-152935",
"weights": {
"clarify": {
"input_length": 6.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9500000000000001,
"top_skill_success_rate": 0.8999999999999998,
"top_tool_confidence": 0.9500000000000001,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"source": "online_learning",
"benchmark_summary": {
"reward_delta": 0.0,
"error_rate_delta": 0.0,
"latency_delta_ms": 0.0,
"baseline_avg_reward": 0.44,
"challenger_avg_reward": 0.44
}
}
}

View File

@@ -0,0 +1,35 @@
{
"version_id": "20260414-152941",
"weights": {
"clarify": {
"input_length": 6.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9500000000000001,
"top_skill_success_rate": 0.8999999999999998,
"top_tool_confidence": 0.9500000000000001,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"source": "online_learning",
"benchmark_summary": {
"reward_delta": 0.0,
"error_rate_delta": 0.0,
"latency_delta_ms": 0.0,
"baseline_avg_reward": 0.44,
"challenger_avg_reward": 0.44
}
}
}

View File

@@ -0,0 +1,35 @@
{
"version_id": "20260414-155036",
"weights": {
"clarify": {
"input_length": 6.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9500000000000001,
"top_skill_success_rate": 0.8999999999999998,
"top_tool_confidence": 0.9500000000000001,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"source": "online_learning",
"benchmark_summary": {
"reward_delta": 0.0,
"error_rate_delta": 0.0,
"latency_delta_ms": 0.0,
"baseline_avg_reward": 0.44,
"challenger_avg_reward": 0.44
}
}
}

View File

@@ -0,0 +1,35 @@
{
"version_id": "20260414-155251",
"weights": {
"clarify": {
"input_length": 6.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9500000000000001,
"top_skill_success_rate": 0.8999999999999998,
"top_tool_confidence": 0.9500000000000001,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"source": "online_learning",
"benchmark_summary": {
"reward_delta": 0.0,
"error_rate_delta": 0.0,
"latency_delta_ms": 0.0,
"baseline_avg_reward": 0.44,
"challenger_avg_reward": 0.44
}
}
}

View File

@@ -0,0 +1,35 @@
{
"version_id": "20260414-155350",
"weights": {
"clarify": {
"input_length": 6.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9500000000000001,
"top_skill_success_rate": 0.8999999999999998,
"top_tool_confidence": 0.9500000000000001,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"source": "online_learning",
"benchmark_summary": {
"reward_delta": 0.0,
"error_rate_delta": 0.0,
"latency_delta_ms": 0.0,
"baseline_avg_reward": 0.44,
"challenger_avg_reward": 0.44
}
}
}

View File

@@ -0,0 +1,35 @@
{
"version_id": "20260414-164944",
"weights": {
"clarify": {
"input_length": 6.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9500000000000001,
"top_skill_success_rate": 0.8999999999999998,
"top_tool_confidence": 0.9500000000000001,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"source": "online_learning",
"benchmark_summary": {
"reward_delta": 0.0,
"error_rate_delta": 0.0,
"latency_delta_ms": 0.0,
"baseline_avg_reward": 0.44,
"challenger_avg_reward": 0.44
}
}
}

View File

@@ -0,0 +1,35 @@
{
"version_id": "20260414-165138",
"weights": {
"clarify": {
"input_length": 6.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9500000000000001,
"top_skill_success_rate": 0.8999999999999998,
"top_tool_confidence": 0.9500000000000001,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"source": "online_learning",
"benchmark_summary": {
"reward_delta": 0.0,
"error_rate_delta": 0.0,
"latency_delta_ms": 0.0,
"baseline_avg_reward": 0.44,
"challenger_avg_reward": 0.44
}
}
}

View File

@@ -0,0 +1,35 @@
{
"version_id": "20260414-165207",
"weights": {
"clarify": {
"input_length": 6.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9500000000000001,
"top_skill_success_rate": 0.8999999999999998,
"top_tool_confidence": 0.9500000000000001,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"source": "online_learning",
"benchmark_summary": {
"reward_delta": 0.0,
"error_rate_delta": 0.0,
"latency_delta_ms": 0.0,
"baseline_avg_reward": 0.44,
"challenger_avg_reward": 0.44
}
}
}

View File

@@ -0,0 +1,35 @@
{
"version_id": "20260414-165241",
"weights": {
"clarify": {
"input_length": 6.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9500000000000001,
"top_skill_success_rate": 0.8999999999999998,
"top_tool_confidence": 0.9500000000000001,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"source": "online_learning",
"benchmark_summary": {
"reward_delta": 0.0,
"error_rate_delta": 0.0,
"latency_delta_ms": 0.0,
"baseline_avg_reward": 0.44,
"challenger_avg_reward": 0.44
}
}
}

View File

@@ -0,0 +1,35 @@
{
"version_id": "20260414-165316",
"weights": {
"clarify": {
"input_length": 6.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9500000000000001,
"top_skill_success_rate": 0.8999999999999998,
"top_tool_confidence": 0.9500000000000001,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"source": "online_learning",
"benchmark_summary": {
"reward_delta": 0.0,
"error_rate_delta": 0.0,
"latency_delta_ms": 0.0,
"baseline_avg_reward": 0.44,
"challenger_avg_reward": 0.44
}
}
}

View File

@@ -0,0 +1,35 @@
{
"version_id": "20260414-165359",
"weights": {
"clarify": {
"input_length": 6.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9500000000000001,
"top_skill_success_rate": 0.8999999999999998,
"top_tool_confidence": 0.9500000000000001,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"source": "online_learning",
"benchmark_summary": {
"reward_delta": 0.0,
"error_rate_delta": 0.0,
"latency_delta_ms": 0.0,
"baseline_avg_reward": 0.44,
"challenger_avg_reward": 0.44
}
}
}

View File

@@ -0,0 +1,35 @@
{
"version_id": "20260414-165450",
"weights": {
"clarify": {
"input_length": 6.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9500000000000001,
"top_skill_success_rate": 0.8999999999999998,
"top_tool_confidence": 0.9500000000000001,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"source": "online_learning",
"benchmark_summary": {
"reward_delta": 0.0,
"error_rate_delta": 0.0,
"latency_delta_ms": 0.0,
"baseline_avg_reward": 0.44,
"challenger_avg_reward": 0.44
}
}
}

View File

@@ -0,0 +1,35 @@
{
"version_id": "20260414-171516",
"weights": {
"clarify": {
"input_length": 6.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9500000000000001,
"top_skill_success_rate": 0.8999999999999998,
"top_tool_confidence": 0.9500000000000001,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"source": "online_learning",
"benchmark_summary": {
"reward_delta": 0.0,
"error_rate_delta": 0.0,
"latency_delta_ms": 0.0,
"baseline_avg_reward": 0.44,
"challenger_avg_reward": 0.44
}
}
}

View File

@@ -0,0 +1,35 @@
{
"version_id": "20260414-171623",
"weights": {
"clarify": {
"input_length": 6.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9500000000000001,
"top_skill_success_rate": 0.8999999999999998,
"top_tool_confidence": 0.9500000000000001,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"source": "online_learning",
"benchmark_summary": {
"reward_delta": 0.0,
"error_rate_delta": 0.0,
"latency_delta_ms": 0.0,
"baseline_avg_reward": 0.44,
"challenger_avg_reward": 0.44
}
}
}

View File

@@ -0,0 +1,35 @@
{
"version_id": "20260414-171651",
"weights": {
"clarify": {
"input_length": 6.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9500000000000001,
"top_skill_success_rate": 0.8999999999999998,
"top_tool_confidence": 0.9500000000000001,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"source": "online_learning",
"benchmark_summary": {
"reward_delta": 0.0,
"error_rate_delta": 0.0,
"latency_delta_ms": 0.0,
"baseline_avg_reward": 0.44,
"challenger_avg_reward": 0.44
}
}
}

View File

@@ -0,0 +1,35 @@
{
"version_id": "20260414-171757",
"weights": {
"clarify": {
"input_length": 6.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9500000000000001,
"top_skill_success_rate": 0.8999999999999998,
"top_tool_confidence": 0.9500000000000001,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"source": "online_learning",
"benchmark_summary": {
"reward_delta": 0.0,
"error_rate_delta": 0.0,
"latency_delta_ms": 0.0,
"baseline_avg_reward": 0.44,
"challenger_avg_reward": 0.44
}
}
}

View File

@@ -0,0 +1,35 @@
{
"version_id": "20260414-173832",
"weights": {
"clarify": {
"input_length": 6.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9500000000000001,
"top_skill_success_rate": 0.8999999999999998,
"top_tool_confidence": 0.9500000000000001,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"source": "online_learning",
"benchmark_summary": {
"reward_delta": 0.0,
"error_rate_delta": 0.0,
"latency_delta_ms": 0.0,
"baseline_avg_reward": 0.44,
"challenger_avg_reward": 0.44
}
}
}

View File

@@ -0,0 +1,35 @@
{
"version_id": "20260414-180027",
"weights": {
"clarify": {
"input_length": 6.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9500000000000001,
"top_skill_success_rate": 0.8999999999999998,
"top_tool_confidence": 0.9500000000000001,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"source": "online_learning",
"benchmark_summary": {
"reward_delta": 0.0,
"error_rate_delta": 0.0,
"latency_delta_ms": 0.0,
"baseline_avg_reward": 0.44,
"challenger_avg_reward": 0.44
}
}
}

View File

@@ -0,0 +1,35 @@
{
"version_id": "20260414-180106",
"weights": {
"clarify": {
"input_length": 6.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9500000000000001,
"top_skill_success_rate": 0.8999999999999998,
"top_tool_confidence": 0.9500000000000001,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"source": "online_learning",
"benchmark_summary": {
"reward_delta": 0.0,
"error_rate_delta": 0.0,
"latency_delta_ms": 0.0,
"baseline_avg_reward": 0.44,
"challenger_avg_reward": 0.44
}
}
}

View File

@@ -0,0 +1,35 @@
{
"version_id": "20260414-180343",
"weights": {
"clarify": {
"input_length": 6.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9500000000000001,
"top_skill_success_rate": 0.8999999999999998,
"top_tool_confidence": 0.9500000000000001,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"source": "online_learning",
"benchmark_summary": {
"reward_delta": 0.0,
"error_rate_delta": 0.0,
"latency_delta_ms": 0.0,
"baseline_avg_reward": 0.44,
"challenger_avg_reward": 0.44
}
}
}

View File

@@ -0,0 +1,35 @@
{
"version_id": "20260414-180515",
"weights": {
"clarify": {
"input_length": 6.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9500000000000001,
"top_skill_success_rate": 0.8999999999999998,
"top_tool_confidence": 0.9500000000000001,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"source": "online_learning",
"benchmark_summary": {
"reward_delta": 0.0,
"error_rate_delta": 0.0,
"latency_delta_ms": 0.0,
"baseline_avg_reward": 0.44,
"challenger_avg_reward": 0.44
}
}
}

View File

@@ -0,0 +1,35 @@
{
"version_id": "20260414-180553",
"weights": {
"clarify": {
"input_length": 6.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9500000000000001,
"top_skill_success_rate": 0.8999999999999998,
"top_tool_confidence": 0.9500000000000001,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"source": "online_learning",
"benchmark_summary": {
"reward_delta": 0.0,
"error_rate_delta": 0.0,
"latency_delta_ms": 0.0,
"baseline_avg_reward": 0.44,
"challenger_avg_reward": 0.44
}
}
}

View File

@@ -0,0 +1,35 @@
{
"version_id": "20260414-180625",
"weights": {
"clarify": {
"input_length": 6.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9500000000000001,
"top_skill_success_rate": 0.8999999999999998,
"top_tool_confidence": 0.9500000000000001,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"source": "online_learning",
"benchmark_summary": {
"reward_delta": 0.0,
"error_rate_delta": 0.0,
"latency_delta_ms": 0.0,
"baseline_avg_reward": 0.44,
"challenger_avg_reward": 0.44
}
}
}

View File

@@ -0,0 +1,35 @@
{
"version_id": "20260414-180658",
"weights": {
"clarify": {
"input_length": 6.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9500000000000001,
"top_skill_success_rate": 0.8999999999999998,
"top_tool_confidence": 0.9500000000000001,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"source": "online_learning",
"benchmark_summary": {
"reward_delta": 0.0,
"error_rate_delta": 0.0,
"latency_delta_ms": 0.0,
"baseline_avg_reward": 0.44,
"challenger_avg_reward": 0.44
}
}
}

View File

@@ -0,0 +1,35 @@
{
"version_id": "20260414-182721",
"weights": {
"clarify": {
"input_length": 6.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9500000000000001,
"top_skill_success_rate": 0.8999999999999998,
"top_tool_confidence": 0.9500000000000001,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"source": "online_learning",
"benchmark_summary": {
"reward_delta": 0.0,
"error_rate_delta": 0.0,
"latency_delta_ms": 0.0,
"baseline_avg_reward": 0.44,
"challenger_avg_reward": 0.44
}
}
}

View File

@@ -0,0 +1,35 @@
{
"version_id": "20260414-182806",
"weights": {
"clarify": {
"input_length": 6.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9500000000000001,
"top_skill_success_rate": 0.8999999999999998,
"top_tool_confidence": 0.9500000000000001,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"source": "online_learning",
"benchmark_summary": {
"reward_delta": 0.0,
"error_rate_delta": 0.0,
"latency_delta_ms": 0.0,
"baseline_avg_reward": 0.44,
"challenger_avg_reward": 0.44
}
}
}

View File

@@ -0,0 +1,35 @@
{
"version_id": "20260414-183024",
"weights": {
"clarify": {
"input_length": 6.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9500000000000001,
"top_skill_success_rate": 0.8999999999999998,
"top_tool_confidence": 0.9500000000000001,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"source": "online_learning",
"benchmark_summary": {
"reward_delta": 0.0,
"error_rate_delta": 0.0,
"latency_delta_ms": 0.0,
"baseline_avg_reward": 0.44,
"challenger_avg_reward": 0.44
}
}
}

View File

@@ -0,0 +1,35 @@
{
"version_id": "20260414-183107",
"weights": {
"clarify": {
"input_length": 6.0,
"memory_count": 1.0,
"skill_count": 1.0,
"tool_count": 1.0,
"top_memory_confidence": 0.9500000000000001,
"top_skill_success_rate": 0.8999999999999998,
"top_tool_confidence": 0.9500000000000001,
"top_tool_risk": 0.0
}
},
"feature_keys": [
"input_length",
"memory_count",
"skill_count",
"tool_count",
"top_memory_confidence",
"top_skill_success_rate",
"top_tool_confidence",
"top_tool_risk"
],
"metadata": {
"source": "online_learning",
"benchmark_summary": {
"reward_delta": 0.0,
"error_rate_delta": 0.0,
"latency_delta_ms": 0.0,
"baseline_avg_reward": 0.44,
"challenger_avg_reward": 0.44
}
}
}

Some files were not shown because too many files have changed in this diff Show More