Initial standalone memabra release
This commit is contained in:
148
docs/DEMO.md
Normal file
148
docs/DEMO.md
Normal file
@@ -0,0 +1,148 @@
|
||||
# Demo
|
||||
|
||||
memabra now has a polished wrap-up workflow in addition to the lower-level demo app.
|
||||
|
||||
## Quick run
|
||||
|
||||
If you installed the repo in editable mode, prefer the dedicated CLI command:
|
||||
|
||||
```bash
|
||||
source venv/bin/activate
|
||||
memabra
|
||||
```
|
||||
|
||||
The legacy developer entrypoint still works too:
|
||||
|
||||
```bash
|
||||
source venv/bin/activate
|
||||
python -m src.memabra.cli
|
||||
```
|
||||
|
||||
This runs the online-learning loop: it seeds demo tasks, trains a challenger router, evaluates it against a benchmark suite, promotes it if thresholds are met, and prints a JSON report.
|
||||
|
||||
You can override the default artifact directory and minimum trajectory threshold:
|
||||
|
||||
```bash
|
||||
source venv/bin/activate
|
||||
memabra run --base-dir /custom/artifacts --min-new-trajectories 5
|
||||
```
|
||||
|
||||
You can also enable episodic retrieval by rebuilding the case index from saved trajectories:
|
||||
|
||||
```bash
|
||||
source venv/bin/activate
|
||||
memabra run --rebuild-case-index
|
||||
```
|
||||
|
||||
You can check system status, list versions, or roll back without running a learning cycle:
|
||||
|
||||
```bash
|
||||
source venv/bin/activate
|
||||
memabra status
|
||||
memabra version list
|
||||
memabra version rollback 20260414-123456
|
||||
```
|
||||
|
||||
If you want operator-friendly output instead of raw JSON, use `--format text`:
|
||||
|
||||
```bash
|
||||
source venv/bin/activate
|
||||
memabra status --format text
|
||||
memabra version list --format text
|
||||
memabra version rollback 20260414-123456 --format text
|
||||
memabra run --dry-run --format text
|
||||
```
|
||||
|
||||
The text formatter is aimed at operators: status output includes the latest report timing/outcome, version listings highlight the currently active router version, and workflow output is grouped into summary/baseline/challenger/deltas/decision sections with normalized yes/no and fixed-precision metrics.
|
||||
|
||||
You can also call it programmatically:
|
||||
|
||||
```bash
|
||||
source venv/bin/activate
|
||||
python - <<'PY'
|
||||
from src.memabra.cli import run_online_learning_workflow
|
||||
result = run_online_learning_workflow()
|
||||
print(result)
|
||||
PY
|
||||
```
|
||||
|
||||
The online-learning workflow will:
|
||||
1. build a demo app
|
||||
2. seed example tasks (if no trajectories exist yet)
|
||||
3. run one online-learning cycle
|
||||
4. train a challenger router
|
||||
5. evaluate it against the baseline on a fixed benchmark suite
|
||||
6. promote it only if the promotion policy accepts
|
||||
7. persist a training report under `training-reports/`
|
||||
8. print a JSON report
|
||||
|
||||
## Python API
|
||||
|
||||
```python
|
||||
from src.memabra.cli import run_wrapup_workflow, run_online_learning_workflow
|
||||
|
||||
# Legacy wrap-up demo
|
||||
result = run_wrapup_workflow()
|
||||
print(result)
|
||||
|
||||
# Safe online-learning loop with benchmark-gated promotion
|
||||
result = run_online_learning_workflow()
|
||||
print(result)
|
||||
```
|
||||
|
||||
## Lower-level demo app
|
||||
|
||||
You can still drive the app manually:
|
||||
|
||||
```bash
|
||||
source venv/bin/activate
|
||||
python - <<'PY'
|
||||
from src.memabra.app import build_demo_app
|
||||
app = build_demo_app()
|
||||
|
||||
for prompt in [
|
||||
'Use my telegram preference for this answer.',
|
||||
'Check the current system status.',
|
||||
'Deploy this service with the usual workflow.',
|
||||
]:
|
||||
trajectory = app.run_task(prompt, channel='telegram', user_id='oza')
|
||||
print(prompt)
|
||||
print(trajectory['decisions'][0]['decision_type'], trajectory['outcome']['status'], trajectory['reward']['total'])
|
||||
print([event['event_type'] for event in trajectory['events']])
|
||||
print('---')
|
||||
|
||||
print(app.replay_summary())
|
||||
PY
|
||||
```
|
||||
|
||||
## Output locations
|
||||
|
||||
By default the workflows write to:
|
||||
- `docs/projects/memabra/demo-artifacts/trajectories/`
|
||||
- `docs/projects/memabra/demo-artifacts/memories/`
|
||||
- `docs/projects/memabra/demo-artifacts/router-versions/`
|
||||
- `docs/projects/memabra/demo-artifacts/training-reports/`
|
||||
|
||||
## What this proves
|
||||
|
||||
The alpha is able to demonstrate the whole loop:
|
||||
- retrieval
|
||||
- routing
|
||||
- execution
|
||||
- persistence
|
||||
- replay
|
||||
- training
|
||||
- evaluation
|
||||
- router versioning
|
||||
- benchmark-gated promotion
|
||||
- auditable training reports
|
||||
|
||||
## Limits
|
||||
|
||||
This is still an alpha:
|
||||
- learning is lightweight, not a deep model
|
||||
- storage is JSON-file based
|
||||
- promotion policy thresholds are manually configured
|
||||
- tool/skill integration is still narrower than a production agent platform
|
||||
|
||||
But it is now a safe, self-improving alpha, not just a pile of modules.
|
||||
Reference in New Issue
Block a user