Module 01 of 19 · Start Here

🏛 What Paras Is

Paras is your one-person, AI-automated futures-trading system whose entire job is to kill bad strategies cheaply — three Windows apps (PULSE, SENTINEL, AXIOM) over one shared kernel, governed by an 11-principle Constitution.

The touchstone

Parasmani is the legendary stone that turns base metal to gold. Paras is the honest version: it does not promise gold — it touches every metal and reveals which ones were gold all along. Most aren't. That is the point. The system's job is to kill bad strategies cheaply, before they cost you real money. (docs/00 line 2; CLAUDE.md line 3)

Welcome. This is the front door of your Paras knowledge portal. Paras (also called sponaitech-trading) is the complete automated futures-trading system you are building solo with AI automation. It is not one program — it is three Windows desktop apps sitting on top of one shared C# 'kernel', plus a written Constitution that every line of code must obey. This page is the big-picture orientation: what Paras is, why it exists, how the pieces fit, and how to use this portal. Later lessons go deep on each part.

Read this first sentence twice, because it reframes everything: a normal trading platform is built to FIND winners; Paras is built to KILL losers. The whole architecture — pre-registered trial budgets, statistical deflation gates, three independent engines that must agree, a compliance engine that makes your worst habit physically impossible — exists to make 'this strategy is fake' the cheap, default, honest answer. An impressive backtest is treated as a suspect until proven otherwise, never as a discovery.

Monitor

PULSE — the heartbeat

An always-on Windows tray monitor. It is the read-only 'face' that watches the data plant and tells you, at a glance, whether your data engine is healthy, syncing, behind, faulted, or just calmly CLOSED for the market. PULSE never writes to the database — it only reads status files. Project: Sponaitech.Pulse.

Research

SENTINEL — the laboratory

The research lab. It backfills and maintains deep market history, runs strategy experiments through a two-speed backtest engine, forces every result through deflation gates (DSR/PBO) and a three-engine parity court, and runs a nightly Claude research loop. Its output is never 'best backtest' — it is survivors after deflation, plus a growing corpus of validated failures.

Execution

AXIOM — the fortress

The execution fortress. It hosts only gate-surviving strategies on live Topstep trading, generates signals through the SAME kernel code that backtested them, and routes every order through a deterministic Topstep compliance engine. A local AI sidecar (Gemma) rides along as a conscience that can only say NO — it never touches an order.

Notice the deliberate shape: PULSE watches, SENTINEL discovers, AXIOM executes — and they are kept apart on purpose. Only AXIOM is ever allowed to emit an order; SENTINEL never touches a trading venue. The thing that makes the three apps trustworthy together is the single shared kernel underneath them.

Why one shared kernel matters (Parity by construction)

The kernel is a pure C# library — domain model, components, strategy specs, and the backtest engine — with no venue SDKs, no HTTP, no UI, no LLM calls, no Topstep values, and not even a wall-clock read (the clock is injected). SENTINEL backtests a strategy with the kernel; AXIOM trades that same strategy with the SAME kernel code. So 'the code that trades is the code that backtests.' That is Constitution Principle 3, and it is why a backtest result actually predicts live behavior. (CLAUDE.md line 32; docs/00 Principle 3)

1

1. A strategy is proposed as a pre-registered hypothesis

Every idea enters as a spec with a stated mechanism (WHY it should work) and a pre-registered trial budget. Pure 'compose combos and see what sticks' pattern-mining is banned — a stated mechanism is required (G0). Every trial is counted; re-rolls face a higher statistical hurdle (DSR / the False Strategy Theorem).

2

2. SENTINEL backtests it through the kernel and counts the trial

The two-speed engine runs the strategy on deep market history. Friction (commissions, slippage) is modeled pessimistically until Phase-0 calibration replaces guesses with measured fills. Every single run records a trial row in the ledger — there are no untracked runs.

3

3. Deflation gates + three-engine parity try to KILL it

Results are forced through statistical deflation gates (DSR/PBO) that punish over-searching, and through a parity court where three independent engines (kernel / TradeStation / LEAN) must agree. Most strategies die here. A killed strategy becomes a permanent, valued negative result in learnings.md.

4

4. Survivors — and only survivors — are promoted to AXIOM

A strategy that survives every gate (a G-gate survivor) is hosted by AXIOM on live Topstep data, running the identical kernel code. Stops, targets, sizing, and the flatten deadline are deterministic code — never an AI decision.

5

5. The compliance engine enforces Topstep, deterministically

Every order passes through a Topstep compliance engine that makes your documented failure mode — discretionary oversizing after a winning streak — architecturally impossible. The AI sidecar can veto, narrate, resist, and reflect; it can never approve, place, size, or modify an order.

Concrete example. You have a hunch: 'ES opening-range breakouts work better when the prior day closed strong.' In a normal platform you'd tweak parameters until the equity curve looked great and start trading. In Paras you write it as a spec with that exact mechanism and a fixed trial budget. SENTINEL runs it, counts the trial, and the deflation gate asks: 'given how many variations you tried, is this edge real or just luck?' If the three engines disagree, or the deflated metric is below threshold, it dies — and the death is logged so you never waste money re-testing the same dead idea. If it survives, AXIOM trades the identical kernel logic, and the compliance engine refuses to let a hot streak talk you into doubling size. The hunch was a base metal or it was gold; Paras tells you honestly which, cheaply.

Why does this exist for YOU specifically? Paras is a one-person operation that relies on AI automation. You run it on a single personal Windows machine — no VPS, no VPN (Topstep bans them), budget kept at or under $200/month, and zero marginal AI token cost at runtime (a Claude Max plan for build sessions, local Gemma/Ollama for runtime inference). The system has to be honest and self-disciplining precisely because there is no team to catch a mistake. The architecture IS the second pair of eyes.

The 11-principle ConstitutionWhat it means in one line
1. Honest discovery over impressive resultsKill bad strategies cheaply — that's the job.
2. Every trial is countedPre-registered budgets; re-rolls face a higher DSR hurdle.
3. Parity by constructionThe code that trades is the code that backtests (the kernel).
4. Deterministic executionNo LLM in the order path; stops/targets/sizing/flatten are code.
5. LLMs reason and veto — never forecast or manage ordersAI is a conscience, not a trader.
6. Costs are first-classFriction modeled pessimistically until Phase-0 calibration.
7. One source of truth per concernBars: DuckDB. Decisions: the ledger. Lessons: learnings.md.
8. Suppression beats optimizationWhen in doubt, don't trade.
9. Topstep rules are versioned config, never constantsFour rule changes Nov 2025–Feb 2026 prove why.
10. Local-first, budget-truePersonal device only; <= $200/mo; $0 marginal AI.
11. Adapt by saying NO more oftenNever quietly become a different system; live params are immutable.
The hard rules are non-negotiable

Code that violates the Constitution is wrong even if it works. The load-bearing 'never violate' rules: no LLM calls in any order path; the AI sidecar may only veto/narrate/resist/reflect; Topstep rule values live in config/topstep-rules.json, never as constants; the kernel stays a pure library; every backtest records a trial row; live strategy parameters are immutable (fixes go through new spec versions + full gates); and AXIOM config is frozen during market hours (06:00–13:30 PT). (CLAUDE.md 'Hard rules'; docs/00 §2)

How to use this portal. This is your single go-to library for everything about Paras, written for understanding — plain-English first, then precise. Each lesson cites its source docs so you can always check it against the specs of record (docs/00–09 and CLAUDE.md). Start here, then branch into the deeper lessons. Two terms to fix in your head now: a 'model' always means an edge-class trading model (a strategy) — never an AI model; and the 'data plant' (the always-on engine, canonically the 'Paras DataPlant') is the run-mode of SENTINEL that owns ingestion — PULSE only watches it, the SENTINEL dashboard is the on-demand research lab you open when you want to do research.

Every task starts by reading
docs/00_MASTER_PLAN.md §2 (the Constitution) + tracking/STATE.md
PULSE
Always-on tray monitor; read-only heartbeat of the data plant (Sponaitech.Pulse)
SENTINEL
Research laboratory: data, backtests, deflation gates, 3-engine parity, nightly Claude loop, dashboard
AXIOM
Execution fortress: live Topstep trading via a deterministic compliance engine + a veto-only AI sidecar
The kernel
Pure shared C# library — the one place SENTINEL and AXIOM share, so backtest == live
Seed instruments
ES, CL, GC (config-driven; expansion to equities/FX/crypto is a config profile, not a rewrite)
Stack
.NET 8 · C# 12 · WPF (all three apps) · DuckDB · LEAN (pinned) · Ollama local models
Where lessons cite from
docs/00–09, CLAUDE.md, tracking/DECISIONS.md (locked decisions)
The one idea to carry out of this lesson

Paras adapts by saying NO more often — never by quietly becoming a different system. Live strategy parameters are immutable. Adaptation means suppression, retirement, and brand-new pre-registered hypotheses through the full gates. A system that re-fits itself in a drawdown is lying to you; Paras is built so it structurally cannot. (Constitution Principle 11)

What to stay aware of
  • Paras's purpose is to KILL bad strategies cheaply, not to find winners — treat an impressive backtest as a suspect, never a discovery (Principle 1).
  • Only AXIOM ever emits an order; SENTINEL never touches a trading venue. Keep that boundary sacred.
  • No LLM is ever in the order path. Stops, targets, sizing, and the flatten deadline are deterministic code; the AI sidecar can only say NO (Principles 4 & 5).
  • The kernel must stay pure — the moment a venue SDK, HTTP client, UI, LLM call, Topstep value, or wall-clock read leaks in, parity (Principle 3) is broken.
  • Topstep rule values live in config/topstep-rules.json and must be verified against the live rulebook before any eval/live order — never hardcode them (Principle 9).
  • Live strategy parameters are immutable. 'Fixes' go through new spec versions and the full gates; the system adapts by suppression, not by quietly re-fitting (Principle 11).
  • Every backtest run records a trial in the ledger and counts against a pre-registered budget; re-rolls face a higher DSR hurdle (Principle 2).
  • Budget reality: one personal Windows machine, no VPS/VPN, <= $200/mo, $0 marginal AI at runtime (local Ollama only) — if a runtime step 'needs' a cloud model, the step is wrong (Principle 10; D-069).
  • Vocabulary: 'model' = an edge-class trading model (a strategy), NEVER an AI model. 'DataPlant' = the always-on ingestion engine (D-091); the SENTINEL dashboard is the on-demand research lab.
  • Every lesson in this portal cites its source docs (docs/00–09, CLAUDE.md, DECISIONS.md) so claims can be checked against the specs of record — if a detail isn't in the docs, it isn't trusted.

Locked decisions & the why

D-084
All three apps (PULSE, SENTINEL, AXIOM) are Windows WPF (.NET 8) desktop apps; the Next.js/React output from Claude Design is a UI/UX reference prototype only, never a runtime dependency.
Why: A single one-UI-stack standardization is the right solo-operator simplification (matches D-073's WPF-over-WinUI reasoning); supersedes the older 'AXIOM stays WinUI 3' and 'SENTINEL dashboard = Next.js' lines. ASP.NET Core remains only as a headless local ledger API. Kernel purity (D-003) unaffected.
D-003
The kernel is a pure library — no venue SDKs, HTTP, UI, LLM calls, Topstep values, or wall-clock reads (clock is injected).
Why: Purity is what guarantees Parity by Construction (Principle 3): the same kernel can backtest in SENTINEL and trade in AXIOM only if it carries no environment-specific dependencies. It is also why the kernel is deterministically testable.
D-091
The headless always-on ingestion process is canonically named the 'DataPlant' (operator-facing 'Paras DataPlant') — the run-mode Sponaitech.Sentinel --run, the single DuckDB writer.
Why: Names the always-on engine distinctly from the on-demand SENTINEL research dashboard (the operator's 'why do we need PULSE / what is the always-on thing' clarification): DataPlant = always-on engine; SENTINEL dashboard = on-demand research lab; PULSE = the DataPlant's read-only tray watcher.
D-069
External-framework adoption policy + standing DO-NOTs: no LLM order/sizing/cancellation authority ever; no bar-level LLM calls (session cadence is the runtime ceiling); zero new API token spend at runtime (local Ollama only).
Why: Keeps Constitution Principles 4, 5, and 10 enforceable: the order path is deterministic, the AI is a reasoner/veto only, and runtime AI is $0 marginal cost on the local machine.
D-094
Market-model architecture is asset-class-agnostic and per-instrument (HARD RULE, build M2+): session geometry, holiday/early-close calendars, and bar-expectation logic must be per-instrument data-driven market profiles, never hardcoded to CME-futures assumptions.
Why: US equities, FX, and crypto are explicitly in-scope future asset classes; adding one must be a config + profile, not a rewrite. Motivating evidence: during the M1 soak a single CME-equity calendar over-closed the market (Juneteenth). Kernel purity (D-003) unaffected (TZ + templates injected).
D-080
OS-level default effort is ultracode-xHigh + Workflow on every substantive task; gates are graded by independent auditor subagents (never the author).
Why: Implements Constitution discipline for a solo operator: optimize for the most exhaustive correct answer, and preserve 'the author never grades their own gate' by routing every verdict through separate independent auditor agents plus measured TEST_LOG evidence and green CI.

Jump to a module

Sources: CLAUDE.md — header touchstone line; 'What this repo is'; The Constitution (1–11); Hard rules; Stack; The Paras Method (six rings); 'How this OS is organized' · docs/00_MASTER_PLAN.md §1 (What we are building — SENTINEL & AXIOM paragraphs) · docs/00_MASTER_PLAN.md §2 (The Constitution, Principles 1–11) · docs/00_MASTER_PLAN.md §3 (System context; hard boundaries — only AXIOM emits orders; AI sidecar can only reduce risk) · docs/00_MASTER_PLAN.md §4 (Repository layout — kernel, sentinel, pulse, axiom projects) · tracking/DECISIONS.md — D-003, D-069, D-080, D-084, D-091, D-094 · docs/education/README.md — portal conventions ('model' = trading model; cites docs/00–09, CLAUDE.md)
Module 02 of 19 · Foundations

📜 The Constitution (11 Principles)

The 11 load-bearing principles that make Paras an honest strategy-killer — code that violates one is wrong even if it works.

Why this page exists

Every Paras task begins by reading docs/00_MASTER_PLAN.md §2 — the Constitution. These 11 principles are not style preferences or nice-to-haves. The Master Plan states it flatly: 'These are load-bearing. Code that violates them is wrong even if it works.' This lesson walks all 11 in plain English: what each means, an example that would violate it, and why it is load-bearing — so that when a build request collides with one, you can name it, cite it, and stop.

Paras is named after the Parasmani — the touchstone of folklore that turns base metal to gold. The honest version doesn't promise gold; it touches every metal and reveals which were gold all along. Most aren't — that is the point. The Constitution is what keeps the touchstone honest. SENTINEL (research lab) and AXIOM (execution fortress) share one C# kernel, and these 11 principles govern both. Think of them as the system's risk-management rules, the same way Topstep's rules govern your funded account: break one and the result is invalid no matter how good the equity curve looks.

A useful way to hold all 11 in your head is four clusters, drawn from the Constitution quick-reference card: (1) Honesty & counting — Principles 1, 2; (2) The sacred order path — Principles 3, 4, 5; (3) Truth & immutability — Principles 7, 9, 11; (4) Restraint — Principles 6, 8, 10. We'll take them one at a time, but keep the clusters in mind — they explain why the principles defend each other.

Honesty

1 · Honest discovery over impressive results

PLAIN ENGLISH: The platform's job is to kill bad strategies cheaply, not to find a great-looking backtest. A run that produces a clean negative result is a SUCCESS. VIOLATION: A nightly loop that only surfaces top-performing cells and quietly drops the duds; tuning a screen until the headline Sharpe looks good. LOAD-BEARING: An expensive funeral for a bad strategy is the product. Optimizing for 'impressive' is exactly how a solo trader fools himself into trading noise — the whole edifice exists to prevent that.

Counting

2 · Every trial is counted

PLAIN ENGLISH: Trial budgets are pre-registered, and every re-roll is a new trial judged against a HIGHER hurdle (False Strategy Theorem / Deflated Sharpe Ratio). VIOLATION: Running a backtest without writing a trials row to the ledger; re-running a failed strategy with tweaked params and resetting the hurdle as if it were the first attempt. LOAD-BEARING: If you can quietly try 50 variants and report the one that passed, you will always 'find' an edge — in pure noise. Counting every trial and raising the DSR hurdle on re-rolls is the only honest defense against this.

Order path

3 · Parity by construction

PLAIN ENGLISH: The code that TRADES is the same code that BACKTESTED it — one shared C# kernel — and independent engines (TradeStation, LEAN) audit that the kernel is right. VIOLATION: AXIOM computing a signal with slightly different live logic than SENTINEL's backtest; a 'quick fix' in the live path that never went back through the kernel. LOAD-BEARING: A backtested edge is worthless if the live code differs even subtly — you'd be trading an unvalidated system. An unexplained delta between the three engines is a STOP-EVERYTHING event.

Order path

4 · Deterministic execution

PLAIN ENGLISH: No LLM anywhere in the order path. Stops, targets, sizing, and flatten deadlines are plain code. VIOLATION: Asking Gemma 'should I widen the stop?' inside the live trading loop; letting any model output a price, size, or order decision. LOAD-BEARING: Models are non-deterministic and can hallucinate. The order path must behave identically every time and be fully auditable. This is what makes your documented failure mode — discretionary oversizing after a win — architecturally impossible.

Order path

5 · LLMs are reasoners and veto filters, never forecasters or order managers

PLAIN ENGLISH: The AI sidecar may only veto / narrate / resist / reflect. It can never place, size, modify, cancel, or APPROVE — and it never predicts price. VIOLATION: An LLM that says 'looks like a strong long, go ahead' and that approval reaching execution; a model used to forecast direction. LOAD-BEARING: An AI that can only reduce risk (say 'no') can never blow up the account. The moment a model can approve or forecast, you've handed a non-deterministic system authority over real money. The sidecar is a conscience, not a trader.

Restraint

6 · Costs are first-class

PLAIN ENGLISH: Friction (commissions, slippage, spread) is modeled PESSIMISTICALLY until Phase-0 calibration replaces assumptions with real measured fills. VIOLATION: A backtest that assumes mid-price fills or zero slippage; relaxing cost assumptions before A4 shadow-fill data justifies it. LOAD-BEARING: Most retail 'edges' are real gross but negative net once true costs are charged. Pessimistic costs by default is how Paras avoids promoting a strategy that only works on paper.

Truth

7 · One source of truth per concern

PLAIN ENGLISH: Each kind of data has exactly ONE home — Bars: DuckDB. Decisions: the experiment ledger. Lessons: ops/learnings.md (append-only). VIOLATION: Caching bars in a second store that can drift; logging research decisions in two places that disagree. LOAD-BEARING: Duplicated state silently diverges, and then you can no longer trust any of it. One writer, one source — so 'what happened' always has a single, auditable answer.

Restraint

8 · Suppression beats optimization

PLAIN ENGLISH: When in doubt, DON'T trade. Regime gates and in-play filters decide WHEN NOT TO. You tune the 'no', not the parameters. VIOLATION: Reaching for a parameter knob to rescue a struggling strategy instead of adding a suppression gate; trading through a regime the strategy was never validated in. LOAD-BEARING: Over-fitting parameters is how edges die; choosing not to trade in unfavorable conditions is robust. The cheapest, safest improvement is almost always to trade less, not to re-tune.

Truth

9 · Topstep rules are encoded, versioned config — not constants

PLAIN ENGLISH: Every Topstep limit lives in config/topstep-rules.json and is version-bumped on the weekly rulebook re-check — never hard-coded. (Four major rule changes hit Nov 2025–Feb 2026.) VIOLATION: Writing a daily-loss limit as a literal in C#; assuming last month's rules still hold. LOAD-BEARING: A stale or hard-coded limit silently violates your funded account and ends the Combine. Rules-as-config means a rule change is a one-line config bump that activates at the next PRE_FLIGHT, not a code hunt.

Restraint

10 · Local-first, budget-true

PLAIN ENGLISH: Everything runs on this one personal machine — Topstep BANS VPS/VPN. Zero marginal AI token cost (Claude Max plan + local Gemma). Total operating budget stays at or under $200/mo. VIOLATION: Spinning up a cloud VPS to run the bot; a runtime step that 'needs' a paid cloud model. LOAD-BEARING: A VPS gets your funded account banned, and runaway cloud spend kills a one-person, budget-conscious operation. The constraint is a feature: it forces a system that is cheap, portable, and rule-compliant by design.

Immutability

11 · The system adapts by saying no more often — never by quietly becoming a different system

PLAIN ENGLISH: Live strategy parameters are NEVER re-fit in response to a drawdown. Adaptation = suppression, retirement, or a NEW pre-registered hypothesis through the full G0–G6 gates. VIOLATION: A drawdown triggers a quiet parameter tweak in the live strategy 'to fix it'. LOAD-BEARING: Re-fitting live params to recent losses is curve-fitting to noise dressed up as 'improvement' — and it means you're now trading an unvalidated system you never tested. The only honest responses to underperformance are: suppress it, retire it, or run a brand-new spec through every gate.

#Principle (the tell)ClusterTraces to / gate
1A clean negative is a win — kill bad strategies cheaplyHonesty00 §2.1; gates G0–G6
2No untracked runs; re-rolls raise the DSR hurdle, never reset itCounting00 §2.2; G3 (DSR>0.95, PBO<0.20)
3Same kernel live + backtest; engines agree or STOP-EVERYTHINGOrder path00 §2.3; G4 parity court
4Compliance + execution are pure code; no model in the order pathOrder path00 §2.4; AXIOM compliance
5AI may only veto / narrate / resist / reflect — never approve or forecastOrder path00 §2.5; AXIOM sidecar
6Pessimistic costs by default; only A4 shadow fills earn relaxationRestraint00 §2.6; A4 calibration
7Bars→DuckDB, decisions→ledger, lessons→learnings.md; one writer eachTruth00 §2.7
8Tune the 'no'; add a suppression gate before reaching for a knobRestraint00 §2.8; regime gates
9Limits live in config/topstep-rules.json; weekly re-check, never constantsTruth00 §2.9; D-040/D-063
10One machine, no VPS/VPN, $0 marginal AI, ≤ $200/moRestraint00 §2.10; D-041
11Live params immutable; adaptation = suppress / retire / new spec through gatesImmutability00 §2.11; G0–G6
The sacred order path (3 + 4 + 5 together)

Principles 3, 4, and 5 defend the same thing from three angles and are best memorized as a unit. Parity (3) guarantees the live code IS the validated code. Determinism (4) guarantees that code runs identically and auditably every time. The AI leash (5) guarantees no non-deterministic model can ever touch an order. Together they make the path from signal to fill fully validated, fully repeatable, and AI-proof — which is the architectural reason your worst trading habit (oversizing after a win) cannot happen in AXIOM.

1

1. Stop

When a build request — yours or an agent's — collides with a principle, halt before writing the code. Do not silently comply (docs/00 §5, §6).

2

2. Name the principle

State the number and the verbatim intent from docs/00 §2. Precision matters — 'this violates Principle 4 (deterministic execution)' is actionable; 'this feels wrong' is not.

3

3. Cite the doc section

Point to the owning section (00 §2.x) and any related hard rule in CLAUDE.md or locked decision (D-xxx). The Constitution is not relitigated inside build sessions.

4

4. Flag to the operator

Surface the conflict to Satya and stop. A red gate is a recorded FAIL — never weaken a threshold or re-roll to green; a determinism or parity delta is STOP-EVERYTHING.

These are never relitigated

The 11 principles are constitutional (DECISIONS.md: 'Locked … Not subject to build-session change'). You don't argue with them mid-build; you obey them or you stop and flag. New ideas that brush against a principle go through a new pre-registered hypothesis and the full G0–G6 gates — they never get a quiet exception. If an agent ever proposes weakening a threshold to make a gate pass, that itself is a Principle-1/-2 violation.

Where the Constitution lives
docs/00_MASTER_PLAN.md §2 (verbatim, of record); restated in CLAUDE.md; quick card in tracking/memory/constitution.md
Status
Load-bearing — code that violates one is wrong even if it works (00 §2 intro)
Locked
Constitutional in DECISIONS.md; not subject to build-session change; never relitigated
How they're enforced
Gate ladder G0–G6, the 3-engine parity court, the PreToolUse Constitution guardrail hook (advisory, D-061), and the /constitution command
When in conflict
Stop → name the principle → cite the section → flag the operator (never silently comply)
What to stay aware of
  • The Constitution is load-bearing: code that violates a principle is wrong even if the backtest looks great. 'It works' is not a defense (docs/00 §2 intro).
  • The sacred order path is Principles 3 + 4 + 5 together — same-kernel signals, deterministic execution, AI only ever says no. Treat any change touching live order logic as constitutional.
  • An unexplained delta in the 3-engine parity court (kernel / TradeStation / LEAN) is a STOP-EVERYTHING event, not a bug to defer (Principle 3 / G4).
  • Every backtest writes a trials row — there are no untracked runs, and re-rolls face the higher DSR hurdle, never a reset (Principle 2).
  • Never hard-code a Topstep limit; values live only in config/topstep-rules.json and are re-verified weekly (Principle 9 / D-063).
  • A drawdown never justifies re-fitting live parameters. The only honest responses are suppress, retire, or run a NEW spec through the full G0–G6 gates (Principle 11).
  • These principles are constitutional and never relitigated inside a build session — if a request collides with one, stop, name it, cite the section, and flag the operator (DECISIONS.md; docs/00 §5, §6).
  • Suppression beats optimization (Principle 8): when a strategy struggles, the cheapest correct move is usually to trade less, not to turn a parameter knob.

Locked decisions & the why

D-001
One shared C# kernel — SENTINEL backtests it, AXIOM trades it; the code that trades is the code that backtests.
Why: This IS Principle 3 (parity by construction) made architectural. A backtested edge only transfers to live if the exact same code produces both — so parity is built in, not hoped for. (DECISIONS.md, docs/01)
D-003
The kernel is a pure library — no venue SDKs, HTTP, UI, LLM calls, Topstep values, or wall-clock reads (clock is injected).
Why: Keeps the validated trading core deterministic and isolated (Principles 3, 4) — nothing non-deterministic or external can leak into the code that both backtests and trades. (DECISIONS.md, docs/01 §7)
D-005
No LLM in any order path; the AI sidecar can only veto / narrate / resist / reflect.
Why: Direct codification of Principles 4 and 5 — a non-deterministic model can never place, size, modify, or approve an order, only reduce risk by saying no. (DECISIONS.md, docs/04 §3)
D-030
Gate ladder G0→G6 with fail-routing; every run is a counted trial; the budget gate is code, not Claude.
Why: Operationalizes Principles 1 and 2 — strategies are killed cheaply through staged gates, and every trial is mechanically counted in the ledger so re-rolls face the higher DSR hurdle. (DECISIONS.md, docs/02 §3)
D-031
G0 requires a stated one-sentence mechanism — no mechanism, no run; pure pattern-mining is banned.
Why: Enforces Principle 1 (honest discovery) at the front door — you can't data-mine a curve and call it an edge; a causal mechanism must be declared before a single trial is spent. (DECISIONS.md, docs/02 §3)
D-040
Topstep rule values are versioned config (config/topstep-rules.json), re-checked on a cadence.
Why: This IS Principle 9 — four major Topstep rule changes Nov 2025–Feb 2026 made hard-coded limits a live-account hazard; config-as-rules lets a change be a one-line bump. (DECISIONS.md)
D-063
Topstep rulebook re-check cadence raised from monthly to WEEKLY, folded into the calendar-news check (/verify-topstep-rules).
Why: Strengthens Principle 9 in practice — given how often Topstep rules move, a weekly verified re-check with a dated log entry keeps the config provably current before any eval/live order. (DECISIONS.md)
D-041
Automation only on Combine / Express Funded; hard-blocked on Live Funded; no VPS/VPN; everything on the personal machine.
Why: Codifies Principle 10 (local-first, budget-true) and the hard boundary — Topstep bans VPS/VPN, so a cloud deploy would get the account banned; everything stays on the one local machine. (DECISIONS.md, docs/00 §3)
D-080
Failure doctrine + independent grading: a red gate is a recorded FAIL; gates are graded by independent auditor subagents, never the author; never weaken a threshold or re-roll to green.
Why: Protects Principles 1 and 2 from the author's own optimism — the person who built a gate can't pass it, and a fail is recorded honestly rather than threshold-shopped into a green. (DECISIONS.md, supersedes D-070)
Sources: docs/00_MASTER_PLAN.md §2 — The Constitution (Principles 1–11), verbatim and of record · docs/00_MASTER_PLAN.md §1 — what SENTINEL and AXIOM are · docs/00_MASTER_PLAN.md §3 — system context and hard boundaries (only AXIOM emits orders; AI can only reduce risk) · docs/00_MASTER_PLAN.md §6 — CLAUDE.md template / hard rules · CLAUDE.md — The Constitution restatement + Hard rules (never violate) · tracking/DECISIONS.md — Constitutional section + D-001, D-003, D-005, D-030, D-031, D-040, D-041, D-063, D-080 · tracking/memory/constitution.md — Constitution quick-reference card (in-practice tells + traceability)
Module 03 of 19 · Foundations

⛔ The Hard Rules (Never Violate)

The short list of construction-level bans that no amount of clever code or good intentions is allowed to override — the rails that keep Paras from quietly becoming a different, more dangerous system.

Read this first

The Constitution (11 principles) is the philosophy — the why. The Hard Rules are the philosophy made enforceable: the concrete, never-violate bans that turn principle into something a reviewer (or an auditor subagent) can point at and say NO. CLAUDE.md states it bluntly: code that violates these is wrong even if it works. They are not style preferences. They are the load-bearing walls.

You are building a system that will one day place real orders on a funded Topstep account with your own money on the line. The danger is never the obvious bug — it is the reasonable-looking shortcut: a quick LLM call to 'improve' a sizing decision, a Topstep dollar amount hard-coded 'just for now', a backtest engine that drifts a little away from the live engine. Each of those is individually defensible and collectively fatal. The Hard Rules exist so that none of them is ever a judgment call. They are pre-decided. This lesson walks each rule: what it bans, why it exists, how it is enforced, and what to watch for.

All of these come straight from the 'Hard rules (never violate)' section of CLAUDE.md, and each traces to a locked decision (D-xxx) in tracking/DECISIONS.md. Nothing here is invented — if a detail isn't in those docs, it isn't on this page.

D-005 / Principle 4

1. No LLM in the order path

No LLM calls in any order path. Compliance and execution are deterministic. Stops, targets, sizing, and the flatten deadline are plain code — never a model's output. The code that decides whether an order is allowed and where the stop goes contains zero AI calls.

D-005 / Principles 4-5 / docs/04 §3

2. The AI sidecar may only veto / narrate / resist / reflect

The runtime AI sidecar has exactly four verbs: veto an entry, narrate what happened, resist an override attempt, reflect after the close. It may NEVER approve, place, size, or modify. Every model in the system can only ever reduce risk.

D-003 / Principle 3 / docs/01 §7

3. The kernel is a pure library

The shared C# kernel contains no venue SDKs, no HTTP clients, no UI, no LLM calls, no Topstep values, and no wall-clock reads (the clock is injected). It is pure deterministic math and logic — the same code that backtests is the code that trades.

D-040 / Principle 9

4. Topstep values are config, not constants

Topstep rule values (loss limits, targets, contract caps) live in config/topstep-rules.json, never as numbers baked into code. The file is transcribed from spec and the operator must verify it against the live rulebook before any eval or live order.

D-069

5. Zero runtime token spend + no bar-level LLM calls

Runtime inference is local Ollama only — zero new API token spend at runtime. And no bar-level LLM calls: session cadence is the runtime ceiling for any model invocation. A model never fires per-bar, and it never costs a cloud token while trading.

D-071 / docs/09

6. The Continuity Rule

Every irreplaceable artifact has a nightly VERIFIED local backup; the migration runbook stays current; any new secret or machine-bound dependency is inventoried in the same change; no absolute paths in code; the restore path is drilled monthly. A backup that has never been restored is a hope, not a backup.

Now the deep dive — each rule with its mechanism and the concrete failure it prevents.

Rule 1 — No LLM in the order path (D-005, Principle 4)

WHAT: From the moment a signal exists to the moment an order is placed, modified, cancelled, or a position is flattened, there are zero LLM calls. Stops, targets, position size, and the end-of-session flatten deadline are all deterministic code. WHY: an LLM is non-deterministic, can hallucinate, can be slow, and can be coaxed. None of those are acceptable on the path that moves real money. You also cannot replay or prove an LLM-driven order the way you can prove a line of code. HOW: the compliance engine and execution are plain C#; the AI lives beside that path as advisory context, never inside it. WATCH FOR: any feature framed as 'let the model pick the size' or 'let the AI choose the stop' — that is this rule, violated. As a futures trader you know the order path is where ruin happens; this rule keeps the unpredictable thing out of it.

Rule 2 — The sidecar may only veto / narrate / resist / reflect (D-005, docs/04 §3)

WHAT: The local runtime AI has four jobs and four only. It can VETO an entry (say no to a trade), NARRATE what just happened (status text), RESIST an override (talk you out of unlocking something it cannot unlock), and REFLECT after the close (write a session note). It can never approve a trade, place an order, set a size, or modify anything. HOW: the boundary table in docs/04 §3 is enforced by construction — the veto returns at most {"veto": true/false}, and even a missing or slow veto can never CREATE or ENLARGE a trade (the absence of a 'no' is not a 'yes'). WHY: the runtime AI's failure mode is greed, override, and tilt — so it is bounded so every model can only ever subtract risk, never add it. WATCH FOR: the asymmetry — research-time Claude proposes (gate-bounded), runtime sidecar can only say no (construction-bounded). Two AIs, two clocks, one asymmetry.

Rule 3 — The kernel is a pure library (D-003, Principle 3, docs/01 §7)

WHAT: The shared kernel — the math and logic that both SENTINEL backtests and AXIOM trades — imports no venue SDKs, no HTTP clients, no UI framework, no LLM client, no Topstep numbers, and never reads the wall clock (time is passed in / injected). WHY: this is how parity-by-construction (Principle 3) is achieved. If the kernel could read the real clock or call the network, the backtest and the live run would diverge, and 'the code that trades is the code that backtests' would be a lie. Purity is what makes a backtest a trustworthy prediction of live behavior. HOW: the kernel is a separate project with a deliberately starved dependency list; the clock is an injected interface so tests and live runs feed it the time. WATCH FOR: the tempting import — 'I just need an HttpClient here' or 'let me grab DateTime.Now' inside the kernel. Both break purity. Anything impure lives in SENTINEL or AXIOM around the kernel, never in it.

Rule 4 — Topstep values are config, not constants (D-040, Principle 9)

WHAT: Every Topstep rule value — daily loss limit, max loss limit, profit target, contract caps, consistency rules — lives in config/topstep-rules.json, never hard-coded. WHY: Topstep changes its rules, and the rules differ by account type (Combine vs Express Funded × 50K/100K/150K). A number baked into code goes stale silently and can blow your account by enforcing yesterday's limit. Config is versioned, re-checkable, and account-type driven (D-064). HOW: AXIOM resolves the active account profile and applies its rules at PRE_FLIGHT; rule changes activate at the next pre-flight, never mid-session. CRITICAL: the file is transcribed from spec — YOU, the operator, must verify it against the live Topstep rulebook before any eval or live order. The /verify-topstep-rules skill runs this WEEKLY (D-063) and appends a dated verification log even when nothing changed.

Rule 5 — Zero runtime token spend + no bar-level LLM calls (D-069)

WHAT: Two bans in one. (a) Zero new API token spend at runtime — all runtime inference is local Ollama only; if a step 'needs' a cloud model while trading, the step is wrong. (b) No bar-level LLM calls — session cadence is the runtime ceiling for any model invocation, so a model never fires once per bar. WHY (cost): a per-bar cloud call would be both unbounded spend and a latency bomb, breaking the budget-true principle (≤ $200/mo, $0 marginal AI). WHY (discipline): bar-frequency model calls drag an unpredictable, non-replayable component into the fast loop. The sidecar runs at session cadence on local models so it stays $0 marginal and bounded. WATCH FOR: 'just ask the model to confirm each bar' — that is both bans at once. A related ban (D-067/D-068): no verdict- or reflection-derived VALUE may reach execution, sizing, filters, or gates — LLM output is read-only advisory context; behavior changes go through the G0-G6 gates.

Rule 6 — The Continuity Rule (D-071, docs/09)

WHAT: Every irreplaceable artifact (the bars vault, the decision ledger, secrets) has a nightly VERIFIED local backup; the machine-migration runbook stays current; any new secret, prerequisite, or machine-bound dependency lands in the docs/09 §4 inventory or scripts/setup-machine.ps1 IN THE SAME CHANGE; no absolute paths or machine assumptions in code; the restore path is drilled monthly during build. WHY: this is a one-person operation on a personal machine with no VPS. A dead disk or a new laptop must not cost you years of irreplaceable market data and trial history. 'Verified' is the load-bearing word — the doctrine's line is blunt: a backup that has never been restored is a hope, not a backup. HOW: tiered artifacts, a 3-2-1-adapted-local scheme (second physical disk + weekly offline copy + git remote for text), migration target under half a day with startup gap-sync as the data healer. WATCH FOR: adding a secret or a hard-coded path and 'documenting it later' — the rule says same change, not later.

Hard ruleWhat it bansWhy (the failure it prevents)Decision
No LLM in order pathAny model call from signal → order/flattenNon-deterministic, slow, unprovable money pathD-005
Sidecar = veto/narrate/resist/reflect onlyAI approving, placing, sizing, modifyingRuntime AI greed/override/tilt — it can only subtract riskD-005
Kernel is a pure librarySDKs, HTTP, UI, LLM, Topstep values, wall-clock in kernelBacktest ≠ live drift; breaks parity-by-constructionD-003
Topstep values are configHard-coded loss limits / targets / capsStale rule blows the account; rules differ by account typeD-040
Zero runtime token spendAny cloud model call at runtimeUnbounded spend; breaks budget-true ($0 marginal AI)D-069
No bar-level LLM callsModel invocation faster than session cadenceLatency + non-replayable component in the fast loopD-069
No verdict→behavior valueLLM output reaching exec/sizing/filters/gatesBehavior change must pass G0-G6, not an opinionD-067/D-068
The Continuity RuleUnverified backups; undocumented secrets/pathsDisk death / machine loss erases irreplaceable dataD-071
A concrete walk-through — the same idea, three ways

Say you have an idea: 'use a model to tighten the stop when volatility spikes.' Watch how the rules route it. (1) At RUNTIME via a per-bar cloud call to set the stop: violates No-LLM-in-order-path, No-bar-level-calls, and Zero-token-spend — three rules at once. Hard NO. (2) At runtime via the local sidecar 'choosing' a tighter stop: still violates the order-path ban and the sidecar's four-verb limit (it cannot size/modify). NO. (3) The legal route: write it as a deterministic rule (stop = f(volatility)), express it as a strategy spec, and run it through the G0-G6 gates. If it survives, it ships as plain code in the kernel. The AI may have helped you THINK of it (research clock), but only proven deterministic code touches the trade.

1

When a change feels like it touches a hard rule, stop and route it

These are construction-level bans, not preferences. Treat any change near them as needing a deliberate check before you write a line.

2

Ask: does this put a model anywhere in the order path?

If a signal → order/modify/cancel/flatten path gains an LLM call, it is wrong. The path stays deterministic code (Rule 1).

3

Ask: is the AI doing more than veto/narrate/resist/reflect?

If the sidecar approves, places, sizes, or modifies anything, it has exceeded its four verbs (Rule 2). It can only subtract risk.

4

Ask: am I importing impurity into the kernel?

HTTP, UI, SDKs, Topstep numbers, DateTime.Now — none belong in the kernel. Put them in SENTINEL/AXIOM around it (Rule 3).

5

Ask: did I hard-code a Topstep number, or a cloud/per-bar model call?

Topstep values → config/topstep-rules.json (Rule 4). Runtime model calls → local Ollama, session cadence only, no cloud tokens (Rule 5).

6

Ask: did I add a secret, a path, or an artifact without continuity?

New secret/prerequisite/machine dependency → inventory it in the SAME change; verified nightly backup; no absolute paths (Rule 6).

7

If it still conflicts with a locked decision — flag, do not silently comply

DECISIONS.md is explicit: if a build request conflicts with a locked decision, stop and flag it rather than silently complying.

How these get enforced (you are not the only guard)

You don't have to police these by memory. The constitution-guardian and parity-auditor subagents grade changes against these rules in separate contexts during the ULTRACODE workflow — for example, the D-078 plant-crash fix was checked by constitution-guardian to confirm 'kernel untouched, no order-path change' before it shipped. A PreToolUse guardrail hook also flags constitution risks advisorily (D-061). The point of writing them down as HARD rules is exactly so an independent auditor — human or agent — can catch a violation the author rationalized.

The one sentence to carry

The Hard Rules are how Paras adapts by saying NO more often — never by quietly becoming a different, riskier system. The order path stays deterministic, the AI can only subtract risk, the kernel stays pure, Topstep is config, runtime is local-and-free, and nothing irreplaceable is one dead disk away from gone. Code that breaks any of these is wrong even if it works.

What to stay aware of
  • Code that violates a hard rule is wrong even if it works — these are construction-level bans, not style preferences; an auditor (or agent) can and will point at them.
  • The order path is sacred: from signal to order/modify/cancel/flatten there are zero LLM calls — stops, targets, sizing, and the flatten deadline are deterministic code (D-005).
  • The runtime sidecar has exactly four verbs (veto/narrate/resist/reflect) and can only subtract risk — a missing or slow veto can never create or enlarge a trade; the absence of a 'no' is not a 'yes'.
  • Kernel purity is non-negotiable: no HTTP, UI, SDKs, Topstep numbers, or DateTime.Now inside the kernel — the clock is injected. Impure code lives in SENTINEL/AXIOM around the kernel.
  • Topstep values live ONLY in config/topstep-rules.json and YOU must verify them against the live rulebook before any eval/live order — run /verify-topstep-rules weekly; rule changes apply at the next PRE_FLIGHT, never mid-session.
  • Runtime inference is local Ollama only — zero cloud tokens, no per-bar model calls; if a runtime step 'needs' a cloud model, the step is wrong (D-069).
  • No LLM verdict or reflection value may reach execution, sizing, filters, or gates — model output is read-only advisory context; behavior changes go through G0-G6 (D-067/D-068).
  • Continuity is part of every change: new secret/prerequisite/machine dependency gets inventoried in the SAME change, backups are nightly AND verified, restore is drilled — an untested backup is just hope.
  • If a build request conflicts with a locked decision, STOP and flag it rather than silently complying (DECISIONS.md standing instruction).
  • The deeper purpose: Paras adapts by saying NO more often (suppression), never by quietly mutating into a different, riskier system (Constitution 8 & 11).

Locked decisions & the why

D-005
No LLM in any order path; the AI sidecar can only veto / narrate / resist / reflect.
Why: Constitution Principles 4-5 made enforceable: execution and compliance must be deterministic and replayable; the runtime AI's failure mode is greed/override/tilt, so by construction every model can only reduce risk, never approve/place/size/modify. (docs/04 §3 boundary table.)
D-003
The kernel is a pure library — no venue SDKs, HTTP, UI, LLM, Topstep values, or wall-clock reads (clock injected).
Why: Purity is what makes parity-by-construction (Principle 3) real: the code that trades is the code that backtests. Any impurity (network, real clock) would make the backtest diverge from live and break the prediction. (docs/01 §7.)
D-040
Topstep rule values are versioned config (config/topstep-rules.json), re-checked weekly; never constants.
Why: Constitution Principle 9. Topstep changes its rules and they differ by account type; a hard-coded limit goes stale silently and can blow the account. Config is versioned, account-type driven (D-064), and verified against the live rulebook before any order (D-062/D-063).
D-069
Standing repo law: no bar-level LLM calls (session cadence is the runtime ceiling); zero new API token spend at runtime (local Ollama only); no reflection-derived values in execution/sizing/filters/gates.
Why: Keeps runtime budget-true ($0 marginal AI, ≤ $200/mo) and keeps an unpredictable, non-replayable component out of the fast loop. A step that 'needs' a cloud model at runtime is, by this rule, the wrong step. (External-framework adoption policy, operator handoff 2026-06-10.)
D-067/D-068
No verdict- or reflection-derived value may reach execution, sizing, filters, or gates — LLM outputs are read-only advisory context (standing risk T-06).
Why: The verdict contract and reflection loop are advisory only; any behavior change must pass through the full G0-G6 gates, never enter live paths as a model opinion. GO-class verdicts carry no permissive power; zero code paths deliver a verdict into compliance-engine inputs.
D-071
Continuity & Portability doctrine — nightly verified local backups, current migration runbook, in-same-change inventory of new secrets/dependencies, no absolute paths, monthly-drilled restore. A new CLAUDE.md hard rule.
Why: One-person, personal-machine, no-VPS operation: a dead disk or new machine must not erase irreplaceable market data and trial history. 'A backup that has never been restored is a hope, not a backup.' (docs/09, risk R-08.)
Sources: CLAUDE.md — 'Hard rules (never violate)' section (the eight bans verbatim) · CLAUDE.md — 'The Constitution' Principles 4, 5, 8, 9, 11 · tracking/DECISIONS.md — D-003, D-005, D-040, D-062, D-063, D-064, D-067, D-068, D-069, D-071, D-078 · tracking/memory/boundaries.md — AI boundary table, the four-verb sidecar limit, veto-timeout fail-safe (source of record docs/04 §3, §2) · docs/01 §7 (kernel purity), docs/04 §3 (Action × Authority boundary table), docs/09 (Continuity & Portability doctrine)
Module 04 of 19 · Foundations

🪙 The Paras Method (6 Build Rings)

Every Paras module passes six rings in fixed order — Design, Data, Concurrency Plan, Build, Wire, Verify — and every task inside those rings runs the ULTRACODE loop (Orient, Plan, Build, Prove, Record) so the factory itself can never lie to you.

Why this page exists

Paras kills bad strategies cheaply. But a sloppy build process produces exactly the impressive, wrong results the Constitution exists to prevent. The Paras Method (the six rings) and ULTRACODE (the per-task loop) are the two disciplines that keep the factory honest. This is the 'how we build' layer — the 'what we build' lives in docs/00-05. Read this before you touch any module.

Think of building a Paras module the way you'd think about taking a trade with a checklist you cannot skip. You don't enter the order, then decide your stop, then check the rules, then size it — you do those in a fixed order, every time, and if any step fails you don't trade. The Paras Method is that checklist for code. A module is not allowed to move to the next ring until the current ring's exit criteria are demonstrably met. ULTRACODE is the smaller loop you run inside every single task (even a one-line fix): orient yourself, write a plan, build it test-first, prove it with real numbers, and record what happened. 'Done' without proof and a recorded trail is not done.

There are two nested loops. The OUTER loop is the Paras Method — six rings a module passes once, in order, on its way from idea to verified. The INNER loop is ULTRACODE — five phases every individual task runs, possibly many times within a single ring. The rings tell you WHAT stage the module is at; ULTRACODE governs HOW each piece of work inside that stage is executed and proven. Both are always on. Neither is a flag you opt into.

design-reviewer

1 · DESIGN

UI/UX prototype on mock data (per the design briefs) for visual modules; or, for headless modules, the public interface + data contract written and reviewed first. You design the shape before you build the thing.

data-plant-engineer

2 · DATA

Schemas and migrations are designed, reviewed, and TESTED — constraints, idempotency, fixtures — before any engine code touches them. The store is proven before the logic arrives.

design-reviewer

3 · CONCURRENCY PLAN

A half-page per module: which threads/processes exist, the shared state named explicitly, the failure modes, and why the chosen isolation is safe. Default doctrine: process isolation for work, threads only over immutable/mmap data, UI on its own dispatcher.

kernel-engineer

4 · BUILD

Test-first where feasible. Unit + property tests are written WITH the code, green locally. The smallest change that satisfies the ring.

module agent

5 · WIRE

Integration: real data replaces the mock contracts via a module SWAP, not a screen rebuild. The interface you designed in ring 1 is exactly the seam you wire here.

test-sentinel → phase-gatekeeper

6 · VERIFY

The module's test gate runs green (audited by test-sentinel), the milestone ceremony is held, and a learnings.md entry is written. Only now is the module real.

The order is load-bearing

None of the six rings is skipped, and they are never blended silently (ULTRACODE Law U2: 'One ring at a time'). A task must NAME its ring. The reason Data comes before Build is that a trading data plant that gets its schema wrong corrupts bars silently — you'd rather find that with a constraint test than with a backtest that quietly lied. The reason Concurrency Plan is its own ring, before Build, is that the determinism law (cached==uncached AND threaded==single-threaded, byte-for-byte) is impossible to retrofit onto threading you never planned.

Now the inner loop. ULTRACODE is the standing execution mode — always on, for the main session and every subagent. It exists because the Constitution's first three principles (honest discovery, every trial counted, parity by construction) die first at the PROCESS level: an untested edit, an unrecorded run, an unverified 'done.' Every task — from a one-line fix to a whole milestone — runs the same five-phase loop.

1

ORIENT

Locate the work: current milestone, which Paras ring, the owning doc section, and any locked decisions that apply. Route memory via tracking/memory/INDEX.md. Reads STATE.md, DECISIONS.md, the doc section. Writes nothing yet — you're getting your bearings.

2

PLAN

Write a micro-plan BEFORE any file edit (Law U1): goal, ring, files, the tests you'll write/run, and the risks. For risky work, name the rollback. Maximum reasoning effort on any non-trivial decision. The plan lives in the conversation, or in SESSION_LOG.md if substantial.

3

BUILD

Test-first where feasible (this is ring 4 work). The smallest change that satisfies the ring. Commit messages cite the doc + section (e.g. 'docs/01 §4.2'). Writes code + tests.

4

PROVE

Run the gate that applies and record the REAL numbers — coverage %, property cases, scenario counts — as a dated row in tracking/TEST_LOG.md. Invoke the independent auditor (Law U4): test-sentinel for test quality, parity-auditor for engine agreement, phase-gatekeeper for exit criteria. The author never grades their own gate.

5

RECORD

Trackers are part of the task, not an afterthought (Law U5). Update PROGRESS (rings/milestones), STATE (if the pointer moved), learnings/ledger (if research-facing), DECISIONS (if anything was locked). A task that didn't update its trackers is not finished.

A sixth step for multi-agent runs: LEDGER

When a task runs through the Workflow tool (the default 'ULTRACODE + Workflow' multi-agent path), the loop adds a LEDGER step (D-076): in the SAME turn the workflow reports results, and BEFORE acting on them, you surface a Subagent Ledger to the operator — how many subagents ran (retries flagged) and one line per subagent (role, what it did, verdict). Non-negotiable transparency: you must always be able to see how many agents ran and what each did.

Here is how the two loops fit together in practice. Suppose you're building the SessionClock for the kernel. The MODULE walks the six rings once: ring 1 you write and review its public interface and data contract; ring 2 there's no new schema so it's light; ring 3 you write the half-page concurrency plan (the clock is injected, never reads wall time); ring 4 you build it test-first with FsCheck property tests covering DST-transition Sundays; ring 5 you wire it into the real engine via the interface; ring 6 the test gate goes green under test-sentinel and you hold the ceremony. But INSIDE ring 4, every commit — adding a method, fixing a property-test counterexample — runs the full ULTRACODE loop: orient, plan, build, prove (a TEST_LOG row), record. Many ULTRACODE loops live inside one ring.

RingEntry requiresExit requiresTypical auditor
1 DESIGNMilestone open; owning doc § readUI prototype on mock data, OR public interface + data contract, written & revieweddesign-reviewer
2 DATARing 1 contract existsSchemas/migrations designed AND tested: constraints, idempotency, fixtures — before engine codedata-plant-engineer
3 CONCURRENCYRings 1-2 doneHalf-page: threads/processes, named shared state, failure modes, isolation justifieddesign-reviewer / module owner
4 BUILDRing 3 plan existsCode + unit/property tests written together, green locallymodule agent (kernel-engineer …)
5 WIRERing 4 greenReal data replaces mock contracts via module swap; integration tests greenmodule agent
6 VERIFYRing 5 doneModule test gate green (test-sentinel), milestone ceremony held, learnings.md entry writtentest-sentinel → phase-gatekeeper

Quality is not enforced by good intentions — it's enforced by five hard Laws and by routing every gate to a second pair of eyes. The five ULTRACODE Laws are: U1 Plan before touch; U2 One ring at a time; U3 Prove, don't claim; U4 Independent verification; U5 Record everything. The one that surprises most newcomers is U4: the author NEVER grades their own gate. The builder model (Opus 4.8) implements and writes first-round tests, but the verdict that the gate is green comes from a separate auditor agent in a separate context — test-sentinel for test quality, parity-auditor for engine agreement, phase-gatekeeper for exit criteria, design-reviewer for ring 1, constitution-guardian on disputes.

U1 — Plan before touch
No file edit before a written micro-plan (goal, ring, files, tests, risks). Maximum reasoning on non-trivial decisions.
U2 — One ring at a time
Work proceeds in Paras Method order. A task names its ring; rings are never skipped or blended silently.
U3 — Prove, don't claim
Nothing is done on assertion. Done = green tests + a dated evidence row in TEST_LOG.md. Coverage is reported, not summarized as 'looks fine.' test-sentinel's default verdict is FAIL.
U4 — Independent verification
Before any gate/milestone is green, a second agent rules on it. The author never grades their own gate.
U5 — Record everything
Tracker updates are part of the task. Tool activity auto-journals to ops/journal/; outcomes go to STATE/PROGRESS/SESSION_LOG/TEST_LOG; research to ledger + learnings.

Proof has a single home: tracking/TEST_LOG.md. Every gate-relevant suite run appends a row — date (PT), milestone·module·ring, suites run, the exact command, the result, the MEASURED numbers (coverage %, property cases, scenario count), determinism/golden status, the auditor verdict, and the artifact path. This is the source of truth for engineering verification, the same way ops/ledger.duckdb is the source of truth for research trials. One concern, one home (Principle 7). The bars never move to fit the code: kernel components and engines must hit ≥90% line coverage; compliance-engine monitors must hit 100% branch coverage plus 10/10 adversarial scenarios; a property test with trivial generators is a FAIL.

Definition of Done (consolidated, docs/06 §6)

A task is done when ALL hold: (1) the micro-plan is satisfied and the named ring is complete; (2) code + tests green locally AND in CI; (3) a measured evidence row in TEST_LOG.md; (4) an independent auditor verdict where a gate is involved; (5) concurrency notes updated if threading was touched; (6) a doc/spec addendum if behavior diverged; (7) a ledger/learnings entry if research-facing; (8) trackers true (PROGRESS, STATE, dashboard in sync); (9) no Constitution principle bent, no locked decision relitigated. A MILESTONE adds: runbook test gate green under test-sentinel, phase-gatekeeper confirms exit criteria, and the ceremony is held and logged.

Failure doctrine — read this twice

A red gate is a RECORDED FAIL (U3). It is never a license to weaken a threshold, delete a test, or quietly retry until green (Principle 11 — the system adapts by saying no more often, never by quietly becoming a different system). A determinism or parity delta — cached vs uncached, threaded vs single-threaded, or kernel vs LEAN disagreement — is STOP-EVERYTHING, not a flaky test to shrug off. This is the whole point of the method: the factory must never be the thing that lies to you.

Finally, the method doesn't rely on you remembering to do all this. Enforcement is layered into the harness: the root CLAUDE.md keeps the ULTRACODE rules always in context; a SessionStart hook injects STATE + the memory index + the ULTRACODE banner at every boot; a PreToolUse guardrail hook warns on Constitution-boundary violations (advisory, D-061); a PostToolUse hook auto-journals every Write/Edit/Bash to ops/journal/; a PreCompact hook protects against losing state if compaction fires before a handover; and the /ultracode command lets you audit any session against the Five Laws on demand. The discipline is wired in, not willed.

What to stay aware of
  • Rings are passed in order and never skipped or blended silently — a task must NAME its ring (Law U2). If you find yourself building before the data schema is tested, you've jumped a ring.
  • 'Done' is not an assertion. Without green tests + a measured-numbers row in TEST_LOG.md + (where a gate is involved) an independent auditor verdict, the task is not done — no matter how finished the code looks (Law U3).
  • You never grade your own gate. The builder (Opus 4.8) implements and writes first-round tests; the green verdict comes from a separate auditor agent in a separate context (Law U4 / D-080).
  • A red gate is a recorded FAIL — never weaken a threshold, delete a test, or retry until green (Principle 11). Coverage bars (≥90% line on kernel/engines, 100% branch on compliance monitors) never move to fit the code.
  • A determinism or parity delta is STOP-EVERYTHING, not a flaky test. This includes cached vs uncached, threaded vs single-threaded, and kernel vs TradeStation vs LEAN.
  • For any Workflow (multi-agent) run, emit the Subagent Ledger in the same turn results are reported and BEFORE acting on them (D-076).
  • TEST_LOG.md is the source of truth for engineering verification; ops/ledger.duckdb is the source of truth for research trials. Don't cross the streams — one concern, one home (Principle 7).
  • Trackers are part of the task (Law U5). The auto-journal in ops/journal/ is hook-written and append-only — never hand-edit it.
  • Context budget: warn at 50%, wrap at 60% — at 60% run /paras-handover, no exceptions; the handover IS the memory transfer.

Locked decisions & the why

D-080
OS-level default = ultracode-xHigh effort + Workflow on EVERY substantive task, mandatory, no opt-out; and gates are graded by independent auditor subagents (phase-gatekeeper · test-sentinel · constitution-guardian · design-reviewer · parity-auditor) run via Workflow in separate agent contexts with adversarial verification — never by the author. Supersedes D-070.
Why: Optimize for the most exhaustive, correct answer — never the fastest or cheapest; token cost is not a constraint. Fable 5 was retired from public use, so the old two-model grading protocol became unworkable; routing verdicts through separate auditor agents preserves ULTRACODE U4 ('the author never grades their own gate'). The independent agent verdicts + measured TEST_LOG evidence + green CI ARE the gate. (operator instruction, 2026-06-15)
D-070
Original two-model build protocol: Opus 4.8 = builder (implements, writes tests-with-code, records evidence); a separate model = independent auditor/PM that grades all gates via the auditor agents. Implements ULTRACODE U4 directly. (Superseded by D-080.)
Why: This is where 'the author never grades their own gate' became a locked decision — a milestone is DONE only after an independent verification verdict. The grader changed under D-080 (to independent auditor subagents), but the U4 principle it established still governs the VERIFY ring and the PROVE phase. (operator instruction, 2026-06-12)
D-076
Workflow Subagent Transparency Mandate: any task run through the Workflow tool must surface a Subagent Ledger to the operator in the same turn results are reported, BEFORE acting on them — total subagents (retries flagged) + one line per subagent (role, action, verdict).
Why: Non-negotiable transparency, not optional: for any multi-agent run the operator must always be able to see how many agents ran and what each one did. This is why the ULTRACODE loop adds a sixth LEDGER step for Workflow runs. Enforced by a PostToolUse hook on the Workflow tool. (operator instruction, 2026-06-13)
D-061
Discipline enforcement is ADVISORY (guardrails + reminders via SessionStart + PreToolUse hooks), not hard-blocking; flip $HARD_BLOCK in guardrail.ps1 to escalate.
Why: The Constitution-boundary guardrail warns rather than blocks, so the operator stays in control of the workflow while still getting the safety signal. Explains why the PreToolUse layer in the enforcement matrix is 'automatic, advisory.' (operator choice, 2026-06-10)
Sources: CLAUDE.md — The Paras Method (six rings, in order, none skipped) · CLAUDE.md — ULTRACODE standing execution mode (the task loop: Orient→Plan→Build→Prove→Record→Ledger) · CLAUDE.md — The Constitution (Principles 1-3, 7, 11) + Hard rules · docs/06_ULTRACODE_EXECUTION_METHODOLOGY.md §1 — The Five Laws (U1-U5), the task loop, the enforcement matrix · docs/06 §2.1 — The six rings: entry/exit criteria + typical auditors · docs/06 §2.3 — Subagent routing (who builds, who audits) · docs/06 §3 — Testing methodology: taxonomy, coverage bars, when each layer runs, TEST_LOG evidence · docs/06 §5 / §5.1 — Recording map + Workflow Subagent Transparency Mandate (D-076) · docs/06 §6 — Definition of Done (consolidated) · tracking/DECISIONS.md — D-080, D-070, D-076, D-061
Module 05 of 19 · Foundations

🔁 How We Change Things & Keep Up

Every change in Paras flows through one disciplined loop — decide and lock, prove through gates, record everything — so you can always see what changed and why.

The whole lesson in one breath

Paras is a one-person operation that runs on AI automation. The only thing that keeps it from quietly drifting into a different, untrustworthy system is its operating rhythm: decisions get LOCKED so they're never re-argued, every change is PROVEN with measured evidence before it counts as done, and every change is RECORDED in a fixed set of files. Learn where the rhythm writes things down and you'll never be lost — you open one short file and know exactly where the system is and how it got there.

You're a solo operator leaning on AI to build and run a trading system. That is powerful and dangerous. Powerful because the machine can move fast; dangerous because fast + sloppy is exactly how you end up with impressive-looking results that are quietly wrong — the precise failure the Constitution exists to prevent. This lesson is the antidote: the disciplined cadence by which things change in Paras and the small set of files that let YOU, the operator, keep up with what changed and why without reading the whole repo. ULTRACODE (docs/06) names the reason plainly: the Constitution's principles 'die first at the process level — an untested edit, an unrecorded run, an unverified done.'

There are three things to understand, in order: (1) how a decision gets made and LOCKED so it's never relitigated; (2) how an actual change flows from idea to done — through the gates, with proof; and (3) how you, the operator, stay current using a handful of tracking files. We'll take them one at a time, then look at the real example happening in the repo right now.

lock it

Decisions

A choice that must never be re-argued is written once to tracking/DECISIONS.md as D-0xx, dated, with rationale. Build sessions do not relitigate it. Reversing one requires a new dated entry that explicitly supersedes the old.

prove it

Changes

Code/research changes run the ULTRACODE loop: ORIENT, PLAN, BUILD, PROVE, RECORD. 'Done' means green tests + a measured-evidence row + an independent auditor verdict on any gate — never just an assertion that it works.

see it

Keeping up

You read STATE.md (where are we / what's next), and when you want the story you read DECISIONS.md (why), SESSION_LOG.md (what happened each session), and learnings.md (what we discovered). One fact, one home.

PART 1 — HOW A DECISION GETS LOCKED. A 'decision' here is not a code change. It's a choice about how the system works that you never want to argue about a second time: 'all three apps are WPF,' 'the kernel is a pure library,' 'Topstep rule values live in config, never as constants.' These live in tracking/DECISIONS.md as numbered, dated entries (D-001, D-002, ... up to D-094 today). Each one states the decision, cites the doc section it comes from, and gives the rationale. The file's own header is blunt: 'Do NOT relitigate these inside build sessions.'

1

1 · A choice is reached

Either an operator instruction ('lock it up'), an architect/auditor ruling, or a builder decision made under independent grading. The decision is real the moment it's recorded — not before.

2

2 · It's written as D-0xx with a date and a WHY

Appended to tracking/DECISIONS.md: the decision, the source doc/section, and the rationale. Example: 'D-040 Topstep rule values are versioned config (config/topstep-rules.json), re-checked monthly. (Principle 9).'

3

3 · It becomes load-bearing law

From then on, code that violates it is wrong even if it works. CLAUDE.md says: 'If a build request conflicts with a locked decision, stop and flag it rather than silently complying.'

4

4 · Reversal is explicit, never silent

A locked decision is only overturned by a NEW dated entry that says it supersedes the old — and the old text is annotated, not deleted. You can always trace the history.

Why locked decisions exist (the WHY behind the rule)

A solo operator with an eager AI assistant will, without this rule, re-open the same questions every session and slowly let the answers drift — the system becomes something else by a thousand small 'reasonable' edits. Constitution Principle 11 forbids exactly that: 'The system adapts by saying no more often — never by quietly becoming a different system.' DECISIONS.md is the memory that makes Principle 11 enforceable. It is why you can trust that the system you built last month is the same system running today.

Real example of supersession done right: D-070 originally said 'Fable 5 grades every gate.' When Fable was retired (2026-06-15), the team did not quietly delete D-070 — they wrote D-080, dated, stating it 'supersedes D-070,' and re-routed grading to independent auditor subagents. The old entry stays, annotated. That's the discipline: history is never rewritten, only extended.

PART 2 — HOW A CHANGE FLOWS. Every task in this repo — from a one-line fix to a milestone — runs the same five-step loop (ULTRACODE, docs/06 §1.2). It is not optional and it has no fast path; the standing default is maximum effort on every substantive task. The loop is what turns 'I changed some code' into 'this change is real, proven, and recorded.'

StepWhat happensWhat it readsWhat it writes
ORIENTLocate the work: current milestone, ring, owning doc section, applicable locked decisions. Route memory via the INDEX.STATE.md, DECISIONS.md, the doc section(nothing yet)
PLANA written micro-plan BEFORE any edit: goal, Paras ring, files, tests to write/run, risks. Name the rollback for risky work.spec docs, memory cardsthe plan (in conversation, or SESSION_LOG if substantial)
BUILDTest-first where feasible; the smallest change that satisfies the ring. Commit messages cite doc+section.code + tests
PROVERun the gate that applies. Record REAL numbers (coverage %, property cases, scenario counts). Get an independent auditor verdict — the author never grades their own gate.test outputa row in tracking/TEST_LOG.md
RECORDUpdate the trackers: PROGRESS (rings/milestones), STATE (if the pointer moved), learnings/ledger (if research-facing), DECISIONS (if anything was locked).tracking files
The two halves that get skipped — and why that's fatal

ULTRACODE's rule is exact: “'Done' without PROVE + RECORD is not done.” PROVE means an evidence row with measured numbers in TEST_LOG.md plus an independent auditor verdict on any gate (Law U3 'prove, don't claim' + U4 'independent verification'). RECORD means the trackers are actually true afterward (Law U5). A change that 'works' but was never proven, or was proven but never recorded, has not happened as far as the system is concerned. This is what protects you from a confident AI that says 'done' when it isn't.

Two more rules sit on top of the loop and are worth knowing because they're sacred. First, the failure doctrine: a red gate is a recorded FAIL — you NEVER weaken the threshold, delete the test, or quietly retry until it goes green. A determinism or parity mismatch is STOP-EVERYTHING, not a flaky test. Second, gates run in order: you do not start a phase before its predecessor's exit criteria are demonstrably met. These two together are why Paras 'kills bad strategies cheaply' instead of nursing them along.

PART 3 — HOW YOU KEEP UP. You should never have to read the whole repo to know where things stand. The system writes its own status into a fixed set of files, each with one job. The most important habit you can build: open these in the right order. STATE.md first (always), then drill into the others only when you want the story.

FileAnswersHow to read itShape
tracking/STATE.mdWhere are we right now / what's the next action?Read this FIRST, every session. It's short and kept true.Short, overwritten
tracking/DECISIONS.mdWhy is it this way? What can't I re-argue?Read the D-0xx entry relevant to your question.Append-only, dated
tracking/SESSION_LOG.mdWhat happened, session by session?Newest at top; each entry ends with a 'Prompt for Next Session'.Append-only, newest top
ops/learnings.mdWhat did we discover (including negative results)?The research corpus — tagged, append-only.Append-only, tagged
tracking/TEST_LOG.mdWhat was actually proven, with what numbers?The engineering-evidence ledger; one row per gate run.Append-only, newest top
tracking/PROGRESS.mdWhich rings/milestones are done?The milestone board (mirrored to the HTML dashboard).Checkboxes + funnel
The two rituals that keep these files honest

You don't maintain these by hand. /paras-start OPENS a session — it loads STATE.md, the Constitution, and the active phase's doc sections, then runs a pre-flight checklist before any work. /paras-handover CLOSES a substantial session — it updates STATE.md, appends a SESSION_LOG.md entry, refreshes the HTML dashboard, and ends with a copy-pasteable 'Prompt for Next Session' block. That handover block is non-negotiable: it IS the memory transfer to the next session. The rule of thumb on timing: warn at 50% context, wrap at 60% — at 60% you run /paras-handover, no exceptions.

Open a session
/paras-start — loads STATE + Constitution + active phase doc sections, runs pre-flight
Close a session
/paras-handover — updates STATE, appends SESSION_LOG, refreshes dashboard, emits 'Prompt for Next Session'
Check a phase gate
/phase-check — are the current phase's exit criteria met; may the next phase start?
Audit the rhythm
/ultracode — checks the current session against the Five Laws
One fact, one home
If a fact must appear twice (MD + HTML dashboard), the MD is authoritative; the HTML is refreshed by the ritual.

WORKED EXAMPLE — the rhythm in action, right now. The repo is mid-milestone (M1 'First Light'). Watch how a single discovery flowed through every part of this lesson during the soak test on 2026-06-18. (1) DISCOVERY: the 72h soak surfaced that the market calendar over-closed the market — Juneteenth was marked a full holiday and the evening-skip rule extended the closure to Sunday, even though the market actually reopened Thursday afternoon. (2) DECISION LOCKED: rather than a one-off patch, the operator locked D-094 — a HARD RULE that the market model must be asset-class-agnostic and per-instrument (so US equities, FX, crypto can be added later by config, not a rewrite) — with the soak finding written in as the motivating evidence, and a matching risk R-09. (3) RECORDED, NOT YET BUILT: the fix and an independent calendar-correctness audit were QUEUED post-soak, explicitly NOT built mid-soak (because D-082 forbids rebuilding the running plant). (4) KEEP-UP: all of this is legible to you by reading STATE.md's NEXT ACTION block and the D-094 entry — you can see what changed, why, and what's pending, without touching a line of code.

Notice what the example does NOT do. It does not silently edit the calendar and move on. It does not re-argue settled choices. It does not weaken the soak gate to make the run pass — the same session shows a soak FAIL recorded honestly (exit code 14, freshness stall) with 'RECORDED FAIL (failure doctrine — do NOT weaken the gate)' and a root-cause-then-re-soak plan. That is the rhythm working exactly as designed: discover, lock the durable choice, prove honestly, record so the operator can keep up.

Watch-outs — the failure modes this rhythm is guarding against

1) A 'done' with no TEST_LOG row + no auditor verdict is not done — push back on it. 2) A green gate that was made green by weakening a threshold or deleting a test is a Constitution violation, not a win. 3) A decision being re-argued in a build session means the lock is being ignored — point at the D-0xx entry. 4) A session that ends without a handover is a crashed session (if compaction fires first, the PreCompact hook treats it that way and demands an immediate state flush). 5) The MD files are the source of truth; if the HTML dashboard disagrees, trust the MD and re-run the refresh ritual.

What to remember

Decisions are LOCKED (DECISIONS.md, never relitigated). Changes are PROVEN (TEST_LOG row + independent auditor verdict, never just claimed) and run in gate order (never skip a phase). You keep up by reading STATE.md first and the story files when you want the why. /paras-start opens, /paras-handover closes — and the handover's 'Prompt for Next Session' is the thread that ties every session to the next. Master this rhythm and the system stays exactly the system you designed.

What to stay aware of
  • Open STATE.md FIRST every session — it's the single short pointer to where the system is and what's next; everything else is drill-down.
  • A change is only 'done' with a measured-evidence row in TEST_LOG.md PLUS an independent auditor verdict on any gate. 'It works' is not evidence.
  • Locked decisions (D-0xx) are never relitigated in a build session; if a request conflicts with one, the correct move is to STOP and flag it, citing the D number.
  • Failure doctrine: a red gate is a recorded FAIL — never weaken a threshold, delete a test, or retry to green. Determinism/parity deltas are STOP-EVERYTHING.
  • Gates run in order — no phase starts before its predecessor's exit criteria are demonstrably met (/phase-check).
  • Every substantial session ends with /paras-handover and a copy-pasteable 'Prompt for Next Session' block; warn at 50% context, wrap at 60%, no exceptions.
  • When a multi-agent Workflow runs, expect a Subagent Ledger in the same turn (D-076) — that's how you see what the AI fleet actually did.
  • One fact, one home: if the HTML dashboard and the MD disagree, the MD is authoritative; re-run the refresh ritual.

Locked decisions & the why

D-080
ULTRACODE-xHigh + Workflow is the OS-level default on every substantive task (no opt-out); gates are graded by independent auditor subagents (phase-gatekeeper / test-sentinel / constitution-guardian / design-reviewer / parity-auditor) in separate contexts — never by the author. Supersedes D-070.
Why: This operationalizes ULTRACODE Law U4 'the author never grades their own gate' after Fable 5 was retired. It is the live mechanism by which a change is PROVEN before it counts as done, and the model case of clean supersession (D-070 annotated, not deleted).
D-065
ULTRACODE is the standing execution mode — always on, for the main session and every subagent: U1 plan-before-touch, U2 one ring at a time, U3 prove-don't-claim (evidence row in TEST_LOG with measured numbers), U4 independent verification, U5 record everything.
Why: This is the change-flow loop itself, locked as repo law. It is why 'done' requires PROVE + RECORD and not assertion — the discipline that keeps the build factory from being the thing that lies to us (docs/06 §0).
D-076
Workflow Subagent Transparency Mandate: any multi-agent Workflow run must be followed, in the same turn before acting on results, by a Subagent Ledger to the operator — count of agents + one line per agent (role, what it did, verdict).
Why: Keeps the operator able to keep up with AI-driven changes: 'the operator must always be able to see how many agents ran and what each one did.' Non-negotiable transparency for a solo operator relying on automation.
D-060
The project OS lives entirely in-repo under tracking/ + .claude/ (no external state). MD files are Claude's working reference; the HTML dashboard is the operator's visual tracker.
Why: Defines WHERE you keep up — the fixed set of tracking files this lesson teaches. It also sets the 'one fact, one home' rule: the MD is authoritative, the HTML is a refreshed view, so the two can never silently disagree (Principle 7).
D-066
The OS is activated by four hooks in .claude/settings.json: SessionStart (injects STATE + memory INDEX + ULTRACODE banner), PreToolUse guardrail (advisory), PostToolUse activity journal, PreCompact compaction protection.
Why: These hooks are what make the rhythm automatic: the session boots with the right context, every tool call is journaled, and a compaction without a handover is caught and flagged. The operator's keep-up files stay current without manual effort.
D-094
Market-model architecture must be asset-class-agnostic and per-instrument — a HARD RULE — so US equities, FX, and crypto can be added by config profile, not a rewrite. Motivated by a Juneteenth over-closure found during the M1 soak.
Why: The lesson's worked example. It shows the full rhythm: a soak discovery became a locked, dated decision (with the evidence written in) plus a risk entry, with the fix QUEUED post-soak rather than built mid-soak — all legible to the operator via STATE.md and the D-094 entry.
Sources: CLAUDE.md — Working agreements (session protocol): /paras-start, /paras-handover, phase gates, never relitigate locked decisions, Definition of Done, small PR-sized changes · CLAUDE.md — ULTRACODE (standing execution mode): the five-step loop ORIENT/PLAN/BUILD/PROVE/RECORD, 'Done without PROVE + RECORD is not done', failure doctrine, D-076 ledger · CLAUDE.md — How this OS is organized (where things live): STATE / DECISIONS / SESSION_LOG / TEST_LOG / PROGRESS / learnings homes · docs/06_ULTRACODE_EXECUTION_METHODOLOGY.md §0 — why the doc exists (the Constitution dies first at the process level) · docs/06 §1.1 The Five Laws (U1–U5) + §1.2 The task loop + §1.3 enforcement matrix (hooks) · docs/06 §4 Memory & context management (tiers T0–T4; 50%/60% budget protocol; PreCompact rule) · docs/06 §4.3 Where new knowledge goes (the write-rules table: one fact, one home) · docs/06 §5 Recording map (every event → artifact) + §5.1 Workflow Subagent Transparency Mandate (D-076) · docs/06 §6 Definition of Done (consolidated) · tracking/DECISIONS.md — D-060, D-065, D-066, D-070→D-080 supersession, D-076, D-094 (+ header: 'Do NOT relitigate these inside build sessions') · tracking/STATE.md + tracking/SESSION_LOG.md (S028) — the live worked example (D-094 Juneteenth discovery, recorded soak FAIL under the failure doctrine)
Module 06 of 19 · Foundations

📅 The Daily Operating Model

How Paras actually runs day to day — what stays always-on, what you open when, and the paras-start-to-paras-handover rhythm that keeps a one-person, AI-automated lab honest without babysitting.

Why this page matters

Paras is a one-person, AI-automated trading lab. It has to run unattended overnight, over weekends, and across maintenance breaks — and still be trustworthy when you sit down to make a decision. This page is the single 'how I actually operate this' lesson: the four runtime pieces and when each fires, the operator's open-work-close rhythm, what's automated vs. hands-on, and what a normal day looks like end to end. Get this wrong and you either babysit a machine that should run itself, or you trust a green light that's lying. The whole design exists so neither happens.

Plain English first. There are really only four moving parts. One engine runs all the time in the background (the DataPlant). One tiny tray light watches that engine and tells you in one glance whether your data is OK (PULSE). One heavy research window you open only when you want to do research (the SENTINEL dashboard). And one trading window you open only during market hours (AXIOM, built later). The mental model that fixes everything: the always-on thing is the headless engine, NOT a window. PULSE watches the engine so you never have to open anything to know the data is healthy. (DAILY_OPERATING_MODEL §1)

always on

DataPlant — the engine

Headless process `Sponaitech.Sentinel --run` (canonical name DataPlant, D-091). ALWAYS ON. The only writer to ops/ledger.duckdb. Does backfill, the ~120 s self-heal, the nightly data top-up, quality, reconciliation, the tape recorder. You START it.

always on

PULSE — the watcher

Tiny tray icon, ALWAYS ON. Read-only health light. Never opens the database — only reads status files the plant writes. Your single 'is my data OK?' glance plus fault pop-ups. You WATCH it.

on demand

SENTINEL dashboard — the lab

The research lab (WPF): backtests, gates, the model pipeline. Heavy. ON DEMAND. You OPEN it when you want to do research, close it when done.

market hours

AXIOM — the cockpit

Live trading (WPF). MARKET HOURS only. Streams its OWN live tick/quote feed direct from TradeStation, independent of the plant. You OPEN it to trade. Built post-M1.

Two clocks, never confuse them (docs/04 §intro)

Paras runs on two separate clocks. RESEARCH-TIME work (build sessions, the nightly Claude loop, the weekly gap session) is when Claude thinks, proposes, and writes code — it can read/write the repo and ledger but never touches a live order. RUNTIME work (market hours) is when the local AI sidecar can only say NO — veto, narrate, resist, reflect; never approve, place, size, or modify. 'Research thinks; runtime can only say no.' Everything below sorts into one of these two clocks.

Runtime pieceWhen it firesAutomated or hands-onWhat you do
DataPlant (`--run`)Always on (background)Automated once startedStart it once (cold start); leave the terminal open; Ctrl+C to stop gracefully
PULSE trayAlways on (autostarts at login)AutomatedGlance at the color; open Activity only on red
~120 s REST self-healContinuously, inside the plantAutomatedNothing — it tops up bars to current
Plant nightly DATA job~18:00 PTAutomatedNothing — gap-sync → consolidate → quality → D1 cross-audit
Tape recorderContinuously during sessionsAutomatedNothing — archives the sub-minute stream
Nightly verified backupNightlyAutomatedNothing (D-071) — but the restore drill is on you monthly
Nightly CLAUDE research loop~05:30 PT (headless)Semi-auto / you reviewRead ops/reviews/<date>.md; the proposals are inert until the budget gate promotes them
Weekly gap sessionWeeklySemi-auto / you confirmRead ops/reviews/gap-<week>.md; you confirm any recalibrate/suppress/retire it recommends
Build / research sessionWhen you sit down to workHands-on (you + Claude)/paras-start → work the ring → /paras-handover
AXIOM live tradingMarket hours (post-M1)Hands-onOpen it to trade; close at the cash close
1

Cold start — Step 1: start the DataPlant

Open a terminal at the repo root and leave it running: `cd c:\AllAboutAI\Paras` then `dotnet run -c Release --project src\Sponaitech.Sentinel -- --run`. THIS WINDOW IS THE DATAPLANT. It begins startup gap-sync, then the ~120 s self-heal + the live stream + the nightly job. Keep it open. (DAILY_OPERATING_MODEL §2)

2

Cold start — Step 2: start PULSE

If it isn't already in your tray (it autostarts at login): `dotnet run -c Release --project src\Sponaitech.Pulse`. PULSE reads the plant's status files and lights up within ~1 second — green when healthy. It shows EMPTY until the plant has published its first status; that clears within seconds of Step 1.

3

To stop the plant

Press Ctrl+C in its window — a GRACEFUL shutdown that finishes any in-flight write. NEVER close the window with the ✕ mid-write or taskkill it unless it's wedged: a torn write risks the vault. Order doesn't strictly matter on start (PULSE just shows EMPTY until the plant is up), but DataPlant-first means PULSE is green immediately.

4

Coming post-M1 (D-085)

PULSE gets Start / Stop / Kick-a-job buttons so you do all of this from the tray and never type a command, and the plant window gets titled 'Paras DataPlant' so it's obvious in the taskbar. PULSE stays read-only to the database forever — when it 'starts a job,' the plant does the database work, never PULSE.

How to read PULSE — the master-state ladder (D-079)

The tray color is the WORST true state, in this priority order: EMPTY > FAULT > CLOSED > BEHIND > SYNCING > SYNCED. GREEN (SYNCED) = alive, bars current, streaming — no action. PULSING BLUE (SYNCING) = backfill/gap-sync running — let it run. AMBER (BEHIND) = bars stale, or 'live stream silent but bars REST-current' — glance; usually self-heals in ~120 s. SLATE/GRAY (CLOSED) = market closed (CME break 14:00–15:00 PT, weekend, holiday) — NORMAL, NOT a fault. RED (FAULT) = real problem (disk/auth/tape/cross-audit) — open PULSE Activity, read the banner. GRAY (EMPTY) = no data yet — start backfill.

The honesty guarantee you can trust

A wedged plant can NEVER read green: stale bars or bad quality force BEHIND at both the plant and PULSE (HZ-1, verified fixed). 'Live stream silent' amber is honest too — the live stream is only a tape + liveness signal; your minute bars stay fresh from the REST self-heal regardless (D-083). So a green light means 'current and cross-audited,' and that is exactly what you can act on. (DAILY_OPERATING_MODEL §3)

Now the day. WHAT THE MACHINE DOES ON ITS OWN (automatic, §4): about every 120 seconds the REST self-heal tops up bars to current; around 18:00 PT the plant's nightly DATA job runs gap-sync → consolidate higher timeframes → quality check → D1 cross-audit (reconciliation), and PULSE shows 'nightly top-up complete'; continuously the tape recorder archives the sub-minute stream for later execution research; and nightly a verified backup of the vault runs (D-071). If the machine was OFF at 18:00, the next plant start catches up the missed quality/audit window. None of this needs you. Your only standing job during the day is the PULSE glance — and you only act on red.

WHAT NEEDS YOU (research-time, hands-on). Two recurring AI reviews surface work for you to read but stay inert until you or a code budget gate act. The NIGHTLY CLAUDE RESEARCH LOOP (~05:30 PT, headless, docs/04 §1.2) renders an honest verdict on every finished experiment against its gate, scans for anomalies, appends findings to ops/learnings.md (negative results are findings), writes ops/reviews/<date>.md, and proposes ≤5 follow-up specs into specs/proposed/ — but those proposals are INERT: the runner's code budget gate promotes proposed→queued, never Claude, and a budget INCREASE needs your sign-off recorded in the ledger. The WEEKLY GAP SESSION (docs/04 §1.3) compares shadow/sim fills to backtest expectations — fill deltas, slippage drift, suppression frequency, live-vs-backtest expectancy — and can RECOMMEND recalibrate cost model / suppress strategy / retire to the G5 pool, which you confirm. You read these reviews; the system never silently acts on them.

Don't confuse the two 'nightlies'

There are two different nightly things and they run at different times. The PLANT's nightly DATA job (~18:00 PT) is about bars: gap-sync, consolidate, quality, D1 cross-audit — fully automatic, you do nothing. The CLAUDE research loop (~05:30 PT) is about strategies: verdict experiments, write learnings, propose follow-ups — you read its review and its proposals are budget-gated by code. 18:00 = data; 05:30 = research. They never touch the same concern.

1

Open a session — /paras-start

Every session opens by loading the single sources of truth: read tracking/STATE.md (current milestone + ▶ NEXT ACTION + blockers + constraints), the Constitution (docs/00 §2, Principles 1–11) and tracking/DECISIONS.md (do not relitigate), and the active milestone's doc sections. It confirms the predecessor exit gate is met (or escalates to phase-gatekeeper) and hands you a one-screen briefing with a default first action. This skill is read-only — it orients, it never builds. (CLAUDE.md Working agreements; docs/06 §1.2 ORIENT)

2

Work the ring — the ULTRACODE loop

Every task runs ORIENT → PLAN → BUILD → PROVE → RECORD (docs/06 §1.2). Plan before you touch a file (U1). Work one Paras ring at a time, in order: Design → Data → Concurrency → Build → Wire → Verify (U2). Prove, don't claim — done = green tests + a measured row in tracking/TEST_LOG.md (U3). An independent auditor subagent grades the gate, never the author (U4). And update the trackers as part of the task (U5).

3

Watch the context budget

Warn at 50%, wrap at 60% (operator preference). At 50%, stop opening new deep docs and finish the current ring step. At 60%, run /paras-handover — no exceptions; the handover IS the memory transfer. If compaction fires before a handover, the PreCompact hook journals it and you must flush STATE/SESSION_LOG immediately — a compacted session without a handover is treated as a crashed session. (docs/06 §4.2)

4

Close the session — /paras-handover

Every substantial session closes by persisting state: update tracking/STATE.md (milestone, NEXT ACTION, blockers, constraints), append a dated entry to SESSION_LOG.md, refresh PROGRESS.md + the HTML dashboard if progress changed, add TEST_LOG rows for suites run, and end with a fenced 'Prompt for Next Session' block a cold agent can paste with zero prior memory. The handover MUST end with that block — non-negotiable. (CLAUDE.md Session Rules; docs/06 §5)

The one rule that protects your data

NEVER run a maintenance command while `--run` is live. The plant is the single writer to the vault, so running `--consolidate`, `--quality`, `--backup`, or `--backfill` while the plant is running risks a writer collision. Until the writer-guard ships (D-086, post-M1), the only protection is the habit: STOP THE PLANT FIRST. After D-086, those commands refuse to run (clean message, no database touched) when the plant is live. (DAILY_OPERATING_MODEL §5)

A NORMAL DAY, end to end. Sunday night CME reopens and you boot the box; the plant starts, buffers the live stream, reads its per-symbol checkpoint and pages TradeStation forward — PULSE goes pulsing blue ('gap-filling'), then settles green; you did nothing. Through the trading day PULSE stays green; at 14:00 PT it goes calm slate (CLOSED) for the CME maintenance break and back to green at 15:00 — that's normal, not a fault. Around 18:00 PT the nightly DATA job runs and PULSE flashes 'nightly top-up complete.' Overnight the vault backs up. At ~05:30 PT the Claude research loop verdicts last night's experiments and leaves you a review to read in the morning. When you want to BUILD or do research, you /paras-start, work one ring through the ULTRACODE loop, prove it with a TEST_LOG row and an independent auditor, and /paras-handover before context hits 60%. When you want to TRADE (post-M1), you open AXIOM — its own live tick feed, independent of the plant — and close it at the cash close. Your standing duty all day is one glance at PULSE, acting only on red.

Live trading sees ticks — the plant's 'tape-only' stream is unrelated (D-083)

When you trade live, AXIOM streams its OWN live tick/quote feed directly from TradeStation — bars + quotes + heartbeat, with reconnect/gap-fill — so you WILL see ticks while trading. The plant's stream being 'tape + liveness only' is about how the PLANT stores bars (REST is the authoritative bars feeder); it places NO restriction on AXIOM seeing ticks. AXIOM is a separate process with its own feed, so a plant outage never blinds live trading. (DAILY_OPERATING_MODEL §6)

The soak — proving the foundation before you trust it (DAILY_OPERATING_MODEL §7)

Before M1 'First Light' is earned, you run the DataPlant CONTINUOUSLY for 72 hours on the real machine and watch it — crossing market opens, the daily CME break, a weekend, sleep/wake, and network blips — proving it survives without crashing, leaking memory, wedging, or silently going stale. Pass criteria: alive the full 72 h with no memory creep; bars stay current via the ~120 s self-heal; survives the probes (P1 disconnect→reconnect, P2 sleep/wake, P3 multi-day gap→heals); PULSE stays HONEST throughout; the nightly job runs and recovers cleanly. Run it yourself (the agent can't run .ps1 or live): `scripts\soak-run.ps1`, watch PULSE, follow ops/runbooks/soak-72h.md. Sequencing (S028): do the data fill + Databento validation FIRST, then soak — you want to soak the FINAL dataset, not one about to change underneath it (§8).

The one idea to keep

The always-on thing is the headless engine, not a window — and PULSE watches it so you never open anything just to check your data. The machine handles the cadence (self-heal, nightly data job, backup, tape) on its own; YOU handle the session rhythm (paras-start → work the ring → paras-handover) and the few decisions the AI reviews surface but never act on. Built purely for a one-person operation: the fewest things to manage, and a green light you can actually trust.

What to stay aware of
  • The always-on thing is the headless DataPlant, NOT a window — if you're opening the SENTINEL dashboard just to check data health, you've got the model backwards; that's PULSE's job.
  • Stop the plant with Ctrl+C (graceful) — never the ✕ or taskkill mid-write unless it's wedged; a torn write risks the vault.
  • NEVER run a maintenance command (--consolidate/--quality/--backup/--backfill) while --run is live — until D-086 ships, stop the plant first (single-writer law).
  • Slate 'CLOSED' is normal, not a fault — CME break 14:00–15:00 PT, weekends, holidays; only RED needs action, and only then do you open PULSE Activity.
  • Two different nightlies: the plant's DATA job (~18:00 PT) and the Claude RESEARCH loop (~05:30 PT) — don't conflate 'data top-up' with 'experiment verdicts.'
  • Nightly Claude proposals are INERT — the runner's code budget gate promotes proposed→queued, never Claude; a budget increase needs your ledger-recorded sign-off.
  • A green light means 'current and cross-audited,' not 'complete since 2008' — deep pre-2021 history is sparse-and-gated (D-081); the historical-coverage panel (D-090) shows per-era completeness honestly.
  • Run the data fill + Databento validation BEFORE the soak (S028) — soak the final dataset, never one about to change underneath it; a bulk ingest during a soak would be a second writer + load + moving data.
  • Wrap the session at 60% context with /paras-handover (warn at 50%) — a compacted session with no handover is treated as a crashed one (PreCompact hook).
  • The agent cannot run .ps1 or run live — the soak (scripts\soak-run.ps1) and any operator-only step are yours to run; the agent builds and tests, you operate.

Locked decisions & the why

D-091
The always-on ingestion process is canonically the 'DataPlant' — the run-mode `Sponaitech.Sentinel --run`, the single DuckDB writer owning ingestion/backfill/self-heal/nightly-top-up/quality/reconciliation/tape.
Why: The operator asked 'what is the always-on thing vs. the dashboard vs. PULSE?' Naming the engine distinctly from the on-demand SENTINEL dashboard and the read-only PULSE watcher removes that confusion and anchors the whole daily model.
D-079
A closed market / CME maintenance break reads as a calm SLATE 'CLOSED' state — not a FAULT — driven by the plant's own SessionClock + holiday calendar; the master-state ladder is EMPTY > FAULT > CLOSED > BEHIND > SYNCING > SYNCED.
Why: During the daily break / weekends / holidays the stream naturally goes quiet; without schedule-awareness that read as a scary FAULT. FAULT still ranks ABOVE CLOSED so a genuine fault is never masked — the glance stays honest.
D-083
The live TradeStation stream is TAPE + a liveness signal only — it never writes the `bars` table; REST (the ~120 s self-heal) is the single authoritative bars feeder. AXIOM streams its own independent live feed for trading.
Why: A stream-side bars writer would be a second writer and a forming-bar look-ahead hazard with no currency benefit over REST. Keeping a connected-but-silent stream amber (not falsely green) keeps PULSE honest; and it's why live trading still sees ticks via AXIOM's own feed.
D-085
Post-M1, PULSE gains Start / Stop / Kick-a-job buttons and a control face, and the plant window is titled 'Paras DataPlant'.
Why: So the operator runs everything from the tray and never types a command to start/stop the engine — fewest things to manage for a one-person operation. PULSE still does the database work via the plant, never itself.
D-086
Post-M1 writer-guard: maintenance commands (`--consolidate`, `--quality`, `--backup`, `--backfill`) refuse to run when `--run` is live (clean message, no database touched).
Why: DuckDB is single-writer; running a maintenance command alongside the live plant risks a writer collision. Until the guard ships, the only protection is the operator habit 'stop the plant first' — this decision turns that habit into enforced code.
D-071
Continuity rule: every irreplaceable artifact has a nightly VERIFIED local backup; the restore path is drilled monthly; no machine-bound assumptions or absolute paths in code.
Why: A one-person, local-first lab must survive a machine loss. A backup that has never been restored is a hope, not a backup — so the verified nightly backup is part of the automatic daily cadence, and the monthly restore drill is on the operator.
Sources: docs/DAILY_OPERATING_MODEL.md §1 (the four runtime pieces) · docs/DAILY_OPERATING_MODEL.md §2 (cold start / stop; PULSE control face post-M1 D-085) · docs/DAILY_OPERATING_MODEL.md §3 (master-state ladder; honesty guarantee; D-079/D-083) · docs/DAILY_OPERATING_MODEL.md §4 (the daily rhythm — 120 s self-heal, 18:00 PT nightly data job, tape, nightly backup) · docs/DAILY_OPERATING_MODEL.md §5 (the one rule — single writer; D-086 writer-guard) · docs/DAILY_OPERATING_MODEL.md §6 (live trading & ticks; AXIOM's own feed; D-083) · docs/DAILY_OPERATING_MODEL.md §7-8 (the soak; data-fill-before-soak sequencing S028) · docs/04_AI_OPERATIONS_SPEC.md §1.2 (nightly Claude research session, ~05:30 PT) · docs/04_AI_OPERATIONS_SPEC.md §1.3-1.4 (weekly gap session; one pipeline, three data sources) · docs/04_AI_OPERATIONS_SPEC.md intro (two clocks: research-time vs runtime; research thinks, runtime says no) · docs/06_ULTRACODE_EXECUTION_METHODOLOGY.md §1.1-1.2 (Five Laws; ORIENT→PLAN→BUILD→PROVE→RECORD task loop) · docs/06_ULTRACODE_EXECUTION_METHODOLOGY.md §4.2 (context budget protocol — warn 50% / wrap 60%) · CLAUDE.md (Working agreements / session protocol; /paras-start, /paras-handover) · .claude/skills/paras-start/SKILL.md · .claude/skills/paras-handover/SKILL.md · .claude/skills/nightly-review/SKILL.md · tracking/DECISIONS.md (D-071, D-079, D-081, D-083, D-085, D-086, D-090, D-091)
Module 07 of 19 · The Three Apps

🧩 The Three Apps (PULSE / SENTINEL / AXIOM)

Paras is three Windows WPF apps over one shared kernel: PULSE watches the data plant's heartbeat, SENTINEL is the research laboratory that kills bad strategies cheaply, and AXIOM is the execution fortress that trades only gate-survivors through a deterministic compliance engine.

The one-line mental model

Three apps, one kernel, one job: reveal which strategies were ever gold — and trade only those, safely. PULSE = is the data alive? SENTINEL = is this edge real? AXIOM = trade it without blowing up. The same kernel code that backtests in SENTINEL is the code that trades in AXIOM (Constitution Principle 3 — parity by construction).

Paras is named after the Parasmani — the legendary touchstone that turns base metal to gold. The honest version of that legend is the point of this whole system: it touches every metal and reveals which were actually gold all along. Most aren't. The system's real job is to kill bad strategies cheaply. To do that it is built as three separate Windows applications sharing one pure C# kernel. Think of it as a factory with three rooms: a heartbeat lamp by the door (PULSE), a research laboratory in the back (SENTINEL), and a hardened trading vault out front (AXIOM). Each room has exactly one responsibility, and nothing crosses between them except along carefully guarded paths.

Why three apps instead of one big program? Because their jobs have completely different rhythms, risk profiles, and failure modes. The data heartbeat must run 24/7 and stay tiny and trustworthy. The research lab runs overnight in heavy batches and must never feel like a trading dashboard. The execution app touches real money on live data and is designed, by law, to be read-heavy and tempt-proof. Welding those into one process would let a research bug or a UI freeze take down live trading — exactly the kind of coupling Paras forbids. Separation is a safety feature.

always-on · read-only

PULSE — the heartbeat

An always-on Windows tray app beside the clock (project Sponaitech.Pulse). It answers one question at a glance: 'is my data plant alive and current?' Glanceable in 200ms, auditable in 10 seconds, forgettable the rest of the day. It is an instrument, not an application. It is strictly read-only — it can pause/resume non-critical jobs but can never place, modify, or stop anything critical.

research · gates

SENTINEL — the laboratory

The research platform: backfills and maintains deep TradeStation market history, runs strategy experiments, forces every result through statistical deflation gates (G0–G6, DSR/PBO), a three-engine parity court, and a nightly Claude research loop. Its output is never 'best backtest' — it is survivors after deflation, plus an ever-growing corpus of validated negative results. Its dashboard is 'honesty made visible'.

live · deterministic

AXIOM — the fortress

The execution app: hosts only G6-promoted strategies on live TradeStation data, generates signals through the same kernel that backtested them, and routes every order through a deterministic Topstep compliance engine. A local AI sidecar (Gemma) rides alongside as a runtime conscience that can only say no. There is no buy button. There never will be.

All three are WPF (.NET 8) — OD-1 / D-084

As of the 2026-06-17 operator ruling (OD-1, decision D-084), all three apps — PULSE, SENTINEL, AXIOM — are Windows WPF (.NET 8) desktop apps with multithreading. The Next.js/React output from Claude Design is a UI/UX reference prototype ONLY, never a runtime dependency. This standardizes the whole system on one UI stack. ASP.NET Core survives only as SENTINEL's headless local API host (ledger read/queue endpoints); the Python FastAPI DSR/PBO service stays unchanged as a 127.0.0.1 compute helper.

Here is how data and decisions actually flow between the three rooms. Market history enters through SENTINEL's data pipeline (TradeStation WebAPI v3) and lands in one DuckDB vault — the single source of truth for bars. The always-on ingestion engine that owns that vault is canonically called the DataPlant (the run-mode 'Sponaitech.Sentinel --run'), and it is the only process that writes to DuckDB. PULSE never opens DuckDB at all; it reads a lightweight status file the plant publishes (pulse-status.json + an event log) and renders the heartbeat. SENTINEL's research side runs experiments against that vault, pushes them through the gates, and the rare survivor that clears all six gates is promoted. Only promoted strategy-cells cross into AXIOM. AXIOM then runs its OWN live TradeStation stream (it does not borrow SENTINEL's data feed), generates signals with the shared kernel, and is the only app in the entire system allowed to emit an order to a venue.

1

1 — Data lands (SENTINEL DataPlant → DuckDB)

The DataPlant backfills and maintains ES/CL/GC history into ops/ledger.duckdb (the seed instrument set; expansion is config-driven). It self-heals gaps on every startup, records a tape of the live stream for execution research, and runs nightly top-ups and quality checks. DuckDB is the single source of truth for bars (Principle 7).

2

2 — PULSE watches (read-only heartbeat)

PULSE reads the plant's published status file and shows a tray lamp: green (synced & streaming), pulsing blue (backfill/gap-sync), amber (behind), red (fault), or the calm slate CLOSED state (market closed — reopens HH:MM PT). It can pause/resume non-critical jobs and open SENTINEL Data Health — nothing else.

3

3 — SENTINEL experiments + gates (the kill room)

Every hypothesis is pre-registered (G0), then run through the gate pipeline G0–G6: a pulse test vs a random-entry null, realistic-fills, statistical deflation (DSR/PBO), a three-engine parity check (kernel/TradeStation/LEAN), walk-forward + holdout, and a final live-ready promotion review. Most ideas die here — and that is the system working.

4

4 — Promotion (G6 survivors only → AXIOM)

Only a strategy-cell that survives all gates and passes the G6 promotion review (operator + Claude weekly report, including a portfolio-correlation check) is assigned to AXIOM. The promotion criterion is uncorrelated daily P&L across survivors, never leaderboard rank.

5

5 — AXIOM trades it (deterministic, on its own feed)

AXIOM hosts the promoted cell on its own live TradeStation stream, generates signals through the SAME kernel that backtested it, and routes every order intent through the TopstepComplianceEngine to TopstepX. New strategies first run 10 clean shadow sessions before real orders. Every decision is journaled for the forever live-vs-backtest gap analysis.

AppProjectRhythmOwnsCan it emit an order?
PULSESponaitech.PulseAlways-on (24/7)Nothing — read-only watcher of the DataPlant status fileNo — read-only, never touches the order path
SENTINELSponaitech.SentinelOvernight batches + on-demand dashboardThe DuckDB vault, the experiment ledger, gates G0–G6, the nightly Claude loopNo — SENTINEL never touches a venue
AXIOMSponaitech.AxiomMarket hours (06:00–13:30 PT)Live trading of G6 survivors through the compliance engineYes — AXIOM is the ONLY app that emits orders

A concrete walk-through makes the boundaries vivid. Suppose you have a hypothesis: 'opening-range breakout on ES, only when the prior session was a trend day.' You write it as a pre-registered spec YAML with a one-sentence mechanism and a trial budget — that satisfies G0. SENTINEL runs it overnight against the DuckDB history: first a fast screen against a random-entry null (G1), then a rigorous run with pessimistic fills (G2), then the deflation service checks whether the edge survives the number of trials your family has already burned (G3), then the kernel result is reconciled three ways against LEAN and TradeStation (G4 — any mismatch is STOP-EVERYTHING), then walk-forward and a one-shot frozen holdout (G5), and finally a promotion review (G6). If — and it usually won't — the cell survives all of that, it is assigned to AXIOM. The next market morning AXIOM's PRE_FLIGHT checks pass at 06:00 PT, the strategy arms, a signal fires, the compliance engine confirms the order is within Topstep limits, and AXIOM places the bracket. PULSE, the whole time, just sits by the clock glowing green because the data plant kept feeding the vault.

Hard boundaries — these are load-bearing, not guidelines

Only AXIOM emits orders; SENTINEL never touches a venue. No LLM ever sits in any order path — stops, targets, sizing, and flatten deadlines are deterministic code (Principle 4). The AI sidecar (any model tier) can only reduce risk: veto, narrate, resist, reflect — it cannot approve, place, size, modify, or cancel. PULSE is read-only forever and never opens DuckDB (the single-writer law). AXIOM config is immutable during market hours (06:00–13:30 PT).

Notice how the three apps each express the system's personality differently in their UI. PULSE is a well-made instrument-panel lamp: four (now five) calm states, monospace numbers, zero gamification. SENTINEL's dashboard is a laboratory's glass wall — most of what it shows is dead hypotheses displayed with dignity, because the corpus of verified 'no' is the asset; there is no celebration, no green-by-default, no leaderboards. AXIOM is designed against the user, lovingly: read-heavy by law and tempt-proof by design, because the operator's documented failure mode is intervening after success. The only large affordance in AXIOM is the kill switch — because flattening reduces risk. None of the three has a button that increases risk.

Shared kernel
Sponaitech.Kernel — a pure library (no venue SDKs, HTTP, UI, LLM calls, Topstep values, or wall-clock reads). The same code backtests in SENTINEL and trades in AXIOM (Principle 3).
Single bar source
DuckDB (ops/ledger.duckdb). SENTINEL's DataPlant is its only writer; PULSE and the research side read it; AXIOM uses its own live stream.
DataPlant
The always-on engine = 'Sponaitech.Sentinel --run' (D-091). Distinct from the on-demand SENTINEL research dashboard.
Order path
AXIOM only → TopstepComplianceEngine → TopstepX (primary) / TradeStation (fallback, default OFF). Deterministic end to end.
AI role
Reasoner and veto filter only, never a forecaster or order manager (Principle 5). Local Ollama/Gemma; zero runtime token spend.
Why this division of labor is the safety system

Each app fails independently. If the research lab has a bug, live trading is untouched. If PULSE crashes, the data plant keeps running (PULSE is just a watcher, and the plant is built to never crash on a status-file write — D-078). If AXIOM's venue connection drops with a position open, its watchdog flattens and halts — never an unattended open position. The three-room design means a problem in one room cannot quietly become a problem in another.

What to stay aware of
  • All three apps are WPF (.NET 8) as of OD-1/D-084 — if you read older docs that say AXIOM is WinUI 3 or the SENTINEL dashboard is Next.js, those lines are superseded.
  • Only AXIOM can emit an order. SENTINEL never touches a venue, and PULSE is read-only forever — if you ever see a design that lets PULSE or SENTINEL place/modify/stop something critical, it is wrong.
  • PULSE never opens DuckDB. It reads the plant's published status file. The DataPlant ('Sponaitech.Sentinel --run') is the single writer to the vault (single-writer law).
  • The 'DataPlant' (always-on ingestion engine) and the 'SENTINEL dashboard' (on-demand research lab) are different things even though both live in the Sponaitech.Sentinel project — don't conflate them (D-091).
  • AXIOM runs its OWN live TradeStation stream; it does not borrow SENTINEL's data feed. Only promoted G6-survivor strategy-cells cross from SENTINEL into AXIOM.
  • The kernel is what makes parity real: the code that trades in AXIOM is the same code that backtested in SENTINEL. A divergence here is a STOP-EVERYTHING (G4 parity) event.
  • ES/CL/GC is the seed instrument set; instrument and session expansion is config-driven (and per-instrument/asset-class-agnostic going forward, D-094) — not a rewrite per app.

Locked decisions & the why

D-084
(OD-1) All three apps — PULSE, SENTINEL, AXIOM — are Windows WPF (.NET 8) desktop apps; the Next.js/React Claude Design output is a UI/UX reference prototype only, never a runtime dependency.
Why: Supersedes the earlier 'AXIOM stays WinUI 3' and 'SENTINEL dashboard = Next.js' lines. A one-UI-stack standardization is the right solo-operator simplification and matches D-073's own WPF-over-WinUI reasoning. ASP.NET Core remains only as SENTINEL's headless local API host; the Python FastAPI validation service is unchanged.
D-091
The headless always-on ingestion process is canonically named the 'DataPlant' (operator-facing 'Paras DataPlant') — the run-mode 'Sponaitech.Sentinel --run', the single DuckDB writer.
Why: Resolves the operator's 'what is the always-on thing vs the dashboard' question: DataPlant = always-on engine; SENTINEL dashboard = on-demand research lab; PULSE = the DataPlant's read-only tray watcher.
D-073
PULSE is WPF on .NET 8; status transport is file-based (atomic pulse-status.json + pulse-events.jsonl + a single reverse channel for the two pause toggles) — PULSE never opens DuckDB.
Why: Enforces the single-writer-process law: the DataPlant is DuckDB's only client, so PULSE stays strictly read-only and cannot corrupt the vault. (AXIOM-stays-WinUI-3 clause later superseded by D-084.)
D-079
A closed market / CME maintenance break reads as a calm slate CLOSED state ('market closed — reopens HH:MM PT'), not a FAULT. Priority order EMPTY > FAULT > CLOSED > BEHIND > SYNCING > SYNCED.
Why: Before this, a closed-market HTTP 400 showed as a scary fault. FAULT stays ABOVE CLOSED so a genuine disk/auth/quality alarm is never masked by 'market closed' — honesty preserved while a resting market reads as resting.
D-078
The data plant must never crash on a status-file write; PULSE's --demo mode was removed (real-data-only). Status publishing is best-effort and can never tear down the plant.
Why: A 72h soak crashed at ~5h because a PULSE/plant file-share collision propagated a fatal IOException. Status publishing is monitoring, not the data path — so it must never be able to crash the always-on ingestion engine that PULSE merely watches.
D-069
No bar-level LLM calls — session cadence is the runtime ceiling for any model invocation; zero new API token spend at runtime (local Ollama only); no reflection-derived values in execution/sizing/filters/gates.
Why: Keeps the AI sidecar a read-only conscience in AXIOM and keeps marginal AI cost at $0 (Principle 10). Any step that 'needs' a cloud model at runtime is, by definition, the wrong step.
Sources: CLAUDE.md — 'What this repo is', the Constitution, Hard rules, Stack (WPF/OD-1/D-084) · docs/00_MASTER_PLAN.md §1 (what we are building), §2 (Constitution), §3 (system context + hard boundaries) · docs/02_SENTINEL_SPEC.md — intro, §1.5 PULSE, §3 gate pipeline G0–G6, §8 dashboard (WPF) · docs/03_AXIOM_SPEC.md — intro, §1 layers, §2 session state machine, §3 compliance engine, §5 AI sidecar, §7 shadow/promotion · docs/DESIGN_BRIEF_PULSE.md — §1 what PULSE is, §4 surfaces/states, D-079 CLOSED addendum · docs/DESIGN_BRIEF_SENTINEL.md — §1 identity (honesty made visible), §5 six screens · docs/DESIGN_BRIEF_AXIOM.md — §1 design against the user, §4 surfaces, anti-pattern laws · tracking/DECISIONS.md — D-073, D-078, D-079, D-084, D-085, D-091, D-094
Module 08 of 19 · The Three Apps

🗄 The Data Plant (the foundation)

The always-on engine that ingests, heals, and quality-checks every bar in one DuckDB vault — because every backtest, gate, and live decision is only as honest as the data underneath it.

Why this page matters

Everything Paras does — backtests, deflation gates, three-engine parity, the nightly Claude loop, eventually live trading — reads from one place: the bars. If a bar is wrong, missing, or quietly fabricated, every conclusion built on it is poisoned, and you'd never see it. The Data Plant exists to make the data honest. It is the touchstone under the touchstone.

Plain English first: the Data Plant is the headless, always-on program (run-mode `Sponaitech.Sentinel --run`, canonically named 'Paras DataPlant', D-091) that goes out to TradeStation, pulls down price bars, stores them in a single database file, repairs any gaps after an outage, records the live tape, runs a holiday-aware quality audit each night, and publishes its heartbeat so the PULSE tray app can show you a green/amber/red glance. It is not the research dashboard and it is not PULSE — it is the engine those two watch and read from.

the engine

DataPlant

Always-on engine. The single DuckDB writer that owns ingestion, backfill, self-heal, nightly top-up, quality, reconciliation, and the tape.

on-demand

SENTINEL dashboard

On-demand research lab (WPF). You open it when you want to look at experiment cells, gates, and learnings. Read-mostly over the same vault.

the watcher

PULSE

Read-only tray watcher. Reads the plant's status file; never opens DuckDB. Shows the heartbeat at a glance.

WHY it exists: a one-person, AI-automated trading lab runs unattended overnight and over weekends. Data feeds go down, the machine reboots, CME closes for maintenance, holidays produce zero bars, half-days produce short sessions. A naive pipeline would either crash, silently skip data, or — worst of all — paper over a gap so the missing bars look fine. Constitution Principle 7 says there is one source of truth per concern: bars live in DuckDB, and nowhere else. The plant is the machinery that keeps that one source honest without you babysitting it.

The storage vault (DuckDB, docs/02 §1.3)

Everything lands in one embedded DuckDB file, `ops/ledger.duckdb`. The `bars` table is keyed PK(symbol, res, end_utc) — so re-pulling an overlapping range is harmless: the same bar merges onto itself idempotently. Sibling tables: `econ_calendar` (scheduled FOMC/CPI/NFP/EIA times only, never released values), `data_quality` (per-day audit verdicts), and `instruments` (tick size/value, point value, session, max contracts). No PostgreSQL, no SQLite — DuckDB is the only datastore (D-072).

bars(symbol, res, end_utc, o, h, l, c, v, ingest_utc, source)        -- PK(symbol,res,end_utc)
econ_calendar(event_utc, event, importance, instruments_json, source) -- scheduled times only (point-in-time honest)
data_quality(symbol, res, day, missing_bars, dup_bars, gap_max_min, ok)
instruments(symbol, tick_size, tick_value, point_value, session_json, max_contracts)
1

1. Backfill — the pager

TradeStation caps a request at 57,600 bars and 3 calendar years of minute data, behind a credit-based rate limiter. So the pager is single-threaded, polite, and resumable: one symbol at a time, exponential backoff on a 429, and a checkpoint table that records the last completed page per (symbol, resolution). A restart resumes — it never re-pulls from scratch. Multithreading buys nothing here because the limiter is the bottleneck; concurrency lives in the experiment runner instead.

2

2. Derive timeframes — the BarConsolidator

Only two things are pulled from the wire: M1 (1-minute, from 2008-01-01) and native D1 (official settlement prices, walked until empty for full depth). Every intermediate timeframe — M5, M15, M30, H1, H4 — is DERIVED from canonical M1 by the kernel's BarConsolidator, never extracted separately. Weekly/monthly derive from D1. Per-timeframe pulls would waste rate-limit credits and create cross-timeframe stamping inconsistencies.

3

3. Quality audit — holiday-aware

Each day, for each symbol/resolution, the QualityEngine computes the EXPECTED bar count from the session template adjusted by a CME holiday & early-close calendar, then compares to actual. A full holiday expects zero bars (zero actual = ok, not a fault); a half-day like post-Thanksgiving is short by design, not 'missing'. Any `ok=false` day is excluded from rigorous runs and surfaced on Data Health and in PULSE.

4

4. Self-heal — startup gap-sync

On every start — after any outage, weekend, or multi-day break — the plant subscribes to the live stream first (buffer), then gap-fills from the per-symbol checkpoint to now via paged requests, merging under the (symbol,res,end_utc) key. The same code path heals a 2-minute gap and a 2-week gap.

5

5. Nightly top-up

At 18:00 PT a job fetches everything since the last bar, then runs the quality checks again. A nightly D1 cross-audit derives a daily bar from M1 internally and diffs it against native D1 — OHLV must agree within tolerance; any unexplained delta is a data-quality alarm.

6

6. Tape recorder — record-forward

From day one, the live quote/trade stream is archived during sessions (DuckDB/Parquet). Sub-minute history cannot be backfilled — only recorded, going forward. Its purpose is execution research (slippage, fill micro-timing) AFTER an edge exists — never signal mining below M5.

7

7. Status stream — the heartbeat

The plant is the WRITER of a lightweight status channel (atomic `pulse-status.json` + `pulse-events.jsonl`). PULSE is strictly read-only and shows a 4-state-plus tray icon: green=synced & streaming, blue=backfill/gap-sync running, amber=behind, red=fault, plus a calm SLATE 'CLOSED' state when the market is closed (D-079).

Concrete example — a weekend gap heals itself

You shut the box Friday evening. Sunday night CME reopens and you boot the machine. The plant starts, subscribes to the live ES/CL/GC stream and buffers it, reads the per-symbol checkpoint (say the last ES M1 bar is Friday 13:59 PT), and pages TradeStation from there to now. Bars merge under their PK — any overlap with the buffered stream is harmless. PULSE shows pulsing blue ('gap-filling ES M1 ...page 3/5'), then settles to green. You did nothing. That is the self-healing-by-construction the continuity doctrine (D-071) depends on.

HOW honesty is enforced in the storage shape itself: the `econ_calendar` model has NO Actual/Released/Realized/Value/Forecast/Previous field — storing a released number would let a backtest 'know' an outcome it couldn't have known in advance (a point-in-time honesty violation). Gates may use only the SCHEDULED event time. A reflection test asserts this contract cannot silently regress. The same spirit runs through the whole plant: it would rather show you an amber 'behind' or a red fault than fabricate a clean-looking bar.

ConcernSingle source of truthWhy it's the only one
BarsDuckDB `ops/ledger.duckdb`Principle 7; PK(symbol,res,end_utc) makes ingestion idempotent
Derived timeframesKernel BarConsolidator from M1Parity by construction — same code path the engines use; no separate pulls
Quality verdict`data_quality` tableHoliday/early-close aware; one ok=false day = excluded everywhere at once
Scheduled events`econ_calendar` (times only)No released values can ever leak into a backtest
Sub-minute tapeTapeRecorder (Parquet/DuckDB)Cannot be backfilled — must be recorded forward from day one
Watch-out: back-adjusted is not raw price

The stored continuous bars (@ES, @CL, @GC) are BACK-ADJUSTED — the standard, internally-consistent frame for return/indicator backtesting — NOT raw traded price levels (D-092/D-093). Two consequences you must hold in your head: (1) percentage-based signals must divide point-changes by the true unadjusted level (deep-history adjusted levels are distorted, even negative) or be restricted to the recent decade; point-based logic (ticks, ATR, ranges) is safe. (2) Deep pre-2021 M1 has real coverage sparsity — ~25% of 2008–2020 days miss >2% of minutes — so those days are ACCEPTED-AND-GATED (flagged in `data_quality`, excluded from rigorous runs), never silently re-backfilled (D-081/D-088).

Watch-out: one writer, never two

DuckDB has a single-writer-process law. The DataPlant is the vault's ONLY writer. PULSE never opens DuckDB — it only reads the status file (D-073). The plant must also never crash on a status-file write: status publishing is monitoring, not the data path, so a write IOException is best-effort and can never tear the plant down (D-078, learned the hard way when a 72h soak died at ~5h). And the live stream is a TAPE + liveness signal only — it never writes the `bars` table; REST is the single authoritative bars feeder (D-083).

Watch-out: data parity is a second witness

Just as three engines must agree (engine parity), the plant gets a second data witness: a second-source audit (D-015) cross-checks sample months of ES/CL/GC M1 against Databento. Tolerances: OHLC within 1 tick, volume within a few %, sessions aligned. This audit is what REVEALED that TradeStation (back-adjusted) and Databento (raw) live in different price frames — a near-constant per-symbol offset (e.g. @ES +529.5 pts). That is exactly why you never co-mingle vendors into one series: it would splice a ~530-point discontinuity at the join (D-092). TradeStation stays PRIMARY; the audit capability remains a standing authenticity check.

Vault
ops/ledger.duckdb (DuckDB, single-writer)
Pulled from wire
M1 (from 2008-01-01) + native D1 settlements
Derived (never pulled)
M5/M15/M30/H1/H4 via kernel BarConsolidator; W/M from D1
Storage frame
Full 24h Globex sessions; back-adjusted continuous (@ES etc.)
Seed symbols
ES, CL, GC (NQ / currency futures / VX are config-driven expansion)
Nightly top-up
18:00 PT fetch-since-last + quality checks + D1 cross-audit
Status transport
Atomic pulse-status.json + pulse-events.jsonl (PULSE reads, never writes)
The one idea to keep

The Data Plant's whole job is to make the data honest cheaply and automatically, so that when a gate kills a strategy you can trust the verdict. Honest data is the foundation everything else stands on — get a bar wrong and you don't lose one backtest, you lose your ability to tell gold from pyrite.

What to stay aware of
  • The stored continuous bars are BACK-ADJUSTED, not raw traded prices — point-based logic is safe; percentage signals must use the true unadjusted level or stay in the recent decade.
  • Only M1 and native D1 are pulled; every other timeframe is DERIVED by the kernel BarConsolidator — if you ever see a 'pull M5 separately' instruction, it's wrong (wastes credits, breaks stamping consistency).
  • A holiday or half-day low bar count is NOT a fault — the QualityEngine expects it via the CME holiday/early-close calendar; only genuinely missing bars set ok=false.
  • Sub-minute data cannot be backfilled — if the tape recorder wasn't running, that history is gone forever; the plant must stay up during session hours (power plan locked, D-071 continuity).
  • DuckDB allows one writer: the DataPlant only. Never run a second writer; PULSE and the dashboard are read-only over the vault.
  • The live stream is tape + liveness only — a dead stream shows amber even while REST keeps bars current; never treat green-bars-age as proof the stream is alive.
  • Deep pre-2021 M1 is sparse-and-gated, not re-filled — an agent must never silently re-backfill or claim a fill was done.
  • `econ_calendar` stores scheduled times only — no released/actual values may ever enter the store or a backtest (point-in-time honesty).

Locked decisions & the why

D-091
The always-on ingestion process is canonically the 'DataPlant' (operator-facing 'Paras DataPlant') — the run-mode `Sponaitech.Sentinel --run`, the single DuckDB writer owning ingestion/backfill/self-heal/nightly-top-up/quality/reconciliation/tape.
Why: The operator asked 'what is the always-on thing vs. the dashboard vs. PULSE?' — naming the engine distinctly from the on-demand SENTINEL research dashboard and the read-only PULSE watcher removes that confusion.
D-072
PostgreSQL declined; DuckDB remains the only datastore.
Why: Reaffirms Principle 7 (one source of truth per concern) and D-010/D-068 — no new infra dependencies; the locally-installed PostgreSQL is simply not used by Paras.
D-073
Status transport is file-based; PULSE never opens DuckDB; the plant is DuckDB's only client.
Why: Enforces the single-writer-process law — atomic pulse-status.json + pulse-events.jsonl + one pulse-commands.json reverse channel; PULSE stays strictly read-only so two processes never write the vault.
D-078
The data plant must never crash on a status-file write; PULSE shares FileShare.Delete and the status write is best-effort/never-fatal.
Why: A 72h soak crashed at ~5h because an unguarded status write threw under StopHost when PULSE held the file — status publishing is monitoring, not the data path, so it can never tear the plant down.
D-079
A closed market / CME maintenance break reads as a calm SLATE 'CLOSED' state — not a FAULT — driven by the plant's own SessionClock + holiday calendar.
Why: During the daily maintenance break / weekends / holidays the stream naturally goes quiet; without schedule-awareness that read as a scary FAULT or false BEHIND. FAULT still ranks ABOVE CLOSED so a genuine fault is never masked.
D-083
The live TradeStation stream is TAPE + a liveness signal only — it never writes the `bars` table; REST is the single authoritative bars feeder.
Why: A stream-side bars writer would be a second writer (D-073 risk) and a forming-bar look-ahead hazard (D-004), with no currency benefit over the ~120s REST self-heal. A dedicated silent-stream signal keeps a connected-but-silent stream amber, never falsely green.
D-081
Deep-history M1 (2008–2020) is ACCEPT-AND-GATE — keep it, flag faulty days via `data_quality`, do not re-backfill now.
Why: The ~25% of older days missing >2% of minutes is TradeStation's own coverage sparsity (the same QualityEngine yields 97–99% clean on 2022+), not an engine defect; M2+ early families validate fine on the clean 2021+ window.
D-092
Databento deep-history FILL abandoned — TradeStation (back-adjusted) and Databento (raw) are in incompatible price frames.
Why: Sample audit found near-constant per-symbol offsets (@ES +529.5, @CL −21.31, @GC +376); co-mingling would splice a ~530–700pt discontinuity at the join, violating P3 (parity) and P7 (one source of truth per bar). Recorded as an honest FAIL, not a weakened threshold.
D-093
Data-frames architecture: RAW is canonical, back-adjusted is a DERIVED, versioned, frozen transform (build M2+).
Why: Own the adjustment instead of TradeStation's opaque, mutable back-adjustment (which silently re-shifts on every roll — a reproducibility hazard for a cached==uncached, parity-gated system); both frames stored and provenance-tagged so any model type can be tested.
Sources: docs/02_SENTINEL_SPEC.md §1.2 (backfill pager, derivation, gap-sync, nightly top-up, tape recorder, second-source audit) · docs/02_SENTINEL_SPEC.md §1.3 (DuckDB storage schema + holiday-aware quality gates) · docs/02_SENTINEL_SPEC.md §1.5 (PULSE status stream, 4-state tray, read-only) · docs/02_SENTINEL_SPEC.md §1.6 (operational hardening — power plan, NTP, backups, disk watch) · docs/02_SENTINEL_SPEC.md §1.2 addendum v1.1.7 (econ_calendar honesty contract — no released values) · tracking/DECISIONS.md (D-072, D-073, D-078, D-079, D-081, D-083, D-091, D-092, D-093) · src/Sponaitech.Sentinel/Quality/QualityEngine.cs (holiday/early-close expected-bar logic) · src/Sponaitech.Kernel/BarConsolidator.cs (derived-timeframe consolidation)
Module 09 of 19 · The Three Apps

⚙ The Kernel & Three-Engine Parity

One pure C# library is the single home of all trading logic, so the code that trades is provably the same code that backtests — verified across three independent engines.

The one idea on this page

There is exactly ONE implementation of your trading logic — the kernel (Sponaitech.Kernel). SENTINEL runs it in backtest mode; AXIOM runs it in live mode. Nothing trading-relevant exists outside it in C#-land. That single fact is what makes 'the code that trades is the code that backtests' literally true rather than a slogan.

Here is the trap that kills most home-built trading systems. You write a strategy in your charting platform (EasyLanguage on TradeStation). The backtest looks great. So you re-write it — by hand — in whatever code actually sends the orders. Now you have TWO implementations of 'the same' strategy. They drift. A rounding difference here, a bar-timing difference there, a fill assumption that doesn't match. The version that backtested gold is not the version that trades, and you only find out after it bleeds money live. Paras refuses to let that gap exist. All trading logic lives in one place — the kernel — and both the research lab and the live trader call into that exact same code.

WHAT the kernel is: a pure C# library. 'Pure' here is a precise engineering term, not a vibe. The kernel is allowed to do trading math and nothing else. It has no venue SDKs, no HTTP clients, no UI, no LLM calls, no Topstep rule values baked in, and — critically — no wall-clock reads. The clock is injected. Why does that matter? Because purity is what makes both three-engine parity and live/backtest identity possible. A function that only depends on the data you hand it, and nothing hidden, produces the same answer every time, on every engine, in backtest or live. The moment the kernel could read the real clock or hit a network, it could behave differently in the lab than in production — and parity would be a lie.

is today tradeable?

IContextGate

Evaluate(in MarketState s) → GateVerdict. Asks 'should we even be looking to trade right now?' Examples: InPlayGate, RegimeGate, CalendarGate, SessionGate.

is this the moment?

ITrigger

OnBarClose(in MarketState s) → Signal? Fires (or doesn't) on a completed primary-timeframe bar. Examples: OrBreakoutTrigger, OrFadeTrigger, FirstHalfHourMomentumTrigger.

do we take it?

IFilter

Allow(in Signal sig, in MarketState s) → bool. Vetoes a signal the trigger produced. Examples: VolumeConfirmFilter, OneTradePerDayFilter.

how exactly do we trade it?

IExecutionPolicy

Plan(in Signal sig, in MarketState s) → BracketPlan. Turns an accepted signal into a concrete order: entry type (stop/limit/market), stop price, target price, qty, time-in-force.

HOW a strategy is built: by composition, not by writing a monolith. Every strategy is a stack of those four small, pure, individually-testable components. A context gate (or several) decides the regime; one trigger decides the moment; filters can veto; one execution policy lays out the bracket. Each component is a deterministic function of MarketState — the read-only snapshot of current/recent bars (an M5 window plus an M1 window), the DailyContext (prev close, ATR20, opening-range stats, gap), the open position, and the injected clock. No I/O, no randomness, no wall-clock reads inside a component. That discipline is exactly why each piece can be unit-tested in isolation and why the whole composition behaves identically across engines.

WHERE the strategy is declared: a versioned YAML spec, not buried in code. The spec names the mechanism (required — no mechanism, no run), the instrument, the primary timeframe, the session, and then lists the context gates, trigger, filters, and execution policy with their parameters. SpecLoader validates it against a JSON Schema and treats unknown keys as errors. The spec's hash is recorded with every experiment row, so every result is traceable back to the exact configuration that produced it.

id: orb-inplay
version: 0.1.0
mechanism: "On in-play days the opening range resolves directionally often enough to clear friction."   # required — no mechanism, no run
instrument: ES
timeframe: M5
session: ES_RTH_PT
context:  [{type: InPlayGate, orwMult: 1.0, gapAtrMin: 0.30, warmup: 10}]
trigger:  {type: OrBreakoutTrigger, orMinutes: 30}
filters:  [{type: OneTradePerDayFilter}]
execution:{type: FixedRiskBracket, rr: 2.0, contracts: 1}
trial_budget: {family: orb-inplay, max_cells: 8}     # pre-registered (Principle 2)
The same YAML feeds three engines

This one spec file is the single source for all three implementations. The kernel reads it directly. The EasyLanguage generator (for TradeStation) and the QCAlgorithm generator (for LEAN) consume the SAME YAML — by template assembly, not by re-compiling logic. That is the mechanism that keeps three independent engines describing the same strategy.

WHY there are two engines, not one: cost. You want to screen thousands of candidate strategy 'cells' per night cheaply, but you also want a small number of survivors measured with brutal realism. So the kernel ships two engines that share components but differ in fidelity.

FastScreenEngineRigorousEngine
JobRank thousands of cells per nightMeasure survivors with full realism
MethodArray-based, single pass over M5 barsEvent-driven bar replay, full order lifecycle
FillsApproximate — bar-extreme touch ⇒ fill at order priceIntrabar M1 resolution; stop/limit sequencing; pessimistic if M1 missing
OutputSummary stats only, flagged approximate=trueTrade list, order-event log, equity series, per-day P&L
Gate reachNever feeds gates beyond G1Used from G2 upward

The RigorousEngine is where realism is enforced. Each M5 bar expands into its M1 children to resolve which side of a bracket hit first (the LIBB-equivalent intrabar problem). If M1 is missing for a period, the engine applies a pessimistic rule — the stop fills before the target whenever both lie within the bar — and logs a data-quality row. Fill models are pluggable and recorded with the run: market orders fill at next bar open minus slippage against you; stop orders fill at trigger minus slippage (never better); limit orders fill only on trade-through by at least one tick (a touch is not a fill). Costs default to $2.20/side + 1 tick/side — pessimistic placeholders that get replaced by Phase-0 calibrated values the day real fill data is reconciled.

Live mode is the SAME engine

AXIOM does not get its own trading engine. The exact same RigorousEngine order/bracket state machine runs against streaming live bars — the only thing that changes is the venue adapter replacing the fill simulator. One code path, two data sources (historical vs streaming). That is Principle 3 made literal: live execution and backtest are the same state machine.

WHAT proves it: three-engine parity. Trusting that the kernel matches reality is not enough — Paras proves it. The same spec is run three ways: (1) the kernel itself, (2) generated EasyLanguage on TradeStation, and (3) generated QCAlgorithm on LEAN. All three emit the same canonical trade record (trade_id, spec_id@version, spec_hash, engine, cell, side, qty, entry/exit time and price, exit_reason, gross/net P&L, MAE, MFE, bars_held). A three-way differ (ParityDiffer) aligns the trade lists by entry time within ±1 bar and classifies every difference.

Indicator values
≤ 0.1% relative difference, per bar
Entries matched
≥ 95% matched within ±1 bar
Net P&L
within 5% after documented semantic deltas
Trade count
within 3%
Fixture data
3 frozen months of ES M5+M1, golden trade lists regenerated only by explicit decision
An unexplained delta STOPS the build

These tolerances are 'the law.' If ParityDiffer finds a difference the team cannot explain and document (a semantic delta), it is a build-stopping event. The G4 rule is blunt: a parity bug contaminates all prior results until it is root-caused. You do not loosen the tolerance to make it pass — you stop and find the cause.

1

1. Author the spec

Write the YAML — mechanism, instrument, timeframe, session, and the component stack. SpecLoader validates it; unknown keys are errors; the spec hash is recorded.

2

2. FastScreen the family

Run thousands of cells through FastScreenEngine to rank them. Results are flagged approximate=true and may not pass beyond G1.

3

3. Rigorously measure survivors

Promote survivors to RigorousEngine — M1 intrabar fills, pessimistic models, recorded costs — to get a trade list trustworthy enough for G2+.

4

4. Run the parity court

Generate EasyLanguage and QCAlgorithm from the same YAML, run all three engines, and feed the three trade lists to ParityDiffer.

5

5. Resolve every delta

Each difference is classified (missing / extra / px_delta / exit_reason_delta) and must fall inside tolerance or be a documented semantic delta. Any unexplained delta is STOP-EVERYTHING until root-caused.

6

6. Trade the same code

AXIOM runs the same RigorousEngine state machine against live bars with the venue adapter. The proven code is the trading code.

WHY expansion does not threaten any of this: instruments and sessions are config, not code. ES/CL/GC in PT sessions is the seed set, not a boundary. Adding NQ, currencies, or overnight CL sessions is an instrument-table row plus a SessionTemplate plus spec YAMLs — zero kernel changes. Strategies are expected to differ per (ticker, session); that is the experiment_cell coordinate, not an exception. M1 is the one canonical stored resolution, and a BarConsolidator derives M5/M15/M30/H1 from it, so all three engines consolidate from the same source bars. The closed-bar rule is non-negotiable: a component may only see completed higher-timeframe bars — reading a forming M30 bar's 'close' mid-bar is look-ahead and a parity killer.

WHAT TO BE AWARE OF as the operator. Parity is fragile by design — it catches drift precisely because the bar is set so high. The known parity killers are concrete: (1) session-clock bugs — SessionClock must derive boundaries from time-anchored templates, never from calendar-date detection (the DAYBREAK v0 lesson; overnight sessions cross midnight and date-based logic breaks silently); (2) indicator-definition drift — each component pins ONE indicator definition (e.g. Wilder vs simple ATR smoothing) and the EL/LEAN generators must implement that same definition, or the 0.1% per-bar tolerance fires; (3) look-ahead from reading forming higher-timeframe bars; (4) the sub-M5 floor — signal generation below M5 is off by default because fixed friction against a shrinking per-bar range pushes breakeven win-rate up sharply and inflates trial counts against the DSR hurdle. None of these are bugs to 'work around' — they are the failure modes parity exists to surface.

What to stay aware of
  • Purity is load-bearing: if any kernel component ever reads the real clock, hits the network, or hard-codes a Topstep value, parity and live/backtest identity silently break — keep the kernel pure (D-003).
  • An unexplained parity delta is a build-stopping, STOP-EVERYTHING event — never loosen a tolerance to make it pass; a parity bug contaminates all prior results until root-caused (G4).
  • FastScreen output is approximate (flagged approximate=true) and must never feed gates beyond G1 — only RigorousEngine numbers are trustworthy for G2+ decisions.
  • RigorousEngine costs default to pessimistic placeholders ($2.20/side + 1 tick/side); real conclusions wait for Phase-0 calibrated values reconciled against actual fills.
  • SessionClock must use time-anchored templates only — calendar-date detection is the DAYBREAK v0 trap and breaks overnight sessions silently (D-025).
  • Each component pins exactly one indicator definition (e.g. Wilder vs simple ATR); the EL/LEAN generators must match it, or the 0.1% per-bar tolerance will fire.
  • Closed-bar rule is non-negotiable: components may only read completed higher-timeframe bars — reading a forming bar's close is look-ahead and a parity killer.
  • Signal generation below M5 is off by default (revisitable policy): fixed friction against shrinking per-bar range raises breakeven win-rate and bar-count inflation multiplies trials against the DSR hurdle.
  • Live mode (AXIOM) is the same RigorousEngine state machine — there is no separate 'trading engine' to keep in sync; the only swap is the venue adapter for the fill simulator.

Locked decisions & the why

D-001
One shared C# kernel; SENTINEL backtests it, AXIOM trades it.
Why: 'The code that trades is the code that backtests.' This is Principle 3 (parity by construction) made into an architectural fact — a single implementation removes the gap where a hand-ported live version could drift from the backtested one. (docs/01, Principle 3)
D-003
The kernel is a pure library — no venue SDKs, HTTP, UI, LLM, Topstep values, or wall-clock reads (clock injected).
Why: Purity is the precondition for three-engine parity and live/backtest identity: a function with no hidden dependencies returns the same answer on every engine, in lab or production. (docs/01 §7)
D-020
Two-speed engine: FastScreen ranks thousands of cells and feeds only ≤ G1; RigorousEngine (M1 intrabar, pessimistic fills, calibrated costs) is used from G2+.
Why: Separates cheap breadth (screen everything) from expensive realism (measure survivors honestly) — you cannot afford full-fidelity replay on thousands of cells nightly, and you cannot let approximate fills reach the serious gates. (docs/01 §4)
D-021
Three-engine parity court: kernel / TradeStation (generated EasyLanguage) / LEAN (generated QCAlgorithm); an unexplained delta is STOP-EVERYTHING.
Why: Independent re-implementations from the same spec are the proof that the logic is correct and venue-neutral; the G4 rule that a parity bug contaminates all prior results forces root-cause over papering over. (docs/01 §6; G4)
D-022
Skender.Stock.Indicators = math source + batch validation oracle; one pinned indicator definition per component.
Why: The kernel's incremental (bar-by-bar) indicators are unit-tested against Skender's batch output as reference truth, and pinning one definition per component stops library-default drift — a known parity killer the 0.1% tolerance is built to catch. (docs/01 §2.1)
D-025
SessionClock derives session boundaries from time-anchored templates only — never calendar-date detection.
Why: The DAYBREAK v0 lesson: overnight sessions cross midnight, so date-based boundary detection breaks silently and becomes a parity killer. Session bugs are parity killers because both engines and AXIOM share one SessionClock. (docs/01 §1)
D-012
All intermediate timeframes (M5/M15/M30/H1/H4) are derived from canonical M1 by BarConsolidator — never extracted separately.
Why: All engines consolidate from the same M1 source bars, so multi-timeframe semantics stay identical across kernel, generated EL, and LEAN — separate extraction would reintroduce drift. (docs/02 §1.2; docs/01 §1)
D-016
Seed instrument set is ES, CL, GC (PT sessions); expansion (NQ, currencies, VX) is config — instrument rows + SessionTemplates + spec YAMLs, zero kernel changes.
Why: Keeps the kernel stable as the universe grows: strategies are expected to differ per (ticker, session) — that is the experiment_cell coordinate, not a special case requiring code. (docs/01 §1)
D-031
G0 requires a stated one-sentence mechanism — no mechanism, no run (pure pattern mining banned).
Why: The spec's 'mechanism' field is required at load time; this enforces Principle 1 (honest discovery over impressive results) at the very front of the pipeline, before a single trial is counted. (docs/02 §3)
Sources: docs/01_KERNEL_SPEC.md §1 (Domain model; expansion-is-config; time discipline; resolutions & MTF; closed-bar rule; M5 floor) · docs/01_KERNEL_SPEC.md §2 (Component grammar — IContextGate / ITrigger / IFilter / IExecutionPolicy; MarketState) · docs/01_KERNEL_SPEC.md §2.1 (Skender indicator policy — math source + validation oracle; pinned definition) · docs/01_KERNEL_SPEC.md §3 (Strategy spec YAML; SpecLoader; generators consume same YAML) · docs/01_KERNEL_SPEC.md §4 (Two-speed engine — FastScreen / Rigorous; intrabar resolution; fill & cost models; live mode) · docs/01_KERNEL_SPEC.md §5 (Canonical trade record) · docs/01_KERNEL_SPEC.md §6 (Parity framework — fixtures, tolerances, ParityDiffer, G4 rule) · docs/01_KERNEL_SPEC.md §7 (What the kernel must never contain) · tracking/DECISIONS.md (D-001, D-003, D-012, D-016, D-020, D-021, D-022, D-025, D-031)
Module 10 of 19 · The Three Apps

🤖 The AI Sidecar (what AI can & cannot do)

The local Ollama/Gemma sidecar is a reasoner that can only veto, narrate, resist, and reflect — it can never approve, place, size, or modify a trade.

The one-sentence version

There is a small AI model living on your machine inside AXIOM. It watches you trade and can say 'no.' That is the entire extent of its power. It cannot say 'yes,' it cannot place an order, it cannot make a position bigger, and it cannot change a single rule. By design, the worst thing this AI can do is be annoyingly cautious.

Let's clear up the single most important confusion first, because it trips up everyone: there are TWO completely different AIs in Paras, and this lesson is only about ONE of them. The docs describe them as 'Two AIs, two clocks, one asymmetry: research thinks, runtime can only say no.' This page is about the second one — the runtime sidecar.

Claude Code (the research AI)The AI Sidecar (this lesson)
When it runsResearch-time — nightly, weekly, during buildsRuntime — during market hours, while you trade
Where it runsCloud (Opus, your Max plan)Locally, on your machine, via Ollama
What it costs$0 marginal (subscription)$0 marginal (local, Apache 2.0 models)
What it CAN doRead/write repo, ledger, learnings; propose experiments; generate codeVeto, narrate, resist, reflect — that's all
What it can NEVER doTouch a live order or change a live parameterApprove, place, size, or modify anything
The failure it guards againstOverfitting, sloppy inferenceGreed, override, tilt
Do not confuse this with the trading models

The sidecar is an AI LANGUAGE MODEL (Gemma, served by Ollama). It is NOT a strategy, NOT a 'trading model,' NOT a forecaster, and NOT in the order path. The things that actually decide trades — entries, stops, targets, sizing — are deterministic CODE in the shared kernel. The sidecar never forecasts price and never picks trades. It only reasons over context the system hands it, in plain language, to catch human and data failure modes.

WHY does this thing exist at all? Because the most expensive failure in your trading was never a bad strategy — it was greed, override, and tilt. The compliance engine (deterministic code) already enforces the hard rules. The sidecar is a second, softer layer: a fuzzy reviewer that can catch the cases rules miss, and a thing to argue with at the exact moment you'd otherwise do something stupid. Critically, it is built so it CAN'T help you do the stupid thing even if you beg it to.

HOW it works: the sidecar has four runtime jobs, and the design principle is 'each given the cheapest tool that does it.' Two of the four jobs don't even use an AI model — they're deterministic templates. The AI is only reached for the fuzzy second-opinion and for the override conversation. Here are the four tiers.

NO MODEL

Tier 1 — Compliance narration

Plain-English status from known numbers, e.g. '$420 from DLL · consistency headroom $1,080 · MLL floor locked.' This is a deterministic template over real figures — no AI involved. Faster, and zero hallucination risk. The ticker you read on the AXIOM cockpit is this.

RESIDENT SMALL MODEL

Tier 2 — Pre-trade veto

Rules first, model second. A compliance-approved intent passes a deterministic veto checklist (stale data, spread blowout, calendar conflict, gap-status). Then a small resident model (E4B-class, ~3-4 GB VRAM, warmed at PRE_FLIGHT) reviews the JSON context pack as a fuzzy second layer. Output: {veto, confidence, reason}. Veto means no trade.

12B, ON DEMAND

Tier 3 — Override-resistance dialogue

gemma4:12b loads ONLY when you press the override button. Its load latency is intended friction — the wait is the point. Its sole goal: delay you to the close. It cannot grant anything. It acknowledges the feeling, restates the blowup autopsy, and gets you to the bell. No tools, no workarounds.

TEMPLATES + CLAUDE

Tier 4 — Post-session reflection

A deterministic template drafts the session summary from the journal (each trade vs spec, compliance events, your interactions, one question). Written to ops/reflections/<date>.md. The nightly Claude session does the real synthesis. The small model may optionally pre-draft commentary — never required.

Let's make the veto concrete. Before any trade, AXIOM assembles a small context pack (≤ ~1.5k tokens) and hands it to the sidecar. Notice it contains NO price prediction — it's a snapshot of compliance stress, market conditions, and live-vs-backtest drift. The model's only allowed reply is the tiny schema.

// What AXIOM hands the sidecar (assembled by code, not the model):
{
  "intent":     {"strategy":"orb-inplay@0.1.0","side":"long","qty":1,
                 "entry":6852.25,"stop":6845.0,"target":6866.75},
  "compliance": {"dll_remaining":640,"mll_floor_distance":1210,
                 "consistency_headroom":1080,"trades_today":0},
  "market":     {"session_minute":47,"in_play_score":1.4,"regime":"trend_up",
                 "calendar":["none"],"spread_ticks":1},
  "strategy_live": {"last20_expectancy_ticks":1.7,
                    "backtest_expectancy_ticks":2.1,"gap_status":"within_tolerance"}
}

// The ONLY thing the model may say back:
{ "veto": false, "confidence": 0.0-1.0, "reason": "<= 30 words" }

// System prompt core:
// "You are a risk veto. You may only output the schema. You cannot approve,
//  encourage, or resize. Veto when the context pack shows compliance stress,
//  stale data, regime/calendar conflict, or live-vs-backtest divergence.
//  When uncertain, veto."

Three details in that example carry the whole philosophy. (1) The reply schema has no 'approve' field — there is literally no place for the model to say yes; 'veto: false' just means 'I have no objection,' and the deterministic system proceeds on its own authority. (2) The instruction ends 'When uncertain, veto' — the asymmetry is baked into the prompt. (3) Every verdict is journaled, and the weekly session audits the sidecar's veto quality by checking the counterfactual P&L of trades it vetoed — so the AI itself is held accountable with measured evidence.

There is a second, formal version of this output called the Structured Verdict Contract (D-067). It's schema-constrained JSON enforced at the DECODER level — Ollama structured outputs / GBNF, never just asking nicely in the prompt — with a five-tier verdict STRONG_GO | GO | NEUTRAL | CAUTION | NO_GO, plus regime, confidence, up to 3 evidence items drawn only from the provided context, and an expires_at. The load-bearing rule: even a GO-class verdict carries ZERO permissive power. A 'STRONG_GO' does not approve anything; it's read-only advisory context for a deterministic sizing policy and the UI. Any mapping from verdict to behavior is deterministic, versioned config, and must pass the full G0-G6 gates to ever go live.

Engine
Ollama, running locally on your machine — no cloud, no API key at runtime
Resident model
Small (E4B-class, ~3-4 GB VRAM), warmed at PRE_FLIGHT, always ready for vetoes
On-demand model
gemma4:12b — loaded only for the override chat; load latency is intentional friction
Narration
No model at all — deterministic templates over known numbers
Temperature
0.1 for the veto (tight, near-deterministic) · 0.7 for the override dialogue (conversational)
Structured output
Enforced via JSON schema at the decoder, not via prompt requests
Health check
Pinged at PRE_FLIGHT; if the sidecar is down, AXIOM runs deterministic-only (and logs it)
Final model sizes
Not assumed — set by the A3 benchmark: 200 synthetic packs vs a hand-labeled key; promote the smallest model that passes
Fail-safe behavior (the part that lets you sleep)

If the sidecar times out (>1.5 s), errors, or is unavailable, the DETERMINISTIC outcome stands — because the validated system is the deterministic one. The AI being down can never break trading; it just means you lose the fuzzy second opinion that day (and it's logged). The one exception: if you set strictMode=true, a veto-layer timeout flips to no-trade — fail-closed by choice. For the formal verdict contract (D-067), any decode failure coerces to NEUTRAL plus a raw-output log, and the model is NEVER retried in a loop at runtime.

1

You hit your limit and want to push

The compliance engine has already locked you (e.g. daily profit lock hit, or DLL soft gate). There is no buy button in AXIOM to fight it with — manual order entry simply does not exist in the app.

2

You press the override button

This is the ONLY override control, and it doesn't unlock anything. It opens a chat. gemma4:12b begins loading — that wait is deliberate friction designed to cool you down before you even type.

3

Gemma talks you down

Its context is fixed: the blowup autopsy summary, the greed-tax table (median outcome of overriding: $0), and today's compliance state. Its sole goal is to delay you to the close. It will acknowledge the feeling and restate what the autopsy showed.

4

It cannot help you, by construction

'You have no ability to do any of these... Never provide workarounds.' Gemma cannot grant size, remove a lock, or extend trading. It has no tools. The whole transcript is journaled for the nightly review.

Watch-outs — keep these straight in your head

1) The sidecar's 'yes' is not a yes — 'veto: false' / 'GO' only means 'no objection'; the deterministic code decides. 2) Never wire a verdict into compliance, sizing, filters, or gates — zero code paths may deliver a verdict into compliance-engine inputs (D-067). 3) Reflections are read-only too — they never feed a parameter, threshold, gate, or entry (D-068). 4) No bar-level AI calls — session cadence is the runtime ceiling; the sidecar is consulted at trade decisions, not on every tick (D-069). 5) Runtime AI is local-only — zero new API token spend at runtime; if a runtime step seems to 'need' a cloud model, the step is wrong (D-069). 6) Sidecar down = deterministic-only, logged — not a trading outage.

The mental model to carry away: AXIOM is a fortress of deterministic code that does the trading. The AI sidecar stands at the gate holding a single red flag. It can wave the flag (veto), read the gauges aloud (narrate), argue with you when you try to climb the wall (resist), and write up what happened after hours (reflect). It does not have keys, it does not have a green flag, and it never touches the controls. That asymmetry — a reasoner that can only say no — is the entire safety value.

What to stay aware of
  • There are TWO AIs in Paras. This lesson is the runtime SIDECAR (local Gemma/Ollama) — not Claude Code (the cloud research AI) and not any trading strategy. Never blur them.
  • The sidecar is a language model, not a forecaster. It never predicts price and never selects trades; deterministic kernel code does that.
  • 'veto: false' / 'GO' is NOT approval — it means 'no objection.' The deterministic system proceeds on its own authority, never the AI's.
  • A GO or STRONG_GO verdict carries zero permissive power. No verdict may reach compliance, sizing, filters, or gates (D-067).
  • Reflections are read-only context for you — they never auto-change a parameter, threshold, gate, size, or entry (D-068).
  • The override button is the ONLY override control, and it grants nothing — it just opens a Gemma chat whose only goal is to delay you to the close.
  • If the sidecar is down or times out, the deterministic outcome stands (logged) — set strictMode=true to make veto-timeouts fail-closed (no-trade).
  • No bar-level AI calls and zero runtime API token spend — the sidecar is local Ollama only, consulted at decision cadence, not every tick (D-069).
  • Final model sizes are an empirical A3-benchmark decision (smallest model that passes 200 hand-labeled packs), not a guess — no faith-based VRAM spend.
  • The sidecar's own veto quality is audited weekly via vetoed-trade counterfactual P&L — the AI is held to measured evidence too.

Locked decisions & the why

D-067
Gemma Structured Verdict Contract v1 locked: schema-constrained JSON enforced at the decoder level (Ollama structured outputs / GBNF), five-tier verdict STRONG_GO|GO|NEUTRAL|CAUTION|NO_GO + regime + confidence + <=3 evidence items + expires_at. Any failure coerces to NEUTRAL + raw-output log; never retry-loop the model at runtime.
Why: The verdict is read-only advisory context for the deterministic sizing policy and the trader UI — 'the LLM has zero authority.' Any verdict->behavior mapping is deterministic versioned config that enters live paths only through G0-G6; zero code paths may deliver a verdict into compliance-engine inputs; GO-class verdicts carry no permissive power. This makes 'AI can only say no' enforceable at the format level, not just by convention.
D-068
Session Decision Log + Reflection Loop locked: one DuckDB schema (session, verdict JSON, signal events, fills, P&L, reflection text), shared unforked with the backtest-to-live gap analysis. Reflections are one paragraph per session per instrument, generated locally (Gemma, $0), injected next session as read-only context.
Why: 'Reflections never feed a parameter, filter, threshold, gate, size, or entry logic' — behavior changes go through G0-G6 (standing risk T-06). The reflection tier is allowed to inform YOU, but is firewalled from ever changing system behavior automatically, preserving the deterministic core.
D-069
External-framework adoption policy locked, with standing repo-law DO-NOTs: no LLM order/sizing/cancellation authority ever; no bar-level LLM calls (session cadence is the runtime ceiling); zero new API token spend at runtime (local Ollama only); no reflection-derived values in execution/sizing/filters/gates.
Why: These are the hard guarantees that make the sidecar safe: the AI literally cannot reach the order path, cannot be called on every bar (cost + determinism), and costs $0 at runtime because it never calls a cloud API. If a runtime step 'needs' a cloud model, the step is wrong by definition.
D-080
Two AIs, clarified roles: the cloud research AI (Claude Code) runs at research-time and may propose/build (gated); the local sidecar runs at runtime and may only veto/narrate/resist/reflect. Independent auditor subagents grade gates, never the author.
Why: Reinforces the two-clocks/one-asymmetry design — 'research thinks, runtime can only say no.' Keeps the powerful, write-capable AI on the research clock and the runtime AI strictly read-only/advisory, so nothing with authority is ever in the live order path.
Sources: docs/04_AI_OPERATIONS_SPEC.md — header (two AIs / two clocks / one asymmetry) · docs/04_AI_OPERATIONS_SPEC.md §2.1 Serving (tiered) — resident small model, on-demand gemma4:12b, narration no-model, temperatures, health-ping, A3 benchmark · docs/04_AI_OPERATIONS_SPEC.md §2.2 Veto call — context pack, response schema, system prompt, weekly counterfactual audit · docs/04_AI_OPERATIONS_SPEC.md §2.3 Override-resistance dialogue · docs/04_AI_OPERATIONS_SPEC.md §2.4 Reflection draft · docs/04_AI_OPERATIONS_SPEC.md §3 Boundary table · docs/03_AXIOM_SPEC.md §5 AI sidecar (tiered — Ollama, local) — the four runtime jobs, fail-safe behavior, strictMode, A3 model-sizing · docs/03_AXIOM_SPEC.md §3.1 The hard locks — no buy button, override button opens Gemma chat · docs/03_AXIOM_SPEC.md §8 Telemetry — Gemma verdicts journaled to axiom-journal · tracking/DECISIONS.md — D-067, D-068, D-069, D-080
Module 11 of 19 · The Three Apps

🛡 Topstep Rules & the Compliance Engine

AXIOM's TopstepComplianceEngine is a deterministic monitor that wraps every order intent and can only ever say NO — enforcing Topstep's trailing drawdown, daily loss, and consistency rules from versioned config, never code constants.

The one-line truth

The compliance engine is the product. A prior Monte Carlo priced it as the entire difference between a $1,345 median monthly income and $0 — at identical edge. The edge does not separate winners from blowups; the engine does.

AXIOM is the execution fortress — the app that places live orders through your Topstep account. But the part that matters most is not the part that trades; it is the part that refuses. Between every signal your strategy generates and every order that reaches the venue sits one gate: the TopstepComplianceEngine. Its entire job is to protect you from yourself by saying NO before money is at risk. This lesson teaches what the Topstep rules actually are, why each one exists, and exactly how the engine enforces them.

First, the mental model. A Topstep account is not your money — it is a funded evaluation seat with hard guardrails. Break a guardrail and the account is gone (a 'breach'). Topstep enforces this on its side; if you wait for Topstep to enforce it, you have already lost the account. AXIOM's compliance engine enforces the same rules locally, in advance, and even stricter — so the venue's enforcement should never be the thing that catches you.

There are three rules that decide whether you keep a funded account: the Trailing Maximum Loss Limit (MLL), the Daily Loss Limit (DLL), and the Consistency rule. Learn these three cold — everything the engine does is in service of them.

Account-ending

Trailing Max Loss Limit (MLL)

A floor under your account that follows your high-water mark UP but NEVER back down. On a $50K Combine it is $2,000 below the starting balance. As your end-of-day balance climbs, the floor trails up; once you have earned the MLL amount in profit it LOCKS at the starting balance and stops trailing. Cross it and the account is breached — permanent.

Forced break

Daily Loss Limit (DLL)

The most you may lose in a single session. $1,000 on a $50K Combine. Hitting it is NOT a breach — it is a forced break: Topstep flattens your positions, cancels pending orders, and blocks new trades until the next session opens at 5:00 PM CT.

Delays / blocks payout

Consistency rule

On a Combine, your single best day's profit must be ≤ 50% of the profit target ($1,500 of the $3,000 target on $50K). On Express Funded it is the largest day ≤ 40% of total net profit, checked at payout. Exceeding it does NOT breach — it just delays passing or blocks the payout until your profit is spread out.

Why these three exist is worth internalizing, because the engine is built around the WHY, not just the numbers. The MLL exists because prop firms cannot let a trader give back all their gains and then keep digging — the trailing floor caps the firm's total downside per seat. The DLL exists to stop the classic blowup spiral: a bad morning becomes revenge trading becomes a wiped account. The consistency rule exists so that one lucky lottery day cannot pass an evaluation — Topstep wants repeatable skill, not a single gambling spike. Each rule maps directly to a way humans destroy accounts; the engine's locks are the operator's own blowup autopsy turned into code.

Now: how the engine enforces them. The compliance engine is Layer 3 of AXIOM, and it wraps EVERY order intent. No order reaches an execution adapter (Layer 4 — TopstepX) without first passing through it. The contract is a single function call on each intent that returns one of three verdicts.

// Every order intent, BEFORE any venue/adapter call:
Check(intent) → Allow | Deny(reason) | DenyAndHalt(reason)

//  Allow         → the intent may proceed to the execution adapter
//  Deny(reason)  → this entry is blocked; the day continues
//  DenyAndHalt   → block, flatten everything, and HALT the day (terminal)

// And on the data side, every fill/quote tick updates the monitors
// that feed the next Check(). The engine is always watching.

Notice what is missing from those three verdicts: there is no 'Approve' that grants anything. The engine can only let an intent pass or stop it. It is a one-way valve. This is the architectural expression of the Constitution's principle that the system 'adapts by saying NO more often — never by quietly becoming a different system.'

Monitor$50K Combine valueWhat the engine does
Daily Loss Limit$1,000; soft gate at 80% ($800)Soft (Paras overlay): block new entries at $800. Hard: flatten + HALT the day at $1,000.
Trailing MLL$2,000; locks at start balance once +$2,000 reached; floor = $0 after first payoutProjected-breach check: DENY any entry whose stop-out loss would cross the MLL floor. It checks the worst case before you take the trade.
ConsistencyBest day ≤ 50% of $3,000 target ($1,500)Personal profit lock (Paras overlay): when day P&L reaches the lock (default 2R), block new entries; the open position exits normally at its next signal exit.
Position caps5 minis / 50 micros per account; tighter per-strategy cap from the specDeny any intent that would exceed the cap.
Clock / FlatTimeVenue hard flatten 15:05 CT; ours earlier per SessionTemplate.FlatTimeIdempotent timed flatten orders — we flatten before the venue ever has to.
One-trade-per-dayPer strategy specDeny repeat entries when the strategy is one-and-done.
The MLL check is forward-looking, not reactive

The engine does not wait for you to breach the MLL and then react — by then it is too late. It runs a 'projected-breach' check: for a candidate entry it computes where the stop-out would leave the account, and if that loss would cross the trailing floor, the entry is denied BEFORE it is ever placed. The floor that never moves down is checked against the worst case that could happen if the trade goes fully wrong.

A concrete walk-through on a $50K Combine. You start the day with the MLL floor sitting $2,000 below your balance and the DLL at $1,000. You take two losing trades for -$600 total. The 80% DLL soft gate is at $800, so new entries are still allowed but you are close. A third signal fires; its stop-out would be another -$450, taking the day to -$1,050. The engine projects that this crosses the $1,000 DLL — DENY. Suppose instead the day had gone well and you booked +$1,500 in profit. That single day is now exactly 50% of the $3,000 profit target: the consistency ceiling. Your personal profit lock (a Paras overlay, default 2R) would already have blocked new entries earlier, letting the open trade exit on its own signal — so you never blow past the consistency line by accident.

Two phrases in that example matter: 'Paras overlay' and 'versioned config.' The 80% DLL soft gate and the personal profit lock are NOT Topstep rules — they are Paras choices, stricter than Topstep, applied on top of whichever account profile is active. The config file is explicit about which numbers are Topstep's and which are ours, so you never confuse a house rule for a firm rule.

Rules live in versioned config — NEVER as code constants

Constitution Principle 9: Topstep rule values live in config/topstep-rules.json, never hard-coded. There is no '$1000' literal in the compliance engine. You set activeAccount to your profile (Combine or Express Funded × 50K/100K/150K) and AXIOM resolves that profile's rules at PRE-FLIGHT and applies them automatically. Topstep changed its rules 4+ times between Nov 2025 and Feb 2026 — if those numbers were constants, every change would be a code change and a redeploy. As config, a rule change is a config change.

Because these numbers are safety-critical and Topstep moves them, the config carries its own discipline. It is re-verified against help.topstep.com WEEKLY (folded into the calendar-news check via the /verify-topstep-rules skill), each run appends a dated entry to the file's _verificationLog even when nothing changed, and a rule change is treated exactly like a RiskConfig change — it activates at the NEXT PRE-FLIGHT, never mid-session. The file also carries an operator note that you must still confirm account-specific items (promo payout caps, exact XFA scaling steps, minimum trading days) against your OWN dashboard before any eval or live order, because values can differ by promo.

Enforcement does not depend only on the per-intent Check(). The session itself is a state machine, and two paths exist that the engine can trigger no matter what a strategy wants: the HALT path and the FLATTEN path.

1

PRE-FLIGHT (06:00 PT)

Before the day can ARM, the engine verifies: the config hash equals yesterday's post-close approved hash, venue auth is good, the data stream is live, compliance limits are loaded, the sidecar pings healthy, and the strategy roster is confirmed. Any failure means no ARMED today — the system simply does not trade.

2

Hard violation → FLATTEN_AND_HALT

Hitting the DLL at 100%, or a watchdog trip (market-data silence > 10s in session, or a venue API failure with a position open), drives the session to HALTED(reason) — terminal for the day. The engine attempts to flatten via TopstepX; if that fails, it falls back to a TradeStation quote-verified manual-assist procedure plus a loud alert. The doctrine is absolute: NEVER an unattended open position.

3

FlatTime → END_OF_DAY

When our FlatTime is reached (earlier than the venue's 15:05 CT hard flatten), the engine issues idempotent timed flatten orders, locks the account, and the session reflects. We flatten on our own clock so the venue's hard flatten is never the thing that closes us.

Surrounding the engine are the hard locks — four design rules taken straight from the operator's own blowup autopsy. They are non-negotiable because they remove the human failure modes entirely rather than relying on discipline in the moment.

No buy button

1. No manual order entry exists

There is no buy button in AXIOM. There is no way to fat-finger a discretionary trade or 'just this once' override the system from the cockpit. Manual intervention happens only through the venue's own web UI, under the runbook.

Immutable 06:00–13:30 PT

2. Config immutable during market hours

RiskConfig (size, locks, limits) is loaded at PRE-FLIGHT from a signed file approved during the prior post-close window. The UI cannot edit it between 06:00 and 13:30 PT. You cannot loosen your own leash while the market is open.

Never same-day

3. Cooling-off staging

Any risk-config change is staged post-close and activates at the NEXT PRE-FLIGHT — never same-day. The decision to change a rule and the moment it takes effect are separated by a full cooling-off period, so a heated mid-day decision cannot reach live trading.

It can only make you wait

4. The override opens a Gemma chat

The only 'override' button does not grant anything. It loads gemma4:12b, whose deliberate load latency is intended friction. Its sole goal is to make you wait until the close. It cannot approve, place, size, or modify — it can only delay-to-close. By design, the system has no yes.

Why the LLM can never touch this

Per the Constitution and D-067, no verdict, reflection, or LLM output may ever reach compliance-engine inputs. The AI sidecar may only veto, narrate, resist, or reflect — it is read-only advisory context with zero authority. Compliance narration is not even a model: it is deterministic templates over known numbers ('$420 from DLL · consistency headroom $1,080 · MLL floor locked'). Faster, and zero hallucination risk on safety-critical numbers.

One more structural fact: AXIOM is multi-account ('pod') from day one, even with a single account today. Each venue account gets its OWN compliance-engine instance — independent DLL, MLL, and consistency state, because the pod contract is per seat. A ledger-recorded allocation map assigns each promoted strategy-cell to exactly one account, and new seats are added one at a time, profit-funded, each receiving a strategy uncorrelated with the existing book (the G6 correlation check is the gate). One TopstepX API subscription covers all linked accounts.

Where rules live
config/topstep-rules.json (versioned; Principle 9 — never code constants)
Active profile
activeAccount field → AXIOM resolves accountProfiles[activeAccount] at PRE-FLIGHT
Engine verdicts
Allow | Deny(reason) | DenyAndHalt(reason) — no 'Approve' exists
DLL ($50K)
$1,000 hard (forced break, not a breach); 80% Paras soft gate at $800
Trailing MLL ($50K)
$2,000; locks at start balance once +$2,000 reached; floor $0 after first payout
Consistency
Combine: best day ≤ 50% of target. XFA: largest day ≤ 40% of total net, at payout
Paras overlays (stricter)
80% DLL soft gate + personal profit lock (default 2R) + smallest-size + earlier FlatTime
Verification cadence
WEEKLY against help.topstep.com via /verify-topstep-rules; rule change = next PRE-FLIGHT
Override
Opens gemma4:12b chat — can grant nothing, only delay-to-close
Before any eval or live order

The config is web-verified, but it is not your account dashboard. Promo accounts differ (No-Activation-Fee $50K accounts reportedly have reduced payout caps — $2K/$3K vs the standard $5K/$6K, an open DISCREPANCY_FLAG in the file). XFA scaling-plan exact contract steps are not fully published. Minimum trading days were recently relaxed. Confirm all of these against your OWN Topstep dashboard first — the operator note in the config says exactly this.

What to stay aware of
  • The config's numbers are web-verified, not your account: promo accounts differ. No-Activation-Fee $50K accounts reportedly have $2K/$3K payout caps vs the standard $5K/$6K — an unresolved DISCREPANCY_FLAG in the file. Confirm against your own Topstep dashboard before any eval or live order.
  • Hitting the Daily Loss Limit is NOT a breach — it is a forced break (flatten, cancel, block new trades until 5:00 PM CT next session). Hitting the Trailing MLL IS a breach and ends the account permanently. Do not conflate the two.
  • The 80% DLL soft gate and the personal profit lock (default 2R) are PARAS overlays — stricter than Topstep, not Topstep rules. Know which numbers are the firm's and which are the house's.
  • The Trailing MLL only ever moves UP and locks at the starting balance once you've earned the MLL amount; after your first payout the floor sits at $0 permanently. The engine's projected-breach check enforces this against the worst-case stop-out, before the trade.
  • Combine consistency (best day ≤ 50% of profit target) and Express Funded consistency (largest day ≤ 40% of total net profit, at payout) are DIFFERENT rules with different thresholds and effects. Exceeding either delays/blocks — it does not breach.
  • Config is immutable 06:00–13:30 PT and any change cools off to the NEXT PRE-FLIGHT. You cannot loosen your own limits mid-session — by design. Plan rule changes for the post-close window.
  • The override button grants nothing — it only loads Gemma to make you wait until the close. There is no path in AXIOM to a discretionary yes. If you 'need' an override, the system is working.
  • Verification is weekly via /verify-topstep-rules and must be run before any first eval/live order or whenever a rule change is rumored. Each run appends a dated _verificationLog entry even when nothing changed.
  • Automation is allowed on Combine and Express Funded only — hard-blocked on Live Funded (Dynamic Live Risk Expansion, out of scope). No VPS/VPN; personal machine only.

Locked decisions & the why

D-040
Topstep rule values are versioned config (config/topstep-rules.json), re-checked regularly — never code constants.
Why: Constitution Principle 9. Topstep changed its rules 4+ times Nov 2025–Feb 2026; as config a rule change is a config change, not a code change and redeploy. The engine contains no rule literals.
D-042
The four hard locks: no manual order entry in AXIOM; config immutable during market hours; cooling-off staging of any change; the only 'override' opens a Gemma chat that can grant nothing.
Why: docs/03 §3.1 — taken directly from the operator's own blowup autopsy. They remove human failure modes (fat-fingers, mid-day leash-loosening, heat-of-the-moment overrides) entirely rather than relying on in-the-moment discipline.
D-064
config/topstep-rules.json is account-type driven: set activeAccount to the profile (Combine/XFA × 50K/100K/150K) and AXIOM resolves and applies that profile's rules automatically at PRE-FLIGHT; parasOverlays apply on top of whichever profile is active.
Why: Operator instruction 2026-06-10. One file drives every account size and type; the Paras stricter overlays (soft gate, profit lock) layer cleanly on top without forking the rule set.
D-063
Topstep rulebook re-check cadence raised from monthly to WEEKLY, folded into the calendar-news check via /verify-topstep-rules; each run appends a dated _verificationLog entry; rule changes activate at the next PRE-FLIGHT, never mid-session.
Why: Operator instruction 2026-06-10. Topstep moved rules repeatedly; weekly verification with an audit trail keeps the config honest, and treating a rule change as a RiskConfig change preserves the cooling-off / no-same-day-change guarantee.
D-062
config/topstep-rules.json web-verified against help.topstep.com on 2026-06-10; corrected stale seed values (MLL lock behavior, XFA consistency 40%, per-size DLL/MLL/targets/contracts) and separated Paras soft-gates from actual Topstep rules.
Why: The original seed had the MLL lock wrong ('start+$100') and the XFA consistency cap wrong (0.50 vs the correct 0.40 largest-day/total-net). Separating house overlays from firm rules prevents mistaking a Paras choice for a Topstep requirement.
D-067
No verdict, reflection, or LLM output may ever reach compliance-engine inputs; GO-class verdicts carry no permissive power; the LLM is read-only advisory context with zero authority.
Why: Operator ruling 2026-06-10 (risk T-06). Compliance must be deterministic; an LLM that could influence a lock or a check would reintroduce exactly the non-determinism the engine exists to remove. Compliance narration is templates, not a model — zero hallucination risk on safety numbers.
D-043
AXIOM is multi-account ('pod') from day one: each venue account gets its own compliance-engine instance; new seats get uncorrelated strategies (G6 check); profit-funded, one at a time.
Why: docs/03 §3.2. The pod contract is per seat, so DLL/MLL/consistency state must be independent per account; uncorrelated allocation prevents one shock from breaching the whole book.
D-084
(OD-1) AXIOM is a Windows WPF (.NET 8) desktop app, aligning all three apps on WPF; the Claude Design Next.js/React output is a UI/UX reference prototype only, never a runtime dependency.
Why: Operator ruling S028, 2026-06-17. A one-UI-stack standardization is the right solo-operator simplification and supersedes the earlier 'AXIOM stays WinUI 3' default in docs/03.
Sources: docs/03_AXIOM_SPEC.md §3 TopstepComplianceEngine (the centerpiece) — monitor table, Check() contract · docs/03_AXIOM_SPEC.md §3.1 The hard locks (no manual entry, immutable config, cooling-off, Gemma override) · docs/03_AXIOM_SPEC.md §3.2 Multi-account (pod) operation — per-account compliance instances · docs/03_AXIOM_SPEC.md §2 Session state machine (PRE-FLIGHT, watchdog, FLATTEN_AND_HALT, FlatTime) · docs/03_AXIOM_SPEC.md §1 Layers (L3 Compliance Engine wraps every order intent), §5 AI sidecar (deterministic narration) · config/topstep-rules.json — _meta, accountProfiles (Combine/XFA × 50K/100K/150K), consistency, dailyLossLimitBehavior, parasOverlays, globalRules, _verificationLog · tracking/DECISIONS.md — D-040, D-042, D-043, D-062, D-063, D-064, D-067, D-084
Module 12 of 19 · The Three Apps

📋 The Topstep Playbook

The practical, rule-by-rule playbook for trading a Topstep account through AXIOM without getting disqualified — the Combine-to-funded path, the three rules that end accounts, exactly how AXIOM enforces each deterministically, and what to verify before any order.

The one-line truth

You do not lose a Topstep account by lacking an edge — you lose it by breaking a rule. The dossier names the destruction mechanism precisely: discretionary oversizing after success. 'The plane flies; the pilot intervened.' This playbook is how AXIOM removes the pilot from the order path so the rules can never be broken in the heat of the moment.

This is the operator's field manual for running a Topstep account. The companion lesson 'Topstep Rules & the Compliance Engine' teaches the architecture; this one is the playbook — the path you actually walk, the rules you must never cross, what AXIOM does for you at each one, and the dos and don'ts that decide whether you stay funded. Everything here is grounded in config/topstep-rules.json, docs/03_AXIOM_SPEC.md, docs/05_BUILD_ROADMAP.md, and the two research dossiers. Where the docs say 'verify against your own dashboard,' so does this lesson — nothing here replaces that check.

Start with the mental model. A Topstep account is not your capital — it is a funded evaluation seat with hard guardrails. You pay a small monthly fee to rent the seat and prove repeatable skill; pass the evaluation (the Combine) and you get a funded account that splits real profit 90/10 in your favor. Break a guardrail and the seat is gone — that is a 'breach.' The entire job of AXIOM's compliance engine is to enforce those guardrails locally, in advance, and even stricter than Topstep, so the firm's own enforcement is never the thing that catches you.

1

Step 0 — Edge first, money second (the budget gate)

Nothing in the Topstep path starts until a strategy holds G5 through the research gates (docs/05, A5). The dossier is blunt: 'every month in a Combine without a real edge costs money with 5–22% odds.' If the early families die at the null, the platform keeps researching at $0/mo and the Combine waits. That is the budget discipline working, not failing.

2

Step 1 — Buy the seat ($49 Combine + $29 API)

A $50K Combine costs ~$49/mo; the TopstepX API subscription that lets AXIOM place orders is $29/mo and covers all linked accounts. That ~$78/mo is the whole runtime cost — comfortably inside the ≤$200/mo budget (Constitution P10). One reset credit accrues per renewal; a reset is ~$49 if you need it. A2's exit criteria require an order round-trip verified on a practice/eval account at 1 micro before any of this goes live.

3

Step 2 — Trade the smallest size that exists (1 micro)

The Paras size policy is the smallest size — 1 micro (e.g., 1 MGC on gold) — until a measured edge AND the gap report justify more (parasOverlays.sizePolicy). At 1 micro a typical $300–$500 winning day never approaches the $1,500 consistency cap, so by construction the consistency rule is 'permanently irrelevant.' Smallest size is not timidity; it is how you make two of the three killer rules a non-issue.

4

Step 3 — Pass the Combine (hit the profit target, obey consistency)

On a $50K Combine the profit target is $3,000. You pass by reaching it WITHOUT any single day exceeding 50% of the target ($1,500) and without ever crossing the Daily Loss Limit ($1,000) or the Trailing Max Loss Limit ($2,000). Pass and you pay a one-time $149 activation to become funded.

5

Step 4 — Get funded, then get paid (90/10, spread your days)

Funded accounts (Express Funded path) split profit 90/10 in your favor, flat from the first dollar for accounts joined on/after 2026-01-12. The standard payout path is 5 winning days of $150+ net each; the consistency payout path is 3+ trading days with the largest day ≤ 40% of total net profit. After your first payout the trailing floor resets to $0 permanently.

Automation is allowed here — and ONLY here

globalRules: automation is allowed on Combine and Express Funded accounts. It is HARD-BLOCKED on Live Funded accounts (which use Dynamic Live Risk Expansion — out of scope). No VPS, no VPN — personal device only (Constitution P10). The whole A5 path lives inside the accounts where AXIOM is permitted to trade for you.

Now the three rules that decide whether you keep the account. Learn these cold — everything AXIOM does is in service of them, and every account death in the dossier traces back to one of them being crossed by a human in the moment.

Account-ending breach

Trailing Max Loss Limit (MLL)

A floor under the account that follows your high-water mark UP but NEVER back down. On a $50K Combine it sits $2,000 below balance; it trails on end-of-day balance and, once you have earned +$2,000, LOCKS at the starting balance and stops trailing. Cross it and the account is breached — permanent. After your first funded payout the floor resets to $0 forever.

Forced break (not a breach)

Daily Loss Limit (DLL)

The most you may lose in one session — $1,000 on a $50K Combine. Hitting it is NOT a breach. It is a forced break: Topstep flattens positions, cancels pending orders, and blocks new trades until the next session at 5:00 PM CT. It exists to stop the revenge-trading spiral cold.

Delays pass / blocks payout

Consistency rule

Combine: your single best day must be ≤ 50% of the profit target ($1,500 of $3,000 on $50K). Express Funded: largest day ≤ 40% of total net profit, checked at payout (min 3 trading days). Exceeding it does NOT breach — it raises the target / blocks the payout until your profit is spread out. One lottery day cannot pass an evaluation.

Rule$50K Combine valueCrossing it meansHow AXIOM enforces it
Trailing MLL$2,000; locks at start balance once +$2,000 earned; $0 after first payoutACCOUNT DEAD — permanent breachProjected-breach DENY: computes where a candidate entry's stop-out would leave the account; if that worst case crosses the trailing floor, the entry is denied BEFORE it is placed.
Daily Loss Limit$1,000 hard; Paras soft gate at 80% ($800)Forced break until 5:00 PM CT — NOT a breachSoft gate (Paras overlay) blocks new entries at $800; at $1,000 the engine flattens, cancels pending, and HALTs the day.
Consistency (Combine)best day ≤ 50% of $3,000 target ($1,500)Delays passing — raises the consistency targetPersonal profit lock (Paras overlay, default 2R): when day P&L hits the lock, new entries are blocked and the open position exits at its next signal exit.
Position caps5 minis / 50 micros per account; tighter per-strategy cap from the specOver-cap order rejectedDeny any intent that would exceed the account or per-strategy cap.
Clock / FlatTimeVenue hard flatten 15:05 CT; ours earlier per SessionTemplate.FlatTimen/a — we flatten firstIdempotent timed flatten orders on OUR clock, before the venue ever has to act.
DLL vs MLL — never confuse them

Hitting the Daily Loss Limit is a bad day; hitting the Trailing MLL is a dead account. The DLL flattens you and sends you home until 5:00 PM CT — you come back tomorrow. The MLL is the floor that only moves up: cross it and the seat is gone permanently (Back2Funded allows ≤2 paid revivals pre-first-payout, but treat a breach as final). The smallest-size policy keeps you far from both.

Here is how AXIOM enforces all of this deterministically. The compliance engine is Layer 3 of AXIOM and it wraps EVERY order intent — no order reaches the TopstepX adapter without passing through it first. The contract is a single function call returning one of three verdicts. Note what is absent: there is no 'Approve' that grants anything. The engine is a one-way valve.

// Every order intent, BEFORE any venue/adapter call:
Check(intent) -> Allow | Deny(reason) | DenyAndHalt(reason)

//  Allow         -> intent may proceed to the execution adapter
//  Deny(reason)  -> this entry is blocked; the day continues
//  DenyAndHalt   -> block, flatten everything, HALT the day (terminal)

// Three deterministic enforcement moves map to the three rules:
//  MLL       -> projected-breach DENY (worst-case stop-out checked first)
//  DLL       -> 80% soft gate blocks entries; 100% flatten + HALT
//  Consistency-> personal profit-lock blocks new entries
//
// No LLM, verdict, or reflection may ever reach these inputs (D-067).

A concrete walk-through on a $50K Combine, 1 micro. You take two small losers for -$600 on the day. The 80% DLL soft gate sits at $800, so entries are still allowed but you are close. A third signal fires; its stop-out would be another -$450, taking the day to -$1,050. The engine projects that this crosses the $1,000 DLL and returns Deny — the trade is never placed. Now flip it: the day goes well and you book +$1,500. That is exactly 50% of the $3,000 target — the consistency ceiling. But your personal profit lock (default 2R) would already have blocked new entries earlier and let the open trade exit on its own signal, so you never blow past the consistency line by accident. At 1 micro, though, a normal day is $300–$500 and you rarely come near any of these — which is the entire point of the smallest-size policy.

Two phrases there are load-bearing: 'Paras overlay' and 'projected.' The 80% DLL soft gate and the personal profit lock are NOT Topstep rules — they are Paras choices, stricter than Topstep, applied on top of whichever account profile is active (parasOverlays in the config). And the MLL check is forward-looking, not reactive: the engine never waits for you to breach and then respond — by then the account is already dead. It checks the worst case before the trade exists.

Enforcement does not rest on the per-intent check alone. The session is a state machine, and two engine-triggered paths exist no matter what a strategy wants: the HALT path and the FLATTEN path. PRE_FLIGHT at 06:00 PT verifies the config hash matches yesterday's approved hash, venue auth, live data, loaded limits, sidecar health, and the strategy roster — any failure means no ARMED today. A hard violation (DLL at 100%, or a watchdog trip: market-data silence >10s in session, or a venue API failure with a position open) drives the session to HALTED and FLATTEN_AND_HALT. FlatTime drives an orderly END_OF_DAY flatten on our earlier clock. The doctrine is absolute: never an unattended open position.

The hard locks are why none of this can be undone in the moment. They come straight from the operator's own blowup autopsy and remove human failure modes entirely rather than relying on discipline under pressure.

No discretionary entry

1. No buy button exists

There is no manual order entry in AXIOM. You cannot fat-finger a discretionary trade or oversize 'just this once' from the cockpit — the exact mechanism the dossier names as your destruction. Manual intervention happens only through the venue's own web UI, under the runbook.

Immutable 06:00–13:30 PT

2. Config immutable in market hours

RiskConfig (size, locks, limits) is loaded at PRE_FLIGHT from a signed file approved during the prior post-close window. The UI cannot edit it 06:00–13:30 PT. You cannot loosen your own leash while the market is open.

Never same-day

3. Cooling-off staging

Any risk-config change is staged post-close and activates at the NEXT PRE_FLIGHT — never same-day. The decision to change a rule and the moment it takes effect are separated by a full cooling-off period, so a heated mid-day decision cannot reach live trading.

Delay-to-close only

4. The override only makes you wait

The single 'override' button grants nothing. It loads gemma4:12b, whose deliberate load latency is intended friction; its sole goal is to delay you to the close. It cannot approve, place, size, or modify. By design, the system has no yes.

Operator DOs

DO trade the smallest size (1 micro / 1 MGC) until a measured edge and the gap report justify more. DO run /verify-topstep-rules WEEKLY and again before any first eval/live order. DO stage every rule change for the post-close window so it cools off to the next PRE_FLIGHT. DO let the DLL forced-break send you home — tomorrow is a fresh session. DO confirm promo-specific items (payout caps, XFA scaling steps, minimum trading days) against your OWN account dashboard before relying on the config.

Operator DON'Ts (these are what get you DQ'd)

DON'T oversize after a winning streak — the documented destruction mechanism is discretionary oversizing after success. DON'T revenge-trade a bad morning; that is exactly what the DLL forced-break and 80% soft gate exist to stop. DON'T try to loosen your limits mid-session — you can't, and the impulse is the warning sign. DON'T run automation on a Live Funded account (hard-blocked). DON'T treat the web-verified config as your account's truth — promos differ. DON'T trust a backup you have never restored (D-071).

Rule VALUES live in versioned config — never as code constants (Principle 9)

There is no '$1000' or '$2000' literal in the compliance engine. You set activeAccount to your profile (Combine / Express Funded × 50K / 100K / 150K) and AXIOM resolves accountProfiles[activeAccount] at PRE_FLIGHT and applies it automatically; the Paras overlays layer on top. Topstep changed its rules 4+ times between Nov 2025 and Feb 2026 — as config, a rule change is a config change, not a redeploy. The config carries a hard operator note: VERIFY against your own account dashboard before any eval/live order, because values can differ by promo (e.g., No-Activation-Fee $50K accounts reportedly have $2K/$3K payout caps vs the standard $5K/$6K — an unresolved DISCREPANCY_FLAG in the file).

Runtime cost
$49/mo Combine + $29/mo TopstepX API (~$78/mo, covers linked accounts) — under the $200/mo budget
Activation / reset
$149 one-time activation on pass; reset ~$49 (1 reset credit accrues per renewal)
Size policy
Smallest size — 1 micro (1 MGC) — until an edge AND gap report justify more (parasOverlays.sizePolicy)
Profit target ($50K)
$3,000 to pass the Combine
DLL ($50K)
$1,000 hard forced-break (NOT a breach); Paras soft gate at 80% ($800)
Trailing MLL ($50K)
$2,000; locks at start balance once +$2,000 earned; $0 after first payout
Consistency
Combine: best day ≤ 50% of target. XFA: largest day ≤ 40% of total net, at payout
Position caps ($50K)
5 minis / 50 micros per account; tighter per-strategy cap from the spec
Payout
90/10 split (flat from $1 for accounts joined on/after 2026-01-12); standard = 5 days of $150+ net
Automation allowed on
Combine + Express Funded only — HARD-BLOCKED on Live Funded; no VPS/VPN
Verification
WEEKLY via /verify-topstep-rules + before any first eval/live order; appends dated _verificationLog entry
Override
Opens gemma4:12b chat — grants nothing, only delay-to-close

The bottom line for the operator: at 1 micro, with AXIOM's projected-breach MLL check, the 80% DLL soft gate, the personal profit lock, and four hard locks that remove the buy button and freeze your config in market hours, the only way left to lose the account is one Topstep no longer lets a human reach — discretionary oversizing after a win. The dossier calls that 'the median outcome, not bad luck' for the unprotected trader. This playbook, and the engine behind it, exists so it can never be your outcome.

What to stay aware of
  • The named way you lose the account is discretionary oversizing after success — the dossier calls it 'the median outcome, not bad luck' for the unprotected trader. AXIOM's no-buy-button + immutable-config locks exist specifically to make that impossible.
  • Hitting the Daily Loss Limit ($1,000 on $50K) is a forced break (flatten, cancel, block until 5:00 PM CT) — NOT a breach. Hitting the Trailing MLL ($2,000) IS a breach and ends the account permanently. Never conflate the two.
  • The 80% DLL soft gate ($800) and the personal profit lock (default 2R) are PARAS overlays — stricter than Topstep, not Topstep rules. Know which numbers are the firm's and which are the house's.
  • Combine consistency (best day ≤ 50% of profit target) and Express Funded consistency (largest day ≤ 40% of total net profit, at payout) are DIFFERENT rules with different thresholds and effects. Exceeding either delays/blocks — it does not breach.
  • At 1 micro the consistency rule is effectively irrelevant by construction: a $300–$500 typical day never approaches the $1,500 best-day cap. Smallest size is how you make two of the three killer rules a non-issue.
  • Config is immutable 06:00–13:30 PT and any change cools off to the NEXT PRE_FLIGHT. You cannot loosen your own limits mid-session — by design. Plan rule changes for the post-close window.
  • Automation is allowed on Combine and Express Funded ONLY — hard-blocked on Live Funded (Dynamic Live Risk Expansion, out of scope). No VPS/VPN; personal machine only (Constitution P10).
  • The web-verified config is NOT your account dashboard. Promo accounts differ — No-Activation-Fee $50K accounts reportedly have $2K/$3K payout caps vs the standard $5K/$6K (an open DISCREPANCY_FLAG), XFA scaling steps are not fully published, and minimum trading days were recently relaxed. Confirm against your own dashboard before any eval or live order.
  • Micros count 10:1 vs minis on TopstepX only — relevant when reading position caps (5 minis / 50 micros on $50K). The smallest-size policy keeps you nowhere near the cap regardless.
  • Nothing in the Topstep path (A5) starts before a strategy holds G5. A Combine without a real edge costs money at 5–22% pass odds; researching at $0/mo until an edge exists is the budget discipline, not a stall.

Locked decisions & the why

D-040
Topstep rule VALUES are versioned config (config/topstep-rules.json), re-checked regularly — never code constants.
Why: Constitution Principle 9. Topstep changed its rules 4+ times Nov 2025–Feb 2026; as config a rule change is a config change, not a code change and redeploy. The compliance engine contains no rule literals — you set activeAccount and it resolves the profile.
D-042
The four hard locks: no manual order entry in AXIOM; config immutable during market hours; cooling-off staging of any change; the only 'override' opens a Gemma chat that grants nothing.
Why: docs/03 §3.1 — taken directly from the operator's own blowup autopsy. They remove the named destruction mechanism (discretionary oversizing after success) and other human failure modes entirely, rather than relying on in-the-moment discipline.
D-064
config/topstep-rules.json is account-type driven: set activeAccount to the profile (Combine / XFA × 50K / 100K / 150K) and AXIOM resolves and applies that profile's rules automatically at PRE_FLIGHT; parasOverlays apply on top of whichever profile is active.
Why: Operator instruction 2026-06-10. One file drives every account size and type; the Paras stricter overlays (80% soft gate, profit lock, smallest size, earlier FlatTime) layer cleanly on top without forking the rule set.
D-063
Topstep rulebook re-check cadence is WEEKLY, folded into the calendar-news check via /verify-topstep-rules; each run appends a dated _verificationLog entry; rule changes activate at the next PRE_FLIGHT, never mid-session.
Why: Operator instruction 2026-06-10. Topstep moved rules repeatedly; weekly verification with an audit trail keeps the config honest, and treating a rule change as a RiskConfig change preserves the cooling-off / no-same-day-change guarantee. The skill must also run before any first eval/live order.
D-067
No verdict, reflection, or LLM output may ever reach compliance-engine inputs; the AI sidecar is read-only advisory context with zero authority over compliance.
Why: Operator ruling 2026-06-10 (risk T-06). Compliance must be deterministic; an LLM that could influence a lock or a check would reintroduce exactly the non-determinism the engine exists to remove. Compliance narration is templates, not a model — zero hallucination risk on safety numbers.
D-069
Zero new API token spend at runtime; runtime inference is local Ollama only; no bar-level LLM calls — session cadence is the runtime ceiling for any model invocation.
Why: Budget-true and order-path safety. The Combine runs at ~$78/mo ($49 + $29) with $0 marginal AI cost; an LLM in the order loop would be both a cost and a determinism violation.
D-084
(OD-1) AXIOM is a Windows WPF (.NET 8) desktop app, aligning all three apps (PULSE/SENTINEL/AXIOM) on WPF; the Claude Design Next.js/React output is a UI/UX reference prototype only, never a runtime dependency.
Why: Operator ruling S028, 2026-06-17. A one-UI-stack standardization is the right solo-operator simplification and supersedes the earlier 'AXIOM stays WinUI 3' default in docs/03.
Sources: config/topstep-rules.json — _meta (Principle 9, weekly verify, operator note), activeAccount, accountProfiles (Combine/XFA × 50K/100K/150K), consistency, dailyLossLimitBehavior, payout, parasOverlays (soft gate, profit lock, smallest-size), globalRules (automation allowed on Combine/XFA, blocked on Live Funded; no VPS/VPN), _verificationLog · docs/03_AXIOM_SPEC.md §3 TopstepComplianceEngine — monitor table, Check() Allow|Deny|DenyAndHalt contract, projected-breach MLL check · docs/03_AXIOM_SPEC.md §3.1 The hard locks — no manual entry, config immutable in market hours, cooling-off, Gemma override · docs/03_AXIOM_SPEC.md §2 Session state machine — PRE_FLIGHT (06:00 PT), watchdog, FLATTEN_AND_HALT, FlatTime/END_OF_DAY · docs/05_BUILD_ROADMAP.md Track A — A5 First Combine ($49 Combine + $29 API, smallest size, operator runbook); A2 order round-trip at 1 micro; critical-path note (nothing in A5 before G5) · docs/research/Consistent_Revenue_Dossier.html — destruction mechanism (discretionary oversizing after success), Combine→XFA→payout path, $49/$29 budget, 90/10 split, MLL/DLL/consistency interactions, smallest-size makes consistency irrelevant, 5–22% pass odds · docs/research/Owners_Reference.html — Combine ≈ $78–127 runtime, MLL breach = account dead, micros count 10:1, smallest size, breach-denial compliance summary · CLAUDE.md — Constitution P9 (rules are config) / P10 (local-first, ≤$200/mo) / P11 (adapt by saying no); hard rules (no LLM in order path; rule values never constants; config operator-verified before any eval/live order); D-069 · tracking/DECISIONS.md — D-040, D-042, D-063, D-064, D-067, D-069, D-084
Module 13 of 19 · Research & Discovery

🔭 Model Discovery Methodology

How Paras finds, classifies, and budgets trading-model ideas — Edge Class → Family → Model → Cell — and turns each into a pre-registered spec that must earn its place through the gates.

Read this first — the one word that means two things

In this lesson, "model" ALWAYS means a TRADING model — a strategy, a way of taking money out of the market (e.g. "buy the opening-range breakout on in-play ES days"). It NEVER means an AI model. When this system means an LLM (Gemma, Ollama, Claude), it says "AI Model" explicitly. The model-research hub enforces this distinction as a non-negotiable rule. Hold it the whole way through.

You already trade ES, NQ, GC, CL, and SI by feel — you read the tape, you know the setups. Model Discovery is how Paras turns that universe of trading ideas into a disciplined, countable pipeline. The job of this methodology is NOT to find clever strategies. The job is the opposite of optimism: it organizes every candidate so the system can KILL the bad ones cheaply, and so the few genuine edges face a higher and higher bar. This is the touchstone (Parasmani): most metals aren't gold, and that is the point.

Discovery answers three operator questions, in order. (1) Where do trading-model ideas come from and where do they live? (2) How do we organize them so we don't fool ourselves? (3) How does one idea become something testable that enters the gates? The answers are: a dedicated research hub, a four-level hierarchy ending in the "cell," and the /new-strategy ritual that writes a spec with a stated mechanism and a pre-registered trial budget.

THE HIERARCHY — four levels, each narrower than the last. This is the spine of discovery. Read it top-down: the top two levels are how humans think and budget; the bottom two are what actually gets tested. The Edge Class layer is an organizing / diversification layer ABOVE the Family, which is the budget unit.

LevelWhat it isConcrete exampleRole
Edge ClassA category of market edge — the economic reason a strategy could workBreakoutORGANIZING + diversification layer; used at G6 to promote UNCORRELATED survivors, never as a leaderboard rank
FamilyA group of related models sharing one mechanism ideaopening-range breakout on in-play days ("orb-inplay")THE BUDGET UNIT — the pre-registered trial budget is set per family; re-rolls and siblings draw from the SAME family budget
ModelOne specific, stated trading strategy with a mechanismOrBreakoutTrigger on ES, 30-min OR, in-play gate, 2:1 bracketBecomes one specs/proposed/*.yaml
Cellmodel × instrument × session × timeframe × params — ONE concrete experimentthat ORB model, ES, RTH PT session, M5, orMinutes=30, rr=2.0THE UNIT OF TEST — every cell that runs is a counted trial
The cell is the heart of it

A CELL is the smallest testable thing in Paras: a model pinned to a specific instrument, session, timeframe, and exact parameter set. "ORB breakout, ES, RTH, M5, 30-minute range, 2:1 risk" is one cell. Change the instrument to GC, or the timeframe to M15, or orMinutes to 60 — each change is a DIFFERENT cell. The kernel spec says it directly: strategies are EXPECTED to differ per (ticker, session) — that is the experiment_cell coordinate, not an exception. The canonical trade record stamps every result with its cell (symbol, timeframe, session, params_hash), so any number traces back to the exact experiment that produced it.

WHY CELLS, AND WHY THE COUNT MATTERS. Each cell you run is one roll of the dice against randomness. Run enough cells and SOMETHING will look profitable by pure luck — this is the p-hacking trap, the thing that destroys most retail "systems." Paras' defense is built into the hierarchy: the trial budget is pre-registered per family BEFORE any run, and the more cells a family burns, the HIGHER the statistical bar that survivors must clear (the Deflated Sharpe Ratio, DSR). You don't get to keep searching for free.

1

1. Pre-register the budget

The spec declares trial_budget: {family, max_cells}. That number is fixed before any backtest runs. A budget gate — CODE, not Claude — blocks the family from queuing more cells once max_cells is exhausted (D-030).

2

2. Every cell is counted

Every backtest run records a trial row in the ledger. There are no untracked runs (Constitution Principle 2). Re-rolls ("let me just try one more parameter") draw from the SAME family budget — you can't escape the count by renaming.

3

3. The bar rises with the count (G3)

At gate G3 the validation service computes DSR > 0.95 GIVEN the family trial count, plus PBO < 0.20 (CSCV). More cells tried = a higher hurdle. A family that burned 30 cells needs a far stronger result than one that burned 5 to be believed.

4

4. No combo-search escape hatch

"Compose a hundred combos and see what sticks" is explicitly REJECTED (D-087). It violates the mechanism requirement (G0/D-031) and the counting rule (P2). Discovery is mechanism-first, never search-first.

WHERE THE IDEAS COME FROM — the master catalog and the seed backlog. Paras keeps a LIVING master catalog of trading models classified specifically for FUTURES DAY TRADING. It holds 123 models (well beyond the operator's own 42-rule grammar), each tagged Intraday / Overnight / Swing, with a Day-Trade Shortlist tab (44 intraday/keep) and one sheet per Edge Class. It is regenerated from raw per-class JSON — never hand-edited — so its provenance stays auditable. New ideas are added by dropping a JSON file into the raw store and re-running the generator.

The living master
catalog/MASTER_Model_Research_Futures_DayTrade.xlsx — 123 models, futures-day-trade classified (Intraday/Overnight/Swing), 1 sheet per Edge Class, a 44-row Day-Trade Shortlist
The raw store
raw/model-catalog-json/ — the per-class JSON the master is built from; add <NN>-<class>-new.json and regenerate to extend
The operator's grammar
AXIOMMTF_MODEL_IDEAS_BACKLOG.md — your 42 source-verified discretionary MTF rules (Divergence, Reversal, Continuation, Disconnect, Special, Filters, Cross-instrument), kept as a SEED BACKLOG of ideas — NOT validated edges
Internal hypotheses
Hypothesis_Codex.html + research dossiers (named systems, institutional patterns, revenue strategies)
The 12 Edge Classes
Trend-Following · Mean-Reversion · Breakout · Momentum · Reversal · Divergence · Relative-Value · Volatility · Seasonality · Order-Flow · Market-Structure · Event-Catalyst
Nothing in the catalog is an edge — yet

The INDEX says it plainly: "Nothing here is a validated edge. Every model is a candidate that must clear the gates like everything else." Your own 42 rules are no exception — the backlog explicitly states they are discretionary pattern IDEAS with NO special status (D-087/D-088a). A model in the catalog is a hypothesis waiting for a mechanism and a budget, not a green light to trade.

FROM AN IDEA TO A SPEC — the component grammar. To be testable, a model can't be a vibe ("buy the brick wall"); it must be expressed in the kernel's four-layer component grammar. Every strategy is a composition of small, pure, individually testable pieces — deterministic functions of a read-only MarketState snapshot, with no I/O, no randomness, no wall-clock reads. This is what /new-strategy scaffolds, and what the spec schema validates.

LayerInterfaceAsksExamples
Context gateIContextGateShould we even look today?InPlayGate, RegimeGate, SessionGate, CalendarGate
TriggerITriggerFire a signal on THIS bar close?OrBreakoutTrigger, OrFadeTrigger, FirstHalfHourMomentumTrigger
FilterIFilterAllow or veto this signal?VolumeConfirmFilter, OneTradePerDayFilter
ExecutionIExecutionPolicyWhat bracket — entry, stop, target, size, TIF?FixedRiskBracket(rr=2.0), TimedExit(flatTime)
# A model expressed as a spec (docs/01 §3). /new-strategy scaffolds this shape.
id: orb-inplay
version: 0.1.0
mechanism: "On in-play days the opening range resolves directionally often enough to clear friction."  # REQUIRED — no mechanism, no run (G0/D-031)
instrument: ES
timeframe: M5
session: ES_RTH_PT
context:  [{type: InPlayGate, orwMult: 1.0, gapAtrMin: 0.30, warmup: 10}]
trigger:  {type: OrBreakoutTrigger, orMinutes: 30}
filters:  [{type: OneTradePerDayFilter}]
execution:{type: FixedRiskBracket, rr: 2.0, contracts: 1}
trial_budget: {family: orb-inplay, max_cells: 8}     # pre-registered (Principle 2)

# The {instrument, timeframe, session} + trigger/execution params ARE the cell coordinate.
# Changing instrument to GC, or orMinutes to 60, makes a DIFFERENT cell — a separate counted trial.
# SpecLoader validates against spec.schema.json; unknown keys are ERRORS; the spec hash is recorded with every run.

THE PIPELINE — how a discovered model enters and what happens next. /new-strategy is the only door in. It scaffolds a schema-valid spec into specs/proposed/, satisfying G0 (a stated mechanism + a pre-registered budget exist before any run). From there the model runs the gate ladder. Each gate is a cheaper-first filter; the goal at every step is to kill the candidate as early and as cheaply as possible.

1

G0 — Pre-registered?

Spec YAML with a one-sentence mechanism + a trial budget exists BEFORE any run. No mechanism, no run (D-031). This is the entry gate /new-strategy fulfills.

2

G1 — Any pulse at all?

FastScreen vs a random-entry null (same trade count, random in-session timing, identical exits, 100 reps). Must beat the null's p50 and clear at least +2 ticks net per trade. Cheap, runs across thousands of cells per night.

3

G2 — Survives realism?

The RigorousEngine: M1 intrabar fill resolution, pessimistic fills, calibrated costs. Execution-realism findings are logged to learnings.md (they are gold).

4

G3 — Survives deflation?

The validation service: DSR > 0.95 given the FAMILY trial count, PBO < 0.20. This is where "more cells = higher bar" bites. Fail → the family hurdle rises and the idea retires unless a NEW hypothesis emerges.

5

G4 — Three engines agree?

kernel vs generated EasyLanguage (TradeStation) vs LEAN, within the §6 parity tolerances. An unexplained delta is STOP-EVERYTHING until root-caused — parity by construction (D-001/D-021).

6

G5 — Walk-forward + holdout?

TradeStation WFO as independent confirmation; the frozen holdout months are touched ONCE here (D-032). Fail → retire the cell.

7

G6 — Live-ready?

Promotion review (operator + weekly Claude report) including the portfolio-correlation check — promote for UNCORRELATED daily P&L across survivors, never leaderboard rank. This is where Edge Class diversity pays off. Then AXIOM shadow assignment.

A WORKED EXAMPLE — your own grammar, end to end. Take REV-BUY-30 from the AxiomMTF backlog ("Low Is In — 30m Buy Reversal," ~100% lows-in in your notes). As a discovery candidate it is just an IDEA with no special status. To make it real: (1) /new-strategy scaffolds a spec; you write the one-sentence mechanism (why a 30m buy reversal with bigger-TF correlation should mark a durable low). (2) Discretionary feel — "humongous," "brick wall," "deeply oversold" — must be turned into deterministic, testable components (an IND4 regime gate, an oscillator threshold) before it can trade; parity demands it. (3) You resolve any carry-forward flag that touches the rule (e.g. the regime-classifier BBR/RRB inversion). (4) You pre-register a family budget, say max_cells: 6 across ES/GC × M5/M15. (5) It runs G0→G6 like everything else, every cell counted, the DSR hurdle rising as the budget burns.

The mental model to keep

Discovery is a funnel, not a search. The catalog and your 42 rules are a WIDE top — hundreds of ideas, organized by Edge Class so you can think and diversify. /new-strategy is the narrow neck — one idea, one mechanism, one pre-registered budget. The cell is the grain of sand that flows through and gets counted. The gates are the filter that keeps almost all of it out. Success is most candidates DYING cheaply; the rare survivor that clears a rising DSR bar across three engines and a frozen holdout is the gold the touchstone exists to find.

What to stay aware of
  • "Model" = trading model (a strategy), NEVER an AI model. The hub enforces this naming as non-negotiable; an LLM is always called "AI Model" explicitly.
  • The cell is the unit of test and of cost: model × instrument × session × timeframe × params. Changing ANY one coordinate creates a new cell and a new counted trial.
  • More cells = a HIGHER bar. The DSR hurdle at G3 rises with the family trial count, so a wider search makes survival harder, not easier. Budget is pre-registered before any run.
  • Nothing in the catalog or the 42-rule backlog is a validated edge. Every candidate, including your own grammar, runs G0→G6 with no special status (D-087/D-088a).
  • Combo-search ("try everything, keep what sticks") is explicitly banned — it breaks the mechanism requirement and the counting rule.
  • Discretionary feel ("humongous," "brick wall," "deeply oversold") must be made into deterministic, testable kernel components before a model can trade — parity by construction.
  • The master catalog is regenerated from raw JSON, never hand-edited; preserve raw sources and add new ideas via the raw store + generator (the model-research hub hard rule).
  • Edge Class is the diversification layer (G6 promotes UNCORRELATED survivors), not a leaderboard; the Family is the budget unit.
  • /new-strategy is the ONLY door into the pipeline — it scaffolds the G0 spec (mechanism + pre-registered budget) that SpecLoader validates against spec.schema.json.

Locked decisions & the why

D-031
G0 requires a stated one-sentence mechanism — no mechanism, no run; pure pattern-mining is banned.
Why: Discovery is mechanism-first. A model without an economic reason it should work is just curve-fitting waiting to happen; the gate refuses it at the door (docs/02 §3).
D-030
Gate ladder G0→G6 with fail-routing; every run is a counted trial; the budget gate is code, not Claude.
Why: Constitution Principle 2 — every trial is counted. Making the budget gate code (not a judgment call) is what prevents quietly re-rolling past the pre-registered family budget (docs/02 §3).
D-087
Feature-cache approved with guardrails; combo-search REJECTED. The 42-rule AxiomMTF grammar is a seed backlog of model ideas only — each enters via G0→G6 with no special status.
Why: "Compose combos and see what works" violates G0/D-031 (a stated mechanism is required) and Principle 2 (every trial counted; DSR hurdle rises with trials). Operator ruling S028 (2026-06-17).
D-088a
The 42-rule AxiomMTF model-idea backlog is captured at docs/research/AXIOMMTF_MODEL_IDEAS_BACKLOG.md; the source .xlsx is reference material moved out of the repo root.
Why: Preserves the operator's source-verified discretionary grammar as auditable seed ideas, while keeping the system's interface (the .md + specs) distinct from the human reference. Operator ruling S028 (2026-06-17).
D-001
One shared C# kernel; SENTINEL backtests it, AXIOM trades it. The code that trades is the code that backtests.
Why: A discovered model is only testable because it is expressed in the kernel's component grammar — the same pure components run in backtest and live, which is what makes parity (G4) and live/backtest identity possible (docs/01, Principle 3).
D-016
Seed instrument set ES, CL, GC (PT sessions); expansion (NQ, currencies, VX) is config — instrument rows + SessionTemplates + spec YAMLs, zero kernel changes.
Why: The instrument and session are part of the cell coordinate, not the kernel. New instruments become new cells, not new code — so discovery scales by configuration (docs/01 §1).
D-020
Two-speed engine: FastScreen ranks thousands of cells (feeds only ≤ G1); RigorousEngine (M1 intrabar, pessimistic fills, calibrated costs) for G2+.
Why: Discovery must screen many cells cheaply, then prove the few survivors expensively. Cheap-first filtering is how bad strategies die cheaply (docs/01 §4).
D-032
Holdout months are frozen and opened once, at G5.
Why: A discovered model must clear an out-of-sample window it has never seen. Touching the holdout only once preserves it as honest final confirmation (docs/02 §3).
Sources: docs/research/model-research/INDEX.md — master index; Edge Class → Family → Model → Cell hierarchy; the LIVING master catalog (123 models, futures-day-trade classified); how model research maps to the build (G0→G6); "nothing here is a validated edge" · docs/research/model-research/README.md — model-research hub process + hard rule; the "model = trading model, AI model is separate" terminology rule; the 12-Edge-Class taxonomy; the 7-step process (land raw → categorize → fit to template → verify → route to build) · docs/research/AXIOMMTF_MODEL_IDEAS_BACKLOG.md — the 42-rule seed backlog (Divergence/Reversal/Continuation/Disconnect/Special/Filters/Cross-instrument); §4 governance (each idea → /new-strategy → G0–G6); no special status; no combo-search; carry-forward flags · docs/01_KERNEL_SPEC.md §1 — experiment_cell coordinate (strategies differ per ticker/session); instrument/session expansion is config (D-016); MTF + timeframe floor · docs/01_KERNEL_SPEC.md §2 — the four-layer component grammar (IContextGate / ITrigger / IFilter / IExecutionPolicy); MarketState; first component library · docs/01_KERNEL_SPEC.md §3 — the YAML spec format (id, mechanism, instrument, timeframe, session, context, trigger, filters, execution, trial_budget); SpecLoader + spec hash · docs/01_KERNEL_SPEC.md §5 — canonical trade record stamps every result with its cell (symbol, timeframe, session, params_hash) · specs/spec.schema.json — the two non-negotiable gate-entry invariants: a stated mechanism (G0) and a pre-registered trial_budget {family, max_cells} (P2) · docs/02_SENTINEL_SPEC.md §3 — the G0→G6 gate pipeline (random-entry null at G1; DSR>0.95 / PBO<0.20 at G3; three-engine parity at G4; holdout at G5; portfolio-correlation at G6) · tracking/DECISIONS.md — D-001, D-016, D-020, D-030, D-031, D-032, D-087, D-088a
Module 14 of 19 · Research & Discovery

🚪 The Gates (G0 -> G6)

A seven-stage ladder (G0->G6) that kills bad strategies as cheaply as possible — cheap filters in front of expensive ones, with brutal fail-routing at every rung.

The one idea on this page

The gates are a funnel, ordered by cost. A candidate strategy must pass G0, then G1, then G2... in strict order, and each gate is harder and more expensive to run than the one before it. The whole point is to spend almost nothing rejecting the thousands of ideas that are junk, and to spend real compute (and a frozen holdout you can only touch once) only on the rare few that have already survived everything cheaper. You are not trying to find winners. You are trying to kill losers for the least money possible.

Think of it like hiring for a critical role. You do not fly every applicant in for a full-day on-site (expensive) before reading their resume (cheap). You screen the resume, then a phone call, then a take-home, then the on-site, then references — each stage more costly, each one cutting the pool down so the expensive stages run on a handful of people. The Paras gate pipeline is exactly that, applied to trading strategies. G0 is the cheapest filter (does a written hypothesis even exist?) and G6 is the most expensive and most consequential (is this thing actually live-ready and does it add anything to the portfolio?). Cheap filters in front of expensive ones — that is the architecture.

WHY this ordering matters so much: realism and statistics are expensive, and overfitting is the enemy. Running a full M1-intrabar backtest with pessimistic fills and calibrated costs (G2) is far more compute than a fast array screen (G1). Running deflation statistics (G3) and three-engine parity (G4) is more ceremonial still. If you ran the expensive stuff on every idea you would (a) waste enormous compute and (b) — worse — multiply your trial count, which raises the statistical bar you must clear and makes it more likely you fool yourself. So the ladder is not just an efficiency trick; putting cheap filters first is how the system stays honest. Every run is a counted trial (Principle 2), so you want as few expensive trials as possible.

GateQuestion it asksPass bar (mechanism)If it fails
G0Is the hypothesis pre-registered?A spec YAML exists with a stated one-sentence mechanism + a trial budget, before any runNo run at all
G1Is there any pulse at all?FastScreen beats a random-entry null's p50 AND clears >= +2 ticks net per tradeRecord the result, done
G2Does it survive realism?RigorousEngine: M1 intrabar, pessimistic fills, calibrated costs — still positiveLog the mechanism in learnings.md (realism findings are gold)
G3Does it survive deflation?DSR > 0.95 given the family's trial count; PBO < 0.20 (CSCV)Family hurdle rises; idea retires unless a NEW hypothesis emerges
G4Do three engines agree?LEAN oracle + generated EasyLanguage + kernel match within doc-01 §6 tolerancesSTOP EVERYTHING until root-caused
G5Survives walk-forward + holdout?TradeStation WFO confirms; frozen holdout months touched exactly once and hold upRetire the cell
G6Is it live-ready?Promotion review (operator + Claude weekly) incl. portfolio-correlation check; AXIOM shadow assignmentStays in the G5 pool

G0 — IS THE HYPOTHESIS PRE-REGISTERED? This is the cheapest gate and it costs zero compute, because nothing runs. The question is purely: before you backtest anything, is there a written spec YAML that states a one-sentence MECHANISM (a real reason this should work — 'on in-play days the opening range resolves directionally often enough to clear friction') plus a pre-registered trial budget for the family? If there is no mechanism, there is no run. Full stop. This is the gate that bans pure pattern-mining: you are not allowed to just data-dredge a thousand parameter combos and keep whatever looks good, because every one of those would be an untracked trial with no theory behind it. G0 is enforced at spec-load time — the mechanism field is required, unknown keys are errors. Fail routing for G0 is the bluntest in the system: no run.

G1 — ANY PULSE AT ALL? Now something actually runs, but cheaply: the FastScreen engine (array-based, single pass) ranks the cell against a RANDOM-ENTRY NULL. Here is the clever part. The null takes your strategy's exact same trade count and identical exits, but throws the entries in at random times within the session, and does this 100 times to build a distribution. If your 'edge' cannot beat what a monkey throwing darts at the same session would have done, you have no edge — you just have a side and an exit that happened to work. So the pass bar is two-pronged: you must beat the null's median (p50) AND independently clear at least +2 ticks net per trade. That +2-tick floor exists because beating a coin-flip by a hair is meaningless once friction is real. Fail routing is gentle here: record the result and you are done — most ideas die at G1, and that is the system working, not failing.

Why G1 uses FastScreen and a null, not a 'real' backtest

G1 is the breadth gate — it must be cheap enough to run on thousands of cells a night. So it uses the FastScreen engine (approximate fills, summary stats only, flagged approximate=true) and the cheap random-entry null. Crucially, FastScreen numbers are NEVER allowed past G1 (D-020). Anything that survives G1 gets promoted to the expensive, realistic RigorousEngine for G2 and up. You screen with the cheap engine; you decide with the expensive one.

G2 — DOES IT SURVIVE REALISM? This is the first expensive gate, and it is where pretty backtests go to die. The survivor is re-run on the RigorousEngine: M1 intrabar resolution (so the engine knows whether your stop or your target was hit first inside the bar, not just that both were touched), pessimistic fills (market orders fill at next bar open minus slippage against you; stops fill at trigger minus slippage, never better; limits fill only on a trade-through by >= 1 tick — a touch is not a fill), and calibrated costs. Many 'edges' are really just artifacts of optimistic fill assumptions; G2 strips those away. Note the fail-routing here is unusually generous and deliberate: when an idea dies at G2, you LOG THE MECHANISM in learnings.md, because execution-realism findings are gold — knowing exactly which kind of edge evaporates under honest fills is reusable knowledge that improves every future hypothesis.

G3 — DOES IT SURVIVE DEFLATION? This is the statistics gate, and it is the one that protects you from yourself. The problem it solves: if you test enough strategies, some will look great by pure luck. The more trials a family has burned, the more suspicious a 'good' result should be. G3 hands the trade returns to the Python validation service (the only Python in the system, an arms-length FastAPI sidecar using reference implementations of Lopez de Prado / Bailey statistics — you do not re-implement these by hand). Two tests must both pass: the Deflated Sharpe Ratio (DSR > 0.95) given the family's trial count, and the Probability of Backtest Overfitting (PBO < 0.20) via CSCV (combinatorially symmetric cross-validation). DSR penalizes you for how many shots you took; PBO estimates how likely your in-sample winner is to underperform out-of-sample. Fail routing has teeth: the family's hurdle RISES (future ideas in that family must clear a higher bar), and the idea retires unless a genuinely NEW hypothesis emerges — you do not get to re-roll the same idea to a luckier seed.

G3 is where the 'every trial counts' law bites

DSR is explicitly a function of the family's trial count. This is Principle 2 (every trial is counted; re-rolls face a higher hurdle) made into math. The more times you have rolled the dice in a family, the higher the DSR threshold effectively becomes — so spamming variations is self-defeating, by design. The budget gate (code, not Claude) enforces this upstream: a proposed idea only moves proposed -> queued if the family's pre-registered trial_budget has headroom. Claude proposes; the budget disposes.

G4 — DO THREE ENGINES AGREE? This is the parity court, and it is the only gate whose failure is a STOP-EVERYTHING event for the entire project. The same spec is run three independent ways: the kernel itself, generated EasyLanguage on TradeStation, and generated QCAlgorithm on LEAN (the LEAN oracle). All three emit the canonical trade list, and the ParityDiffer aligns them by entry time within +-1 bar and classifies every difference (missing / extra / px_delta / exit_reason_delta). The tolerances from doc-01 §6 are 'the law': indicator values within 0.1% relative per bar, entries >= 95% matched within +-1 bar, net P&L within 5% after documented semantic deltas, trade count within 3%. WHY so ceremonial: parity is the proof that 'the code that trades is the code that backtests' (Principle 3). An unexplained delta means one of your engines is wrong — which means you cannot trust ANY prior result until you find out why. So G4 is invoked only for survivors (it is rare and expensive), and its fail-routing is the harshest in the system: a parity bug contaminates everything until root-caused. You never loosen a tolerance to make G4 pass.

G5 — SURVIVES WALK-FORWARD AND THE FROZEN HOLDOUT? By now the idea is realistic (G2), statistically deflated (G3), and provably consistent across engines (G4) — but every one of those used the same body of historical data. G5 asks whether it survives data it has genuinely never seen. Two things happen. First, TradeStation walk-forward optimization (WFO) runs as an INDEPENDENT confirmation — a different platform, a rolling re-fit, checking the edge is not a single-period fluke. Second, and this is the sacred part: the frozen HOLDOUT months are opened. These are months of data that have been sealed since the start and touched exactly once, at G5 (D-032). The discipline is absolute — if you peek at the holdout earlier, or open it twice, it is no longer a holdout, it is just more training data and your out-of-sample claim is a lie. Fail routing: retire the cell. A clean failure on never-seen data is the most honest 'no' the system can give.

The holdout is a one-shot resource

Frozen holdout months are opened ONCE, at G5 (D-032). There is no 'let me just re-check against the holdout with a tweaked parameter.' The moment you use it to make a decision and then iterate, you have leaked the future into your design and burned the only truly out-of-sample evidence you had. Treat it like a sealed envelope you may open exactly one time.

G6 — IS IT LIVE-READY, AND DOES IT EARN ITS SEAT? This is the final, most consequential, and most expensive-to-be-wrong gate — and notice it is the ONLY gate that is not purely mechanical. It is a promotion review: the operator plus Claude's weekly report. The decisive test is NOT 'is this the best-performing survivor on the leaderboard?' It is the portfolio-correlation check: does this strategy add UNCORRELATED daily P&L to the survivors already running? A second strategy that makes money on exactly the same days as your first one adds risk without adding diversification — it is not a new edge, it is leverage on the old one. Paras promotes for uncorrelated daily P&L across survivors, never for rank. Survivors that pass get an AXIOM shadow assignment (they trade on paper, in real time, against live data, before any real capital). Fail routing is patient: stay in the G5 pool — proven, but not yet promoted, waiting until the portfolio actually needs what it offers.

Worked example — the life of one idea. Say you observe that ES opening-range breakouts seem to work on high-gap days. (1) G0: you write a spec YAML with the mechanism 'on in-play days the opening range resolves directionally often enough to clear friction' and a trial budget — it loads, so it may run. (2) G1: FastScreen says the cell makes +3.1 ticks/trade and beats the random-entry null's p50 across 100 reps — passes. (3) G2: RigorousEngine with pessimistic stop-before-target fills and real costs drops it to +0.4 ticks/trade — it dies. You LOG the mechanism in learnings.md: 'ORB edge on ES is largely an optimistic-fill artifact; stop-before-target sequencing eats it.' That logged negative is a first-class result. Now imagine instead it had survived G2 at +1.8 ticks: it would face G3's DSR/PBO (is +1.8 real given 40 trials in this family, or luck?), then G4's three-engine parity, then G5's holdout months, and only then G6's question of whether ES-ORB P&L is uncorrelated enough with what you already trade to deserve a seat.

1

G0 — Pre-register

Spec YAML with a one-sentence mechanism + trial budget must exist. No mechanism, no run. Cost: zero compute.

2

G1 — Pulse check

FastScreen vs random-entry null (100 reps); beat null p50 AND clear >= +2 ticks net/trade. Cheap, runs on thousands of cells.

3

G2 — Realism

RigorousEngine: M1 intrabar, pessimistic fills, calibrated costs. First expensive gate; failures are logged as gold.

4

G3 — Deflation

Validation service: DSR > 0.95 (given family trial count) and PBO < 0.20 (CSCV). Statistics protect you from luck and from yourself.

5

G4 — Parity

Kernel + generated EL + LEAN agree within doc-01 §6 tolerances. Any unexplained delta = STOP EVERYTHING.

6

G5 — Out-of-sample

TradeStation WFO confirms; frozen holdout months opened ONCE. Fail = retire the cell.

7

G6 — Promotion

Operator + Claude review; promote for UNCORRELATED daily P&L, never leaderboard rank; AXIOM shadow assignment. Fail = stay in G5 pool.

Cheapest gate
G0 (zero compute — just 'does a written, mechanism-bearing spec exist?')
Breadth gate
G1 (FastScreen + random-entry null; runs on thousands of cells)
First expensive gate
G2 (RigorousEngine — M1 intrabar, pessimistic fills, calibrated costs)
Honesty-math gate
G3 (DSR > 0.95, PBO < 0.20 via the Python validation service)
STOP-EVERYTHING gate
G4 (three-engine parity; an unexplained delta contaminates all prior results)
One-shot-resource gate
G5 (frozen holdout months — opened exactly once)
Human-in-the-loop gate
G6 (operator + Claude; uncorrelated-P&L portfolio check, not rank)
Where survivors live before promotion
The G5 pool (proven but not yet needed by the portfolio)
Most ideas dying is success, not failure

Paras is the touchstone: its job is to reveal which metals were gold all along — and most aren't, that is the point. A pipeline where ideas mostly die at G1/G2/G3 is working exactly as designed. Killing bad strategies cheaply (Principle 1) IS the product. The rare survivor that reaches G6 is valuable precisely because so many siblings were honestly rejected on the way up.

WHAT TO BE AWARE OF as the operator. (1) Order is sacred — a gate cannot be skipped, and a phase/gate cannot start before its predecessor's exit criteria are demonstrably met. (2) Fail-routing is not uniform: G1 just records, G2 logs a gold finding, G3 raises the family hurdle, G4 stops the whole project, G5 retires the cell, G6 parks it in the pool — know which 'no' you are getting. (3) Two gates are special-danger: G4 is the only STOP-EVERYTHING, and G5's holdout is a one-shot you can never un-spend. (4) G3's DSR threshold rises with trial count, so the budget gate (code, not Claude) is what keeps families from spamming the dice. (5) G6 rewards uncorrelated P&L, never leaderboard rank — your best-looking survivor is not automatically promotable. (6) FastScreen's approximate numbers must never leak past G1 into a G2+ decision.

What to stay aware of
  • Gate order is sacred: a gate cannot be skipped, and you cannot start one before its predecessor's exit criteria are demonstrably met — the funnel only works in order.
  • Fail-routing differs by gate: G1 records, G2 logs a gold learning, G3 raises the family hurdle, G4 stops EVERYTHING, G5 retires the cell, G6 parks it in the pool — know which 'no' you got.
  • G4 is the only STOP-EVERYTHING gate: an unexplained three-engine parity delta contaminates ALL prior results until root-caused — never loosen a tolerance to make it pass.
  • The G5 holdout is a one-shot resource (D-032): opened exactly once. Peeking early or re-using it destroys its out-of-sample value permanently.
  • G3's DSR threshold rises with the family's trial count — spamming variations is self-defeating; the budget gate (code, not Claude) enforces this by only letting proposals queue if trial_budget has headroom.
  • G6 promotes for UNCORRELATED daily P&L, never leaderboard rank (D-043) — your best-looking survivor is not automatically the right one to promote.
  • FastScreen output is approximate and must never feed a G2+ decision (D-020); only RigorousEngine numbers are trustworthy past G1.
  • Most ideas dying early (G1/G2/G3) is the system working correctly — killing bad strategies cheaply is the product (Principle 1), not a malfunction.
  • No live money until the full ladder is cleared: A5's first Combine is hard-gated on a G5 survivor that also passed G6 (D-044).

Locked decisions & the why

D-030
The gate ladder is G0 -> G6 with explicit fail-routing; every run is a counted trial; the budget gate is code, not Claude.
Why: This is the spine of the whole research method — cheap filters in front of expensive ones, with a defined consequence at every rung — and it makes Principle 2 (every trial counted) survive automation: proposals only move proposed -> queued if the family's trial budget has headroom. (docs/02 §3; Principle 2)
D-031
G0 requires a stated one-sentence mechanism — no mechanism, no run (pure pattern mining banned).
Why: Enforced at spec-load time, this puts Principle 1 (honest discovery over impressive results) at the very front of the pipeline, before a single trial is counted — you cannot data-dredge your way in. (docs/02 §3)
D-020
Two-speed engine: FastScreen ranks thousands of cells and feeds only <= G1; RigorousEngine (M1 intrabar, pessimistic fills, calibrated costs) is used from G2 up.
Why: This is the 'cheap filters in front of expensive ones' principle in code — you cannot afford full-fidelity replay on thousands of cells nightly, and you must not let approximate FastScreen fills reach the serious gates. (docs/01 §4)
D-021
Three-engine parity court (kernel / generated EasyLanguage / LEAN); an unexplained delta is STOP-EVERYTHING.
Why: G4 proves 'the code that trades is the code that backtests' (Principle 3); the rule that a parity bug contaminates all prior results forces root-cause over papering-over and makes G4 the project's hardest stop. (docs/01 §6; G4)
D-032
Holdout months are frozen and opened exactly once, at G5.
Why: A holdout is only out-of-sample if it is touched once — peeking early or re-using it turns it into training data and makes the out-of-sample claim a lie; the one-shot rule is what gives G5 its evidential power. (docs/02 §3)
D-023
Python exists ONLY in the FastAPI validation service (DSR/PBO via reference implementations); arms-length sidecar.
Why: Re-implementing Lopez de Prado / Bailey deflation statistics by hand is unacceptable risk, so G3 calls a version-pinned, local-only Python service that uses reference implementations — kept at arms length as a sidecar process. (docs/02 §5)
D-043
AXIOM is multi-account from day one; new seats get uncorrelated strategies (the G6 check).
Why: G6 promotes for uncorrelated daily P&L across survivors, never leaderboard rank — a strategy that profits on the same days as an existing one adds risk without diversification and should stay in the G5 pool. (docs/03 §3.2)
D-044
A5 (the first Combine) has a HARD gate: a G5 survivor exists and has passed G6 before any real-money account is opened.
Why: Until a strategy has run the full ladder, the factory researches at ~$0/mo — and that is success, not delay. No live capital is risked on anything that has not survived G0 -> G6. (docs/05)
Sources: docs/02_SENTINEL_SPEC.md §3 (Gate pipeline G0-G6 — the question, mechanism, and fail-routing table) · docs/02_SENTINEL_SPEC.md §2 (Experiment model & ledger — experiments / trials / gate_results; experiment_cell coordinate) · docs/02_SENTINEL_SPEC.md §4 (Experiment runner — process isolation; two-speed feeding the gates) · docs/02_SENTINEL_SPEC.md §5 (Validation service — DSR / PBO / random-entry null endpoints; arms-length Python sidecar) · docs/02_SENTINEL_SPEC.md §6 (LEAN oracle — invoked only for G4 survivors; rare and ceremonial) · docs/02_SENTINEL_SPEC.md §7 (Claude research loop & budget gate — proposals queue only with trial_budget headroom) · docs/02_SENTINEL_SPEC.md changelog (G0 requires a mechanism; G6 adds the portfolio-correlation check) · docs/01_KERNEL_SPEC.md §4 (Two-speed engine — FastScreen <= G1, RigorousEngine G2+; fill & cost models) · docs/01_KERNEL_SPEC.md §6 (Parity framework — tolerances 'the law'; ParityDiffer; G4 stop-everything rule) · tracking/DECISIONS.md (D-020, D-021, D-023, D-030, D-031, D-032, D-043, D-044)
Module 15 of 19 · The Build & Decisions

🗺 Roadmap & Where We Are Now

Paras is built as a chain of gate-locked phases that never start out of order; today it is parked at M1 First Light, one 72h soak away from opening the kernel.

● LIVE — current state (pulled from tracking/ at portal build time)

STATE.md last updated: 2026-06-18 (PT) · by: Session 028 (Opus 4.8 + Ultracode) — OPERATOR DESIGN SESSION + Databento D-015 tooling. Docs/decisions (`951a4de`) = zero src/. Then operator chose DATA-FILL-BEFORE-SOAK → built the Databento second-source tooling (`0e6b60d`, additive src/; plant `--run` hot-path untouched + independently graded so). Validated/graded by 6 workflows (`wf3980d1ab` design review · `wfe3bd5462` validate+design · `wf7a49615e` doc grade · `wfff1dfa03` Databento grade). 7 new locked decisions D-084→D-090: OD-1 all three apps are WPF (PULSE/SENTINEL/AXIOM; Next.js/React = design-reference only; supersedes the D-073 AXIOM-WinUI clause) — propagated across 12 files; OD-2 PULSE becomes the data-plant control face (start/stop + kick jobs; PULSE stays DB-read-only; APPROVED but post-M1/design-only); D-086 single-writer fail-fast guard (post-M1); OD-6 feature-cache APPROVED w/ guardrails (incremental, no look-ahead, byte-for-byte cached==uncached); combo-search REJECTED; OD-4 D-081 deep-history accept-and-gate STANDS — the Databento D-015 audit was NEVER done; no fill done; free `metadata.getcost` is the cheap next step (D-088); D-089 PULSE 4→2 tabs (post-M1); D-090 status-contract v1.1 (post-M1). New docs: `docs/DAILYOPERATINGMODEL.md` (one-pager), `docs/research/AXIOMMTFMODELIDEASBACKLOG.md` (42-rule seed backlog; xlsx moved to docs/research/), `docs/research/DATABENTOD015FEASIBILITY2026-06-17.md`, `docs/research/POSTM1BUILDSPECS028.md` (the post-M1 build blueprint). Honesty bugs HZ-1/HZ-5 verified ALREADY FIXED on this branch (producer+consumer, tests) — no code owed. No soak running (last soak stopped Jun 17 ~12:45; no process). ▶ Operator on return: re-publish the branch HEAD plant → 5 live probes → fresh 72h soak (the one remaining M1 operator gate); then post-soak grade → 🏮 ceremony → M1 ☑. All plant/PULSE code changes (D-085/086/089/090 + resilience) are queued POST-M1 per `POSTM1BUILDSPECS028.md`. History → Session 027 (Opus 4.8 + Ultracode) — PULSE blank-FAULT-banner FIXED + committed (`1ec213e`, 3 tab XAMLs: dropped the fragile `RelativeSource AncestorType=UserControl` visibility binding → wrapper Border + `FallbackValue/TargetNullValue=Collapsed`); `0/3 symbols healthy` confirmed HONEST (all 3 symbols warn). `m1-liveness-selfheal` PUSHED to origin (5 commits `99c3db3`→`532deb4`; new remote branch, no force; M2 worktree branch untouched). RECONCILED with Session 026: F1–F7 committed `5f0f4bf`; the running 72h soak is INVALID (pre-S025-fix binary, wedged stale-but-green — reproduces RC#5); PULSE 22-discrepancy review → `docs/research/PULSEREVIEWS026.md` (post-soak, HZ-1 first); M2·B2 DESIGN ring CLOSED/PASS (`4c63100`, confirm `wf33c0f9ea-5aa`) + 5 kernel indicators BUILT + verified PASS (`6c8c4df`/`0a5ac18`/`973fa4c`, 276 green, in `Paras-m2`). Operator next: re-publish plant from the S025 fix → re-run soak + the 5 live probes (M1); M2 components BUILD continues in parallel. ✅ Branch CI GREEN on a clean clone (workflowdispatch run `27715434846`, 1129 tests — Sentinel 655 / Kernel 227 / Pulse 246 / Axiom 1, 0W/0E — first-ever CI on the branch; reconciles the 1124/1128/1129 disagreement → truth is 1129). Invalid soak KILLED; 3 parallel hardening worktrees launched (`m1-fix-plant-honesty` / `m1-fix-pulse` / `m2-b2-kernel-design`) per [PARALLELHARDENINGPLANS027.md]; `specs/spec.schema.json` authored + validates all 3 specs. ✅ CONVERGENCE COMPLETE: P1+P2 merged (`40639af`); pre-soak D-080 grade caught B1 (CI flake) + B5 (PULSE 60-vs-300 cry-wolf) → both FIXED (`57c6174`); 3 consecutive green clean-clone CI runs (1191 tests: Sentinel 681/Kernel 227/Pulse 282/Axiom 1); RE-GRADE `wfed4834b9-758` = PASS-WITH-NOTES, READY TO SOAK, zero must-fix. M2 components (339 green) parked unmerged (D-082). HEAD `c23b3b1`. ▶ Operator owns Phase 3: re-publish `c23b3b1` → 5 live probes (market hours) → fresh 72h soak (tune `maxBarAgeSeconds` to probe-confirmed cadence). Then post-soak grade → 🏮 ceremony → M1 ☑ → open M2. History → Session 025 (Opus 4.8) — DEAD LIVE-STREAM root-caused + fixed (F1–F7) + soak/PULSE-honesty hardened; independent grade FAILed (8 blockers) → all fixed → re-grade CONCERNS; 1128 tests green, 0W/0E. History → Session 024 — 72h soak ABORTED to fix a live-data/self-heal gap (plant + PULSE STOPPED); M2·B2 design draft built (compiles) but FAILED independent re-review. History → Session 023: the 4 "DO FIRST" M1 builder fixes (CI-flake de-gate + 3 PULSE bugs) built + independently verified (`wf5c0951ce-cd5`), 1,079 green, committed local (push pending). History → M1.6 + M1.8 builder-complete + D-076 mandate. M1.6 (live stream + jobs + status loop) and M1.8 (BackupJob/`--restore-verify` + preflight + e2e + benchmarks + soak harness + PULSE fonts) built via workflows `wf89a026e8`/`wf833253d4`/`wf992b07d8`; 930 tests green, 0W/0E; committed+pushed `e625f60`→`9d40a39`. Independent agents: constitution-guardian PASS, design-reviewer PASS, test-sentinel CONCERNS (real coverage gap — see below). Fable OFFLINE (operator-authorized to proceed; full grade + 72h soak + live + Databento DEFERRED). Coverlet env trap hit + diagnosed (see memory `coverlet-testhost-trap`). + Session 017 (2026-06-14, parallel design track, Opus 4.8): SENTINEL dashboard Ring-1 DESIGN review COMPLETE — verdict CONCERNS; D-077 locked (4 operator ratifications + 7-item pre-M7 fix list). No M1/`src/` files touched. See the "Parallel design track" section below. + Session 018 (2026-06-15, Opus 4.8 verification): independent M1 exit-gate audit (`wf91429ec2-c7c`) → 🔴 FAIL — CI is RED on a clean clone (Sentinel 9 failed/527 passed on the runner; LOCAL 536/536). Coverage gate re-verified MET, but M1 is NOT builder-complete until CI is green. 2 builder blockers BOTH FIXED → CI GREEN on a clean clone (run `27556407894`, 966/966: Sentinel 536, Kernel 197, Pulse 232, Axiom 1): (a) CRLF golden `.gitattributes` (`6d87849`, Session 019); (b) host-loop test timeouts → serialized the test assembly (`8f2d8ee`, Session 018). The §14 CI-green blocker is CLEARED; M1 builder-side gate now MET pending operator runs + Fable's grade. See NEXT ACTION. + 2026-06-15 (later, Opus 4.8): the 72h soak FAILED at ~5h on a status-file contention crash (PULSE held `pulse-status.json` without `FileShare.Delete` → plant `File.Replace` IOException → `StopHost`). Root cause FIXED (PULSE share-delete + `AtomicJsonFile` retry + `PlantHost` never-crash status guard; 3 regression tests). PULSE `--demo` mode REMOVED (real-data-only). Re-soak parked → Wed 2026-06-17. + Session 021 (2026-06-15, Opus 4.8 builder + Ultracode `wf6943133e-744`): MARKET-SESSION AWARENESS shipped — the calm `CLOSED` master-state (kernel `SessionClock.NextSessionOpenUtc` pure + Sentinel `MarketCalendar`/`DisplayTimeZone` + `MasterStateDeriver.Closed` priority EMPTY>FAULT>CLOSED>BEHIND>SYNCING>SYNCED + session-aware gap-sync degrade + stale-row neutralization + PULSE slate `#7A8699` accent + new closed golden). D-079 locked (operator-ratified slate + neutralize-staleness). 5 subagents (build + adversarial + 3 auditors, all PASS); the adversarial pass caught+fixed a mid-session reopen-time honesty bug. 1,065 tests green, 0W/0E; CI-GREEN on a clean clone (run `27585914900`); committed+pushed `695479f`. Remaining lead-task piece = item 0b (market-hours PULSE→SYNCED confirm, operator/GUI).

Current milestone: M1 · First Light — ◐ IN PROGRESS (data plant ALIVE; M0 ☑ done) 🏮 FIRST LIGHT (2026-06-13): operator ran --auth (refresh token stored) → --backfill → 19,206,707 bars in ops/ledger.duckdb (1.4 GB), VERIFIED via --stats: @ES/@CL/@GC M1 from 2008 (~6.4M each) + D1 back to 1997/2001; 24 distinct hours-of-day ⇒ full 24h Globex sessions (D-013 proven). Live pull matched the doc-derived fixtures (§9 Stage C shape reconciliation passed). ⚠ ledger is single-copy — no backup until M1.8 BackupJob. Current phase status: Done so far: step 0 (D-075c culture rules → error; CI actions → Node-24 majors), M1.2 (DuckDB §7.1 schema + MigrationRunner + idempotent BarRepository merge; cov 100%), M1.1 (TradeStationClient + TokenProvider, fixture-first §9 Stage B — Polly 8.7 429-storm heal + bounded giveup, OAuth refresh under injected IClock; 11 tests liveconfirm=pending; cov 91.7%). 46 tests green, 0W/0E. Pushed through d2f5697 (CI green per push). Builder does not self-grade — M1 submitted whole to Fable 5 at the end (§14). +M1.7 (Session 011): Sponaitech.Pulse WPF tray monitor (pixel-faithful port, read-only, never opens DuckDB) + Sentinel StatusPublisher (§7.3/§7.4) built via multi-

Refreshes every time the portal is regenerated (python docs/education/_build_portal.py).

Milestone board (snapshot from tracking/PROGRESS.md)

MilestoneNameTrackStatusStartedCeremony heldExit gate
M0IgnitionB2026-06-122026-06-12CI green on hello-world kernel test; secret round-trip test passes
M1First LightB2026-06-13ES/CL/GC backfilled; quality report clean; 72h soak; top-up 3 nights unattended
M2The FuneralBCoverage + property suites green; FastScreen reproduces TS backtest within fast-pass tolerance
M3Two WitnessesBGolden fixtures locked; Experiment #001 kernel vs TS within §01-6 tolerances or every delta root-caused
M4The GauntletBOne strategy flows G1→G3 end-to-end; budget gate blocks over-budget proposal
M5Quiet NightsBOne fully unattended cycle (evening batch → nightly review → ≤5 follow-ups)
M6The Third WitnessBLEAN reproduces Experiment #001 within tolerances; G4 report generated
M7The GlassBAll six dashboard screens on real ledger; uncertainty visible everywhere
A1The FortressA10 adversarial tests green; 3 shadow sessions, zero violations (forks after M3)
A2The WireAOutage drills pass; one-micro round-trip verified on eval account
A3The ConscienceAModel-size benchmark passes; veto p95 <1.5s; schema 100%; override red-team clean
A4The RehearsalA10 clean shadow sessions, zero violations; cost model calibrated from shadow fills
A5The SeatAFirst Combine; one strategy, one pod (HARD GATE: a G5 survivor exists + passed G6)
The TouchstoneFirst strategy to survive G0→G5 out-of-sample (whenever it happens)
Where we are right now (as of 2026-06-18)

Milestone M1 · First Light is ◐ IN PROGRESS. The data plant is ALIVE — 19.2M bars backfilled, live self-heal working — but M1 is NOT closed. The single remaining operator gate is a clean 72-hour soak. The most recent soak FAILED at ~3h40m (exit code 14, a freshness stall caused by the nightly top-up starving the live heal-loop). That is a recorded FAIL under the failure doctrine — root-cause and re-soak, do NOT proceed to the ceremony. Until M1 closes, M2 and everything after it stay hard-blocked.

This page is your map. Paras is not built all at once — it is built as a chain of numbered phases, each with a written exit gate, and a hard rule that you never start a phase until the one before it is demonstrably done. This lesson explains the whole road (both tracks), why it is ordered the way it is, and exactly which milestone you are standing on today.

Plain-English first: think of Paras as two factories built in sequence. Track B (SENTINEL + the kernel) is the research laboratory — it has to exist and be trustworthy before anything is allowed to touch real money. Track A (AXIOM) is the execution fortress that places live trades — it forks off the research track only after the research engine can prove its own arithmetic. You are at the very start of Track B: the data plant (the thing that feeds everything) is alive but not yet certified.

DataPlant (SENTINEL --run)
The always-on engine. The single DuckDB writer — owns ingestion, backfill, self-heal, nightly top-up, quality, reconciliation, tape. This is what the soak tests (D-091).
SENTINEL dashboard
The on-demand research laboratory: backtests, deflation gates, 3-engine parity, the nightly Claude loop, the glass that shows it all. Built last, at M7.
PULSE
The DataPlant's read-only tray watcher — the heartbeat light. Never opens DuckDB; reads a status file (D-073).
AXIOM
The execution fortress — live Topstep trading through a deterministic compliance engine. Track A, forks after B3 (kernel parity is proven).

How the roadmap works (the mechanism): every phase in docs/05 has an Exit criteria column. The phase is not 'done' because the code compiles or the tests pass — it is done when its exit criteria are demonstrably met AND an independent auditor (not the author) grades the gate green (D-080). Only then may the next phase open. This is the single most load-bearing rule in the whole build: B0→B1→B2→B3 is strictly sequential.

PhaseWhat it buildsExit gate (must be proven)
B0 Repo & railsSolution scaffold, CLAUDE.md, CI on push, config conventions, topstep-rules seededCI green on a hello-world kernel test; docs in repo
B1 Data plant + PULSETS OAuth client, backfill pager, consolidator-derived timeframes, DuckDB schemas, quality + holiday calendar, gap-sync, nightly top-up, tape recorder, PULSE tray monitor, ops hardeningES/CL/GC backfilled; quality report clean; top-up runs 3 nights unattended (+ 72h soak)
B2 Kernel coreDomain, SessionClock, component library, SpecLoader + schema, FastScreenComponent unit tests; FastScreen reproduces the TradeStation backtest within fast-pass tolerance
B3 Rigorous engine + parityFull order lifecycle, M1 intrabar, fill/cost models, canonical trades, ParityDiffer, Experiment #001§01-6 tolerances met or every delta root-caused; pessimistic default cost config committed
B4 Gates + validationGate orchestration, random-entry null harness, Python FastAPI DSR/PBO service, trial budgets in codeA G1 survivor flows G1→G3 end-to-end; budget gate blocks an over-budget proposal in a test
B5 Runner + Claude loopProcess-isolated runner, queue, artifacts, nightly session wiring, review/learnings outputsOne fully unattended cycle: evening batch → nightly review → ≤5 follow-ups → next batch — zero manual steps
B6 LEAN oraclePinned LEAN build, exporter, spec→QCAlgorithm generator, 3-way differ in G4LEAN reproduces Experiment #001 within tolerances; G4 produces a written reconciliation report
B7 DashboardHeadless read/queue API + WPF screens (Results Matrix first)All six screens live against the real ledger; uncertainty visible on every results view
PhaseWhat it buildsExit gate (must be proven)
A1 Compliance engine + shadowState machine, all monitors, hard locks, SimVenueAdapter, journalAdversarial test suite green (the 10 tests); 3 shadow sessions on live data, zero violations
A2 Streaming + executionTS streaming L1 (reconnect/gap-fill), TopstepX adapter (idempotent, reconcile-on-reconnect), watchdog flatten paths, runbooksSimulated-outage drills pass (kill stream / kill venue → correct FLATTEN_AND_HALT); 1-micro round-trip on an eval account
A3 AI sidecar (tiered)Ollama serving, deterministic narration, rules-first veto + small-model layer, on-demand 12B override dialogue, reflection templatesModel-size benchmark passes; veto p95 <1.5s; schema conformance 100%; override red-team yields no workarounds
A4 Dress rehearsal10 clean shadow sessions, full pipeline, daily reflections, weekly gap reportZero compliance violations; zero unhandled failure states; gap metrics within tolerance
A5 First Combine$49 Combine + $29 API live; smallest-size policy; operator runbookFunded → the Phase-D operating loop (HARD GATE: a G5 survivor exists + passed G6)
The critical path (memorize this line)

B0→B1→B2→B3 is sequential (~2.5–3 weeks of build sessions). B4/B5/B6 can interleave. The A-track forks off only AFTER B3 — once the kernel's parity is proven — and A1 may overlap B4+. Nothing in A5 (real money) starts before a strategy holds G5. If every early family dies at the null, the platform keeps researching at $0/mo and A5 simply waits. That waiting is the budget discipline working, not failing.

Example of the gate-lock in action — why M2 is hard-blocked right now: the kernel (Track B's brain) does not technically depend on whether the plant survives a soak. So the operator made a careful, narrow exception (D-082): M2 kernel groundwork — DESIGN/DATA rings and pure-kernel BUILD in an isolated git worktree — MAY proceed in parallel with the soak. But the formal M2 open, the ceremony, and any work that wires kernel components into SENTINEL or stops/rebuilds the running plant stay blocked until M1's soak is green and the independent assessment passes. That is the gate being respected precisely, not bent.

1

1. Check the soak

Default first action next session: see whether the running 72h soak is alive and healthy (tasklist for Sponaitech.Sentinel; tail the soak log in ops/soak/; confirm PULSE is green). The last soak FAILED at exit 14 — freshness stall — so expect a re-soak, not a pass.

2

2. If GREEN

Run the post-soak checklist (ops/runbooks/soak-72h.md): --stats unchanged, --restore-verify, full suite green, write a TEST_LOG row.

3

3. Independent assessment

Run the M1 assessment Workflow with the independent auditor subagents (phase-gatekeeper / test-sentinel / constitution-guardian / design-reviewer) per D-080. The author never grades their own gate.

4

4. Ceremony

Only if all green: /milestone-ceremony 🏮 First Light → flip M1 ☐→☑ → officially open M2 (The Funeral).

5

5. If FAILED

Recorded FAIL → root-cause Workflow → fix (the top-up must not starve live currency — chunk/yield or reschedule; relates to D-086) → re-soak. Does NOT proceed to the ceremony.

☑ done

M0 · Ignition

DONE (ceremony 2026-06-12). Repo, rails, CI, DPAPI secret store. The foundation.

◐ now

M1 · First Light

IN PROGRESS. Data plant alive, 19.2M bars, PULSE shipped. Blocked on a clean 72h soak (last one failed at ~3h40m).

☐ blocked

M2 · The Funeral

Kernel core + FastScreen. HARD-BLOCKED behind M1; DESIGN-ring groundwork allowed in parallel (D-082).

☐ next-next

M3 · Two Witnesses

Rigorous engine + parity; Experiment #001 (kernel vs TradeStation). The A-track forks after this.

☐ ahead

M4–M7

The Gauntlet (gates) → Quiet Nights (Claude loop) → The Third Witness (LEAN) → The Glass (dashboard). All gated, dashboard last.

★ the prize

The Touchstone ★

The real goal: the first strategy to survive G0→G5 out-of-sample — whenever it happens. Most won't. That is the point.

Why the dashboard is LAST, not first

Intuition says build the pretty screen early. Paras refuses. The dashboard is M7 'The Glass' — it cannot be built until M1 exits and M2–M6 are done, because a dashboard with no honest ledger behind it is theater. Scope creep before an edge exists is a named build-time risk; the manual campaign is the forcing function and the dashboard comes last on purpose.

What to be aware of, operator-side: most of the remaining M1 work is YOURS, not the builder's. The 72h soak runs on your physical machine (run-as-Administrator, keep it awake on AC, High-Performance power plan). A failed soak is never weakened to green — it is root-caused and re-run. And the moment M1 closes, the post-M1 build queue is already specified (single-writer fail-fast guard D-086 first, then the PULSE control-face D-085) — but none of it lands mid-soak, because a rebuild during a soak invalidates the soak (D-082).

The one sentence to remember

Paras advances only by demonstrably passing gates in order — B0→B1→B2→B3 sequential, A forks after B3, real money waits behind G5 — and today you are one clean 72-hour soak away from closing M1 and unlocking the kernel.

What to stay aware of
  • M1 is NOT closed. The data plant is alive (19.2M bars, self-heal working) but the milestone only flips to done after a clean 72h soak + independent assessment + the 🏮 First Light ceremony.
  • The most recent soak FAILED at ~3h40m with exit code 14 (freshness stall) — the nightly top-up starved the live heal-loop. That is a recorded FAIL under the failure doctrine: root-cause and re-soak; never weaken the freshness gate to make it pass.
  • Phases are strictly ordered — B0→B1→B2→B3 is sequential. M2 (kernel) and everything after stay hard-blocked until M1 closes; only narrow pure-kernel DESIGN-ring work is allowed in parallel, in an isolated worktree (D-082).
  • The author never grades their own gate. M1's exit is graded by independent auditor subagents via Workflow (phase-gatekeeper / test-sentinel / constitution-guardian / design-reviewer), per D-080.
  • The 72h soak is the operator's gate, run on the physical machine — run-as-Administrator, keep it awake on AC, High-Performance power plan; never stop or rebuild the running plant/PULSE mid-soak (it invalidates the soak).
  • The A-track (live trading) does not start until B3 proves kernel parity, and real money (A5) waits behind a G5 survivor. If the early families all die, the platform researches at $0/mo — that waiting is the discipline working.
  • The dashboard is built LAST (M7 'The Glass'), not first — a dashboard without an honest ledger behind it is theater; scope creep before an edge exists is a named build-time risk.
  • The post-M1 build queue is already specified (single-writer fail-fast guard D-086 first, then the PULSE control-face D-085) but none of it lands until M1 closes — because each rebuilds the exe and a rebuild mid-soak invalidates the soak.

Locked decisions & the why

D-080
Gates are graded by independent auditor subagents (phase-gatekeeper / test-sentinel / constitution-guardian / design-reviewer / parity-auditor) run via Workflow in separate contexts, never by the author; ultracode-xHigh + Workflow is the OS-level default. Operator authorized completing the M1 exit gate + ceremony via this independent assessment.
Why: Supersedes D-070 — 'Fable 5 is retired from public use and is NO LONGER AVAILABLE,' so the two-model protocol where Fable graded every gate is unworkable. The independent agent verdicts + measured TEST_LOG evidence + green CI ARE the gate; ULTRACODE U4 ('the author never grades their own gate') is preserved by routing the verdict through separate auditor agents.
D-082
M2 kernel groundwork (DESIGN/DATA/CONCURRENCY rings + pure Kernel/Kernel.Tests BUILD, ideally in an isolated git worktree) may proceed IN PARALLEL with the M1 72h soak. The formal ceremony, the official M2 open, and any plant-rebuild/wiring stay gated on soak-green + the independent M1 assessment.
Why: The soak is an endurance test of the PLANT; the kernel does not depend on the soak result (a soak failure is a plant-reliability fix, never a kernel rework), so the phase-gate's intent is preserved. Hard rule while soaking: never stop or rebuild the running plant/PULSE exes.
D-081
Deep-history M1 (2008–2020) = ACCEPT-AND-GATE; do NOT re-backfill now. Keep the sparse deep history and gate it via the existing data_quality table; revisit only if a second-source audit shows the gaps recoverable AND a pre-2021 strategy family actually needs them.
Why: The quality run shows recent M1 (2022+) is 97–99% clean while 2008–2020 has ~25% of days missing minutes — a genuine data-completeness characteristic of TradeStation's deep history, NOT an engine defect (D1 is 99.5% clean with the same engine). M2+ early families validate fine on the clean 2021+ window; never weaken a threshold to fake a clean report.
D-091
The headless always-on ingestion process is canonically the 'DataPlant' (operator-facing 'Paras DataPlant') — the run-mode Sponaitech.Sentinel --run, the single DuckDB writer. Distinct from the SENTINEL research dashboard (on-demand) and PULSE (read-only tray watcher).
Why: Resolves the operator's 'what is the always-on thing vs the dashboard' question. DataPlant = always-on engine; SENTINEL dashboard = on-demand research lab; PULSE = the DataPlant's read-only watcher. This is exactly what the soak certifies at M1.
D-073
PULSE is WPF/.NET 8, tray-first, and NEVER opens DuckDB — status transport is file-based (atomic pulse-status.json + events + a single reverse-channel for the two pause toggles). The plant is DuckDB's only M1 client (single-writer-process law).
Why: Single-writer discipline (Constitution P7): one source of truth per concern. PULSE as a read-only watcher means a monitoring app can never corrupt or contend for the vault. This law is what the M1 soak exercises (plant + PULSE coexisting for 72h).
D-013
Bars are stored as full 24-hour Globex sessions; RTH/pit/overnight windows are derived logically by SessionClock, never stored separately.
Why: First Light proved this: the live backfill showed 24 distinct hours-of-day = full Globex sessions present. One canonical storage, all session views derived — the data foundation B1 must establish before any engine work (B2+) touches it.
D-016
Seed instrument set is ES, CL, GC (PT sessions). Expansion (NQ, currencies, etc.) is config — instrument rows + SessionTemplates + spec YAMLs, zero kernel changes.
Why: Keeps the roadmap scope honest: the whole build targets three instruments first; adding more is a config change, not a re-architecture (reinforced by the asset-class-agnostic market-model hard rule D-094).
Sources: tracking/STATE.md — Current milestone (M1 ◐), Current phase status, ▶ NEXT ACTION (S028 wrap + the failed-soak update), First Light backfill (19.2M bars) · docs/05_BUILD_ROADMAP.md — Track B table (B0–B7), Track A table (A1–A5), Critical path line, §4 adversarial test suite, §5 risk register, Definition of fully operational · tracking/PROGRESS.md — Milestone board (M0–M7, A1–A5, ★ The Touchstone), M1 ring tracker, gate funnel · tracking/DECISIONS.md — D-080 (independent grading), D-082 (M2 parallel during soak), D-081 (deep-history accept-and-gate), D-091 (DataPlant naming), D-073 (PULSE read-only single-writer), D-013/D-016 (data + instrument scope)
Module 16 of 19 · The Build & Decisions

✅ Locked Decisions & The Why

The most important locked decisions in Paras — what was decided, why it was decided, and what would break if it were ever reversed — grouped so you can hold the whole system's spine in your head.

The whole lesson in one breath

Paras has a file — tracking/DECISIONS.md — that is the memory of every choice the system must never re-argue. Each entry is numbered (D-001, D-002, ... up to D-094), dated, and carries its rationale. They are LOCKED: a build session does not get to relitigate them, and the only way to reverse one is a new dated entry that explicitly supersedes the old (the old text is annotated, never deleted). This lesson walks the load-bearing ones, grouped by what they protect — architecture, the AI boundary, data, model-research, and operations — so that for each you understand the choice, the reason, and the cost if it were undone. Hold these and you hold the spine of the system.

You're building Paras solo, leaning hard on AI to move fast. That speed is the whole point — and it's also the danger. An eager assistant, left to its own judgment, will quietly re-open the same questions every session and let the answers drift, until a thousand small 'reasonable' edits have turned the system into something else. DECISIONS.md is the antidote. Its own header is blunt: 'Do NOT relitigate these inside build sessions.' And CLAUDE.md gives the operating rule: 'If a build request conflicts with a locked decision, stop and flag it rather than silently complying.' This page is your map of the decisions that matter most, so you can recognize when one is being violated — and say so.

A 'decision' here is not a code change. It is a durable choice about HOW the system works that you never want to argue twice: 'all three apps are WPF,' 'the kernel is a pure library,' 'no LLM in any order path,' 'Topstep rule values live in config, never as constants.' We'll take them in five groups, in roughly the order they shape the system: (1) ARCHITECTURE — the bones; (2) the AI BOUNDARY — what the machine is and is not allowed to do; (3) DATA — the ground truth everything stands on; (4) MODEL & RESEARCH — how edges are found honestly; (5) OPERATIONS — how it's run, graded, and kept alive. For each, the choice, the why, and the consequence of reversal.

How to read a D-entry (and why the format matters)

Every entry has the same skeleton: a number (D-0xx), the decision in one line, the source it traces to (a doc section or a Constitution Principle), and a dated rationale. The number is permanent — code reviews and other decisions cite it. The date lets you reconstruct history. The source means a decision is never free-floating opinion; it descends from a spec or a Principle. When you see 'supersedes D-070' (as D-080 does), that is the reversal mechanism working correctly: dated, explicit, traceable — never a silent rewrite.

GROUP 1 — ARCHITECTURE: the bones. These five decisions fix the shape of the system. Everything else is built inside the box they draw. The keystone is D-001: there is ONE shared C# kernel; SENTINEL backtests it and AXIOM trades it — so the code that trades IS the code that backtests. That single choice is what makes a backtest result mean anything: if the live system ran different code, your backtest would be measuring a system you'll never actually trade. D-003 protects it by keeping the kernel a PURE library — no venue SDKs, no HTTP, no UI, no LLM, no Topstep values, no wall-clock reads (the clock is injected). Purity is what lets the same kernel run identically in a backtest, a parity court, and live.

Principle 3

D-001 · One shared kernel

SENTINEL backtests the kernel; AXIOM trades the kernel. The code that trades is the code that backtests. Reverse it and your backtest measures a system you never actually run — every result becomes a guess.

the shape

D-002 · Three apps, named Paras

PULSE (tray monitor) + SENTINEL (research) + AXIOM (execution). Three jobs, three apps, one kernel. Separation of concerns by construction.

docs/01 §7

D-003 · Pure kernel

No venue SDKs, HTTP, UI, LLM, Topstep values, or wall-clock in the kernel — the clock is injected. Reverse it and the kernel can no longer run identically in backtest, parity court, and live — parity dies.

concurrency

D-004 · Process isolation default

Process isolation for work; threads only over immutable/mmap data; UI on its own dispatcher. The default that prevents shared-state corruption between the plant, the engines, and the UI.

supersedes D-073/docs02

D-084 · All three apps are WPF

PULSE, SENTINEL, AXIOM are all Windows WPF (.NET 8). The Next.js/React design output is a UI reference prototype only, never a runtime dependency. Supersedes the old 'AXIOM stays WinUI 3' / 'SENTINEL = Next.js' lines — one UI stack is the right solo-operator simplification.

GROUP 2 — THE AI BOUNDARY: what the machine is allowed to do. This is the most important group for a system whose whole pitch is 'AI-automated.' The line is absolute and it is drawn at D-005: NO LLM in any order path; the AI sidecar can only veto, narrate, resist, or reflect — never approve, place, size, or modify. The runtime AI's only power is to say NO. Why so strict? Because LLMs are reasoners and veto filters, never forecasters or order managers (Principles 4–5). An LLM that could place or size an order would be a non-deterministic component in the one place — the order path — where you need provable, repeatable behavior. The consequence of reversing this is catastrophic: a hallucination could become a live trade.

DecisionWhat it locksWhy — and the cost of reversing it
D-005No LLM in any order path; sidecar may only veto/narrate/resist/reflect.LLMs are reasoners, not forecasters or order managers (P4–P5). Reverse it and a hallucination could place a live trade — the order path must be deterministic.
D-050Two AIs, two clocks: Claude (research-time, $0 marginal) proposes; local Ollama sidecar (runtime) can only say no.Separates the 'thinking' AI from the 'trading' moment. The runtime AI has zero authority and costs $0 at the margin (local Ollama only).
D-067Gemma verdict is schema-constrained JSON enforced at the decoder level; any failure coerces to NEUTRAL, never a retry-loop; the verdict is read-only advisory with zero authority.A malformed or stalled model can never block or distort execution. GO-class verdicts carry no permissive power — the LLM cannot approve anything.
D-068Reflections are local, $0, injected as read-only context — and NEVER feed a parameter, filter, threshold, gate, size, or entry. Behavior changes go through G0–G6.Stops the 'AI learned something overnight and changed how we trade' failure. Adaptation must pass full gates, not leak in through a reflection.
D-069External-framework adoption policy + standing DO-NOTs: no LLM order/sizing authority ever; no bar-level LLM calls (session cadence is the runtime ceiling); zero new API token spend at runtime.Caps both the authority and the cost of runtime AI. A step that 'needs' a cloud model at runtime is, by definition, a wrong step.
The one sentence that governs the whole AI boundary

From D-067/D-068 and the standing risk T-06: 'No verdict- or reflection-derived value may reach execution, sizing, filters, or gates — LLM outputs are read-only advisory context.' This is the wall. The AI can narrate, it can warn, it can refuse — but it cannot touch a number that affects a trade. Every time you see an AI feature proposed, the first question is: does any of its output reach a behavior? If yes, it's wrong as designed, no matter how clever it is.

GROUP 3 — DATA: the ground truth. Every backtest, every indicator, every decision stands on the bars in the vault. So the data decisions are about provenance and integrity. D-010 makes TradeStation the primary source (free, deepest history). D-011 stores M1 from 2008 so stress regimes (2008, COVID) are in-sample. D-012 is the discipline that prevents silent corruption: ALL intermediate timeframes (M5/M15/H1...) are DERIVED from canonical M1 by one consolidator — never extracted separately. If you pulled M5 separately, it could silently disagree with the M1 it should aggregate to, and you'd never know which was right (Principle 7: one source of truth per concern). The most recent and most important data decisions, though, came from a hard lesson during the M1 soak.

1

D-081 · Accept-and-gate the deep history

2008–2020 M1 has ~25% of days missing >2% of minutes — but this is TradeStation's actual coverage sparsity, NOT an engine bug (the same QualityEngine yields 97–99% clean on 2022+). Decision: keep it, flag faulty days via the data_quality table, do NOT re-backfill blindly. Revisit only if a second-source audit shows the gaps are recoverable AND a pre-2021 family needs them.

2

D-092 · Databento fill ABANDONED — sources are in different price frames

The free cost-check passed, but a 5-cent sample audit revealed TradeStation continuous (@ES/@CL/@GC) is BACK-ADJUSTED while Databento (*.c.0) is RAW/unadjusted — a near-constant per-symbol offset (@ES +529.5, @CL -21.31, @GC +376). Co-mingling would splice a ~530-700pt discontinuity at the join: a corrupt series violating P3 (parity) and P7. Hard STOP at Gate 1; recorded FAIL, no fill, ~$0.06 spent.

3

D-093 · The correct architecture — RAW is canonical, back-adjusted is DERIVED

Lock: RAW unadjusted bars are the immutable ground truth (like M1); the back-adjusted series is a DERIVED, versioned, frozen, reproducible transform from raw + a frozen roll schedule (exactly as M5/H1 derive from M1). Both frames stored and provenance-tagged. NEVER co-mingle vendors. This is why: TradeStation's back-adjustment is opaque and MUTABLE — it silently re-shifts historical prices on every roll, a determinism hazard. Own the adjustment yourself.

4

D-094 · Asset-class-agnostic market model — a HARD RULE

Found during the soak: one CME-equity calendar applied to all symbols over-closed the market (Juneteenth + an evening-skip rule pushed the close to Sunday when the market actually reopened Thursday afternoon). Lock: session geometry, calendars, and bar-expectation logic MUST be per-instrument and asset-class-aware — so US equities, FX, and crypto can be added by a config profile, never a rewrite. No CME literal may live in shared session/calendar/quality code.

Why D-092/D-093 are a textbook 'kill it cleanly' moment

The data team wanted deep history. The honest path didn't fight for it — it ran the cheap free check, then a 5-cent sample, found the two sources fundamentally incompatible, and STOPPED at the first gate rather than splice a corrupt series to get the result they wanted. That recorded FAIL (D-092) is the Constitution's 'honest discovery over impressive results' in miniature. And it produced the better answer (D-093): don't borrow someone else's adjusted prices — own your own adjustment from raw, deterministically. A reversal here would re-introduce exactly the silent ~530-point discontinuity the audit caught.

GROUP 4 — MODEL & RESEARCH: how edges are found honestly. This group exists to stop the single most common way trading research lies to itself — torturing data until something looks profitable. D-031 is the gatekeeper: G0 requires a stated one-sentence MECHANISM — no mechanism, no run; pure pattern-mining is BANNED. D-030 makes every run a counted trial against a pre-registered budget, with re-rolls facing a higher hurdle (DSR), and that budget gate is CODE, not Claude. D-020/D-021 give you the two-speed engines and the three-engine parity court (kernel / TradeStation EasyLanguage / LEAN) where an unexplained delta is a STOP-EVERYTHING event. And D-087 makes 'the code that trades is the code that backtests' concrete at the feature level while explicitly DROPPING the seductive 'compose combos and see what works' idea.

DecisionWhat it locksWhy it matters
D-031G0 demands a one-sentence mechanism — no mechanism, no run. Pure pattern-mining is banned.Pattern-mining finds noise that looks like signal. A required mechanism forces a real economic 'why' before any compute is spent.
D-030Gate ladder G0→G6; every run is a counted trial vs a pre-registered budget; re-rolls face a higher DSR hurdle; the budget gate is code.Counting trials is how you stay honest about multiple testing — the more you try, the higher the bar to call something real (Principle 2).
D-021Three-engine parity court: kernel / TradeStation / LEAN. An unexplained delta is STOP-EVERYTHING.Three independent implementations agreeing is strong evidence the result is real and not a single-engine artifact (G4).
D-032Holdout months are frozen and opened exactly once, at G5.A holdout you can peek at isn't a holdout. Opening it once is the only honest out-of-sample test.
D-087Feature-cache approved with guardrails (cached==uncached byte-for-byte; populated by the kernel's incremental walk, no look-ahead); combo-search REJECTED.Speed without changing results. Combo-search is rejected because it violates D-031 (no mechanism) and Principle 2 (every trial counted). Model ideas enter via G0–G6 with no special status.

GROUP 5 — OPERATIONS: how it's run, graded, and kept alive. These decisions are about the system staying trustworthy day to day. D-040 keeps Topstep rule values as versioned config (config/topstep-rules.json), re-checked WEEKLY (D-063) — never as constants — so a rule change is a config edit that activates at the next pre-flight, never a code change. D-041/D-042 lock the safety rails: automation only on Combine/Express (hard-blocked on Live Funded), no VPS/VPN, no manual order entry in AXIOM, config immutable during market hours, and the only 'override' opens a Gemma chat that can grant nothing. D-071 is the continuity doctrine — nightly VERIFIED backups, a drilled restore path ('a backup that has never been restored is a hope, not a backup'). And two decisions changed how the whole project is built and graded.

Principle 9

D-040 / D-063 · Topstep rules are config

Rule values live in config/topstep-rules.json, re-verified WEEKLY against the live rulebook, and treated as RiskConfig changes that activate at the next pre-flight — never mid-session, never as code constants (Principle 9).

docs/03 §3.1

D-042 · Hard execution locks

No manual order entry in AXIOM; config immutable during market hours; cooling-off staging; the only 'override' opens a Gemma chat that can grant nothing. The fortress only says no.

docs/09 · R-08

D-071 · Continuity doctrine

Nightly VERIFIED backup to a second disk + weekly offline copy + git for text; restore drilled monthly; no absolute paths or machine assumptions in code. 'A backup that has never been restored is a hope, not a backup.'

supersedes D-070

D-080 · ultracode-xHigh + independent grading

Maximum-effort Workflow on every substantive task (no opt-out); gates graded by independent auditor subagents in separate contexts — never by the author. Supersedes D-070 (Fable retired). The author never grades their own gate.

one datastore

D-072 · PostgreSQL declined

DuckDB remains the ONLY datastore; the operator's local PostgreSQL is not used by Paras. No new infra dependency without a superseding entry (Principle 7).

naming

D-091 · The DataPlant is named

The always-on ingestion process (Sponaitech.Sentinel --run) is the 'Paras DataPlant' — the single DuckDB writer. Distinct from the on-demand SENTINEL research dashboard and the read-only PULSE tray watcher.

The model reversal: D-070 → D-080 (how a lock is undone correctly)

D-070 originally said 'Fable 5 grades every gate' (a two-model build protocol). When Fable was retired from public use, the team did NOT quietly delete D-070. They wrote D-080, dated 2026-06-15, stating it 'supersedes D-070,' and re-routed grading to independent auditor subagents (phase-gatekeeper, test-sentinel, constitution-guardian, design-reviewer, parity-auditor) run in separate contexts. The old entry stays, annotated as superseded. That is the only legal way to reverse a lock — and it preserves the deeper invariant (ULTRACODE U4: the author never grades their own gate) that mattered more than the specific tool.

WHY THIS WHOLE FILE EXISTS — the deepest reason. Constitution Principle 11 says: 'The system adapts by saying no more often — never by quietly becoming a different system. Live params are immutable; adaptation = suppression, retirement, and new pre-registered hypotheses through full gates.' DECISIONS.md is the memory that makes Principle 11 enforceable. Without it, a solo operator and an eager AI would re-open settled questions every session and drift. With it, you can trust that the system you designed last month is the system running today — and that any change is dated, sourced, and traceable. The file is not bureaucracy; it is the thing that keeps Paras honest about its own identity.

Watch-outs — how a locked decision gets violated in practice

1) A request re-argues a settled choice ('let's just have the AI size the position this once') — that's a D-005/D-067 violation; STOP and cite the D number. 2) A 'reversal' that just edits the old D-entry's text instead of writing a new dated superseding entry — history is being rewritten; that's not allowed. 3) A Topstep value appearing as a constant in code instead of config — D-040 violation. 4) A second process opening the DuckDB vault, or PostgreSQL creeping in — D-072/D-073 violation. 5) Any LLM output reaching a parameter, filter, gate, or size — the AI-boundary wall (D-067/D-068) is breached. 6) A 'fill the deep history from Databento' suggestion — that path is closed (D-092); the correct architecture is D-093.

What to remember

DECISIONS.md is the system's spine: numbered, dated, sourced choices that are never relitigated, only superseded by a new dated entry. ARCHITECTURE: one pure kernel, three WPF apps (D-001/D-003/D-084). AI BOUNDARY: the runtime AI can only say no — no LLM output ever reaches a trade (D-005/D-067/D-068). DATA: raw is canonical, back-adjusted is derived, never co-mingle vendors, market model is per-instrument (D-093/D-094). RESEARCH: no mechanism no run, every trial counted, parity is sacred (D-031/D-030/D-021). OPS: Topstep rules are config, backups are verified, the author never grades their own gate (D-040/D-071/D-080). When a request conflicts with one of these, the correct move is always the same: stop, name the D number, and flag it.

What to stay aware of
  • DECISIONS.md is never relitigated in a build session. If a request conflicts with a locked D-entry, the correct move is to STOP and flag it, citing the D number — not to silently comply (CLAUDE.md).
  • A lock is only reversed by a NEW dated entry that explicitly says it supersedes the old; the old text is annotated, never deleted. An edit to the old entry's text is a history rewrite and is not allowed (see D-070 → D-080).
  • The AI boundary is absolute: no verdict- or reflection-derived value may reach execution, sizing, filters, or gates (D-067/D-068, risk T-06). If any LLM output touches a trade-affecting number, the feature is wrong as designed.
  • Topstep rule values are config, re-verified WEEKLY, never code constants (D-040/D-063). A Topstep number appearing as a literal in code is a violation.
  • DuckDB is the ONLY datastore (D-072) and the DataPlant is its single writer (D-091); a second process opening the vault, or PostgreSQL creeping in, is a violation.
  • Deep-history 'fill from Databento' is a closed path (D-092) — the sources are in different price frames. The correct architecture is D-093: raw is canonical, back-adjusted is a derived, frozen, reproducible transform; never co-mingle vendors.
  • The market model must be per-instrument and asset-class-agnostic (D-094) — no CME literal in shared session/calendar/quality code, so US equities/FX/crypto can be added by config, not a rewrite.
  • Research stays honest by construction: no mechanism, no run (D-031); every run is a counted trial against a pre-registered budget (D-030); the holdout opens exactly once at G5 (D-032); a parity delta is STOP-EVERYTHING (D-021).
  • The author never grades their own gate — gates are graded by independent auditor subagents in separate contexts (D-080, ULTRACODE U4).

Locked decisions & the why

D-001
One shared C# kernel; SENTINEL backtests it and AXIOM trades it — the code that trades is the code that backtests.
Why: Parity by construction (Principle 3). If live ran different code from the backtest, every backtest result would be measuring a system you never actually trade — results would be meaningless. This is the architectural keystone.
D-003
The kernel is a pure library — no venue SDKs, HTTP, UI, LLM, Topstep values, or wall-clock reads; the clock is injected.
Why: Purity is what lets the identical kernel run in a backtest, the three-engine parity court, and live without behavioral difference (docs/01 §7). Reverse it and parity (D-021) and determinism die.
D-005
No LLM in any order path; the AI sidecar may only veto/narrate/resist/reflect — never approve, place, size, or modify.
Why: LLMs are reasoners and veto filters, never forecasters or order managers (Principles 4–5). The order path must be deterministic; an LLM there means a hallucination could become a live trade. This is the central AI-boundary line.
D-067
The Gemma verdict is schema-constrained JSON enforced at the decoder level; any failure coerces to NEUTRAL (never a retry-loop); the verdict is read-only advisory with zero authority — GO-class verdicts carry no permissive power.
Why: A malformed or stalled model can never block or distort execution, and the LLM can never approve anything. Any verdict→behavior mapping is deterministic versioned config that enters live paths only through G0–G6 (risk T-06).
D-068
Session reflections are local ($0), injected next session as read-only context — and NEVER feed a parameter, filter, threshold, gate, size, or entry. Stored once in DuckDB (SQLite declined, Principle 7).
Why: Stops the 'the AI learned something overnight and silently changed how we trade' failure. Adaptation must pass full gates G0–G6, never leak in through a reflection (standing risk T-06).
D-031
G0 requires a stated one-sentence mechanism — no mechanism, no run. Pure pattern-mining is banned.
Why: Pattern-mining finds noise that looks like signal; a required economic 'why' before any compute is the first honesty filter. Paired with D-030 (every run is a counted trial vs a pre-registered budget, DSR hurdle rises with re-rolls), it is how the research stays honest about multiple testing (Principle 2).
D-021
Three-engine parity court (kernel / TradeStation EasyLanguage / LEAN QCAlgorithm); an unexplained delta is a STOP-EVERYTHING event.
Why: Three independent implementations agreeing is strong evidence a result is real and not a single-engine artifact (G4, docs/01 §6). A determinism/parity mismatch is never a flaky test — it halts everything.
D-092
Databento deep-history fill ABANDONED — TradeStation continuous bars are BACK-ADJUSTED while Databento is RAW/unadjusted (near-constant offsets: @ES +529.5, @CL -21.31, @GC +376). Hard STOP at Gate 1; recorded FAIL, no fill, ~$0.06 spent.
Why: Co-mingling the two would splice a ~530–700pt discontinuity at the join — a corrupt, untradeable series violating P3 (parity) and P7 (one source of truth per bar). A textbook 'kill it cleanly' moment: ran the cheap checks, found the incompatibility, stopped honestly rather than fake a clean fill.
D-093
Data-frames architecture LOCKED: RAW unadjusted bars are canonical and immutable; the back-adjusted series is a DERIVED, versioned, frozen, reproducible transform from raw + a frozen roll schedule. Both frames stored, provenance-tagged; never co-mingle vendors.
Why: TradeStation's back-adjustment is opaque and MUTABLE — it silently re-shifts historical prices on every roll, a reproducibility/determinism hazard for a system built on cached==uncached and parity. Owning the adjustment yourself (like M5/H1 derive from M1, D-012) restores determinism and lets you test any model type, not just indicator/regime.
D-094
Market-model architecture must be asset-class-agnostic and per-instrument — a HARD RULE. Session geometry, calendars, and bar-expectation logic resolve per-instrument via a market profile so US equities, FX, and crypto can be added by config, never a rewrite. No CME literal in shared session/calendar/quality code.
Why: Found during the M1 soak: one CME-equity calendar applied to all symbols over-closed the market (Juneteenth + an evening-skip rule pushed the close to Sunday when the market actually reopened Thursday afternoon). The fix is a design law that future-proofs the system for new asset classes (motivating evidence + risk R-09).
D-040
Topstep rule values are versioned config (config/topstep-rules.json), re-checked WEEKLY (D-063) and treated as RiskConfig changes that activate at the next pre-flight — never as code constants, never mid-session.
Why: Constitution Principle 9. A prop-firm rule change must be a config edit the operator can verify against the live rulebook, not a code change buried in a build. Constants would make rule drift invisible and dangerous to a live account.
D-080
OS-level default is ultracode-xHigh + Workflow on every substantive task (no opt-out); gates are graded by independent auditor subagents (phase-gatekeeper / test-sentinel / constitution-guardian / design-reviewer / parity-auditor) in separate contexts — never by the author. Supersedes D-070 (Fable retired).
Why: Preserves ULTRACODE U4 — 'the author never grades their own gate' — after the prior two-model protocol (D-070, Fable grades) became unworkable. It is also the model of a correct reversal: dated, explicit supersession, old entry annotated not deleted, deeper invariant preserved.
Sources: tracking/DECISIONS.md — header ('Do NOT relitigate these inside build sessions'; reversals require an explicit dated superseding entry) · tracking/DECISIONS.md — Architecture: D-001, D-002, D-003, D-004, D-084 (+ D-073 superseded clause) · tracking/DECISIONS.md — AI operations / boundary: D-005, D-050, D-051, D-052, D-067, D-068, D-069 · tracking/DECISIONS.md — Data plant + data-frames: D-010, D-011, D-012, D-015, D-081, D-088, D-092, D-093, D-094 · tracking/DECISIONS.md — Engines & validation + Gates & discipline: D-020, D-021, D-022, D-024, D-030, D-031, D-032, D-087 · tracking/DECISIONS.md — Execution & Topstep: D-040, D-041, D-042, D-043, D-044 (+ OS/tooling D-062, D-063, D-064) · tracking/DECISIONS.md — Execution protocol & continuity: D-070 → D-080 supersession, D-071, D-072, D-091 · CLAUDE.md — The Constitution (Principles 3, 4–5, 7, 9, 11) + Hard rules (no LLM in order path; Topstep values in config; pure kernel; no verdict/reflection value in execution) + 'never relitigate locked decisions'
Module 17 of 19 · The Build & Decisions

⚠ The Risk Register

A living table that names the six ways the build can fail and binds a tested, gate-anchored control to each one—so risks are caught by construction, not by hope.

Why a register at all

You are building Paras solo. There is no QA team, no second pair of eyes on a 2 a.m. deploy, no compliance department. The risk register is how a one-person operation keeps honest about the ways the system can quietly go wrong—before live P&L is on the line. Every row is a named failure mode bound to a control that is built and tested, not assumed. The two non-negotiables carried above every row: an unexplained parity delta is STOP-EVERYTHING, and operator-override pressure is the headline human risk.

Think of the register the way you think of a trading checklist before the open: it is not paranoia, it is the list of things that have actually killed accounts. In enterprise IT you called this a risk register too—a table of risk, mitigation, owner, and status. Paras keeps exactly that discipline, but with one twist that makes it Paras: every mitigation is anchored to a specific build gate or phase, so the control is something the code must demonstrably pass, not a line in a policy document nobody reads. The source of record is docs/05 §5 (the build-time risk register); the living working copy is tracking/RISKS.md, which you update in place and never delete from.

There are two layers. Layer one (§A, R-01…R-09) is the build-time register—the strategic risks to the whole project. Layer two (§B, T-01…T-06) is the Paras-Method testing risks: the precise failure modes the verify ring exists to catch, which are how R-01 actually shows up in practice. This lesson focuses on the six headline rows from docs/05 §5 that the operator must watch, then points at the testing layer underneath them.

RiskMitigationGate/Phase anchor
Kernel fill-model bug (silent)B3 parity differ + B6 LEAN oracle 3-way court; unexplained delta = stop-everything (G4)B3, B6 / G4
Scope creep before an edge existsA5 gate (nothing trades before G5); dashboard built last (B7); manual campaign is the forcing functionA5 / G5
TopstepX outages / no sandboxShadow-mode rehearsal, simulated-outage drills, runbooks, never-unattended ruleA1, A2, A4
Rule drift at TopstepVersioned config/topstep-rules.json + re-check cadence; rules are config, never constantsA1+
Claude-loop overfitting at scaleTrial-budget gate enforced in code (B4); proposals inert without DSR/PBO headroomB4, B5 / G3
Operator override pressureHard locks + Gemma resistance dialogue + cooling-off; tested adversarially in A1/A3A1, A3

Now the same six rows as a course, one at a time—what each is, why it earns a place, how its control works, and what you must watch.

OPEN · KER/SENT · G4

R-01 · Kernel fill-model bug (silent)

A wrong fill or cost rule that passes its own unit tests yet quietly corrupts every backtest AND every live order—because the same kernel both backtests and trades (Principle 3). This is the most dangerous row because it is invisible: nothing crashes, the numbers just lie. Control: a three-engine parity court. FastScreen, the rigorous engine, and the LEAN oracle (B6) must agree within the docs/01 §6 tolerances; ParityDiffer (B3) plus the 3-way differ at gate G4 enforce it. An unexplained delta is STOP-EVERYTHING—never tolerance-waved.

OPEN · OP/CC · G5

R-02 · Scope creep before an edge exists

The temptation to polish the AXIOM execution fortress and the dashboard before a single strategy has survived the gates—effort spent on a system that may have no edge yet. Control: the A5 gate. Nothing trades before a strategy holds G5; the dashboard is built last (B7); and the manual research campaign is the forcing function that keeps you finding edge instead of decorating. Budget discipline is the proof it is working: if every early family dies at the null, the platform keeps researching at $0/mo while A5 waits. That is the design working, not failing.

OPEN · AXM · A1/A2/A4

R-03 · TopstepX outages / no sandbox

Topstep can go down, and there is no safe full sandbox to rehearse against. If the venue dies mid-position with no plan, you are exposed. Control: shadow-mode rehearsal (A1/A4), simulated-outage drills (A2 §4—kill the stream, kill the venue mid-position ⇒ correct FLATTEN_AND_HALT), written runbooks, and the never-unattended rule. The drills are part of the A1/A2 exit gates, so the outage handling is proven before any real order.

OPEN · OP/AXM · A1+

R-04 · Rule drift at Topstep

Topstep can change its MLL / DLL / consistency rules under you, and a stale rule value silently makes the compliance engine wrong. Control: the rulebook is versioned config (config/topstep-rules.json, Principle 9), never hardcoded constants, plus a standing re-check cadence via /verify-topstep-rules. Each run appends a dated _verificationLog entry even when nothing changed, and a rule change is treated as a RiskConfig change—it activates at the next PRE_FLIGHT, never mid-session.

OPEN · SENT/AI · G3

R-05 · Claude-loop overfitting at scale

The nightly Claude research loop mines noise as the trial count grows—torture enough data and something always 'works.' Control: a trial-budget gate enforced in code (B4), not by good intentions. Proposals are inert without DSR/PBO headroom; every trial is counted; re-rolls face a higher DSR hurdle (Principles 2 and 11). The B4 exit criteria literally require that a budget gate blocks an over-budget proposal in a test.

OPEN · OP/AXM/AI · A1/A3

R-06 · Operator override pressure

The headline human risk: under live P&L stress, the human (you) pushes to relax a lock—move a stop, lift a halt, size up 'just this once.' Control: hard locks + a Gemma resistance dialogue + a cooling-off period, and crucially these are tested adversarially in A1/A3 (the override red-team must yield no workarounds), not assumed. Live parameters are immutable (Principle 11). The system adapts by saying NO more often—never by you quietly relaxing a lock.

The two rules that outrank everything

1) Unexplained parity delta = STOP-EVERYTHING. Any divergence between FastScreen / rigorous engine / LEAN oracle that is not root-caused halts ALL research and trading until reconciled—because a parity bug silently contaminates every downstream gate, ledger row, and lesson, so it outranks schedule (Principle 3; G4). 2) Operator-override pressure is the headline human risk: the controls (hard locks, Gemma dialogue, cooling-off) are tested, not trusted.

Beyond the six headline rows, the register has grown three more build-time risks worth knowing—they show how a living register actually lives: R-07 (the LEAN-as-engine spike could pivot the architecture and break parity-by-construction—blocked unless its pre-committed four-part rule passes AND a dated superseding DECISIONS entry is written, because it touches D-021/D-024/Principle 3); R-08 (single-machine hardware failure could destroy the unrepeatable tape, ledger, and learnings—mitigated by the Continuity doctrine of tiered, verified, monthly-drilled backups, currently MITIGATING); and R-09 (market-calendar mis-modeling, found live during the M1 soak on 2026-06-18, where one CME-equity calendar over-closed @ES/@CL/@GC and flapped quality to warn—driving the per-instrument market-model hard rule D-094).

Underneath R-01 sits the testing layer (§B). These are the standing tests, not one-time checks, that catch how a fill-model bug actually manifests: T-01 determinism deltas (same spec + same bars must produce byte-identical trades), T-02 golden-fixture drift (canonical trade-lists must not change silently), T-03 parity-bug contamination (the stop-everything court), T-04 cost-model optimism (costs pessimistic by default until calibration), T-05 AI sidecar leaking into the order path, and T-06 verdict/reflection leakage into behavior. A standing architecture test greps and audits to prove no verdict- or reflection-derived value reaches execution, sizing, filters, or gates.

1

Each build session (Claude Code)

Touch any register row whose phase is in flight and bump its Last review date. The register is part of the task, not an afterthought.

2

Weekly (you, the operator)

Run /verify-topstep-rules to re-check R-04 and confirm the topstep-rules.json version. Pair it with the calendar-news check. Note: docs/05 §5 still reads 'monthly re-check,' but D-063 raised the live cadence to WEEKLY—follow D-063.

3

On any parity event

R-01 and T-03 go back to OPEN regardless of prior status until a written reconciliation report closes them. Log it in SESSION_LOG.md, and in DECISIONS.md if a tolerance or definition changed.

4

New risk discovered

Append it with the next R-/T- id, a date, and a doc-section anchor—never delete a retired risk; mark it RETIRED with a date. R-09 is the live example of this in action.

Status legend
OPEN (no mitigation in place) · MITIGATING (control being built) · CONTROLLED (live + verified) · WATCH (controlled but monitored each session) · RETIRED (keep row, add date)
Owner legend
OP = operator (Satya) · CC = Claude Code build sessions · KER = kernel · SENT = SENTINEL · AXM = AXIOM · AI = nightly Claude / Gemma sidecar
Source of record
docs/05 §5 (build-time risks) + docs/05 §4 (A1 adversarial suite) + the Paras Method ring-6 verify discipline
Living working copy
tracking/RISKS.md — update Status and Last review in place; cross-referenced from tracking/STATE.md
Concrete example — R-09 caught live

During the M1 soak on 2026-06-18, the single CME-equity calendar marked Juneteenth (Fri 6/19) as a full holiday and the evening-open-skip rule extended that closure to Sunday. But the market actually reopened Thursday 15:00 PT on the normal daily break. The runtime state machine recovered to SYNCED on real bars (masking the bug), but expToday=0 then flapped quality to 'warn' on live bars. This is the register doing its job: the failure was named (R-09), root-caused, and turned into a hard rule (D-094, per-instrument asset-class-agnostic market model) plus a post-soak calendar-correctness audit—instead of being shrugged off.

The finish line, for context

Definition of fully operational (docs/05 §5): one strategy at G6 trading a Combine through AXIOM with zero violations across 20 sessions, nightly and weekly AI loops running unattended, and the dashboard showing it all—including everything that failed on the way. The register is what keeps the 'everything that failed' honest.

What to stay aware of
  • An unexplained parity delta between FastScreen / rigorous engine / LEAN is STOP-EVERYTHING—never tolerance-wave it; R-01 and T-03 revert to OPEN until a written reconciliation report closes them.
  • Operator-override pressure is the headline human risk: the hard locks, Gemma dialogue, and cooling-off are tested (A1/A3), not trusted—never relax a lock under live P&L stress; the system adapts by saying NO more often, not by you loosening a control.
  • Run /verify-topstep-rules WEEKLY (D-063 supersedes the 'monthly' wording in docs/05 §5) and confirm the topstep-rules.json version; a stale rule value silently makes the compliance engine wrong.
  • Nothing trades before a strategy holds G5 (R-02). If early families die at the null, the platform researching at $0/mo while A5 waits is the budget discipline working—resist polishing AXIOM or the dashboard early.
  • The trial-budget gate (R-05) is enforced in code at B4—proposals are inert without DSR/PBO headroom; do not trust the nightly loop's enthusiasm, trust the budget gate.
  • Backups are only real once restored: the Continuity doctrine (R-08, D-071) requires a monthly drilled restore during build—a never-tested backup is a hope, not a backup.
  • The register is living: touch in-flight rows each session and bump Last review; append new risks with an id, date, and doc anchor; never delete a retired row—mark it RETIRED with a date.
  • Watch the calendar/market-model risk (R-09): until the per-instrument refactor (D-094) lands, a calendar misprediction alone can flap live quality to 'warn'; verify config/cme-holidays.csv per the standing operator action.

Locked decisions & the why

D-021
Three-engine parity court: kernel / TradeStation (generated EasyLanguage) / LEAN (generated QCAlgorithm); an unexplained delta is a stop-everything event.
Why: This is the entire control for R-01 (silent kernel fill-model bug) and T-03 (parity-bug contamination). Because the code that trades is the code that backtests (Principle 3), a single undetected divergence poisons every downstream gate, ledger row, and lesson—so it must outrank schedule. Anchored at docs/01 §6; G4.
D-024
LEAN built from pinned source (lean/VERSION), native Windows, no Docker; DuckDB bars exported as custom data.
Why: Provides the independent third engine (the oracle) that makes the R-01 parity court a court and not a coin-flip—the LEAN repro of Experiment #001 at B6 is the outside check on the kernel. docs/02 §6.
D-063
Topstep rulebook re-check cadence raised from monthly to WEEKLY, folded into the calendar-news check via /verify-topstep-rules; each run appends a dated _verificationLog entry even when nothing changed; rule changes are treated as RiskConfig changes that activate at the next PRE_FLIGHT, never mid-session.
Why: Directly tightens the R-04 (rule drift) control. docs/05 §5 still reads 'monthly'; D-063 supersedes it to weekly because a stale MLL/DLL/consistency value silently makes the compliance engine wrong. Operator instruction, 2026-06-10.
D-005
No LLM in any order path; the AI sidecar can only veto / narrate / resist / reflect.
Why: Underpins the R-06 override-pressure control and risk T-05: the Gemma resistance dialogue can push back on the operator but has zero authority to approve, place, size, or modify. Principles 4–5; docs/04 §3.
D-067
Gemma Structured Verdict Contract v1 locked: schema-constrained five-tier verdict, read-only advisory context only; any verdict→behavior mapping is deterministic versioned config that enters live paths only through G0–G6; GO-class verdicts carry no permissive power.
Why: Closes risk T-06 (verdict leakage into behavior) and reinforces R-06: the sidecar's structured output can resist the operator but is firewalled from compliance-engine inputs, sizing, and gates. Operator handoff + ruling, 2026-06-10.
D-069
External-framework adoption policy: classification ladder with every spike's pass/fail rule written before results are seen; the LEAN-as-engine question is a SPIKE, not a decision—adoption still requires a dated superseding entry because it touches D-021/D-024/Principle 3.
Why: This is the explicit control behind R-07 (LEAN-as-engine spike pivoting the architecture): the pre-committed rule and the superseding-entry requirement stop a tooling spike from silently breaking parity-by-construction. Operator handoff, 2026-06-10.
D-071
Continuity & Portability doctrine locked: tiered backups (nightly verified snapshot to a second physical disk + weekly offline copy + git remote for text), restore drilled monthly during build, machine-migration runbook < half a day, portability rules enforced at code review.
Why: The mitigation for R-08 (single-machine hardware failure / loss). Because Paras is local-only by Principle 10, one PC holds the ledger, journal, learnings, and the unrepeatable tape—so 'a backup that has never been restored is a hope, not a backup.' New CLAUDE.md hard rule. Operator instruction, 2026-06-12.
D-094
Market-model architecture is asset-class-agnostic and per-instrument (hard rule, M2+): session geometry, holiday/early-close calendars, and bar-expectation logic must be per-instrument and data-driven, never hardcoded to CME-futures assumptions.
Why: The fix born directly from risk R-09, found live in the M1 soak (2026-06-18) when one CME-equity calendar over-closed @ES/@CL/@GC and flapped quality to 'warn.' It is the register turning an observed failure into a design law. Operator hard rule, S028 2026-06-18.
Sources: docs/05_BUILD_ROADMAP.md §5 (build-time risk register) · docs/05_BUILD_ROADMAP.md §4 (A1 adversarial test suite) · docs/05_BUILD_ROADMAP.md Track A/B phase tables + 'Definition of fully operational' · tracking/RISKS.md §A (R-01…R-09) + §B (T-01…T-06) + review cadence · tracking/DECISIONS.md (D-005, D-021, D-024, D-063, D-067, D-069, D-071, D-094)
Module 18 of 19 · The Build & Decisions

💾 Continuity & Portability

Paras runs on one personal Windows machine that will eventually die — this doctrine makes that day a half-day inconvenience, not a project-ending loss, by keeping nightly VERIFIED backups, a current migration runbook, an inventoried secret list, and a drilled restore path.

The whole lesson in one breath

Everything Paras has accumulated — the ledger of every trial, the learnings, the specs, the tracking files — lives on ONE personal Windows machine (Principle 10). That machine will fail one day: disk death, Windows corruption, theft, spilled coffee. This doctrine (docs/09, decision D-071) is the survival plan. It rests on four pillars — a nightly VERIFIED backup, a current machine-migration runbook, an inventoried list of every secret, and a regularly DRILLED restore — bound by one unforgiving rule: a backup that has never been restored is a hope, not a backup.

You're running a trading-research system solo, local-only, on your own desktop. That's a deliberate choice (budget-true, security-first, no VPS) — but it concentrates all the risk in one box. docs/09 opens with the blunt truth: 'Paras runs entirely on one personal Windows machine. That machine will eventually fail.' The job of this doctrine is to turn that inevitable event from a catastrophe into a chore. The only thing that ever leaves the machine is the private GitHub remote (code, docs, tracking — no market data, no secrets), which was already accepted at D-060. Everything else is protected locally, on purpose.

There's one insight that shapes the entire design, so learn it first: the startup gap-sync is the migration healer. Because the data plant self-heals any gap from its last checkpoint (docs/02 §1.2), a restored backup that is days stale simply gap-syncs forward to NOW on first start. That means your backups never need to be perfectly fresh — they need to be VERIFIED and RESTORABLE. Freshness is automatic; restorability is the thing you must guarantee. Hold that thought; it's why 'verified' matters more than 'recent' everywhere below.

PART 1 — WHAT WE PROTECT, AND WHY NOT EVERYTHING. The doctrine refuses to back up everything blindly. It sorts artifacts into five tiers by how replaceable they are, because spending effort protecting a thing you can rebuild with a script is wasted effort. Understanding the tiers tells you instantly why a lost ledger is a disaster but a lost LEAN checkout is a shrug.

TierExample artifactsReplaceable?How it's protected
0 — Irreplaceable, smallops/ledger.duckdb (experiments, trials, gate_results, trades, sessions), axiom-journal.duckdb, learnings.md, reviews/, reflections/, specs/, config/ (incl. topstep-rules.json), tracking/, docs/, .claude/NO — this is the project's accumulated honesty, the only true product until a strategy is fundedGit push (text) + nightly verified DB snapshot + weekly offline copy
1 — Irreplaceable, largeTape-recorder Parquet archive (ops/tape/) — record-forward only, can never be re-downloadedNO, but lower criticality (post-edge execution research)Nightly snapshot while size permits; revisit policy at 100 GB
2 — Expensive to rebuildbars data inside ledger.duckdb (full M1 re-backfill = days of rate-limited paging)Yes — TradeStation re-backfill, pager is resumableRides along inside the Tier-0 DB snapshot (cheap insurance vs. days of re-paging)
3 — Freely rebuildableLEAN checkout, NuGet/npm caches, Ollama models, build outputs, the OS itselfYes — scriptedNEVER backed up. scripts/setup-machine.ps1 rebuilds them
4 — Machine-bound (never copyable)Windows DPAPI token store (TradeStation refresh token, later TopstepX), GitHub creds, machine certsDeliberately non-portable — DPAPI encrypts to this machine+userNOT backed up — re-entered. The §4 secrets inventory IS the recovery path
Why Tier-0 is 'the project's accumulated honesty'

Until a strategy is actually funded, Paras has no product except its own record of what it tried and what it learned. The ledger says which experiments ran, which trials were counted, which gates passed or failed; learnings.md holds the negative results that stop you re-rolling a dead idea. You cannot re-download that — it's the work itself. That's why the experiments / trials / gate_results tables 'never regenerate' and the restore verification checks them for EXACT matches, not just 'present.' Lose them and you lose the whole point of the system (Constitution Principle 1: honest discovery).

PART 2 — THE BACKUP DOCTRINE (3-2-1, adapted local). The classic rule is three copies, two media, one offline. Paras adapts it for a local-only operator. Cloud storage of market data was declined (your preference); the residual fire/theft risk is documented and accepted in §6, with the Tier-0 TEXT corpus already offsite on GitHub. Three layers do the work:

every session

B-1 · Git push

All Tier-0 TEXT (code, docs, specs, tracking, config, learnings) pushed to private GitHub every session close — /paras-handover includes it — plus CI on push. Verification: CI green = a parse-able repo. This is your one offsite copy and your insurance against a total local loss.

nightly, automatic

B-2 · Nightly verified snapshot

A job IN THE PLANT (~18:30 PT, after top-up) CHECKPOINTs the DuckDB files, copies the DBs + ops/tape/ delta + a git bundle to a SECOND PHYSICAL DISK at ParasBackup/<yyyy-MM-dd>/, 30-day retention. Unattended. This is the workhorse.

weekly, you

B-3 · Weekly offline copy

Latest verified snapshot copied to a rotating EXTERNAL SSD kept UNPLUGGED between copies (~5 min, prompted by a PULSE event). Offline = immune to ransomware and power surge. The only human step in the whole backup chain.

What 'verified' actually means (this is the heart of it)

The nightly job does NOT just copy files and call it done. It RE-OPENS each copied DB read-only and runs row-count + integrity queries, then writes a manifest.json (file hashes, row counts, duration). Only a snapshot that passes this check updates backupAgeHrs — the value PULSE displays, which turns AMBER past 26 h (PULSE_DEV_SPEC §3.4 / §3 fields: backupAgeHrs > 26 ⇒ amber card, sub-line 'overdue', else 'verified'). So a copy that silently corrupted does NOT reset the clock — PULSE goes amber and tells you the backup is overdue. The system refuses to count an unverified copy as a backup. That is D-071 enforced in code and on your tray.

PART 3 — THE RESTORE DRILL (non-negotiable). A backup you've never restored is a guess about the future. The doctrine therefore mandates a drill: MONTHLY during build phases, QUARTERLY once operating — restore the latest snapshot to a scratch folder, open it, run the §5.5 verification queries, and file a TEST_LOG.md evidence row. Crucially, the drill is graded by the auditor (ULTRACODE Law U4), NOT by whoever wrote the backup job — the author never grades their own gate. And the very first restore test is not optional polish: it is an M1 exit criterion (runbook M1.8: 'Nightly backup job + one tested restore').

// §5.5 Post-restore verification queries (also the restore-drill script):

-- bar coverage advances after gap-sync, counts >= manifest:
SELECT count(*), max(end_utc) FROM bars GROUP BY symbol, res;

-- the irreplaceable tables must match the manifest EXACTLY (never regenerate):
SELECT count(*) FROM experiments;
SELECT count(*) FROM trials;
SELECT count(*) FROM gate_results;

-- data-quality job over the restore gap -> zero ok=false days post-heal
-- one tape Parquet file from the snapshot opens and row-counts > 0

PART 4 — THE SECRETS INVENTORY (the recovery checklist). Tier-4 things — the DPAPI-encrypted tokens — can NEVER be copied; DPAPI binds them to this machine+user by design. So they are not backed up; they are RE-ENTERED. The §4 table is the recovery path: every secret has a row naming where it lives and the exact procedure to re-enter it on a new machine. The standing portability rule is strict: adding a secret anywhere in the system WITHOUT a row here is a review-blocking violation. Secrets live only behind the DPAPI TokenStore abstraction — never in code, config files, or git, ever.

Secret / bindingWhere it livesRe-entry procedure on a new machine
TradeStation API key + secretDPAPI store ts-client-id.secret + ts-client-secret.secret (master copy in your password manager)Put id + secret in a temp file -> Sentinel --bootstrap-ts-credentials <file> (stores to DPAPI, deletes the file, verifies against the OAuth endpoint)
TradeStation refresh tokenDPAPI store ts-refresh-token.secret (machine-bound, dies with the machine)Re-run the interactive OAuth login once (Sentinel --auth, to be built); needs the registered redirect URI
GitHub accessgh auth login / credential managergh auth login
TopstepX API key (from A2)Password manager + DPAPI storeRe-enter at AXIOM first run
Databento API key (D-015 audit)DPAPI store databento-api-key.secret (master copy in password manager)Pipe key via STDIN ONLY: Sentinel --store-secret databento-api-key (prints only a char-count). Validate with the FREE --databento-check before any paid download
PULSE start-with-WindowsHKCU Run keyscripts/setup-machine.ps1 recreates
Power plan + NTP configWindows settingsscripts/setup-machine.ps1 applies (docs/02 §1.6)
Secret hygiene the inventory enforces

Two habits from §4 that matter for you specifically. (1) Pipe secrets via STDIN, never argv/chat/journal — the Databento key uses --store-secret reading Console.In and prints only a char-count, so the value never lands in a command line, a chat log, or the activity journal. (2) If a secret WAS exposed (the TradeStation creds were pasted in plaintext in a chat 2026-06-12; the Databento key in chat 2026-06-15), the inventory records 'rotate after the audit' / 'consider regenerating' — an exposed secret is rotated, not trusted. The inventory isn't paperwork; it's the audit trail of every secret's exposure and recovery.

PART 5 — THE MIGRATION RUNBOOK (target: under half a day, mostly unattended). When the machine dies, you don't improvise — you walk §5. The pre-condition is guaranteed by the doctrine itself: a verified B-2/B-3 backup exists. Then six steps:

1

1 · Provision (~30–60 min, mostly unattended)

On the new Windows box: install Git (the one manual bootstrap step), git clone the repo from GitHub, run scripts/setup-machine.ps1 — it installs .NET 8 SDK, VS Code, and phase-appropriate extras (Node, Python 3.11, Ollama + models), applies power plan + NTP, registers PULSE autostart. The script IS the machine spec.

2

2 · Restore data (minutes to ~1 h)

Copy the latest VERIFIED snapshot from the backup disk into ops/ (DBs, tape archive). Time depends on tape size.

3

3 · Re-enter secrets (~15 min)

Walk the §4 inventory top to bottom — the Tier-4 tokens that couldn't be copied get re-entered here, each by its documented procedure.

4

4 · Heal (unattended)

Start the plant. Startup gap-sync detects the staleness and backfills checkpoint -> now automatically; PULSE goes blue (syncing) then green. This is the 'gap-sync is the migration healer' insight paying off — you do nothing.

5

5 · Verify (§5.5)

dotnet test green; ledger row-count + latest-bar queries match the manifest and the healed gap; data-quality job clean for the gap period; PULSE green with all vitals; the backup job runs against the NEW second disk and writes a verified manifest.

6

6 · Record

learnings.md entry (what broke, how long it really took), a STATE.md note, a risk R-08 review-date bump. If the old machine still exists: wipe the DPAPI tokens (revoke at the provider if it was theft/loss).

PART 6 — THE PORTABILITY RULES (why migration is even possible). The runbook only works because every module obeys six standing design laws from M0 forward, enforced at code review. These are what make the new machine accept your code with a single config line instead of a rewrite:

  • No absolute paths in code — all paths resolve from configuration relative to the repo/data root. A different drive letter on the new machine = one config line, zero code changes.
  • No machine assumptions — no hardcoded hostname, username, core counts, or screen geometry; discover at runtime or read from config.
  • Every prerequisite is scripted — if a task requires installing anything, the SAME change updates scripts/setup-machine.ps1 (winget-based, idempotent).
  • Every secret is inventoried — secrets live only behind the DPAPI TokenStore, each with a §4 row and a re-entry procedure. No secret in code, config, or git, ever.
  • One data root — all mutable state lives under ops/ (+ the DuckDB files), never in %APPDATA%, the registry (sole exception: the PULSE Run key, recreated by the setup script), or scattered folders.
  • Backups are part of the plant, not an operator chore — the nightly job is code with tests; only B-3 (plug in the external disk) involves a human.
Accepted residual risks — eyes open

The doctrine is honest about what it does NOT cover (§6). (1) Simultaneous loss of ALL local media (house fire / flood / burglary taking the external SSD too): the Tier-0 TEXT corpus survives on GitHub, but the DB ledgers and tape archive would be lost (bars are re-backfillable). You explicitly chose local-only for data. Standing offer: an encrypted cloud copy of the small Tier-0 DBs would close this for ~$0–2/mo — revisit when ledger.duckdb > 1 GB or at first funded account, whichever comes first. (2) Backup disk dies silently: mitigated by per-snapshot verification + PULSE backup-age amber + the drill cadence. (3) DPAPI loss on a Windows reinstall (same machine): same as migration — §4 re-entry; the tokens are the only loss.

Watch-outs — where continuity quietly rots

1) PULSE backup-age card AMBER (> 26 h) means no VERIFIED snapshot landed last night — investigate, don't ignore; an unverified copy does not reset the clock. 2) Skipping the restore drill turns every backup into an untested hope — the drill is the only thing that proves the chain works, and the FIRST one is an M1 exit criterion. 3) Adding a secret without a §4 row, or sneaking in an absolute path / machine assumption, is a review-BLOCKING violation — it silently breaks the next migration. 4) The external SSD (B-3) only protects you if you actually plug it in weekly when PULSE prompts — it's the one human step and the only thing immune to ransomware. 5) An exposed secret (pasted in chat) must be ROTATED, per its inventory note — don't leave it trusted.

The doctrine's one rule
A backup that has never been restored is a hope, not a backup.
3-2-1 adapted local
Three copies (git + nightly snapshot + weekly offline), two media, one offline (the unplugged external SSD).
Why freshness doesn't matter
Startup gap-sync heals a stale restore to NOW — backups must be VERIFIED and RESTORABLE, not recent.
The migration healer
Step 4 of the runbook: start the plant, gap-sync backfills checkpoint->now automatically.
Migration target
Under half a day, mostly unattended (provision ~30–60 min, restore minutes–1 h, secrets ~15 min).
Drill cadence
Monthly during build, quarterly in operation; graded by the auditor (U4), evidence row in TEST_LOG.md.
PULSE signal
backupAgeHrs > 26 ⇒ amber 'overdue'; only a VERIFIED snapshot resets it (PULSE_DEV_SPEC §3.4).
Current status (from §4, as of the doc)

TradeStation client id + secret are STORED and VERIFIED — --bootstrap-ts-credentials confirmed the OAuth server recognizes the client. The refresh token is NOT yet obtained — it needs the interactive authorize login (browser) + the registered redirect URI, which is the to-be-built --auth path; until then no live data pull is possible. All values are DPAPI-only and gitignored, never written to the repo. Because the TS creds were once transmitted in plaintext in chat, the note stands to regenerate the secret once interactive auth is wired and a refresh token is in hand.

What to remember

Your whole project lives on one machine that will die — and this doctrine makes that a half-day chore. Protect by tier (Tier-0 is irreplaceable honesty; Tier-3 you just rebuild with a script; Tier-4 secrets you re-enter, never copy). Back up three ways (git every session, verified nightly snapshot to a second disk, weekly offline SSD) where VERIFIED means the job re-opened and integrity-checked the copy. DRILL the restore on a cadence — the first one is an M1 exit gate — because a never-restored backup is just a hope. Keep the §4 secret inventory and the §5 runbook current in the SAME change that introduces any new dependency. Obey the six portability rules so the new machine accepts your code with one config line. That's D-071, and it's a CLAUDE.md hard rule for a reason.

What to stay aware of
  • A backup that has never been restored is a hope, not a backup — the restore drill (monthly during build, quarterly operating) is the only proof the chain works, and the FIRST tested restore is an M1 exit criterion (runbook M1.8).
  • 'Verified' is the load-bearing word: the nightly job re-opens each copied DB read-only and runs integrity + row-count queries before writing a manifest; only a VERIFIED snapshot resets backupAgeHrs. An unverified/corrupt copy does NOT reset the clock.
  • Watch the PULSE backup-age card — amber past 26 h means last night's verified snapshot did not land (PULSE_DEV_SPEC §3.4). Don't ignore amber.
  • Backups never need to be fresh, only restorable — startup gap-sync heals a stale restore forward to now (docs/02 §1.2). Spend your worry on restorability, not recency.
  • Tier-4 secrets (DPAPI tokens) are NEVER copied — they're re-entered from the §4 inventory. Adding any secret without a §4 row is a review-blocking violation.
  • Secrets go in via STDIN only (e.g. --store-secret reads Console.In, prints only a char-count) — never argv, chat, or the journal. An exposed secret (TS creds pasted in chat 2026-06-12; Databento key 2026-06-15) must be ROTATED per its inventory note.
  • Portability is design law from M0: no absolute paths, no machine assumptions, every prerequisite scripted into setup-machine.ps1, one data root under ops/. A new dependency must update the §4 inventory or the setup script in the SAME change.
  • The accepted residual risk is total local-media loss (fire/theft): Tier-0 text survives on GitHub, but DB ledgers + tape would be lost. The standing ~$0–2/mo encrypted-cloud offer revisits at ledger.duckdb > 1 GB or first funded account.
  • Only ONE step in the whole backup chain is yours: plug in the external SSD weekly when PULSE prompts (B-3). Everything else is plant code with tests.

Locked decisions & the why

D-071
Continuity & Portability doctrine locked (docs/09, risk R-08): tiered artifacts by replaceability; nightly VERIFIED snapshot to a second physical disk + weekly offline copy + git remote for text (3-2-1 adapted local); restore drilled monthly during build, quarterly in operation; standing portability rules (no absolute paths, no machine assumptions, scripted prerequisites via setup-machine.ps1, secrets only behind the DPAPI TokenStore + inventoried in §4); machine-migration runbook target < half a day, with startup gap-sync as the data healer. New CLAUDE.md hard rule.
Why: Paras is local-first on one personal device (Principle 10), so a single machine failure could end the project. The doctrine converts that into a recoverable event. The locked decision explicitly DECLINED a cloud copy of market data (operator's local-only preference) and documented + accepted the residual fire/theft risk, to be revisited at ledger > 1 GB or first funded account.
D-060
The project OS lives entirely in-repo under tracking/ + .claude/; the only cloud artifact is the private GitHub remote (code/docs/tracking — no market data, no secrets).
Why: This is the single offsite copy that B-1 (git push) relies on, and the boundary the continuity doctrine respects: TEXT goes to GitHub, but DBs/tape/secrets stay local. It's why a total-local-media loss still leaves the Tier-0 text corpus alive (§6).
D-015
Second-source data audit (Databento/FirstRate) is a B1/M1 exit criterion — and it introduced the Databento API key as an inventoried secret.
Why: It's the reason the Databento key has a row in the §4 inventory with a STDIN-only re-entry procedure (--store-secret) and a cost-gated free --databento-check before any paid download — a worked example of the rule that every new secret lands in the inventory in the same change that introduces it.
D-083
The live TradeStation stream is TAPE + a liveness signal only — it never writes the bars table; REST is the single authoritative bars feeder (startup gap-sync + mid-session self-heal + nightly top-up own bars currency).
Why: This is why the migration runbook's step 4 (Heal) works: bars currency is owned by the resumable REST gap-sync, so a stale restored backup self-heals to NOW on first start regardless of the stream. The 'gap-sync is the migration healer' insight depends on REST being the single bars writer.
Sources: docs/09_CONTINUITY_AND_PORTABILITY.md — header standing rule + intro (one machine, will fail; gap-sync is the migration healer) · docs/09 §1 — Artifact tiers (Tier 0–4 by replaceability; 'the project's accumulated honesty'; key insight: gap-sync heals stale restores) · docs/09 §2 — Backup doctrine (3-2-1 adapted local: B-1 git push / B-2 nightly verified snapshot / B-3 weekly offline copy; the non-negotiable restore drill, graded by U4, first restore is an M1 exit criterion) · docs/09 §3 — Portability rules (no absolute paths, no machine assumptions, scripted prerequisites, every secret inventoried, one data root, backups are plant code) · docs/09 §4 — Secrets & machine-bound inventory (re-entry procedures; STDIN-only secret entry; status note on TradeStation creds verified / refresh token outstanding / chat-exposed secrets to rotate) · docs/09 §5 — Machine-migration runbook (six steps, target < half a day) + §5.5 post-restore verification queries (the restore-drill script) · docs/09 §6 — Accepted residual risks (total local-media loss; silent disk death; DPAPI loss on reinstall; the encrypted-cloud standing offer) · PULSE_DEV_SPEC §3 / §3.4 — backupAgeHrs field: > 26 h ⇒ amber 'overdue', else 'verified'; only a verified snapshot updates it · docs/PARAS_BUILD_RUNBOOK.md M1.8 — 'Nightly backup job + one tested restore' as an M1 hardening exit criterion · tracking/DECISIONS.md — D-071 (continuity doctrine locked), D-060 (OS in-repo / GitHub the only cloud artifact), D-015 (Databento second-source audit secret), D-083 (REST is the single bars feeder — enables gap-sync healing) · CLAUDE.md — Continuity rule (D-071) hard rule: nightly verified backup, current migration runbook, new dependency lands in §4/setup-machine.ps1 in the same change, restore path drilled
Module 19 of 19 · Reference

📖 Glossary (terms of art)

Plain-English then precise definitions of every term of art you will meet running Paras — from Combine and DSR to ring, cell, family, and matched null.

How to use this page

This is the dictionary for the whole system. Other lessons teach the ideas in depth; this page is where you come to look up one word fast. Plain-English first, then the precise definition, with the doc that owns the full story. If a term isn't here, the one-line source is tracking/memory/glossary.md.

Paras has its own vocabulary, and a lot of it overloads words you already know from trading and IT. 'Model' here means a trading strategy, never an AI model. 'Cell' is one counted experiment, not a spreadsheet box. 'Gate' is a pass/fail checkpoint, not a network device. Getting the vocabulary right is the difference between reading a dashboard and being able to act on it — so the terms below are grounded strictly in the specs (docs/00–06) and the in-repo glossary card. Nothing here is invented.

The three apps + the kernel — start here

PULSE = the always-on tray heartbeat. SENTINEL = the research laboratory that kills bad strategies cheaply. AXIOM = the live-execution fortress. The kernel = the one pure C# library all three share, so the code that trades is the code that backtests. Everything else in this glossary hangs off these four.

Paras (Parasmani)
The whole system — PULSE + SENTINEL + AXIOM on one shared C# kernel. Named for the touchstone that reveals which metals were gold all along; most aren't, and killing bad strategies cheaply is the point (docs/00 header).
PULSE
The always-on Windows tray monitor — the data plant's heartbeat. A 4-state icon plus a read-only flyout (commentary, per-symbol health, plumbing vitals). Its own process, starts with Windows (docs/02 §1.5).
SENTINEL
The research laboratory: data pipeline, two-speed backtest engine, G0–G6 deflation gates, 3-engine parity court, nightly Claude research loop, WPF dashboard. Its output is survivors after deflation, never 'best backtest' (docs/00 §1, docs/02).
AXIOM
The execution fortress: hosts G6-promoted strategies on live data and routes orders through a deterministic Topstep compliance engine. A local Ollama AI sidecar may only veto / narrate / resist / reflect — never touch an order (docs/00 §1, docs/03).
kernel
Sponaitech.Kernel — the single pure C# implementation of all trading logic. SENTINEL runs it in backtest mode, AXIOM in live mode. Parity by construction (Constitution Principle 3; docs/01).

The next two ideas are the spine of how Paras organizes research: the hierarchy that classifies a trade idea (Edge Class → Family → Model → Cell), and the experiment_cell coordinate that pins down exactly what got tested. Get these two right and the dashboard reads like a sentence.

TermPlain EnglishPrecise meaningIs it a budget?
Edge ClassWHY a trade makes money (the genre)The top level — groups families by the reason a trade works: Reversal/Regime-Change, Exhaustion-Reversal, Trend-Continuation, Mean-Reversion, Relative-Value. Used in the final G6 correlation check so you don't run five copies of one bet.No — diversification only
FamilyWhich allowance pool the idea spends fromThe accounting unit. You promise up front 'this family may run at most N cells.' Examples: divergence, reversal, continuation, disconnect, special, filter, xinstrument.YES — this is the budget
ModelThe exact recipe (one written strategy)One spec. Must state a one-sentence reason it works (the G0 mechanism) or it isn't allowed in. Example ID: REV-SELL-BIGTF-CORR.No
Cell (experiment_cell)The one version that actually got testedThe formal coordinate (spec_id@version, params, symbol, timeframe, session, regime_filter) — 'right strategy × right ticker × right session.' One cell = one backtest = one logbook row = one counted trial.It IS the spend
TagA sticker for sorting, not a levelAttributes you slice by (dir=short, MTF(15/30/60), role=entry). Never changes the ladder or the budget.No
The gotcha that exists on purpose

Family is just your filing label; Edge Class is the true reason. They sometimes disagree — a 'divergence' family model and a 'special' family model can both really be betting that a stretched market snaps back (Mean-Reversion). If you only read family labels you THINK you're diversified while you've piled into one bet. The Edge Class rung exists precisely to catch this at G6 (Edge_Class_Hierarchy_Explained §9).

The gates are the seven checkpoints (G0–G6) every model crosses in order, like belt tests — you cannot skip ahead. They get progressively harder and more expensive, so the cheap tests kill bad ideas before you spend on the expensive ones. Here is each gate, what it asks, and what happens on failure.

GateQuestionHow it's judgedOn fail
G0Is the hypothesis pre-registered?A spec YAML with a stated one-sentence mechanism + trial budget exists before any run. No mechanism, no run (pure pattern mining is banned).No run
G1Any pulse at all?FastScreen vs the random-entry null (same trade count, random in-session timing, identical exits, 100 reps); must beat the null p50 and clear ≥ +2 ticks net/trade.Record, done
G2Survives realism?RigorousEngine with M1 intrabar resolution, pessimistic fills, calibrated costs.Log the mechanism in learnings.md — execution-realism findings are gold
G3Survives deflation?Validation service: DSR > 0.95 given the family's trial count; PBO < 0.20 via CSCV.Family hurdle rises; the idea retires unless a NEW hypothesis emerges
G4Do three engines agree?LEAN oracle + generated EasyLanguage vs the kernel, reconciled within docs/01 §6 tolerances.STOP EVERYTHING until root-caused — an unexplained delta contaminates all prior results
G5Walk-forward + holdout?TradeStation walk-forward as independent confirmation; the frozen holdout months are touched exactly once.Retire the cell
G6Live-ready?Promotion review (operator + Claude weekly report) including the portfolio correlation check — promote for uncorrelated daily P&L, never leaderboard rank.Stay in the G5 pool
DSR (Deflated Sharpe Ratio)
A luck-adjusted score. A Sharpe Ratio measures reward-per-risk; the Deflated version (López de Prado / Bailey) RAISES the passing bar the more cells you tested in a family, because more tries produce more accidental winners. G3 requires DSR > 0.95 (docs/02 §3, §5).
PBO (via CSCV)
Probability of Backtest Overfitting — the chance your 'best' result is just curve-fit luck that won't repeat. Computed via Combinatorially-Symmetric Cross-Validation. Lower is better; G3 requires PBO < 0.20 (docs/02 §3, §5).
random-entry null / matched null
The G1 baseline: a control strategy with the SAME trade count, random in-session timing, and identical exits, run 100 reps. A real strategy must beat the null's median (p50). It answers 'is this better than randomly poking the same market the same number of times?' (docs/02 §3).
trial budget
A number you pre-register in each spec — trial_budget: {family, max_cells} — BEFORE testing. Every run is a counted trial; proposals only move proposed→queued if the family still has headroom. Promising first is pre-registration (Constitution Principle 2; docs/01 §3, docs/02 §7).
mechanism
The one-sentence reason a model should work, stated before any run. G0 requires it; without it the spec is rejected as pure pattern mining (D-031; docs/02 §3).
walk-forward (WFO)
Testing on rolling out-of-sample windows so the strategy is always judged on data it was not fitted to. At G5, TradeStation runs this as an independent confirmation engine (docs/02 §3).
holdout
Frozen out-of-sample months that are touched exactly once, at G5. Rendered visibly locked on the dashboard so you can't peek (D-032; docs/02 §3, §8).
FastScreen
FastScreenEngine: an array-based single pass over M5 bars. Fast but approximate (a bar-extreme touch counts as a fill at the order price). Ranks thousands of cells per night, is always flagged approximate=true, and never feeds gates beyond G1 (docs/01 §4.1).
Rigorous / RigorousEngine
Event-driven bar replay with the full order lifecycle, intrabar resolution, pessimistic fill + cost models, and an order-event log. It is the SAME state machine AXIOM runs live against a venue adapter — that's how parity is guaranteed (docs/01 §4.2–4.3).
two-speed backtest
The pairing of FastScreen (broad, cheap, approximate) and Rigorous (narrow, expensive, faithful): screen thousands cheaply at G1, then prove the survivors rigorously at G2+ (docs/01 §4, docs/02 §3).
LIBB / intrabar resolution
Low-If-Bearish-Bullish-equivalent: each M5 bar expands into its M1 children to decide which bracket side (stop vs target) hit first. If M1 is missing, the pessimistic rule 'stop fills before target' applies and a data-quality row is logged (docs/01 §4.2).
parity / parity court
Parity = the trading code and the backtest code are literally the same kernel (Principle 3). The parity court (G4) is the three-engine audit — kernel vs TradeStation EasyLanguage vs LEAN — reconciled by ParityDiffer against fixed tolerances. An unexplained delta is a build-stopping event that contaminates all prior results (docs/01 §6).
data_version
A hash recorded per experiment; cached and uncached runs must produce identical trade lists. Any delta is a stop-everything bug (docs/02 §4).
IContextGate
WHEN NOT to trade — admits or blocks a whole day/session. Example: the in-play gate. Embodies 'suppression beats optimization' (docs/01 §2).
ITrigger
The signal on bar close — fires the candidate entry (e.g. an opening-range breakout). (docs/01 §2)
IFilter
Allow or block a signal the trigger produced — a yes/no veto, not a standalone entry (docs/01 §2).
IExecutionPolicy
The bracket plan: entry / stop / target / quantity / time-in-force. Every strategy is a pure, individually-testable composition of these four (docs/01 §2).
in-play gate
InPlayGate(orwMult, gapAtrMin, warmup) — an IContextGate that admits trading only on 'in-play' days (enough opening-range width or gap-vs-ATR). 'Don't trade a dead day' (Principle 8; docs/01 §2).
ORB
Opening Range Breakout — the seed trigger family (OrBreakoutTrigger, OrFadeTrigger) firing off the first N minutes' range; the orb-inplay spec is the canonical example (docs/01 §2–3).

Two words in this codebase overload each other dangerously: 'ring' and 'gate'. Rings are how the BUILDER works (the six-step method for writing code). Gates are how a STRATEGY is judged (the seven-step deflation pipeline). They are unrelated ladders — keep them separate in your head.

RingWhat it requires to exitTypical auditor
① DESIGNA UI prototype on mock data (per the design brief) OR a public interface + data contract, written & revieweddesign-reviewer
② DATASchemas/migrations designed AND tested — constraints, idempotency, fixtures — before engine code touches themdata-plant-engineer
③ CONCURRENCY PLANHalf-page per module: threads/processes, named shared state, failure modes, isolation justifieddesign-reviewer / module owner
④ BUILDCode + unit/property tests written together, green locallymodule agent (kernel-engineer, …)
⑤ WIREReal data replaces the mock contracts via module swap (not a screen rebuild); integration tests greenmodule agent
⑥ VERIFYModule test gate green, milestone ceremony held, ops/learnings.md entry writtentest-sentinel → phase-gatekeeper
Combine
Topstep's evaluation account — the test you pay for and must pass before you get funded. Paras automation runs ONLY on Combine and Express Funded accounts, hard-blocked on Live Funded (D-041; docs/00 §3, docs/05). A5 — the first $49 Combine — is the system's first live milestone.
XFA (Express Funded Account)
Topstep's funded stage after passing the Combine — where actual payouts come from (90/10 split, winning-day requirements). Treated as a separate rule regime from the Combine (Hypothesis_Codex; docs/03).
MLL (Trailing Maximum Loss Limit)
The trailing drawdown floor on an account. Combine = intraday-trailing, locks at start+$100; XFA = EOD-trailing, locks at $0; after the first payout the floor is $0 permanently. AXIOM denies any entry whose stop-out would cross it (docs/03 §3).
DLL (Daily Loss Limit)
A per-day loss cap. Paras uses a soft gate at 80% of the limit — it starts suppressing before the hard wall (docs/03 §3).
consistency rule
Topstep's requirement that your best single day be ≤ 50% of the profit target — no one lucky day can carry you. A moving target as profit accumulates (docs/03 §3, Consistent_Revenue_Dossier).
pod / multi-account model
AXIOM is multi-account from day one: each account gets its OWN compliance-engine instance (independent MLL/DLL/consistency state). An allocation map assigns each cell to one account; seats are added one at a time, profit-funded, each uncorrelated with the existing book (D-043; docs/03 §3.2).
edge
A real, repeatable reason you make money — your advantage. The entire system exists to tell a real edge from a lucky-looking one (Edge_Class_Hierarchy §5).
the tape recorder
A record-forward archive of the live quote/trade stream during sessions (DuckDB/Parquet). Sub-minute history can only be recorded, never backfilled — and it's for execution research AFTER an edge exists, never signal-mining below M5 (docs/02 §1.2, §1.5).
the greed-tax table
The blowup-autopsy figures shown to Gemma in the override dialogue (e.g. median outcome $0 vs $1,345 at identical edge) — the priced cost of discretionary oversizing, used as friction to delay the operator to the close (docs/03 §3, §5.3).
sidecar
The local Ollama AI inside AXIOM. It can only veto / narrate / resist / reflect — never approve, place, size, or modify an order. Different from a trading Model (D-050; docs/04).
If you remember only three things

1) 'Model' = a trading strategy; 'AI Model' = an LLM that can only say no. 2) A 'cell' is the unit of spend — every cell raises the DSR hurdle, so fewest-honest-cells wins. 3) 'Ring' is how the builder works (6 steps); 'gate' is how a strategy is judged (G0–G6). Mix those two up and nothing on the dashboard makes sense.

What to stay aware of
  • Word overload is the #1 trap: 'Model' = trading strategy (never an AI model); 'cell' = one counted experiment (never a spreadsheet box); 'gate' = a G0–G6 strategy checkpoint while 'ring' = a step in the six-ring build method. Keep these pairs separate.
  • Family is the budget unit, Edge Class is the truth — they can disagree, and the G6 correlation check exists precisely to catch a portfolio that looks diversified by family label but is really one Edge Class bet (Edge_Class_Hierarchy §9).
  • Every cell you add raises the DSR hurdle for the whole family — 'fewest honest cells' is a survival strategy, not pedantry (docs/02 §3, Principle 2).
  • FastScreen results are always approximate=true and never feed gates beyond G1 — never quote a FastScreen number as a proven result (docs/01 §4.1).
  • A G4 parity delta is a STOP-EVERYTHING event — an unexplained engine disagreement contaminates ALL prior results, not just the current cell (docs/01 §6).
  • Topstep terms (MLL/DLL/consistency/Combine/XFA) describe values that live in config/topstep-rules.json, never as constants — the live rulebook changes and the config must be re-verified (Principle 9; D-040).
  • This glossary is grounded only in docs/00–06 and the in-repo glossary card; if a term you need isn't here, the canonical one-liner is in tracking/memory/glossary.md, and the owning doc § is the full story.

Locked decisions & the why

D-030
Gate ladder G0→G6 with fail-routing; every run is a counted trial; the budget gate is code, not Claude.
Why: Defines the seven terms (G0–G6) at the heart of this glossary and enforces that counting trials is automatic, not a judgment call (docs/02 §3; Constitution Principle 2).
D-031
G0 requires a stated one-sentence mechanism — no mechanism, no run; pure pattern mining is banned.
Why: Anchors the 'mechanism' term and the honest-discovery principle: a strategy with no stated reason to work is rejected before it can consume any trial budget (docs/02 §3).
D-032
Holdout months are frozen and opened exactly once, at G5.
Why: Defines 'holdout' precisely and prevents the operator from peeking at out-of-sample data — the dashboard renders these cells visibly locked (docs/02 §3, §8).
D-041
Automation only on Combine / Express Funded; hard-blocked on Live Funded; no VPS/VPN; everything on the personal machine.
Why: Bounds where the terms Combine and XFA apply — Paras never auto-trades a Live Funded account, so the live-money vocabulary is deliberately scoped (docs/00 §3, Principle 10).
D-043
AXIOM is multi-account ('pod') from day one; per-account compliance instance; new seats get uncorrelated strategies (G6 check); profit-funded, one at a time.
Why: Defines 'pod' and ties it to the Edge Class / G6 correlation check — each seat needs an uncorrelated edge, which is why the Edge Class rung exists (docs/03 §3.2).
D-050
Two AIs, two clocks: Claude Code (research-time) writes and proposes; the local Ollama sidecar (runtime) can only say no.
Why: Disambiguates 'AI Model' / 'sidecar' from a trading 'Model' — the runtime AI is a veto filter, never an order manager (docs/04; Constitution Principles 4–5).
Sources: tracking/memory/glossary.md — the in-repo one-line glossary card (all terms) · tracking/memory/INDEX.md — memory routing table (term-of-art routing row) · docs/00_MASTER_PLAN.md §1, §3 — system overview, Combine/Live-Funded scope · docs/01_KERNEL §2, §4.1–4.3, §6 — component interfaces, FastScreen/Rigorous, LIBB, parity court · docs/02_SENTINEL_SPEC.md §3, §4, §5, §7, §8 — gate pipeline G0–G6, runner, validation service (DSR/PBO), trial budgets, holdout · docs/03_AXIOM_SPEC.md §3 — MLL/DLL/consistency, pod model, greed-tax table · docs/06_ULTRACODE_EXECUTION_METHODOLOGY.md §2.1 — the six rings · docs/education/Edge_Class_Hierarchy_Explained.html §4–§10 — Edge Class/Family/Model/Cell/Tag, the family-hides-the-edge gotcha · tracking/DECISIONS.md — D-030, D-031, D-032, D-041, D-043, D-050