FTR Strategy Backtester

A backtesting research platform for three systematic trading strategies sharing a common metrics spine: FTR (Failed To Return), OB-Pullback Trend, and ORB (Opening Range Breakout).

Stack: Python, YAML configs, argparse CLI, stdlib-only CSV ledger for logged runs. Status: Active research. Source: private.

The story

The three strategies, supply/demand zones with a failed-to-return confirmation, order blocks with an EMA trend filter, and opening-range breakouts, are implemented as parallel entry points (backtest.py, backtest_ob.py, backtest_orb.py) that plug into a shared metrics and simulation spine. What matters for this project isn’t any single strategy; it’s the scaffolding.

The primary metric is confidence-adjusted expectancy: raw expectancy minus 2σ/√n, so a promising-looking edge on a thin sample doesn’t get to pretend it’s robust. There’s an R:R-scaled minimum-trade gate that raises the evidence bar when the payoff is skewed. Walk-forward validation is a gate rather than a chart, a run that fails walk-forward gets an overfit flag attached to its ledger entry. Higher-timeframe bias (e.g. 1h EMA trend) is applied without look-ahead, via an explicit index shift at merge time. The data snapshot is MD5-hashed and attached to the run record, so a result can be traced back to the exact bars that produced it.

Configs are YAML, resolved through a named-then-scratch-then-explicit-path lookup chain (configs/, configs-scratch/, and a --config flag). Runs are logged to a stdlib-only CSV ledger, no pandas, no database, so the logging survives independently of the research stack. An additional researcher.md defines an autonomous parameter-optimisation workflow: a METRIC: <float> stdout contract and a gitignored .lab/ for experiment state.

The repo is young. The git history is a compressed sprint over two days in April 2026, and there are no tests. I’d call this a research-platform build, the scaffolding is there, the strategies themselves are early.

Architecture in one breath

data_fetch → zones / OB / ORB detectors → backtest (shared spine: metrics, exits, slippage) → CSV ledger with MD5 snapshot hash → run.py / run_multi.py entry points.

Proof points

2,708 LOC across 14 Python files.
Three strategy variants on one shared spine.
56 YAML configs (14 / 15 / 27 across the three strategies).
65 logged runs across 17 cached instruments.
7 commits over 2026-04-09 → 2026-04-10 (research sprint).
Confidence-adjusted expectancy as primary metric; R:R-scaled min-trade gate; MD5 snapshot hashing.

What this proves

Quant Engineering, confidence-adjusted primary metric, walk-forward as a gate, no-lookahead HTF filtering, explicit overfit flagging.
Python Services & Data Pipelines, linear data flow, stdlib-only ledger decoupled from research stack, YAML config resolution, CLI ergonomics.

Decisions worth a deeper read

Why I publish negative results, walk-forward-as-gate is the same stance in a different register.
Why stdlib over pandas for the scanner core, the stdlib-only CSV ledger here is the same habit.

Knowledge graph