Equity Research & Backtesting
arb_bot/research/ is a fully isolated research and
backtesting subsystem. It tests, compares, persists, reports, and reviews
equity portfolio strategies before any live execution.
Open Research Simulation Animated Example
⚠️ Simulation only. The research subsystem never places real orders, modifies live trades, touches go-live state, or sends broker orders. It reads market data read-only and writes only toresearch_*tables. Isolation is enforced bytests/research/test_import_isolation.pyand aplace_ordergrep guard intests/research/test_provider_safety.py.
Pipeline
Read-only provider (yfinance live / Dhan prod / fake tests) → DB price cache →
point-in-time universe (niftyindices.com CSVs, effective-dated) →
ResearchStrategy (signals → rank → select) → per-stock rupee
targets K = capital/top_n (fixed|dynamic, with sector cap) —
dynamic cash forces K = portfolio_value/top_n × equity_weight
so the equity sleeve scales with the growing book and freed/idle cash is
redeployed equally across slots (winners still run under drift) →
drift/rebalance trades with costs & FIFO tax → leftover cash swept into the
Nifty Liquid Overnight Fund, then redeemed only when later buys or tax payments
need cash → daily three-level marks
(gross / net-of-costs / net-of-tax) → metrics (CAGR, XIRR, Sharpe, Sortino,
drawdown) → persisted by run_id for CLI/UI review.
Each daily row also records per-level returns and running drawdowns
(daily_return_*, drawdown_*), benchmark/alpha,
turnover, and the running cost/tax ledger
(transaction_cost / tax_accrued / tax_paid).
Migration db/005_equity_aggregate.sql adds the allocation columns
the UI reads to separate equity / liquid-fund / cash —
equity_market_value, liquid_fund_value,
cash_balance (previously bundled into cash_weight).
Equity BUY/SELL quantities are rounded down to whole shares; only liquid-fund
units are fractional.
Key config (RESEARCH_*)
| Key | Default | Meaning |
|---|---|---|
RESEARCH_DATA_PROVIDER | yfinance | yfinance / dhan / broker / fake |
RESEARCH_DEFAULT_UNIVERSE | nifty500 | index universe |
RESEARCH_UNIVERSE_MODE | point_in_time | point_in_time / current |
RESEARCH_TOP_N | 20 | portfolio size; per-stock target K = capital/top_n |
RESEARCH_POSITION_SIZING | fixed | fixed (K from initial capital) / dynamic (K from current PV). Forced to dynamic when dynamic cash is on — see below. |
RESEARCH_REBALANCE_POLICY | drift | drift (only exit losers) / rebalance (also trim winners back to K) |
RESEARCH_DYNAMIC_CASH_ENABLED | false | opt-in de-risk-to-cash overlay on weak benchmark trend. Implies pv-based sizing: targets become K = portfolio_value/top_n (multiplied by the regime equity_weight), so freed/idle cash is redeployed equally across the top_n slots and existing holdings are topped up as the book grows — instead of capping the equity sleeve at the initial capital and stranding gains in the liquid fund. With drift, winners still run uncapped. |
RESEARCH_TAX_PAYMENT_TIMING | quarterly | tax cash timing: quarterly (paid at quarter-end) / on_realization; liability always accrued at realization |
RESEARCH_LIQUID_FUND_ANNUAL_YIELD | 0.065 | idle-cash parking yield (daily-compounded) |
RESEARCH_STCG_RATE / RESEARCH_LTCG_RATE | 0.15 / 0.10 | equity tax; LTCG after RESEARCH_LTCG_HOLDING_DAYS with ₹1L fiscal-year exemption |
RESEARCH_LIQUID_FUND_TAX_RATE | 0.30 | debt-fund tax on liquid-fund redemptions |
| Fundamentals layer (quarterly earnings data) | ||
RESEARCH_FUNDAMENTALS_ENABLED | True | Master switch for fundamentals-gated strategies |
RESEARCH_FUNDAMENTALS_FILING_LAG_DAYS | 45 | Days after period-end before quarterly earnings are available; prevents look-ahead bias |
| momentum_quality strategy | ||
RESEARCH_MQ_MIN_ROE | 0.12 | Minimum Return on Equity for gating |
RESEARCH_MQ_MAX_DE | 1.0 | Maximum Debt/Equity ratio for gating |
RESEARCH_MQ_QUALITY_WEIGHT | 0.5 | Blend weight for quality vs momentum z-scores |
| quality_alpha strategy | ||
RESEARCH_QA_MIN_ROE | 0.15 | Minimum Return on Equity for gating |
RESEARCH_QA_MIN_ROCE | 0.15 | Minimum Return on Capital Employed for gating |
RESEARCH_QA_MAX_DE | 0.6 | Maximum Debt/Equity ratio for gating |
RESEARCH_QA_MIN_MARGIN | 0.08 | Minimum net profit margin for gating |
RESEARCH_QA_TREND_FILTER | True | Require close > SMA200 to avoid value traps (set False to ignore price trend) |
RESEARCH_QA_TREND_SMA | 200 | SMA period (days) for trend filter |
| value_trend strategy | ||
RESEARCH_VT_MIN_PASS | 2 | Minimum number of {E/P, B/P, S/P, div yield} above universe median |
RESEARCH_VT_TREND_SMA | 200 | SMA period (days) for trend filter (close must be above this to avoid value traps) |
Data providers
Research ships four swappable providers behind RESEARCH_DATA_PROVIDER:
yfinance (default, adjusted), dhan, broker,
and fake (tests). All implement the read-only
base.MarketDataProvider contract — OHLCV DataFrames indexed by
date with lowercase open, high, low, close, volume
— and never expose an order surface.
RESEARCH_DATA_PROVIDER=dhan is functional.
DhanProvider resolves research symbols to Dhan
security_ids via InstrumentMaster (Dhan's public
compact security-list CSV), fetches historical OHLCV through the read-only
DhanHistoricalData accessor (Dhan /charts/historical),
and resolves benchmark indices (^NSEI, ^NSEBANK,
^NSMIDCAP, ^CNXSML) to their NSE_INDEX
security_ids. It reuses the shared DB-backed OHLCV cache (the
adjusted flag is part of the cache key, so raw and adjusted
requests never collide) and is read-only by design.
⚠️ Raw OHLC caveat. Dhan returns raw OHLC. The defaultNoOpAdjustmentProvideris a passthrough that returns raw-equivalent prices, so splits/bonuses are not back-adjusted and long-horizon backtests will show discontinuities around corporate actions until a real NSE corp-action source is wired into theAdjustmentProviderseam. Until then preferadjusted=False(explicit raw) or keepyfinancefor adjusted, long-horizon equity backtests.
Running
CLI (synchronous):
venv/bin/python -m arb_bot.research.run_backtest \
--strategy nifty_trend_momentum --universe nifty500 \
--provider yfinance --start-date 2020-01-01 --end-date 2024-12-31 \
--top-n 20 --rebalance-frequency monthly --persist --export-csv ./out
UI (asynchronous): the dashboard Equity Research page
posts to /api/research/runs; a daemon ResearchRunWorker
claims the queued run and executes it; the page polls the run every 3s while
QUEUED/RUNNING. On completion a tabbed report renders
(KPI strip; Overview charts — equity curve with click-to-day drill-down,
allocation, drawdown; Holdings / Trades / Tax / Risk / Universe tabs) over the
read-only strategies and GET /api/research/universes
(loaded universes with member counts, earliest_effective_from,
and supports_point_in_time) form endpoints, plus the
trades, signals, and holdings?date=
(latest snapshot by default; nearest snapshot on/before date when supplied)
report-tab endpoints. Holdings include unrealized P&L; Universe shows raw
signal pass/fail changes, not executed entries/exits. All
report reads are simulation-only and never place orders. A read-only
animated example at /research/demo (sidebar →
Equity Research → Animated Example) renders a bundled sample run through the
same report UI with entrance animations (KPI count-up, chart draw-in) — it
fetches nothing and places no orders.
Active research runs expose operator recovery controls. Restart
requeues a QUEUED or RUNNING run only when no result
artifacts have been persisted. Stop marks an active,
artifact-free run as CANCELLED. If a worker is already inside
long-running provider/backtest work, that Python call may return later, but
the runner checks the cancelled status before saving and discards the result.
The strategy selector also exposes the virtual combined runner.
Selecting it opens combination controls for Mode A (composite) and
Mode B (sleeve), with checkboxes for every registered member strategy
and per-member weight inputs. pair_statarb and
options_vol_premium are sleeve-only in the UI because they do not
produce comparable long-only composite scores.
Strategies
Registered research strategies: nifty_trend_momentum,
vol_adjusted_momentum, low_volatility,
jensen_alpha, sector_rotation,
breakout_volume, mean_reversion,
momentum_quality, quality_alpha, value_trend,
pair_statarb, options_vol_premium,
earnings_pead, and event_driven. The
/api/research/strategies endpoint also returns the virtual
combined runner for dashboard combination runs.
The engine hands each strategy a market_data dict keyed by
symbol; market_data['benchmark'] is the benchmark's close
pd.Series (or None when no benchmark is configured).
The jensen_alpha, sector_rotation, and
breakout_volume strategies consume it, and all of them degrade
gracefully when it is absent.
| Strategy | Rule | Data Needed |
|---|---|---|
momentum_quality |
Momentum names (close > SMA200 and either 6M or 12M return > 0) that also pass ROE ≥ 0.12 and D/E ≤ 1.0. Ranked by blended momentum+quality z-scores. | Price (close), ROE, Debt/Equity (fundamentals) |
quality_alpha |
High-quality names: ROE ≥ 0.15, ROCE ≥ 0.15, D/E ≤ 0.6, net margin ≥ 0.08, FCF > 0, optionally filtered by close > SMA200 to avoid value traps. Ranked by blended ROE/ROCE/D/E/margin z-scores. | Price (close), ROE, ROCE, Debt/Equity, net margin, FCF (fundamentals) |
value_trend |
Cheap stocks (at least 2 of E/P, B/P, S/P, dividend yield above universe median) confirmed by close > SMA200 to avoid value traps. Ranked by blended value z-scores. | Price (close), E/P yield, B/P ratio, S/P ratio, dividend yield (fundamentals) |
Tip: pair mean_reversion with a daily or weekly
rebalance_frequency so its exits realize quickly — a long
rebalance window lets the mean-revert opportunity drift before the next exit.
Long/short pairs (pair_statarb)
pair_statarb is a market-neutral, dedicated engine that trades
same-sector cointegrated pairs on mean-reverting spreads. Unlike the long-only backtest
engine, PairBacktestEngine (in arb_bot/research/engine/pair_engine.py)
handles both long and short legs, daily short-borrow financing, and dollar-neutral position sizing.
Entry & Exit Rules:
- Find pairs — daily scan of same-sector stocks with closing returns
correlation ≥
RESEARCH_PAIR_CORR_MIN(default 0.8). - Hedge ratio — linear regression:
beta = slope(y_close, x_close)over the lookback window. - Spread z-score —
z = (spread - mean) / stdwherespread = y - beta * x, computed over the lastRESEARCH_PAIR_LOOKBACKdays (default 120). - Entry — when
|z| ≥ RESEARCH_PAIR_Z_IN(default 2.0); if z > 0 the spread is rich so short the first leg and long the second; if z < 0, flip. - Exit — at mean reversion (
|z| ≤ RESEARCH_PAIR_Z_OUT, default 0.5), extreme drawdown (|z| ≥ RESEARCH_PAIR_Z_STOP, default 3.5), or holding cap (RESEARCH_PAIR_MAX_HOLDING_DAYS, default 30 days).
Costs & Financing:
- Per-leg transaction costs — both legs incur brokerage, STT, exchange, SEBI,
and stamp fees using the standard
equity_trade_cost()function (entry and exit). - Short-borrow financing — accrues daily at
RESEARCH_SHORT_BORROW_BPS_ANNUAL / 365(default 50 bps/year = ~0.137 bps/day) on the notional value of the short leg. - STCG treatment — realized gains on closed pairs are taxed at the STCG rate
(
RESEARCH_STCG_RATE, default 15%), applied only to positive realized P&L. Open positions are marked to market daily but not taxed until realization.
Dollar-neutral sizing:
capital_per_pair = initial_capital / pair_max_pairs
long_qty = floor(capital_per_pair / long_price)
short_qty = floor((long_qty * long_price) / short_price)
This ensures each pair slot has roughly equal gross notional exposure (long dollars ≈ short dollars at entry) across all active pairs.
Running pair_statarb:
venv/bin/python -m arb_bot.research run pair_statarb \
--universe nifty500 \
--start-date 2022-01-01 --end-date 2024-12-31 \
--rebalance-frequency daily \
--persist --export-csv ./out
Always use --rebalance-frequency daily for pairs — daily scans
find new cointegrated pairs and re-evaluate exits without artificial waiting.
Options Volatility Premium (options_vol_premium)
Signal: IV-rank ≥ RESEARCH_OVP_MIN_IVRANK (default 0.5) computed over a
RESEARCH_OVP_IVRANK_LOOKBACK-day IV history window. Flat-regime check via DTE band
(RESEARCH_OVP_DTE_MIN–RESEARCH_OVP_DTE_MAX).
Structure: Defined-risk iron condor — sell OTM strangle at ±RESEARCH_OVP_WING_POINTS
points from spot, buy protective wings RESEARCH_OVP_WING_POINTS further out. Credit received = net
premium after wing cost. Never a naked short.
Exit: Close when profit ≥ RESEARCH_OVP_PROFIT_TARGET × credit, or loss ≥
RESEARCH_OVP_STOP_MULT × credit, or DTE ≤ 1.
Pricing: EOD NSE F&O bhavcopy settlement prices (BhavcopyOptionsSource);
missing strikes fall back to Black-Scholes via arb_bot.backtest.simulation.option_pricer.
Run example:
venv/bin/python -m arb_bot.research.cli.run_backtest \
--strategy options_vol_premium --start 2023-01-01 --end 2023-12-31
Approximation note: Uses EOD settlement prices — no intraday fills. IV is from bhavcopy implied vol or ATM-IV estimate. Results represent daily mark-to-market, not intraday P&L.
Strategy Combination (Mode A — composite)
Purpose: Blend any two or more registered research strategies into a single
portfolio by z-scoring each member's cross-sectional composite_score and weighting
the blend. This diversifies factor exposure without rebuilding a new strategy from scratch.
How the z-blend works:
- On each rebalance date, every member strategy runs its full
generate_signals → rank_candidatespipeline and returns acomposite_scorefor each symbol it selects. CombinedStrategy.rank_candidatesz-scores each member's composite scores cross-sectionally (symbols absent from a member get 0.0), then blends the z-scores by the supplied weights.- The final
composite_scoredrives portfolio construction — the top-N by blend score enter the portfolio. - No lookahead: each member receives the same
as_of_dateboundary, so point-in-time discipline is preserved.
CLI flags:
--combine-mode {off,composite,sleeve} (default: off)
--combine-members STRATEGY1,STRATEGY2,... comma-separated member strategy names
--combine-weights 0.5,0.5,... comma-separated weights (auto-normalized)
Example run:
venv/bin/python -m arb_bot.research.cli.run_backtest \
--strategy combined \
--combine-mode composite \
--combine-members momentum_quality,value_trend \
--combine-weights 0.5,0.5 \
--start-date 2022-01-01 --end-date 2024-12-31 \
--universe nifty500 --persist
Each rebalance date's signals include a metadata_json.members object containing
the per-member composite_score so you can audit which factors drove selection.
Combination membership is persisted to research_combination_members for
every run where combine_mode != "off".
Strategy Combination (Mode B — Capital Sleeves)
Purpose: Run each member strategy as an independent sub-backtest with its own capital allocation, then merge the equity curves day by day. Use this when member strategies are heterogeneous — for example, a long-only equity strategy paired with a market-neutral pairs strategy — where z-blending their signals into one portfolio (Mode A) does not make sense.
When to use vs Mode A (composite)
| Mode A — composite | Mode B — sleeves |
|---|---|
| Blends signals from all members into a single portfolio via z-scored cross-sectional ranks. | Each member runs through its own engine and owns its allocated capital independently. |
Works best for homogeneous long-only strategies that share the same
universe and produce comparable composite_score values. |
Designed for heterogeneous strategy types: e.g., a long-only equity
strategy alongside a market-neutral pair_statarb strategy. |
One BacktestEngine run; signals merged before portfolio construction. |
Separate engine runs (BacktestEngine for long-only members;
PairBacktestEngine for pair_statarb); results merged
after simulation. |
How it works
- Capital split: each sleeve receives
initial_capital × weight. Weights are taken directly from--combine-weightsand are not auto-normalized, so they must sum ≤ 1.0. - Independent runs: each member is dispatched to the correct engine.
Long-only strategies use
BacktestEngine;pair_statarbusesPairBacktestEngine. Each engine receives only its sleeve's capital and the member's strategy config. - Equity curve merge: the per-sleeve equity curves (gross / net-of-costs / net-of-tax) are aligned by date and summed day by day. Days where a sleeve has no mark (e.g., before it makes its first trade) contribute zero to the combined total for that day — they are not forward-filled.
- Trade tagging: all trades are concatenated into a single trade list.
Each trade carries a
sleeve=<member_name>tag so you can filter by strategy in the Trades tab of the report UI. - Metrics: computed from the merged three-level equity curves, so CAGR, Sharpe, drawdown, etc., reflect the combined portfolio.
Idle remainder
Sleeve weights must sum ≤ 1.0. Any remainder (i.e., 1.0 − Σ weights) stays
as flat cash at the combined level — it earns no liquid-fund yield and is not swept.
Example: weights [0.6, 0.3] leave 10% idle.
⚠️ No liquid-fund sweep for the idle remainder. If you want idle capital to earn the liquid-fund yield, reduce the idle fraction by sizing your weights to sum to 1.0, or run each strategy through its own full-capital backtest and compare them separately.
Example: momentum_quality + pair_statarb (60/40)
Pair a fundamentals-gated long-only momentum strategy (60% sleeve) with a market-neutral pairs strategy (40% sleeve) for a diversified combined portfolio:
venv/bin/python -m arb_bot.research.cli.run_backtest \
--strategy combined \
--combine-mode sleeve \
--combine-members momentum_quality,pair_statarb \
--combine-weights 0.6,0.4 \
--start 2023-01-01 \
--end 2024-12-31 \
--persist
The momentum_quality sleeve runs through BacktestEngine with
₹6,00,000 of the total ₹10,00,000 capital; pair_statarb runs through
PairBacktestEngine with the remaining ₹4,00,000. The combined equity
curve, drawdown, and metrics are computed over both sleeves together. Trades are
tagged sleeve=momentum_quality and sleeve=pair_statarb
in the Trades tab.
Combination membership is persisted to research_combination_members for
every run where combine_mode != "off", so you can audit which member
contributed to each run.
Fundamentals layer
The three fundamentals-gated strategies (momentum_quality,
quality_alpha, value_trend) consume quarterly
fundamental metrics from a point-in-time research_fundamentals
table. Fundamentals are sourced from yfinance's quarterly earnings statements
(income statement, balance sheet, cash flow) and persist fields: ROE, ROCE,
Debt/Equity, net margin, TTM FCF, TTM E/P yield, B/P ratio, S/P ratio, and dividend
yield. Each row is keyed by (symbol, period_end, effective_from)
with a 45-day filing lag (period_end → effective_from); the
engine's point-in-time filter ensures the strategy never sees a filing before
it is actually available. The engine only loads fundamentals for strategies
that declare fundamental fields; combined factor runs inherit the union of
their member requirements.
Refresh fundamentals with a weekly cron (or on-demand):
venv/bin/python -m arb_bot.research.cli.refresh_fundamentals \
--universe nifty500 --provider yfinance
This fetches yfinance quarterly statements for each universe member, derives the
ten fundamental metrics for the most recent period, calculates effective_from as
period_end + 45 days, and upserts the rows into
research_fundamentals. Repeat weekly or monthly to capture new
filings. The CLI is read-only and safe to run in parallel with the backtest
engine.
Events Layer + PEAD / Event-Driven
Two strategies use a point-in-time corporate-events table
(research_events) to capture earnings drift and corporate-action
catalysts without lookahead bias.
Storage: research_events table
Rows are keyed by (symbol, event_type, announce_date).
The engine's no-lookahead gate is: announce_date ≤ as_of_date.
On each rebalance date the engine calls provider.get_events()
with a rolling window of RESEARCH_EVENT_LOOKBACK_DAYS
(default 5), so only events announced in the last N calendar days are
visible to the strategy. Event loading is skipped for non-event strategies,
including combined factor portfolios that do not include
earnings_pead or event_driven. Enable/disable the
injection with RESEARCH_EVENTS_ENABLED.
Carry-forward holding clock
Neither strategy uses exit_rules. Instead, each rebalance the
strategy emits carry-forward signals for positions whose
holding_days < max_hold. When a position ages past the
threshold it is simply omitted — the engine's normal
"sell anything not in the target set" path realizes the exit.
Strategy #8 — earnings_pead
Thesis: Indian Nifty 500 shows significant post-earnings drift (~64 days) — stocks with positive earnings surprises tend to continue outperforming.
- Entry: surprise_pct ≥
RESEARCH_PEAD_MIN_SURPRISE(default 0.05, i.e. 5 %) from an earnings event in the lookback window. - Hold: up to
RESEARCH_PEAD_HOLDING_DAYS(default 40 days); carry-forward signals keep the position in the target set. Aged positions exit naturally on the next rebalance. - Ranking: by
surprise_pctdescending; carry- forwards score 0.0 (below new entries). - Main risk: gap-up at announcement may already be priced; estimate-history coverage from yfinance is shallow for small caps.
Recommended rebalance frequency: weekly or
daily to catch freshly announced earnings promptly.
venv/bin/python -m arb_bot.research.cli.run_backtest \
--strategy earnings_pead --start 2022-01-01 --end 2024-12-31 \
--rebalance-frequency weekly --universe nifty500 --persist
Strategy #12 — event_driven
Thesis: Markets often misprice complex corporate events (buybacks, index inclusions, pledge reductions, block deals).
- Enabled event types (comma-separated config):
RESEARCH_EVENT_TYPES_ENABLED(default:buyback,index_inclusion,pledge_reduction,block_deal,earnings). - Ranking weight map (hard-coded multipliers): index_inclusion 1.2 × value_num, buyback 1.0 ×, pledge_reduction 0.8 ×, block_deal 0.6 ×, and earnings 1.0 × positive surprise fraction. Event types not in the enabled set are silently ignored.
- Hold: up to
RESEARCH_EVENT_HOLDING_DAYS(default 30 days); carry-forward same mechanism as earnings_pead. - Main risk: depends on manual event ingestion; the yfinance
earnings seam only populates earnings. Non-earnings event types require
custom data injection via
events_store.upsert_events().
Recommended rebalance frequency: weekly.
refresh_events CLI
Populate research_events from yfinance earnings data:
venv/bin/python -m arb_bot.research.cli.refresh_events \
--universe nifty500 --provider yfinance
The CLI calls YFinanceProvider.get_events() which iterates over
each symbol's Ticker.earnings_dates, computes
surprise_pct = (actual − estimate) / |estimate|, and upserts
into research_events via events_store.upsert_events().
The store uses ON CONFLICT … DO UPDATE so re-runs are idempotent.
Run weekly to capture new quarters. The CLI is read-only relative to live
trading state.
Universe membership (point-in-time)
Membership is a forward-only weekly snapshot from the
niftyindices.com CSVs into effective-dated universe_constituents
rows. point_in_time (with historical accepted as a
legacy alias) resolves each rebalance date, and dates before the first tracked
snapshot fall back to the earliest known membership with a survivorship-bias
warning. After the first snapshot, future dates resolve entrants/leavers via
loader.diff_apply. Run either trigger, both isolation-safe:
- System crontab —
5 18 * * 5 cd /path/to/project && venv/bin/python -m arb_bot.research.cli.refresh_universe --all(--allsnapshots nifty50/100/200/500). Use this on a host that can reach niftyindices.com. - In-process — set
RESEARCH_UNIVERSE_REFRESH_CRON="5 18 * * 5"(5-field cron). Empty/unset = off (default);ResearchRunWorkerfires it at most once per calendar day. - Deployed VPS — the Hostinger box cannot reach niftyindices.com, so fetch the CSVs on a machine that can and run the diff on the server:
make sync-universe-pit(scripts/sync_universe_pit.sh), which fetches locally, rsyncs totmp_csvs/, and runspython -m arb_bot.research.refresh_universe --all --from-dir tmp_csvsinside thearb-botcontainer. Samediff_applysemantics;--from-diralso works with--universefor a single index.
Backfill (pre-tracking history): historical membership is
loaded from the Wayback Machine, which archives the
niftyindices CSVs (earliest ~2017-10; density is uneven, ~annual in places).
Run make backfill-universe-wayback
(scripts/backfill_universe_wayback.sh): it lists archived
snapshots via the CDX API and fetches them locally (the VPS cannot reach
web.archive.org), then runs backfill_universe --all --from-dir
inside the deployed container, which merges Wayback snapshots with the forward
timeline and atomically rebuilds each universe via
loader.rebuild_timeline (effective-dated, idempotent). Pre-2017
has no free source and stays on the survivorship-bias fallback; commercial
feeds (Refinitiv, Bloomberg, CMIE Prowess, GDFL/TrueData) give clean history
for a fee.
Deleting a run
The dashboard can permanently delete a research run and all of
its data: the equity curve, signals, holdings, trades, tax lots, cash flows,
dynamic-cash decisions, run logs, exports, the resolved config, and the run
record itself (every research_* table scoped to that
run_id), plus any on-disk CSV/JSON files recorded in
research_exports. The wipe runs in a single transaction.
DELETE /api/research/runs/{run_id} returns
{deleted, files_removed, files_missing} on success. Runs that are
still QUEUED or RUNNING cannot be deleted — the API
returns 409 Conflict and the dashboard disables the delete action
until the run finishes, so the worker never races a vanishing run. The
operator must type the literal DELETE to confirm. Shared tables
(universe_constituents, research_market_data_cache)
are never affected — they are keyed by universe/provider+symbol, not
run_id. Deletion is isolated to research_* and never
touches live trades.
See docs/research_backtest.md for the full reference.