Pipeline, explained
High-level view of how data moves through the system�ingest, label, feature, train/select, and publish.
IngestPrices, fundamentals, calendars
?
CleanGaps, splits, outliers
?
LabelTargets, regimes
?
FeaturesSignals, risk proxies
?
Train & SelectFit, CV, calibration
?
PolicyPID, constraints
?
PublishCurves, tables, status
Stages
Ingest & Clean
- Consolidate vendor feeds
- Normalize splits/dividends
- QC on gaps and anomalies
Label & Regimes
- Forward returns & horizons
- Regime timeline and transitions
- Sane defaults for ties/missing
Features
- Price/volume transforms
- Sector & market context
- Risk greek proxies
Model & Selection
- Cross-validation by regime
- ECE calibration
- Consistency checks
Policy
- PID trace & clamps
- Position sizing rules
- Trade eligibility filters
Publish
- Equity windows
- Rollups and leaders
- Status JSON for live
Data I/O
| Artifact | Where | Used by |
|---|---|---|
| Equity windows | /data/models/eqx-m1/equity/* | Model Card, Compare, Pipelines |
| Calibration | /data/models/eqx-m1/calibration.json | Model Card, Workstation |
| Drivers/Sectors | /data/models/eqx-m1/top_drivers.json | Home, Workstation |
| Daily picks | /data/daily/<date>/model_selection.json | Home, Live |
| Pipeline status | /data/pipeline/status.json | Pipelines, Live |
Scheduling & Ops
- Deterministic runs; write once, read many
- Health JSON per module; simple green/amber/red
- Idempotent publishers for window updates
Evaluation
- Windowed comparisons vs baselines
- ECE and drift monitors
- Stability checks across regimes