Alpha family pipeline
15 families that aren't yet producing live candidates, either the upstream feed is still being ingested, the original source went dark, or it sits behind a paid API. Each row is honest about why and what the live-data alternative is.
Implementing now
10 familiesUpstream feed identified, alt source verified, ingest script in flight.
- jolts hiring momentumMomentum
- filing text deltaText & Filings
- sunshine mood local hqWeather
- election political cycle rotationGeopolitical
- ag basis signalCommodities
AMS Datamart key for live data
deps: usda_grain_basis
- clarkson shipping indexCommodities
Clarkson ClarkSea index is paid; switching to Trading Economics free BDI puller (in progress)
deps: clarkson_shipping
- product review velocityAlt-Data
Amazon page-scrape is fragile; HF McAuley-Lab backfill pending for historical (cutoff Sep-2023)
deps: amazon_reviews
- mobile app dau panelAlt-Data
- nasa black marble nightlightsAlt-Data
NASA Earthdata token approved; VNP46A2 LAADS pipeline (rasterio + Census MSA zonal aggregation) build in progress
deps: nasa_black_marble_msa
- usda basis inversion ag longCommodities
USDA AMS key is configured, but the live grain-basis feed still needs a valid MARS slug/source mapping; current usda_grain_basis rows are a partial stub.
Paid-only
4 familiesBest source requires a paid API key. Tracked but not on the free roadmap.
- ofi microstructureMicrostructure
Polygon NBBO paid
deps: nbbo_ticks
- heat wave power genWeather
PJM/ERCOT API keys needed
deps: iso_load_forecasts
- linkedin employee growthAlt-Data
Revelio Labs is request-only/WRDS-gated; LinkedIn ToS blocks scraping
deps: linkedin_headcount
- store traffic safegraphAlt-Data
SafeGraph → Advan/Dewey requires .edu academic email; commercial path is paid
deps: safegraph_visits
Exploratory
1 familyPrototype-stage; signal not yet validated.
- wayback homepage changeSentiment
Why publish the pipeline?
Most quant platforms show only what currently works. We list what doesn't, the upstream feeds that went dark, the paid datasets we're routing around, and the families we're actively unblocking, so you can see whether the factor you care about is on the way.
Back to the live 278 families