Alpha family pipeline

15 families that aren't yet producing live candidates, either the upstream feed is still being ingested, the original source went dark, or it sits behind a paid API. Each row is honest about why and what the live-data alternative is.

Implementing now

10 families

Upstream feed identified, alt source verified, ingest script in flight.

  • jolts hiring momentumMomentum
  • filing text deltaText & Filings
  • sunshine mood local hqWeather
  • election political cycle rotationGeopolitical
  • ag basis signalCommodities

    AMS Datamart key for live data

    deps: usda_grain_basis

  • clarkson shipping indexCommodities

    Clarkson ClarkSea index is paid; switching to Trading Economics free BDI puller (in progress)

    deps: clarkson_shipping

  • product review velocityAlt-Data

    Amazon page-scrape is fragile; HF McAuley-Lab backfill pending for historical (cutoff Sep-2023)

    deps: amazon_reviews

  • mobile app dau panelAlt-Data
  • nasa black marble nightlightsAlt-Data

    NASA Earthdata token approved; VNP46A2 LAADS pipeline (rasterio + Census MSA zonal aggregation) build in progress

    deps: nasa_black_marble_msa

  • usda basis inversion ag longCommodities

    USDA AMS key is configured, but the live grain-basis feed still needs a valid MARS slug/source mapping; current usda_grain_basis rows are a partial stub.

Paid-only

4 families

Best source requires a paid API key. Tracked but not on the free roadmap.

  • ofi microstructureMicrostructure

    Polygon NBBO paid

    deps: nbbo_ticks

  • heat wave power genWeather

    PJM/ERCOT API keys needed

    deps: iso_load_forecasts

  • linkedin employee growthAlt-Data

    Revelio Labs is request-only/WRDS-gated; LinkedIn ToS blocks scraping

    deps: linkedin_headcount

  • store traffic safegraphAlt-Data

    SafeGraph → Advan/Dewey requires .edu academic email; commercial path is paid

    deps: safegraph_visits

Exploratory

1 family

Prototype-stage; signal not yet validated.

  • wayback homepage changeSentiment

Why publish the pipeline?

Most quant platforms show only what currently works. We list what doesn't, the upstream feeds that went dark, the paid datasets we're routing around, and the families we're actively unblocking, so you can see whether the factor you care about is on the way.

Back to the live 278 families
For informational and educational purposes only. Not financial advice. Learn more