Markets · Benchmarks · March 2026

Do Benchmark Breakthroughs Actually Matter?

A 7-phase event study asking whether SOTA events move token consumption or stock prices. Usage shifts within two weeks. Markets don't care.

392
SOTA Events
2023-02-24 to 2026-03-05
46
Benchmarks
Reasoning, coding, agents, knowledge
5
Companies
Google, OpenAI, Anthropic, Meta, +
5
Tickers Tested
GOOGL, AMZN, META, NVDA, MSFT

Stock Price Impact

Event Study Results: Cumulative Abnormal Returns

CompanyEventsMean CARMedian CARt-statp-valueSignificant?
OpenAI169-1.344%-1.732%-3.730.0003YES
Google64+2.425%+2.443%3.470.0009YES
Microsoft4-1.968%-1.968%-inf0.0000YES
Meta6+11.316%+15.104%2.990.0306YES

API Usage Impact

Usage Event Study Results: Token Share Change

CompanyEventsMean Share ΔMedian Share Δt-statp-valueSignificant?
Google64+4.23%+1.69%1.590.1166no
Anthropic91+49.05%+35.72%7.390.0000YES
OpenAI110+17.01%-2.38%2.870.0049YES
Meta5-11.96%-11.96%-inf0.0000YES

Interactive Charts

Event Timeline

Benchmark Dominance

CAR by Company

CAR Distribution

CAR by Benchmark Category

CAR vs Score Improvement

Gain vs Lose SOTA

NVIDIA Cross-Company

Usage Timeline

Usage Change by Company

Usage Change by Category

Usage Impact Heatmap

Tokens vs USD Comparison

SDK Downloads

Granger Causality

Cross-Correlations (Returns)

Cross-Correlations (USD Share)

Events vs Returns & USD Share

Methodology

Approach

  • Benchmark events auto-detected from 48 Epoch AI benchmark datasets: when a model sets a new highest score, it's a SOTA event.
  • Stock event study uses classic market model (R = α + βRSPY). Estimation window [-120, -30] days. Event window [-2, +10] days.
  • Usage analysis compares OpenRouter token share: 8-week pre-average vs 4-week post-average around events.
  • Granger causality tests whether lagged SOTA event counts predict stock returns (and vice versa).

Key Limitations

  • Model launches come with marketing, pricing, and partnership announcements — hard to isolate the benchmark signal alone.
  • AI revenue is a small fraction of GOOGL/AMZN/META market cap — benchmark events may be immaterial.
  • OpenRouter usage is a proxy, not total market. Enterprise API usage is invisible.
  • OpenAI is proxied via MSFT, which adds noise.