Markets · Benchmarks · March 2026
Do Benchmark Breakthroughs Actually Matter?
A 7-phase event study asking whether SOTA events move token consumption or stock prices. Usage shifts within two weeks. Markets don't care.
March 25, 2026 · Event study using Epoch AI + OpenRouter + yfinance
392
SOTA Events
2023-02-24 to 2026-03-05
46
Benchmarks
Reasoning, coding, agents, knowledge
5
Companies
Google, OpenAI, Anthropic, Meta, +
5
Tickers Tested
GOOGL, AMZN, META, NVDA, MSFT
Stock Price Impact
Event Study Results: Cumulative Abnormal Returns
| Company | Events | Mean CAR | Median CAR | t-stat | p-value | Significant? |
| OpenAI | 169 | -1.344% | -1.732% | -3.73 | 0.0003 | YES |
| Google | 64 | +2.425% | +2.443% | 3.47 | 0.0009 | YES |
| Microsoft | 4 | -1.968% | -1.968% | -inf | 0.0000 | YES |
| Meta | 6 | +11.316% | +15.104% | 2.99 | 0.0306 | YES |
API Usage Impact
Usage Event Study Results: Token Share Change
| Company | Events | Mean Share Δ | Median Share Δ | t-stat | p-value | Significant? |
| Google | 64 | +4.23% | +1.69% | 1.59 | 0.1166 | no |
| Anthropic | 91 | +49.05% | +35.72% | 7.39 | 0.0000 | YES |
| OpenAI | 110 | +17.01% | -2.38% | 2.87 | 0.0049 | YES |
| Meta | 5 | -11.96% | -11.96% | -inf | 0.0000 | YES |
Interactive Charts
Event Timeline
Benchmark Dominance
CAR by Company
CAR Distribution
CAR by Benchmark Category
CAR vs Score Improvement
Gain vs Lose SOTA
NVIDIA Cross-Company
Usage Timeline
Usage Change by Company
Usage Change by Category
Usage Impact Heatmap
Tokens vs USD Comparison
SDK Downloads
Granger Causality
Cross-Correlations (Returns)
Cross-Correlations (USD Share)
Events vs Returns & USD Share
Methodology
Approach
- Benchmark events auto-detected from 48 Epoch AI benchmark datasets: when a model sets a new highest score, it's a SOTA event.
- Stock event study uses classic market model (R = α + βRSPY). Estimation window [-120, -30] days. Event window [-2, +10] days.
- Usage analysis compares OpenRouter token share: 8-week pre-average vs 4-week post-average around events.
- Granger causality tests whether lagged SOTA event counts predict stock returns (and vice versa).
Key Limitations
- Model launches come with marketing, pricing, and partnership announcements — hard to isolate the benchmark signal alone.
- AI revenue is a small fraction of GOOGL/AMZN/META market cap — benchmark events may be immaterial.
- OpenRouter usage is a proxy, not total market. Enterprise API usage is invisible.
- OpenAI is proxied via MSFT, which adds noise.