AMZN PRICES 2020-2025 - Daily AI Feature Feed

Finance & Banking Analytics

Tags and Keywords

Amzn

Amazon

Ai

Stocks

Finance

Sentiment

Quant

Analysis

Stock

Market

Machine

Learning

Alternative

Data

Trading

Quantitative

Technical

Indicators

Magnificent

7

Cloud

Computing

Faang

Time

Series

Backtesting

Trusted By
Trusted by company1Trusted by company2Trusted by company3
AMZN PRICES 2020-2025 - Daily AI Feature Feed Dataset on Opendatabay data marketplace

"No reviews yet"

£199.99

About

Amazon AMZN 2020-2025 — Daily AI Feature Feed Dataset CSV
94 pre-engineered daily features for Amazon (AMZN) covering 6 full years of trading data from January 2020 to December 2025. Every row is one trading day. Every column is a calculated numerical feature — no raw text, no copyrighted content.
Amazon operates the most complex dual-engine business model among mega-caps — e-commerce and AWS cloud computing generate fundamentally different sentiment dynamics on the same ticker. AWS is the profit driver while retail is the revenue driver, and the market reprices AMZN based on which narrative dominates any given quarter. Add in the advertising revenue surge, AI infrastructure investments (Bedrock, Anthropic partnership), and logistics transformation, and AMZN becomes one of the most multi-dimensional sentiment stories in the market.
Sentiment scores are derived from 100+ real financial articles analyzed daily through our proprietary Herodote AI pipeline. This is not synthetic or LLM-generated data — these are genuine market signals extracted from actual news coverage of AWS growth, retail margins, AI infrastructure bets, and Amazon's logistics and advertising evolution.

What You Get

  • 1,508 trading days from January 2, 2020 to December 31, 2025
  • 94 columns (3 metadata + 84 features + 7 forward labels)
  • Single CSV file, chronological order
  • One row per trading day — clean, ready for ML ingestion

Feature Groups (94 columns)

Metadata (3): date, year, day of week
Price Action (11): closing price (USD), 1d/3d/5d/10d/20d returns, log returns, distance from rolling highs and lows
Technical Indicators (22): RSI-14, SMA/EMA crossovers, Bollinger Bands (position, bandwidth, squeeze), MACD (signal, histogram), ADX, Rate of Change, rolling volatility, momentum quality, up/down streaks
AI Sentiment (14): overall market sentiment and AMZN-specific sentiment (-1 to +1), sentiment spread, 3d/5d moving averages, sentiment momentum, sentiment volatility
Sentiment x Price (3): rolling 10d/20d sentiment-price correlation, divergence flag (when sentiment and price disagree 3+ days)
News Volume (5): daily article count, 20-day moving average, z-score, momentum, spike detection flag
Cross-Asset (12): S&P 100 breadth and returns, market dispersion, sector returns (tech, finance, health, energy, consumer, industrial), AMZN vs sector/market relative performance, rolling beta
Volatility Regime (2): volatility-of-volatility (20d), regime classification (0=low, 1=normal, 2=high, 3=extreme)
Sentiment x Volatility (1): sentiment-volatility regime interaction term
Macro (6): VIX level, VIX change, VIX 5-day MA, treasury yield spread, credit spread proxy, USD Index change
Options (2): implied volatility ATM, IV-RV spread
Earnings (2): days to next earnings, earnings day flag
Social Attention (4): Google Trends index and change, Wikipedia pageviews and z-score
Forward Labels (7): 1d/3d/5d forward returns (%), direction labels (UP/DOWN/FLAT), flat zone flag

Why Amazon Sentiment Is Different

  • AWS growth narrative — cloud market share battles vs Azure and Google Cloud dominate analyst sentiment; quarterly growth rate expectations drive multi-week sentiment trends
  • Retail margin trajectory — the shift from growth-at-all-costs to profitability creates sentiment regime changes around cost-cutting and efficiency announcements
  • AI infrastructure pivot — Bedrock, Trainium chips, and the Anthropic partnership create a new AI narrative layer tracked daily in tech and financial media
  • Advertising revenue surge — Amazon's third profit engine generates increasingly positive sentiment as ad revenue consistently beats expectations
  • Logistics transformation — same-day delivery, drone delivery, and fulfillment cost optimization create operational sentiment signals
  • Labor relations narrative — unionization efforts, warehouse conditions, and hiring/layoff cycles generate periodic negative sentiment waves
Our AI reads 100+ articles daily and distills these narratives into numerical sentiment scores — giving your models signal that traditional technical analysis misses.

Collection Methodology

Data is produced by MarketSignal Solutions using our proprietary Herodote AI pipeline:
  1. News Collection: GDELT (Global Database of Events, Language, and Tone) provides 100+ Amazon-related articles daily from global financial media
  2. AI Sentiment Analysis: Google Gemini processes each article batch, scoring overall market sentiment and AMZN-specific sentiment on a -1.0 to +1.0 scale
  3. Price Data: Yahoo Finance provides AMZN closing prices, S&P 100 cross-asset data, sector returns, VIX, yields, credit spreads, and options data
  4. Feature Engineering: 91 features are computed from raw inputs using numpy — technical indicators, sentiment derivatives, cross-asset correlations, macro signals, and forward labels
  5. Quality Control: Automated audit checks coverage, NaN rates, column integrity, and data consistency
No copyrighted article text is included — only our own calculated numerical features derived from public market data and public news feeds.

Complete Column Reference

# | Column | Type | Group | Description
---|--------|------|-------|------------
1 | date | date | Metadata | Trading date (YYYY-MM-DD)
2 | year | int | Metadata | Calendar year
3 | day_of_week | string | Metadata | Day name (Monday-Friday)
4 | price_close | float | Price Action | Closing price (USD)
5 | price_return_1d | float | Price Action | 1-day return (%)
6 | price_return_3d | float | Price Action | 3-day return (%)
7 | price_return_5d | float | Price Action | 5-day return (%)
8 | price_return_10d | float | Price Action | 10-day return (%)
9 | price_return_20d | float | Price Action | 20-day return (%)
10 | price_log_return_1d | float | Price Action | 1-day log return
11 | price_dist_high_10d | float | Price Action | Distance from 10-day high (%)
12 | price_dist_high_20d | float | Price Action | Distance from 20-day high (%)
13 | price_dist_low_10d | float | Price Action | Distance from 10-day low (%)
14 | price_dist_low_20d | float | Price Action | Distance from 20-day low (%)
15 | tech_rsi_14 | float | Technical | RSI 14-day (0-100, >70 overbought, <30 oversold)
16 | tech_sma_5 | float | Technical | 5-day Simple Moving Average (USD)
17 | tech_sma_20 | float | Technical | 20-day Simple Moving Average (USD)
18 | tech_sma_5_dist | float | Technical | Distance from 5-day SMA (%)
19 | tech_sma_20_dist | float | Technical | Distance from 20-day SMA (%)
20 | tech_ema_20 | float | Technical | 20-day Exponential Moving Average (USD)
21 | tech_ema_20_dist | float | Technical | Distance from 20-day EMA (%)
22 | tech_bollinger_pos | float | Technical | Position within Bollinger Bands (0=lower, 0.5=middle, 1=upper)
23 | tech_bollinger_bw | float | Technical | Bollinger Band width (%)
24 | tech_bollinger_squeeze | float | Technical | Squeeze indicator (1 = bandwidth in bottom 10th percentile, upcoming breakout)
25 | tech_macd | float | Technical | MACD line
26 | tech_macd_signal | float | Technical | MACD signal line
27 | tech_macd_hist | float | Technical | MACD histogram (positive = bullish momentum)
28 | tech_adx | float | Technical | ADX trend strength (>25 trending, <20 ranging)
29 | tech_roc_5 | float | Technical | 5-day Rate of Change (%)
30 | tech_roc_10 | float | Technical | 10-day Rate of Change (%)
31 | tech_roc_20 | float | Technical | 20-day Rate of Change (%)
32 | tech_streak | int | Technical | Consecutive up/down days (positive = up streak)
33 | tech_vol_5d | float | Technical | 5-day realized volatility (annualized %)
34 | tech_vol_10d | float | Technical | 10-day realized volatility (%)
35 | tech_vol_20d | float | Technical | 20-day realized volatility (%)
36 | tech_momentum_quality | float | Technical | Momentum consistency score (ROC adjusted for volatility)
37 | sent_overall | float | AI Sentiment | Overall market sentiment (-1.0 bearish to +1.0 bullish)
38 | sent_amzn | float | AI Sentiment | Amazon-specific sentiment (-1.0 bearish to +1.0 bullish)
39 | sent_spread | float | AI Sentiment | Ticker minus overall sentiment (positive = stock more bullish than market)
40 | sent_overall_ma3 | float | AI Sentiment | 3-day MA of overall sentiment
41 | sent_amzn_ma3 | float | AI Sentiment | 3-day MA of amzn sentiment
42 | sent_overall_ma5 | float | AI Sentiment | 5-day MA of overall sentiment
43 | sent_amzn_ma5 | float | AI Sentiment | 5-day MA of amzn sentiment
44 | sent_overall_mom3 | float | AI Sentiment | 3-day overall sentiment momentum
45 | sent_amzn_mom3 | float | AI Sentiment | 3-day amzn sentiment momentum
46 | sent_overall_mom5 | float | AI Sentiment | 5-day overall sentiment momentum
47 | sent_amzn_mom5 | float | AI Sentiment | 5-day amzn sentiment momentum
48 | sent_overall_vol5 | float | AI Sentiment | 5-day overall sentiment volatility
49 | sent_overall_vol10 | float | AI Sentiment | 10-day overall sentiment volatility
50 | sent_amzn_vol5 | float | AI Sentiment | 5-day amzn sentiment volatility
51 | sent_price_corr_10d | float | Sent x Price | 10-day rolling sentiment-price correlation
52 | sent_price_corr_20d | float | Sent x Price | 20-day rolling sentiment-price correlation
53 | sent_price_diverge | float | Sent x Price | Divergence flag (1 when sentiment and price disagree 3+ consecutive days)
54 | news_count | int | News Volume | Articles collected that day
55 | news_count_ma20 | float | News Volume | 20-day article count moving average
56 | news_count_zscore | float | News Volume | Z-score vs 20-day window (>2 = unusual volume)
57 | news_count_mom5 | float | News Volume | 5-day article count momentum
58 | news_spike | float | News Volume | Binary flag for abnormal news volume (count > 2x MA)
59 | mkt_sp100_breadth | float | Cross-Asset | S&P 100 market breadth (% of stocks up, 0-100)
60 | mkt_sp100_return | float | Cross-Asset | S&P 100 equal-weight return (%)
61 | mkt_dispersion | float | Cross-Asset | Cross-sectional return dispersion (%)
62 | mkt_tech_return | float | Cross-Asset | Tech sector return (%)
63 | mkt_finance_return | float | Cross-Asset | Finance sector return (%)
64 | mkt_health_return | float | Cross-Asset | Healthcare sector return (%)
65 | mkt_energy_return | float | Cross-Asset | Energy sector return (%)
66 | mkt_consumer_return | float | Cross-Asset | Consumer sector return (%)
67 | mkt_industrial_return | float | Cross-Asset | Industrial sector return (%)
68 | mkt_amzn_vs_tech | float | Cross-Asset | Amazon minus tech sector return (%)
69 | mkt_amzn_vs_market | float | Cross-Asset | Amazon minus S&P 100 return (%)
70 | mkt_amzn_beta_20d | float | Cross-Asset | 20-day rolling beta vs S&P 100
71 | vol_of_vol_20d | float | Vol Regime | Volatility of volatility (20-day rolling)
72 | vol_regime | int | Vol Regime | Regime: 0=low, 1=normal, 2=high, 3=extreme
73 | sent_vol_regime_interaction | float | Interaction | Amazon sentiment x volatility regime (amplified signal in high-vol periods)
74 | macro_vix | float | Macro | VIX level (CBOE Volatility Index)
75 | macro_vix_change_1d | float | Macro | 1-day VIX percentage change
76 | macro_vix_ma5 | float | Macro | VIX 5-day moving average
77 | macro_yield_spread | float | Macro | 10Y minus short-term Treasury yield spread (%)
78 | macro_credit_spread | float | Macro | High-yield credit spread proxy (LQD/HYG ratio)
79 | macro_dxy_change | float | Macro | US Dollar Index 1-day percentage change (%)
80 | options_iv_atm | float | Options | ATM implied volatility (%) via VIX x beta approximation
81 | options_iv_rv_spread | float | Options | IV minus realized vol (positive = fear premium)
82 | earnings_days_to_next | float | Earnings | Days to next quarterly earnings (0 = earnings day, capped at 90)
83 | earnings_is_earnings_day | float | Earnings | Binary flag: 1 on earnings day, 0 otherwise
84 | attention_wikipedia_views | float | Social Attention | Daily Wikipedia pageviews (1-day lagged)
85 | attention_wikipedia_zscore | float | Social Attention | Wikipedia views z-score (>2 = unusual interest)
86 | attention_google_trends | float | Social Attention | Google Trends index (0-100)
87 | attention_google_trends_change | float | Social Attention | Google Trends change vs prior period
88 | label_return_1d | float | Forward Labels | Next-day return (%)
89 | label_dir_1d | string | Forward Labels | Next-day direction (UP/DOWN/FLAT)
90 | label_return_3d | float | Forward Labels | 3-day forward return (%)
91 | label_dir_3d | string | Forward Labels | 3-day forward direction
92 | label_return_5d | float | Forward Labels | 5-day forward return (%)
93 | label_dir_5d | string | Forward Labels | 5-day forward direction
94 | label_flat_1d | float | Forward Labels | Flat flag (1 if next-day |return| < 0.3%)

Data Quality

  • NaN values limited to first ~30 rows (indicator warmup period for rolling windows)
  • Last 1-5 rows may have empty forward labels (not yet realized)
  • Zero gaps in sentiment coverage — every trading day has article data
  • Cross-asset features sourced from Yahoo Finance; occasional NaN on mismatched trading holidays
  • Quality rating: 5/5 (automated audit verified)

Known Limitations

  • Amazon-specific sentiment (sent_amzn) is derived from AI analysis of English-language financial news via GDELT. The pipeline does not distinguish between AWS-specific and retail-specific sentiment — both contribute to a single AMZN score.
  • Options IV (options_iv_atm) is approximated using VIX x AMZN beta, not from actual AMZN options chains. This may understate IV around earnings and Prime Day events.
  • Amazon's 20:1 stock split (June 2022) is reflected in adjusted prices, but sentiment dynamics around the split event may create a temporary regime break.
  • Earnings dates are pattern-estimated with yfinance confirmation. Accuracy is +/- 3 days for some historical quarters.
  • The COVID e-commerce boom period (2020-2021) represents an anomalous demand environment — models should be aware of structural breaks when training across the full 6-year period.

Use Cases

  • ML model training for AMZN price direction prediction across 6 years of e-commerce and cloud computing cycles
  • Dual-engine analysis: separating AWS-driven vs retail-driven sentiment for more granular prediction models
  • Event study: CEO transition impact, cost-cutting announcements, and AI investment sentiment waves
  • Cross-asset analysis: AMZN vs tech sector — when does Amazon decouple from the Nasdaq narrative?
  • Advertising revenue narrative tracking: sentiment as a leading indicator for the third profit engine
  • Feature engineering baseline for cloud computing, e-commerce, or mega-cap portfolio strategies
  • Earnings surprise prediction using pre-earnings sentiment divergence between AWS and retail narratives

Pairs Well With

Amazon AMZN Live 2026 — subscribe for weekly updates and extend this dataset into 2026.

License

CUSTOM — Single User Commercial License. Full rights to use for internal trading research, analysis, ML model training, AI/LLM fine-tuning, and model commercialization. Dataset itself may not be resold or redistributed. Contact contact@marketsignal.solutions for multi-seat licensing.

AI Training Rights

Non-exclusive, worldwide, perpetual right to train, fine-tune, and evaluate ML models. Derivative works and commercialization of model outputs permitted. Dataset redistribution prohibited.

Not investment advice. This dataset is intended for quantitative research, ML model development, and academic analysis only. Past patterns do not guarantee future results. Amazon stock prices are influenced by company-specific developments, sector dynamics, macroeconomic conditions, and market sentiment that may not be fully captured in historical data.
Data is produced by MarketSignal Solutions using our proprietary Herodote AI pipeline. All source data is derived from publicly available market prices and public news APIs/feeds. No copyrighted article text is included — only our own calculated numerical features.

Listing Stats

VIEWS

17

DOWNLOADS

0

LISTED

08/03/2026

UPDATED

13/03/2026

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

Loading...

£199.99

Download Dataset in CSV Format