GOOG PRICES 2020-2025 - Daily AI Feature Feed

Finance & Banking Analytics

Tags and Keywords

Goog

Google

Ai

Sotcks

Sentiment

Quant

Finance

Alphabet

Googlegoog

Analysis

Stock

Market

Machine

Learning

Alternative

Data

Trading

Quantitative

Technical

Indicators

Magnificent

7

Faang

Time

Series

Backtesting

Trusted By
Trusted by company1Trusted by company2Trusted by company3
GOOG PRICES 2020-2025 - Daily AI Feature Feed Dataset on Opendatabay data marketplace

"No reviews yet"

£199.99

About

Alphabet GOOG 2020-2025 — Daily AI Feature Feed Dataset CSV
94 pre-engineered daily features for Alphabet (GOOG) covering 6 full years of trading data from January 2020 to December 2025. Every row is one trading day. Every column is a calculated numerical feature — no raw text, no copyrighted content.
Alphabet is the search and digital advertising giant navigating the most significant disruption threat in its history — AI-powered search alternatives. The race between Gemini and ChatGPT/OpenAI, combined with DOJ antitrust lawsuits, Google Cloud growth ambitions, YouTube advertising revenue, and Waymo autonomous driving expansion, creates one of the most complex sentiment landscapes among mega-caps. Whether GOOG is "the AI winner" or "the search disruptor victim" depends entirely on which narrative dominates — and our sentiment features capture that narrative in real time.
Sentiment scores are derived from 100+ real financial articles analyzed daily through our proprietary Herodote AI pipeline. This is not synthetic or LLM-generated data — these are genuine market signals extracted from actual news coverage of Google Search, Gemini AI, Google Cloud, YouTube, antitrust proceedings, and Waymo autonomous driving.

What You Get

  • 1,508 trading days from January 2, 2020 to December 31, 2025
  • 94 columns (3 metadata + 84 features + 7 forward labels)
  • Single CSV file, chronological order
  • One row per trading day — clean, ready for ML ingestion

Feature Groups (94 columns)

Metadata (3): date, year, day of week
Price Action (11): closing price (USD), 1d/3d/5d/10d/20d returns, log returns, distance from rolling highs and lows
Technical Indicators (22): RSI-14, SMA/EMA crossovers, Bollinger Bands (position, bandwidth, squeeze), MACD (signal, histogram), ADX, Rate of Change, rolling volatility, momentum quality, up/down streaks
AI Sentiment (14): overall market sentiment and GOOG-specific sentiment (-1 to +1), sentiment spread, 3d/5d moving averages, sentiment momentum, sentiment volatility
Sentiment x Price (3): rolling 10d/20d sentiment-price correlation, divergence flag (when sentiment and price disagree 3+ days)
News Volume (5): daily article count, 20-day moving average, z-score, momentum, spike detection flag
Cross-Asset (12): S&P 100 breadth and returns, market dispersion, sector returns (tech, finance, health, energy, consumer, industrial), GOOG vs sector/market relative performance, rolling beta
Volatility Regime (2): volatility-of-volatility (20d), regime classification (0=low, 1=normal, 2=high, 3=extreme)
Sentiment x Volatility (1): sentiment-volatility regime interaction term
Macro (6): VIX level, VIX change, VIX 5-day MA, treasury yield spread, credit spread proxy, USD Index change
Options (2): implied volatility ATM, IV-RV spread
Earnings (2): days to next earnings, earnings day flag
Social Attention (4): Google Trends index and change, Wikipedia pageviews and z-score
Forward Labels (7): 1d/3d/5d forward returns (%), direction labels (UP/DOWN/FLAT), flat zone flag

Why Alphabet Sentiment Is Different

  • Search disruption narrative — ChatGPT and AI-powered search alternatives created existential fear in late 2022; sentiment tracks the ongoing battle between "Google is doomed" and "Google is the AI winner" narratives
  • Gemini AI competition — model releases, benchmark performance, and product integration (AI Overviews) generate sentiment waves that directly impact valuation multiples
  • DOJ antitrust proceedings — the landmark search monopoly ruling (2024) and potential remedies create ongoing legal sentiment risk with massive revenue implications
  • Google Cloud growth — cloud market share dynamics vs AWS and Azure generate quarterly sentiment shifts tied to growth rate expectations
  • YouTube advertising — the second-largest digital ad platform generates its own sentiment cycle tied to creator economy, Shorts monetization, and connected TV growth
  • Waymo autonomous driving — commercial robotaxi expansion creates a high-optionality sentiment layer with periodic catalysts
Our AI reads 100+ articles daily and distills these narratives into numerical sentiment scores — giving your models signal that traditional technical analysis misses.

Collection Methodology

Data is produced by MarketSignal Solutions using our proprietary Herodote AI pipeline:
  1. News Collection: GDELT (Global Database of Events, Language, and Tone) provides 100+ Alphabet-related articles daily from global financial media
  2. AI Sentiment Analysis: Google Gemini processes each article batch, scoring overall market sentiment and GOOG-specific sentiment on a -1.0 to +1.0 scale
  3. Price Data: Yahoo Finance provides GOOG closing prices, S&P 100 cross-asset data, sector returns, VIX, yields, credit spreads, and options data
  4. Feature Engineering: 91 features are computed from raw inputs using numpy — technical indicators, sentiment derivatives, cross-asset correlations, macro signals, and forward labels
  5. Quality Control: Automated audit checks coverage, NaN rates, column integrity, and data consistency
No copyrighted article text is included — only our own calculated numerical features derived from public market data and public news feeds.

Complete Column Reference

# | Column | Type | Group | Description
---|--------|------|-------|------------
1 | date | date | Metadata | Trading date (YYYY-MM-DD)
2 | year | int | Metadata | Calendar year
3 | day_of_week | string | Metadata | Day name (Monday-Friday)
4 | price_close | float | Price Action | Closing price (USD)
5 | price_return_1d | float | Price Action | 1-day return (%)
6 | price_return_3d | float | Price Action | 3-day return (%)
7 | price_return_5d | float | Price Action | 5-day return (%)
8 | price_return_10d | float | Price Action | 10-day return (%)
9 | price_return_20d | float | Price Action | 20-day return (%)
10 | price_log_return_1d | float | Price Action | 1-day log return
11 | price_dist_high_10d | float | Price Action | Distance from 10-day high (%)
12 | price_dist_high_20d | float | Price Action | Distance from 20-day high (%)
13 | price_dist_low_10d | float | Price Action | Distance from 10-day low (%)
14 | price_dist_low_20d | float | Price Action | Distance from 20-day low (%)
15 | tech_rsi_14 | float | Technical | RSI 14-day (0-100, >70 overbought, <30 oversold)
16 | tech_sma_5 | float | Technical | 5-day Simple Moving Average (USD)
17 | tech_sma_20 | float | Technical | 20-day Simple Moving Average (USD)
18 | tech_sma_5_dist | float | Technical | Distance from 5-day SMA (%)
19 | tech_sma_20_dist | float | Technical | Distance from 20-day SMA (%)
20 | tech_ema_20 | float | Technical | 20-day Exponential Moving Average (USD)
21 | tech_ema_20_dist | float | Technical | Distance from 20-day EMA (%)
22 | tech_bollinger_pos | float | Technical | Position within Bollinger Bands (0=lower, 0.5=middle, 1=upper)
23 | tech_bollinger_bw | float | Technical | Bollinger Band width (%)
24 | tech_bollinger_squeeze | float | Technical | Squeeze indicator (1 = bandwidth in bottom 10th percentile, upcoming breakout)
25 | tech_macd | float | Technical | MACD line
26 | tech_macd_signal | float | Technical | MACD signal line
27 | tech_macd_hist | float | Technical | MACD histogram (positive = bullish momentum)
28 | tech_adx | float | Technical | ADX trend strength (>25 trending, <20 ranging)
29 | tech_roc_5 | float | Technical | 5-day Rate of Change (%)
30 | tech_roc_10 | float | Technical | 10-day Rate of Change (%)
31 | tech_roc_20 | float | Technical | 20-day Rate of Change (%)
32 | tech_streak | int | Technical | Consecutive up/down days (positive = up streak)
33 | tech_vol_5d | float | Technical | 5-day realized volatility (annualized %)
34 | tech_vol_10d | float | Technical | 10-day realized volatility (%)
35 | tech_vol_20d | float | Technical | 20-day realized volatility (%)
36 | tech_momentum_quality | float | Technical | Momentum consistency score (ROC adjusted for volatility)
37 | sent_overall | float | AI Sentiment | Overall market sentiment (-1.0 bearish to +1.0 bullish)
38 | sent_goog | float | AI Sentiment | Alphabet-specific sentiment (-1.0 bearish to +1.0 bullish)
39 | sent_spread | float | AI Sentiment | Ticker minus overall sentiment (positive = stock more bullish than market)
40 | sent_overall_ma3 | float | AI Sentiment | 3-day MA of overall sentiment
41 | sent_goog_ma3 | float | AI Sentiment | 3-day MA of goog sentiment
42 | sent_overall_ma5 | float | AI Sentiment | 5-day MA of overall sentiment
43 | sent_goog_ma5 | float | AI Sentiment | 5-day MA of goog sentiment
44 | sent_overall_mom3 | float | AI Sentiment | 3-day overall sentiment momentum
45 | sent_goog_mom3 | float | AI Sentiment | 3-day goog sentiment momentum
46 | sent_overall_mom5 | float | AI Sentiment | 5-day overall sentiment momentum
47 | sent_goog_mom5 | float | AI Sentiment | 5-day goog sentiment momentum
48 | sent_overall_vol5 | float | AI Sentiment | 5-day overall sentiment volatility
49 | sent_overall_vol10 | float | AI Sentiment | 10-day overall sentiment volatility
50 | sent_goog_vol5 | float | AI Sentiment | 5-day goog sentiment volatility
51 | sent_price_corr_10d | float | Sent x Price | 10-day rolling sentiment-price correlation
52 | sent_price_corr_20d | float | Sent x Price | 20-day rolling sentiment-price correlation
53 | sent_price_diverge | float | Sent x Price | Divergence flag (1 when sentiment and price disagree 3+ consecutive days)
54 | news_count | int | News Volume | Articles collected that day
55 | news_count_ma20 | float | News Volume | 20-day article count moving average
56 | news_count_zscore | float | News Volume | Z-score vs 20-day window (>2 = unusual volume)
57 | news_count_mom5 | float | News Volume | 5-day article count momentum
58 | news_spike | float | News Volume | Binary flag for abnormal news volume (count > 2x MA)
59 | mkt_sp100_breadth | float | Cross-Asset | S&P 100 market breadth (% of stocks up, 0-100)
60 | mkt_sp100_return | float | Cross-Asset | S&P 100 equal-weight return (%)
61 | mkt_dispersion | float | Cross-Asset | Cross-sectional return dispersion (%)
62 | mkt_tech_return | float | Cross-Asset | Tech sector return (%)
63 | mkt_finance_return | float | Cross-Asset | Finance sector return (%)
64 | mkt_health_return | float | Cross-Asset | Healthcare sector return (%)
65 | mkt_energy_return | float | Cross-Asset | Energy sector return (%)
66 | mkt_consumer_return | float | Cross-Asset | Consumer sector return (%)
67 | mkt_industrial_return | float | Cross-Asset | Industrial sector return (%)
68 | mkt_goog_vs_tech | float | Cross-Asset | Alphabet minus tech sector return (%)
69 | mkt_goog_vs_market | float | Cross-Asset | Alphabet minus S&P 100 return (%)
70 | mkt_goog_beta_20d | float | Cross-Asset | 20-day rolling beta vs S&P 100
71 | vol_of_vol_20d | float | Vol Regime | Volatility of volatility (20-day rolling)
72 | vol_regime | int | Vol Regime | Regime: 0=low, 1=normal, 2=high, 3=extreme
73 | sent_vol_regime_interaction | float | Interaction | Alphabet sentiment x volatility regime (amplified signal in high-vol periods)
74 | macro_vix | float | Macro | VIX level (CBOE Volatility Index)
75 | macro_vix_change_1d | float | Macro | 1-day VIX percentage change
76 | macro_vix_ma5 | float | Macro | VIX 5-day moving average
77 | macro_yield_spread | float | Macro | 10Y minus short-term Treasury yield spread (%)
78 | macro_credit_spread | float | Macro | High-yield credit spread proxy (LQD/HYG ratio)
79 | macro_dxy_change | float | Macro | US Dollar Index 1-day percentage change (%)
80 | options_iv_atm | float | Options | ATM implied volatility (%) via VIX x beta approximation
81 | options_iv_rv_spread | float | Options | IV minus realized vol (positive = fear premium)
82 | earnings_days_to_next | float | Earnings | Days to next quarterly earnings (0 = earnings day, capped at 90)
83 | earnings_is_earnings_day | float | Earnings | Binary flag: 1 on earnings day, 0 otherwise
84 | attention_wikipedia_views | float | Social Attention | Daily Wikipedia pageviews (1-day lagged)
85 | attention_wikipedia_zscore | float | Social Attention | Wikipedia views z-score (>2 = unusual interest)
86 | attention_google_trends | float | Social Attention | Google Trends index (0-100)
87 | attention_google_trends_change | float | Social Attention | Google Trends change vs prior period
88 | label_return_1d | float | Forward Labels | Next-day return (%)
89 | label_dir_1d | string | Forward Labels | Next-day direction (UP/DOWN/FLAT)
90 | label_return_3d | float | Forward Labels | 3-day forward return (%)
91 | label_dir_3d | string | Forward Labels | 3-day forward direction
92 | label_return_5d | float | Forward Labels | 5-day forward return (%)
93 | label_dir_5d | string | Forward Labels | 5-day forward direction
94 | label_flat_1d | float | Forward Labels | Flat flag (1 if next-day |return| < 0.3%)

Data Quality

  • NaN values limited to first ~30 rows (indicator warmup period for rolling windows)
  • Last 1-5 rows may have empty forward labels (not yet realized)
  • Zero gaps in sentiment coverage — every trading day has article data
  • Cross-asset features sourced from Yahoo Finance; occasional NaN on mismatched trading holidays
  • Quality rating: 5/5 (automated audit verified)

Known Limitations

  • Alphabet-specific sentiment (sent_goog) is derived from AI analysis of English-language financial news via GDELT. The pipeline does not distinguish between Google Search, YouTube, Cloud, and Waymo sentiment — all contribute to a single GOOG score.
  • Options IV (options_iv_atm) is approximated using VIX x GOOG beta, not from actual GOOG options chains. This may understate event IV around earnings and major AI product announcements.
  • Alphabet's dual-class share structure (GOOG vs GOOGL) means some news sources reference GOOGL — our pipeline captures both but outputs under the GOOG ticker.
  • Earnings dates are pattern-estimated with yfinance confirmation. Accuracy is +/- 3 days for some historical quarters.
  • The post-ChatGPT period (late 2022 onward) represents a structural break in search sentiment — models should consider regime-aware features when training across the full 6-year window.

Use Cases

  • ML model training for GOOG price direction prediction across 6 years of search dominance, AI disruption fear, and competitive response
  • AI disruption event study: how ChatGPT launch, Gemini releases, and AI Overviews rollout shifted the sentiment narrative and repriced the stock
  • Antitrust analysis: DOJ ruling sentiment impact, remedy proposals, and market repricing of regulatory risk
  • Cross-asset analysis: GOOG vs tech sector — Alphabet as a digital advertising bellwether
  • YouTube revenue narrative tracking: sentiment as a leading indicator for ad revenue growth acceleration
  • Feature engineering baseline for digital advertising, AI infrastructure, or search/discovery strategies
  • Cloud infrastructure sentiment: Google Cloud vs AWS/Azure competitive dynamics as a growth narrative driver

Pairs Well With

Alphabet GOOG Live 2026 — subscribe for weekly updates and extend this dataset into 2026.

License

CUSTOM — Single User Commercial License. Full rights to use for internal trading research, analysis, ML model training, AI/LLM fine-tuning, and model commercialization. Dataset itself may not be resold or redistributed. Contact contact@marketsignal.solutions for multi-seat licensing.

AI Training Rights

Non-exclusive, worldwide, perpetual right to train, fine-tune, and evaluate ML models. Derivative works and commercialization of model outputs permitted. Dataset redistribution prohibited.

Not investment advice. This dataset is intended for quantitative research, ML model development, and academic analysis only. Past patterns do not guarantee future results. Alphabet stock prices are influenced by company-specific developments, sector dynamics, macroeconomic conditions, and market sentiment that may not be fully captured in historical data.
Data is produced by MarketSignal Solutions using our proprietary Herodote AI pipeline. All source data is derived from publicly available market prices and public news APIs/feeds. No copyrighted article text is included — only our own calculated numerical features.

Listing Stats

VIEWS

4

DOWNLOADS

0

LISTED

11/03/2026

UPDATED

13/03/2026

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

Loading...

£199.99

Download Dataset in CSV Format