Foundation Intelligence

Foundation Model Datasets

Tags and Keywords

Longitudinal

Workforce

Intelligence

Historical

Skills

Capability

Evolution

Humancapital

Foundation Intelligence Dataset on Opendatabay data marketplace

"No reviews yet"

£15,000

About

Vivameda - Longitudinal Company Evolution Panel

A 70-year longitudinal panel of company workforce evolution. Built as training substrate for AI models that reason about how companies scale, fail, and recover.

Overview

  • 4.2M companies across 100+ countries
  • 48M observed company-year records, 1950–2020
  • 1.88B skill rows, 46.5M capability buckets
  • One row per company per year. Year-end resolution.
  • Real historical observation, not synthetic, not scraped from the web.

Why It Exists

Modern AI models reason about companies almost entirely from post-2020 web data. They never saw the dot-com bust, the 2008 recovery, or the full seventy-year arc of how organizational structure predicts outcomes.
That's not a knowledge gap. It's a substrate problem.
Vivameda is the substrate.

Era Coverage

1950–1969: Post-war industrial expansion
1970–1979: Stagflation, oil shocks
1980–1989: Deregulation, conglomerate era
1990–1999: Globalization, early internet
2000–2009: Dot-com cycle, 2008 crisis
2010–2017: Platform economy, mobile wave
2018–2020: Pre-COVID dense window (highest observation density)

Product Tiers

Workforce Intelligence (Entry) — Base panel, 19 columns
Landscape Intelligence (Mid) — Workforce + market structure, 24 columns
Workforce and Capability Intelligence (Premium) — Full schema, 30 columns at company-year-bucket grain

Use Cases

  • Foundation model pre-training on company evolution patterns
  • Fine-tuning corpora for company-reasoning systems
  • Evaluation benchmarks for organizational reasoning
  • Ground-truth anchoring for agent systems
  • Alternative data signal for systematic strategies

Delivery

  • CSV / Parquet flat files (one-time license)
  • Snowflake Data Sharing (recurring access)
  • Custom delivery formats on request

Data Provenance

Aggregated at the company-year level from commercial partnerships and publicly available workforce signals collected over the past decade. No PII. No individual employee records. All observations aggregated to the company-year grain.

Note on Recency

The dataset terminates at 2020 by design. This is a feature: it provides a stable, reproducible historical training corpus that does not drift between training runs.

Listing Stats

VIEWS

7

DELIVERY

INSTANT DOWNLOAD

LISTED

21/04/2026

UPDATED

27/04/2026

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

Loading...

£15,000

Download Dataset in Parquet Format