Opendatabay APP

Global Startup Activity Dataset

NLP / Natural Language Processing

Tags and Keywords

Startups

Investments

Acquisitions

Funding

Venture

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Global Startup Activity Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset offers a snapshot into the dynamic world of startups, providing valuable insights into the venture capital ecosystem. It captures a diverse range of information, including details about organisations, individuals, company news, funding rounds, acquisitions, and initial public offerings (IPOs). The startup sector is vibrant, with hundreds of new companies emerging daily and venture capital becoming a significant asset class, seeing annual investments exceeding $100 billion in the US alone. This particular dataset is considered unique as it has not been published on platforms like Kaggle before.

Columns

The dataset includes several tables, with acquisitions.csv detailing the following columns:
  • id: An index for the record.
  • acquisition_id: The unique identifier for each acquisition event.
  • acquiring_object_id: The unique identifier for the entity that made the acquisition.
  • acquired_object_id: The unique identifier for the entity that was acquired.
  • term_code: Indicates the type of payment used in the acquisition, such as 'cash' or others. Approximately 80% of values are null.
  • price_amount: The monetary amount paid for the acquisition. This column has a wide range, from 0 to £2.6 trillion.
  • price_currency_code: The currency in which the transaction took place, with USD being the most common (98%).
  • acquired_at: The date on which the acquisition deal occurred. Dates range from 24 March 1966 to 12 December 2013.
  • source_url: The URL of the information source for the acquisition details. Approximately 10% of values are null.
  • source_description: A brief description of the information source. Approximately 10% of values are null.
  • created_at: The date when the record was first created in the dataset, ranging from 1 June 2007 to 12 December 2013.
  • updated_at: The date when the record was last updated, ranging from 6 February 2008 to 12 December 2013.

Distribution

The dataset is provided in CSV format. The acquisitions.csv file is 2.1 MB in size and contains approximately 9,500 records. The full dataset comprises 11 tables which can be joined using unique IDs. While the acquisitions.csv table has high data validity for most columns (e.g., 9,562 valid entries for id, acquisition_id, acquiring_object_id, price_amount, created_at, updated_at), some columns like term_code, source_url, and source_description have a notable percentage of missing values. It's noted that no extensive data quality checks have been performed yet.

Usage

This dataset is ideal for various analytical applications, including:
  • Exploratory data analysis of the startup ecosystem.
  • Tracking and analysing investment trends over time.
  • Clustering venture capital funds based on their existing investments.
  • Predicting startup outcomes, such as which startups will secure further funding rounds, be acquired, or file for an IPO.
  • Mapping the network of individuals involved in the startup ecosystem.

Coverage

The information in this dataset is available up to December 2013. Acquisition deal dates (acquired_at) span from March 1966 to December 2013, while record creation and update dates fall within 2007 and 2013. Although the context mentions substantial US investments, the dataset's scope isn't strictly limited geographically, though a high percentage of transactions are in USD. There are no specific notes on demographic scope.

License

Attribution 4.0 International (CC BY 4.0) License.

Who Can Use It

This dataset is suitable for:
  • Data Analysts and Scientists: For building predictive models and conducting market research.
  • Venture Capitalists and Investors: To identify investment patterns, analyse trends, and inform investment decisions.
  • Researchers and Academics: For studying economic trends, innovation, and network dynamics within the startup world.
  • Entrepreneurs: To understand market conditions, potential acquisition opportunities, and funding landscapes.

Dataset Name Suggestions

  • Crunchbase 2013 Startup & VC Data
  • Startup Investment & Acquisition Data 2013
  • Venture Capital Ecosystem Snapshot
  • Global Startup Activity Dataset (2013)

Attributes

Original Data Source: Global Startup Activity Dataset

Listing Stats

VIEWS

1

DOWNLOADS

2

LISTED

14/07/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in ZIP Format