Global Startup Activity Dataset
NLP / Natural Language Processing
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset offers a snapshot into the dynamic world of startups, providing valuable insights into the venture capital ecosystem. It captures a diverse range of information, including details about organisations, individuals, company news, funding rounds, acquisitions, and initial public offerings (IPOs). The startup sector is vibrant, with hundreds of new companies emerging daily and venture capital becoming a significant asset class, seeing annual investments exceeding $100 billion in the US alone. This particular dataset is considered unique as it has not been published on platforms like Kaggle before.
Columns
The dataset includes several tables, with
acquisitions.csv
detailing the following columns:- id: An index for the record.
- acquisition_id: The unique identifier for each acquisition event.
- acquiring_object_id: The unique identifier for the entity that made the acquisition.
- acquired_object_id: The unique identifier for the entity that was acquired.
- term_code: Indicates the type of payment used in the acquisition, such as 'cash' or others. Approximately 80% of values are null.
- price_amount: The monetary amount paid for the acquisition. This column has a wide range, from 0 to £2.6 trillion.
- price_currency_code: The currency in which the transaction took place, with USD being the most common (98%).
- acquired_at: The date on which the acquisition deal occurred. Dates range from 24 March 1966 to 12 December 2013.
- source_url: The URL of the information source for the acquisition details. Approximately 10% of values are null.
- source_description: A brief description of the information source. Approximately 10% of values are null.
- created_at: The date when the record was first created in the dataset, ranging from 1 June 2007 to 12 December 2013.
- updated_at: The date when the record was last updated, ranging from 6 February 2008 to 12 December 2013.
Distribution
The dataset is provided in CSV format. The
acquisitions.csv
file is 2.1 MB in size and contains approximately 9,500 records. The full dataset comprises 11 tables which can be joined using unique IDs. While the acquisitions.csv
table has high data validity for most columns (e.g., 9,562 valid entries for id
, acquisition_id
, acquiring_object_id
, price_amount
, created_at
, updated_at
), some columns like term_code
, source_url
, and source_description
have a notable percentage of missing values. It's noted that no extensive data quality checks have been performed yet.Usage
This dataset is ideal for various analytical applications, including:
- Exploratory data analysis of the startup ecosystem.
- Tracking and analysing investment trends over time.
- Clustering venture capital funds based on their existing investments.
- Predicting startup outcomes, such as which startups will secure further funding rounds, be acquired, or file for an IPO.
- Mapping the network of individuals involved in the startup ecosystem.
Coverage
The information in this dataset is available up to December 2013. Acquisition deal dates (
acquired_at
) span from March 1966 to December 2013, while record creation and update dates fall within 2007 and 2013. Although the context mentions substantial US investments, the dataset's scope isn't strictly limited geographically, though a high percentage of transactions are in USD. There are no specific notes on demographic scope.License
Attribution 4.0 International (CC BY 4.0) License.
Who Can Use It
This dataset is suitable for:
- Data Analysts and Scientists: For building predictive models and conducting market research.
- Venture Capitalists and Investors: To identify investment patterns, analyse trends, and inform investment decisions.
- Researchers and Academics: For studying economic trends, innovation, and network dynamics within the startup world.
- Entrepreneurs: To understand market conditions, potential acquisition opportunities, and funding landscapes.
Dataset Name Suggestions
- Crunchbase 2013 Startup & VC Data
- Startup Investment & Acquisition Data 2013
- Venture Capital Ecosystem Snapshot
- Global Startup Activity Dataset (2013)
Attributes
Original Data Source: Global Startup Activity Dataset