Official TED Talks Dataset
NLP / Natural Language Processing
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset contains details of all TED Talks available on Ted.com, designed primarily for beginner students of Data Analysis to explore real-life data. It enables users to analyse talks, watch their favourite author's presentations, and delve into data analytics concepts while learning from leading speakers. The creator's inspiration was to provide a resource for learning both data analysis and insights from the best minds in various fields.
Columns
- title: The title of the TED Talk. There are 5,440 unique talk titles, all of which are valid.
- author: The author or speaker of the talk. There are 4,444 unique authors, with 5,439 valid entries and one missing value.
- date: The date when the talk took place. All 5,440 entries are valid, with dates ranging from 1st January 1970 to 1st February 2022.
- views: The total number of views the talk has received. All 5,440 entries are valid, with views ranging from 532 to 72.0 million, and an average of 2.06 million views.
- likes: The total number of likes the talk has received. All 5,440 entries are valid, with likes ranging from 15 to 2.10 million, and an average of 62.6 thousand likes.
- link: The direct URL to the talk on ted.com. All 5,440 entries are unique and valid.
Distribution
The dataset is provided as a CSV file named
data.csv
, with a size of 863.62 kB. It comprises 6 columns and contains 5,440 records or rows, each representing a unique TED Talk.Usage
This dataset is ideal for various analytical applications and learning exercises, including:
- Identifying the most popular TED Talks.
- Determining the most popular TED Talk speakers (based on number of talks or views).
- Performing month-wise and year-wise analysis of TED Talk frequency.
- Finding talks by a favourite author.
- Analysing talks based on their view-to-like ratio.
- Searching for talks based on specific tags, such as 'climate'.
- General exploratory data analysis and data visualisation.
Coverage
The dataset covers TED Talks published between 1st January 1970 and 1st February 2022. The talks are from Ted.com, implying a global scope of content topics and speakers. There are no specific notes on data availability for particular groups or years beyond the general date range.
License
Creative Commons License (CC BY-NC-SA 4.0).
Who Can Use It
This dataset is intended for:
- Beginner students of Data Analysis looking to gain practical experience with real-world data.
- Individuals interested in exploring TED Talks from an analytical perspective.
- Learners aiming to develop skills in Data Analytics, Data Visualisation, and Exploratory Data Analysis using tools like Python.
- Anyone wishing to learn from the speakers and content of TED Talks.
Dataset Name Suggestions
- TED Talks Global Archive
- TED.com Data Analytics Dataset
- Curated TED Talks Collection
- TED Talks Performance Metrics
- Official TED Talks Dataset
Attributes
Original Data Source:Official TED Talks Dataset