Opendatabay APP

Popular Data Science Books on Amazon

Education & Learning Analytics

Tags and Keywords

Data

Science

Books

Amazon

Learning

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Popular Data Science Books on Amazon Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset features 946 books sourced from Amazon, specifically focusing on popular titles related to data science, statistics, data analysis, Python, deep learning, and machine learning. It offers valuable insights into Amazon's offerings in these fields, making it a useful resource for academic study or consumer reference.

Columns

  • title: The full title of the book.
  • author: The author or authors of the book.
  • price: The price of a new copy in US dollars.
  • price (including used books): The price range for both new and used copies in US dollars.
  • pages: The total number of pages in the book.
  • avg_reviews: The average customer review rating, out of 5 stars.
  • n_reviews: The total number of reviews received for the book.
  • star5: The percentage of reviews that are 5-star ratings.
  • star4: The percentage of reviews that are 4-star ratings.
  • star3: The percentage of reviews that are 3-star ratings.
  • star2: The percentage of reviews that are 2-star ratings.
  • star1: The percentage of reviews that are 1-star ratings.
  • dimensions: The physical size of the book, specified in inches.
  • weight: The weight of the book, in pounds or ounces.
  • language: The language in which the book is written.
  • publisher: The publishing company of the book.
  • ISBN-13: The 13-digit International Standard Book Number.
  • link: A partial link to the Amazon product page for the book.
  • complete_link: The full URL to the Amazon product page for the book, including the domain https://amazon.com.

Distribution

The dataset is provided as a CSV file named final_book_dataset_kaggle2.csv, with a size of 499.04 kB. It contains 18 columns and details for 946 individual books. The data was obtained through web scraping from Amazon. Initially, over 1,700 books were scraped; however, duplicates were removed, columns were formatted for ease of use, and rows with numerous missing values were deleted or appropriately filled, leading to the final count of 946 books.
Data validity varies across columns:
  • title, link, and complete_link have no missing values.
  • language is 9% missing, with 90% of books in English.
  • author is 21% missing, ISBN-13 is 20% missing.
  • avg_reviews and n_reviews are 15% missing.
  • star1 has the most missing values at 60%. The mean average review is 4.47 out of 5 stars, and the mean number of reviews is 326. Book prices range from 0.99 to 1318.74 US dollars, with a mean price of 46.5 US dollars.

Usage

This dataset is ideal for:
  • Exploratory data analysis using libraries like Pandas or NumPy.
  • Creating insightful visualisations of different features with Python libraries.
  • Practising data querying skills using SQL or Pandas.
  • Ranking books based on positive customer reviews.
  • Serving as a valuable reference for purchasing data science-related books.

Coverage

The dataset focuses on books available on Amazon, specifically those tagged with data science, statistics, data analysis, Python, deep learning, and machine learning. The vast majority of books (90%) are in English. While no specific time range for data collection is given, the dataset is expected to be updated annually. Missing data percentages vary per column, indicating that some book attributes are less consistently available than others.

License

CC0: Public Domain

Who Can Use It

  • Data scientists for analytical projects and deriving market insights.
  • Students and educators seeking to practise data analysis, programming, and database querying.
  • Researchers studying trends in data science publications or e-commerce book markets.
  • Individuals looking for recommendations or market information before buying data science books.

Dataset Name Suggestions

  • Amazon Data Science Books Metrics
  • Popular Data Science Books on Amazon
  • Machine Learning & Data Analysis Book Data
  • Amazon DS Books Collection

Attributes

Listing Stats

VIEWS

1

DOWNLOADS

0

LISTED

22/07/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format