Opendatabay APP

Historical Book Sales Records

E-commerce & Online Transactions

Tags and Keywords

Books

Sales

Authors

Literature

Fiction

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Historical Book Sales Records Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset features lists of best-selling books and book series, covering various languages. The term 'best-selling' refers to the estimated number of copies sold for each title, rather than the number of copies printed or currently owned. It specifically excludes comic books and textbooks. The books are ordered based on the greatest sales estimates, as reported by credible and independent sources.

Columns

  • Book: The title of the best-selling book. There are 174 unique book titles, with "A Tale of Two Cities" being the most frequently listed.
  • Author(s): The author or authors of the book. There are 157 unique authors, with J. K. Rowling and Dan Brown being the most common.
  • Original language: The original language in which the book was published. English is the predominant language, accounting for 75% of entries, followed by Russian at 3%. There are 16 unique original languages in total.
  • First published: The year the book was first published. Dates range from 1304 to 2018, with a mean publication year around 1960.
  • Approximate sales in millions: The estimated sales figures for the book, presented in millions of copies. Sales range from 10 million to 200 million copies, with a mean of approximately 30.1 million.
  • Genre: The genre classification of the book. While there are 80 unique genres, 32% of entries do not have a specified genre. Fantasy is the most common listed genre among the available entries.

Distribution

The dataset is provided as a CSV file, named 'best-selling-books.csv', with a file size of 12.9 kB. It contains 174 records, each representing a best-selling book. All columns, with the exception of 'Genre', are fully populated, ensuring 100% data validity for key fields.

Usage

This dataset is ideal for market analysis in the publishing industry, identifying trends in book sales and popularity. It can be used for literary studies, examining the characteristics of successful books across different eras and languages. Additionally, it serves as a valuable resource for educational content development related to literature and publishing, and for data analysis projects in the arts and entertainment sectors.

Coverage

The dataset covers best-selling books in any language, suggesting a global scope. The time range for book publications spans from the year 1304 up to 2018, offering a historical perspective on bestsellers. While most data fields are complete, approximately 32% of the entries are missing genre information.

License

CC0: Public Domain

Who Can Use It

  • Data Scientists and Analysts: For exploring sales trends, author popularity, and language distribution.
  • Publishers and Marketing Professionals: To inform publishing strategies, identify market gaps, and understand what makes a book sell well.
  • Literary Scholars and Researchers: For academic studies on literary history, genre evolution, and the impact of language on global reach.
  • Educators: As a resource for teaching about literature, publishing, and data analysis.
  • Writers and Aspiring Authors: To gain insights into the characteristics of highly successful books.

Dataset Name Suggestions

  • Global Bestsellers Dataset
  • Historical Book Sales Records
  • Top-Selling Books by Sales Volume
  • World's Best-Selling Books Compilation

Attributes

Original Data Source: Historical Book Sales Records

Listing Stats

VIEWS

1

DOWNLOADS

0

LISTED

14/07/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format