Book Ratings and Metadata
Product Reviews & Feedback
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset provides a curated collection of book information, including titles, authors, ratings, publication years, and genres. It is ideal for analysing trends in the book industry, developing book recommendation systems, and conducting academic research into literary patterns and reader behaviour. The dataset offers insights into public sentiment towards various books and authors through its rating data, making it a valuable resource for market analysis and data visualisation projects.
Columns
- Name: The title of the book. There are 350 unique book names in the dataset, all of which are valid and present. The most common book name is '10-Day Green Smoothie Cleanse'.
- Author: The name(s) of the book's author. The dataset contains 247 unique authors, with all 350 entries being valid. Jeff Kinney and Rick Riordan are the most frequently appearing authors, each accounting for 3% of the entries.
- Rating: A numerical rating assigned to the book, typically on a scale of 1 to 5. Ratings range from 3.3 to 4.9. The mean rating is 4.61 with a standard deviation of 0.23. The majority of ratings fall between 4.58 and 4.90. All 350 entries are valid.
- Year: The year the book was published. The publication years range from 2009 to 2019. The mean year is 2.01k (as stated in the source) with a standard deviation of 3.28. All 350 entries are valid.
- Genre: The genre or category of the book (e.g., fiction, non-fiction, mystery, romance). There are 3 unique genres identified: Non Fiction (51%), Fiction (37%), and other genres (11%). All 350 entries are valid.
Distribution
The dataset is provided in a CSV format (
clean_books.csv
) and has a size of 30.05 kB. It contains 350 records across its 5 columns. All columns have 100% valid entries with no missing or mismatched data.Usage
This dataset is well-suited for several applications, including:
- Building recommendation systems to suggest books based on user preferences.
- Conducting sentiment analysis by examining ratings to gauge public opinion on books and authors.
- Performing market analysis to identify popular genres, authors, and publication trends in the book industry.
- Creating data visualisations to explore relationships between book attributes, such as rating versus year or genre distribution.
- Supporting academic research on literary trends, author popularity, and reader behaviour.
Coverage
The dataset covers book publications primarily from 2009 to 2019, with a concentration of entries around 2009-2010 and 2018-2019. It includes books by 247 unique authors across a variety of genres, with Non Fiction and Fiction being the most prominent categories. The data represents 350 unique book titles.
License
CC0: Public Domain
Who Can Use It
- Data scientists and machine learning engineers for developing recommendation algorithms.
- Market researchers interested in book industry trends and consumer preferences.
- Academics and literary scholars studying publishing patterns and reader engagement.
- Data analysts looking to visualise and explore book-related data.
- Software developers integrating book information into applications.
Dataset Name Suggestions
- Clean Books Dataset
- Book Ratings and Metadata
- Literary Insights Collection
- Global Book Data
- Publisher Trends Dataset
Attributes
Original Data Source: Book Ratings and Metadata