Online Book Ratings Dataset
Retail & Consumer Behavior
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset is designed for building state-of-the-art models for book recommendation systems. Recommender systems have become indispensable in various online services, from e-commerce to online advertising, by suggesting relevant items to users. Their efficiency can significantly impact income and provide a competitive edge in industries. A notable example of their importance is the "Netflix prize" challenge, which offered a substantial reward for a superior recommendation algorithm. This particular dataset was collected from the Book-Crossing community, serving as a valuable resource for developing and evaluating book recommendation algorithms.
Columns
The dataset is structured across three key files:
-
Books Data:
- ISBN: The unique identifier for each book.
- Book-Title: The title of the book.
- Book-Author: The author of the book; if multiple authors exist, only the first is listed.
- Year-Of-Publication: The year the book was published.
- Publisher: The publisher of the book.
- Image-URL-S: A URL linking to a small cover image of the book from Amazon.
- Image-URL-M: A URL linking to a medium-sized cover image of the book from Amazon.
- Image-URL-L: A URL linking to a large cover image of the book from Amazon.
-
Users Data:
- User-ID: Anonymised integer identifiers for users.
- Location: The geographical location of the user, where available (can contain NULL values).
- Age: The age of the user, where available (can contain NULL values).
-
Ratings Data:
- Book-Rating: Contains book rating information. Ratings are either explicit, given on a scale of 1-10 (higher values indicate greater appreciation), or implicit, expressed as 0.
Distribution
The dataset consists of three files, typically in CSV format. It contains data from 278,858 anonymised users who provided 1,149,780 ratings across 271,379 books.
Usage
This dataset is ideal for a variety of applications, including:
- Developing and training book recommendation systems.
- Building models for predicting user preferences for books.
- Conducting research into recommender algorithms.
- Exploring data understanding and simple recommendation techniques.
Coverage
The data was collected over a four-week period in August and September 2004 from the Book-Crossing online community. It includes demographic information such as user location and age, though these fields may contain NULL values. Book publication years vary widely, with a notable concentration between 1968 and 2009.
License
CC0: Public Domain
Who Can Use It
This dataset is valuable for:
- Data scientists and machine learning engineers working on recommendation engines.
- Researchers in artificial intelligence and machine learning focusing on collaborative filtering or content-based recommendations.
- Academics studying online communities, literature, and cultural data.
- Developers creating applications that require personalised book suggestions.
Dataset Name Suggestions
- Book-Crossing Community Ratings
- Book Recommendation System Data
- Online Book Ratings Dataset
- Book-Crossing User Interactions
Attributes
Original Data Source: Online Book Ratings Dataset