Diamonds Price & Attributes
Retail & Consumer Behavior
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset explores prices and various attributes of approximately 54,000 round-cut diamonds. It serves as an excellent resource for students and practitioners alike to practice and refine their data analysis skills using a sizeable real-world dataset. The primary aim is to facilitate learning and training in data exploration and analytical concepts.
Columns
The dataset contains 53,940 diamonds and features 10 distinct attributes. Most variables are numeric, while 'cut', 'color', and 'clarity' are ordered factor variables.
- carat: The weight of the diamond, measured as a float64. Values range from 0.2 to 5.01, with an average of 0.8.
- cut: Describes the quality of the diamond's cut. This is an ordered factor variable with 5 unique levels, including 'Ideal' (40%) and 'Premium' (26%) being the most common.
- color: Refers to the diamond's colour grade. This is an ordered factor variable with 7 unique levels, with 'G' (21%) and 'E' (18%) being frequently observed.
- clarity: Indicates the internal and external inclusions of the diamond. This is an ordered factor variable with 8 unique levels, 'SI1' (24%) and 'VS2' (23%) are common.
- depth: The total depth percentage of the diamond. Measured as a float64, with values typically between 43 and 79, and a mean of 61.7.
- table: The width of the top facet of the diamond relative to its widest point. Measured as a float64, with values ranging from 43 to 95, and a mean of 57.5.
- price: The price of the diamond in US dollars ($). This is an integer variable, ranging from $326 to $18,800, with an average price of $3,930.
- x: The length of the diamond in millimetres (mm). This is a float64, ranging from 0 to 10.7 mm, with an average of 5.73 mm.
- y: The width of the diamond in millimetres (mm). This is a float64, ranging from 0 to 58.9 mm, with an average of 5.73 mm.
- z: The depth of the diamond in millimetres (mm). This is a float64, ranging from 0 to 31.8 mm, with an average of 3.54 mm.
Distribution
This dataset contains 53,940 records related to round-cut diamonds [1]. The file format is typically CSV, specifically referenced as
Diamonds Prices2022.csv
, with a size of 2.82 MB [2]. All 10 columns are present and validated across the entire dataset, with no missing or mismatched entries reported [3-10]. Most variables are numeric, but the variables 'cut', 'color', and 'clarity' are ordered factor variables [1].Usage
This dataset is ideally suited for learning data exploration and for practising various data analysis techniques [1, 11]. It provides a valuable resource for students and educators to apply theoretical concepts to a real-world, large dataset, facilitating the training of analytical ideas [11].
Coverage
The dataset focuses on round-cut diamonds and includes their physical attributes and pricing information [1]. While it is noted that the data may not be up to date, it remains a robust resource for practice [11]. Specific collecting years or geographic and demographic scopes are not detailed in the provided materials [11].
License
CC BY 4.0
Who Can Use It
This dataset is particularly useful for:
- Students: To practice data analysis, data visualization, and statistical analysis as part of their training [2, 11].
- Educators/Instructors: For developing and delivering practical exercises in college-level data science courses [11].
- Data Analysts and Scientists: For exploratory data analysis, data cleaning exercises, and building predictive models for diamond pricing [2].
- Researchers: Studying the factors influencing diamond valuations [11].
Dataset Name Suggestions
- Diamonds Price & Attributes
- Round-Cut Diamond Dataset
- Gemstone Price Analysis
- Diamond Characteristics Data
- Diamond Pricing Study
Attributes
Original Data Source: Diamonds Price & Attributes