Opendatabay APP

Diamond Pricing Prediction Dataset

Product Reviews & Feedback

Tags and Keywords

Diamond

Price

Quality

Carat

Cut

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Diamond Pricing Prediction Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

offers detailed attributes pertaining to the quality and physical characteristics of diamonds, coupled with their corresponding retail prices in US dollars. The information contained within is crucial for understanding how factors such as cut quality, colour grade, and internal clarity combine to determine a stone's overall value and visual appeal. It serves as an excellent resource for investigating the intricate balance consumers and industry professionals must strike when evaluating size versus quality versus cost.

Columns

The dataset features 10 distinct columns, providing both grading classifications and physical measurements:
  • carat: Indicates the weight of the diamond.
  • cut: Describes the quality of the cut, which influences how brilliantly the diamond reflects light. Categories include 'Ideal', 'Premium', 'Very Good', 'Good', and 'Fair'.
  • color: Represents the diamond's colour grade, ranging from D (near colourless) down to J (slight yellowish tint). Colour significantly affects the stone’s perceived value.
  • clarity: Defines the internal purity based on the presence of inclusions or blemishes, with grades such as VVS1, VVS2, VS1, VS2, SI1, SI2, and I1.
  • depth: The depth percentage, reflecting the stone's depth in relation to its average diameter. This percentage is a key indicator of cut quality.
  • table: Refers to the flat facet on the top surface of the diamond, influencing how light enters the stone.
  • price: The cost of the diamond, denominated in USD.
  • x: Represents the width measurement of the diamond.
  • y: Represents the length measurement of the diamond.
  • z: Represents the height measurement of the diamond.

Distribution

The data file, typically offered in CSV format, contains 53,940 individual records. There are 10 columns detailing various diamond attributes. Key columns like cut, colour, clarity, depth, table, and price currently show no mismatched or missing values. The file size is approximately 2.52 MB.

Usage

This collection of data is ideally suited for predictive modelling, particularly regression analysis aiming to forecast diamond prices based on their physical and grading characteristics. Users can perform market research to evaluate the valuation hierarchy of cut quality versus carat weight. It is also excellent for educational purposes, demonstrating the application of statistical methods to real-world luxury goods pricing.

Coverage

The data describes universal attributes used in the diamond trade (carat, cut, clarity, etc.). Specific geographic origin, time range, or demographic information regarding the collection of these records is not available in the provided source material.

License

CC0: Public Domain

Who Can Use It

  • Data Scientists: For building machine learning models to predict diamond pricing.
  • Gemmologists and Industry Analysts: To study the impact of specific physical proportions (depth, table, x, y, z) on light performance and market value.
  • Consumers/Buyers: To understand the trade-offs involved in purchasing diamonds and determining fair market cost based on quality grades.

Dataset Name Suggestions

  • Diamond Pricing Prediction Dataset
  • Gemstone Quality Factors
  • Cut Colour Clarity Analysis
  • Luxury Goods Value Attributes

Attributes

Listing Stats

VIEWS

3

DOWNLOADS

0

LISTED

20/10/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in CSV Format