Used Car Features and Price Data
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
10,000 realistic entries for used cars, designed for regression tasks to predict vehicle resale prices in USD. It includes various features like mileage, engine size, number of previous owners, and fuel type. It is ideal for practising with regression models, exploring the importance of different features, building interactive dashboards, and training machine learning models for real-world pricing predictions.
Columns
- make_year: The year the car was manufactured, ranging from 1995 to 2023.
- mileage_kmpl: The car's mileage in kilometres per litre (kmpl).
- engine_cc: The engine capacity in cubic centimetres (cc).
- fuel_type: The type of fuel the car uses, such as Petrol, Diesel, or Electric.
- owner_count: The number of previous owners the car has had, from 1 to 5.
- price_usd: The target variable, representing the resale price of the car in US dollars.
- brand: The brand name of the car.
- transmission: The type of transmission, either Manual or Automatic.
- color: The exterior colour of the car.
- service_history: The maintenance record of the vehicle, categorized as Full, Partial, or None.
- accidents_reported: The number of accidents reported for the vehicle.
- insurance_valid: An indicator of whether the car's insurance is currently valid (Yes or No).
Distribution
The data is provided in a single CSV file named
used_car_price_dataset_extended.csv
. It is a tabular dataset containing 10,000 rows and 12 columns.Usage
- Regression Analysis: Predict the
price_usd
using the other available features. - Feature Engineering: Develop new features, such as car age or a fuel economy score, to improve model performance.
- Data Visualisation: Create analyses and build dashboards to understand how factors like brand or engine size influence price.
- Exploratory Data Analysis (EDA): Use this dataset for learning data cleaning and insight discovery techniques.
Coverage
This synthetically generated dataset does not represent real-world individuals' data. The vehicles' manufacturing years span from 1995 to 2023.
License
CC BY-SA 4.0.
Who Can Use It
- Data Science Beginners: Ideal for learning about data cleaning, exploratory data analysis, and regression modelling.
- Machine Learning Engineers: Suitable for training and testing real-world pricing prediction models.
- Data Analysts: Perfect for creating visualisations and dashboards to explore the relationships between vehicle features and price.
Dataset Name Suggestions
- Used Car Price Prediction
- Vehicle Resale Value Dataset
- Automotive Pricing for Regression Analysis
- Used Car Features and Price Data
Attributes
Original Data Source: Used Car Features and Price Data