£0

Utrecht Housing / Dutch housing market

Urban Planning & Infrastructure

Tags and Keywords

Housing Dataset

Machine Learning Dataset

Regression Analysis

Classification

Housing Market

Data Visualization

Educational Dataset

Feature Engineering

Trusted By

Utrecht Housing / Dutch housing market Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

The Utrecht Housing Dataset is a synthetic dataset designed for students and practitioners to learn about data science and machine learning. Derived from the Dutch housing market, it is high-quality and noise-free, making it suitable for multiple algorithms such as decision trees, linear regression, logistic regression, and neural networks. This dataset was specifically created for educational purposes and emphasises responsible AI by being accessible to learners with diverse academic backgrounds.

Dataset Features:

id: Unique identifier for each house, ranging from 0 to 100,000 (not used in algorithms).
zipcode: Zip code of the house's location, indicating its area. Possible values: 3520, 3525, 3800.
lot-len: Length of the house plot in meters, ranging from 5.0 to 100.0.
lot-width: Width of the house plot in meters, ranging from 5.0 to 100.0.
lot-area: Total area of the house plot in square meters, derived from lot-len * lot-width.
house-area: The living area of the house in square meters (e.g., 30.0 for small houses, 200.0 for mansions).
garden-size: The size of the garden in square meters, with larger gardens being desirable.
balcony: Number of balconies (common values: 0, 1, 3). x-coor: X-coordinate of the house's location (range: 2000 to 3000).
y-coor: Y-coordinate of the house's location (range: 5000 to 6000).
buildyear: The year the house was built (from as early as 1100 to modern times).
bathrooms: Number of bathrooms (common values: 1, 2, or 3). Output/Target Features
tax value: Estimated value of the house for taxation, ranging from 50,000 to 1,000,000 euros.
Retail value: The market value of the house, also ranges from 50,000 to 1,000,000 euros.
energy-eff: Binary indicator (0 or 1) of whether the house is energy-efficient.
monument: Binary indicator (0 or 1) of whether the house has architectural or historical monumental value.

Usage:

The dataset is ideal for:

Machine Learning Applications: Training and testing predictive models for tax valuation, market value, and energy efficiency.
Feature Analysis: Exploring the relationships between housing attributes and target values.
Educational Purposes: Teaching students about regression, classification, and feature engineering.
Visualisation: Creating plots and graphs due to the well-structured and interpretable data.

Coverage:

The dataset provides a comprehensive representation of housing features relevant to the Dutch market, ensuring high usability for educational and experimental projects.

License:

CC0 (Public Domain)

Who Can Use It:

This dataset is designed for students, researchers, data scientists, and machine learning practitioners seeking to explore real-world applications of AI in housing markets.

How to Use It:

Develop predictive models for tax and retail value estimation.
Evaluate housing energy efficiency or monumental status using classification techniques.
Explore feature importance to understand what drives housing value.
Benchmark machine learning algorithms on a synthetic, high-quality dataset.

Listing Stats

VIEWS

DOWNLOADS

LISTED

21/11/2024

REGION

GLOBAL

QUALITY

5 / 5

VERSION