Opendatabay APP

Southeast Asia Property Price Dataset

Data Science and Analytics

Tags and Keywords

Malaysia

Condo

Housing

Prediction

Mudah

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Southeast Asia Property Price Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

Captures pricing and attribute details for various Malaysian condominium units, harvested from listings on the popular housing website mudah.my. Containing over 4,000 unit entries, the collection is designed to facilitate machine learning tasks, primarily focusing on predicting house prices based on unit characteristics, location amenities, and building specifications. Unlike highly curated benchmark datasets, this collection offers a realistic challenge for data scientists and analysts, providing ample opportunity to apply and refine data cleaning and transformation techniques.

Columns

The dataset includes detailed fields describing the unit and its location:
  • description: The complete, unfiltered text description of the unit listing.
  • Ad List: The unique identification number for the listing on the source website.
  • Category: The type of listing, typically "Apartment / Condominium."
  • Facilities: A listing of amenities available, provided as a comma-separated list.
  • Building Name: The name of the structure.
  • Developer: The company responsible for the property development.
  • Tenure Type: The type of land ownership tenure for the property.
  • Address: The location details of the building.
  • Completion Year: The year the building was finished, listed as '-' if construction is ongoing.
  • # of Floors: The total number of levels in the building.
  • Total Units: The total quantity of units within the building.
  • Property Type: Classification of the property.
  • Bedroom: The count of bedrooms in the specific unit.
  • Bathroom: The count of bathrooms in the specific unit.
  • Parking Lot: The number of assigned parking spaces, if applicable.
  • Floor Range: The range of floors the unit occupies within the building.
  • Property Size: The physical size of the unit.
  • Land Title: The official title given to the land.
  • Firm Type/Number/REN Number: Identification details related to the firm that posted the listing.
  • price: The monetary value of the unit, serving as the key prediction target.
  • Nearby amenities: Includes specific columns detailing proximity to locations such as schools, parks, railway stations, bus stops, malls, and major highways.

Distribution

The data consists of records for more than 4,000 condominium units. The structure is derived from direct website scraping, meaning it is typically expected to be provided in a flat file format like CSV. Users should be aware that the raw nature of the scraped data means it is less refined and less organized than prepared datasets.

Usage

This data is ideally suited for:
  • Developing and evaluating predictive models for real estate valuation.
  • Practising real-world data cleaning, feature engineering, and dealing with missing or inconsistently structured information.
  • Analysing price factors and locational influences within the Malaysian housing market.
  • Comparative analysis against standard housing datasets, such as the Melbourne Housing Dataset.

Coverage

The scope is focused exclusively on condominium properties located in Malaysia. All data points represent listings that were scraped from the mudah.my housing platform. There are no specific temporal range limitations noted, but data availability is subject to the scraping process, which required specific time-outs to manage website protections.

License

CC BY-NC-SA 4.0

Who Can Use It

  • Machine Learning Engineers: To train models for predicting target variables like property price.
  • Aspiring Data Scientists: To build foundational skills by cleaning and preparing real-world, messy data for analysis.
  • Real Estate Market Researchers: To gain detailed insights into current Malaysian condominium market dynamics, pricing trends, and amenity valuation.

Dataset Name Suggestions

  • Malaysian Housing Price Prediction Data
  • Condominium Listings from mudah.my
  • Southeast Asia Property Price Dataset
  • Raw Malaysian Real Estate Data

Attributes

Listing Stats

VIEWS

1

DOWNLOADS

0

LISTED

07/10/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format