Opendatabay APP

Boston Suburb Home Values Dataset

Data Science and Analytics

Tags and Keywords

Housing

Boston

Prices

Realestate

Machinelearning

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Boston Suburb Home Values Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset is designed for housing price prediction using machine learning approaches. It concerns housing values in the suburbs of Boston, Massachusetts, USA. The data includes various attributes that describe socio-economic factors, environmental conditions, and property characteristics for different tracts. This dataset has been previously utilised in notable academic works, including 'Regression diagnostics…' by Belsley, Kuh & Welsch (1980) and Quinlan's 1993 study on combining instance-based and model-based learning. It is a suitable resource for developing and evaluating predictive models for real estate values.

Columns

The dataset comprises 14 attributes, including a "class" attribute named "MEDV" (Median Value). All attributes are continuous except for one binary-valued attribute.
  • CRIM: Per capita crime rate by town.
  • ZN: Proportion of residential land zoned for lots over 25,000 sq.ft.
  • INDUS: Proportion of non-retail business acres per town.
  • CHAS: Charles River dummy variable (= 1 if tract bounds river; 0 otherwise).
  • NOX: Nitric oxides concentration (parts per 10 million).
  • RM: Average number of rooms per dwelling.
  • AGE: Proportion of owner-occupied units built prior to 1940.
  • DIS: Weighted distances to five Boston employment centres.
  • RAD: Index of accessibility to radial highways.
  • TAX: Full-value property-tax rate per $10,000.
  • PTRATIO: Pupil-teacher ratio by town.
  • B: 1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town.
  • LSTAT: Percentage lower status of the population.
  • MEDV: Median value of owner-occupied homes in $1000's.

Distribution

The dataset contains 509 instances (rows) and 14 attributes (columns). It is structured as tabular data, indicated by the 'Housing.csv' file name. The attributes are primarily continuous, with 'CHAS' being binary. A few attributes have a small number of missing values: 'INDUS' has 3 missing, 'AGE' has 1 missing, 'RAD' has 1 missing, and 'LSTAT' has 1 missing value.

Usage

This dataset is an excellent resource for various applications, including:
  • Developing and testing machine learning models for housing price prediction.
  • Performing linear regression analysis to understand the impact of different factors on home values.
  • Engaging in model comparison to evaluate the performance of different predictive algorithms.
  • Practising data visualisation techniques to uncover patterns and relationships within the data.
  • Conducting data cleaning exercises due to the presence of a few missing entries.
  • Analysing real estate market trends and socio-economic influences on property values.

Coverage

  • Geographic Scope: The data pertains to the suburbs of Boston, Massachusetts.
  • Time Range: The 'AGE' attribute specifically refers to the proportion of owner-occupied units built prior to 1940.
  • Demographic Scope: The dataset includes demographic information such as the proportion of blacks by town (attribute 'B') and the percentage of lower status population (attribute 'LSTAT').

License

CC0: Public Domain

Who Can Use It

This dataset is particularly valuable for:
  • Machine Learning Engineers and Data Scientists: For building and refining predictive models for real estate.
  • Real Estate Analysts: To gain insights into market drivers and assess property valuation factors in the Boston area.
  • Academics and Researchers: As a well-established benchmark for studies in econometrics, statistics, and machine learning.
  • Students and Educators: For learning and practical application of data analysis, statistical modelling, and machine learning concepts.

Dataset Name Suggestions

  • Boston Housing Prices Data
  • Boston Suburb Home Values Dataset
  • Boston Real Estate Prediction Data
  • Boston ML Housing Dataset

Attributes

Original Data Source: Boston Suburb Home Values Dataset

Listing Stats

VIEWS

1

DOWNLOADS

0

LISTED

03/08/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format