Opendatabay APP

Historical Biometric Height Analysis Records

Data Science and Analytics

Tags and Keywords

Regression

Statistics

Pearson

Anthropometry

Heredity

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Historical Biometric Height Analysis Records Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

Historical records from a renowned experiment conducted by Karl Pearson around 1903 capture the vertical measurements of fathers and their sons. Comprising 1,078 pairwise observations, the data facilitates the study of heredity and statistical regression. To ensure privacy and standardisation, random noise was introduced to the original figures to produce height measurements accurate to the nearest 0.1 inch. The primary objective of these records is to serve as a foundational resource for practising and understanding simple linear regression analysis.

Columns

  • Father: Represents the height of the father in inches.
  • Son: Represents the height of the son in inches.

Distribution

The data is structured in a tabular format, available as a CSV file with a size of approximately 11.87 kB. It contains 1,078 valid records (rows) across 2 distinct columns. There are no missing values or mismatched entries, ensuring 100% validity across the file. The statistical distribution includes a mean height of 67.7 inches for fathers and 68.7 inches for sons.

Usage

  • Statistical Training: Ideal for students learning the fundamentals of simple linear regression and correlation.
  • Algorithm Validation: Useful for testing and validating basic regression algorithms and predictive models.
  • Educational Demonstrations: Suitable for academic lectures demonstrating the concept of regression to the mean.
  • Data Visualisation: Excellent for creating scatter plots and fitting trend lines to visualise intergenerational height relationships.

Coverage

  • Temporal Scope: The data originates from an experiment conducted circa 1903.
  • Demographic Scope: Covers male subjects (fathers) and their male offspring (sons).
  • Geographic Scope: While specific location metadata is not included, the data is associated with Karl Pearson's biometric research, historically linked to the University of California, Berkeley Department of Statistics archives.

License

CC0: Public Domain

Who Can Use It

  • Data Science Students: For practising regression techniques and data cleaning.
  • Statistics Educators: To provide clear, historical examples of linear relationships.
  • Machine Learning Beginners: For building introductory predictive models without complex feature engineering.
  • Researchers: Interested in historical anthropometric data and the history of statistics.

Dataset Name Suggestions

  • Pearson's 1903 Father-Son Height Regression Data
  • Historical Biometric Height Analysis Records
  • Linear Regression Training Data: Fathers and Sons
  • Pearson's Anthropometric Height Pairs

Attributes

Listing Stats

VIEWS

6

DOWNLOADS

1

LISTED

06/12/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in CSV Format