Opendatabay APP

Parent-Child Height Study

Data Science and Analytics

Tags and Keywords

Height

Heredity

Galton

Family

Genetics

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Parent-Child Height Study Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset captures Francis Galton's observations from 1886 regarding the heights of parents and their adult children. The primary aim of this study was to investigate the relationship between children's heights and those of their parents. Additionally, Galton sought to determine if marital partner selection exhibited a correlation between a husband's and his wife's heights. The dataset includes 1886 individual observations, drawing from 934 children and their 205 families.

Columns

  • rownames: A unique identifier for each observation, ranging from 1 to 934.
  • family: Identifies individual families within the study, with 205 unique family units observed.
  • father: Represents the father's height in the family, with measurements ranging from 62.0 to 78.5 units (mean 69.2, standard deviation 2.48).
  • mother: Represents the mother's height in the family, with measurements ranging from 58.0 to 70.5 units (mean 64.1, standard deviation 2.29).
  • midparentHeight: The calculated mid-parent height, derived using the formula (father + 1.08*mother)/2. Values range from 64.4 to 75.4 units (mean 69.2, standard deviation 1.8).
  • children: Indicates the total number of children within each family included in the study, ranging from 1 to 15 (mean 6.17, standard deviation 2.73).
  • childNum: The sequential number of a child within their family. Children are ordered by height, from tallest boys to tallest girls, ranging from 1 to 15 (mean 3.59, standard deviation 2.36).
  • gender: Specifies the gender of the child, with 51% male and 49% female observations.
  • childHeight: The height of the adult child, with measurements ranging from 56.0 to 79.0 units (mean 66.7, standard deviation 3.58).

Distribution

The dataset is provided as a single CSV file named GaltonFamilies.csv, with a size of 34.72 kB. It comprises 9 columns and contains 934 valid records or rows, representing observations from 205 families.

Usage

This dataset is ideal for studies in heredity, particularly for establishing relationships between parental and offspring traits. It is well-suited for statistical analysis, including regression models to predict child height based on parental heights, and for research into marital selection patterns based on physical attributes. Researchers interested in historical demographic studies or foundational genetic studies will find this dataset valuable.

Coverage

The data was collected in 1886, making it a historical dataset. The scope includes 934 adult children and their 205 families. While a specific geographic location is not mentioned, the study is attributed to Galton's observations. The dataset includes demographic information such as family structure, number of children, and gender of the children.

License

CC0: Public Domain

Who Can Use It

  • Researchers and Academics: For studies in biometrics, heredity, genetics, and statistical modelling.
  • Statisticians: To practise and demonstrate regression analysis and correlation studies.
  • Students: As a classical dataset for learning about data analysis, historical scientific methods, and the foundations of quantitative genetics.
  • Social Scientists: For exploring historical patterns of marriage and family dynamics related to physical characteristics.

Dataset Name Suggestions

  • Galton's Heredity Height Data
  • Parent-Child Height Study (1886)
  • Historical Human Height Dataset
  • Galton Family Heights
  • Adult Offspring Height Regression Data

Attributes

Original Data Source: Parent-Child Height Study

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

03/08/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format