Historical Biometric Height Analysis Records
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
Historical records from a renowned experiment conducted by Karl Pearson around 1903 capture the vertical measurements of fathers and their sons. Comprising 1,078 pairwise observations, the data facilitates the study of heredity and statistical regression. To ensure privacy and standardisation, random noise was introduced to the original figures to produce height measurements accurate to the nearest 0.1 inch. The primary objective of these records is to serve as a foundational resource for practising and understanding simple linear regression analysis.
Columns
- Father: Represents the height of the father in inches.
- Son: Represents the height of the son in inches.
Distribution
The data is structured in a tabular format, available as a CSV file with a size of approximately 11.87 kB. It contains 1,078 valid records (rows) across 2 distinct columns. There are no missing values or mismatched entries, ensuring 100% validity across the file. The statistical distribution includes a mean height of 67.7 inches for fathers and 68.7 inches for sons.
Usage
- Statistical Training: Ideal for students learning the fundamentals of simple linear regression and correlation.
- Algorithm Validation: Useful for testing and validating basic regression algorithms and predictive models.
- Educational Demonstrations: Suitable for academic lectures demonstrating the concept of regression to the mean.
- Data Visualisation: Excellent for creating scatter plots and fitting trend lines to visualise intergenerational height relationships.
Coverage
- Temporal Scope: The data originates from an experiment conducted circa 1903.
- Demographic Scope: Covers male subjects (fathers) and their male offspring (sons).
- Geographic Scope: While specific location metadata is not included, the data is associated with Karl Pearson's biometric research, historically linked to the University of California, Berkeley Department of Statistics archives.
License
CC0: Public Domain
Who Can Use It
- Data Science Students: For practising regression techniques and data cleaning.
- Statistics Educators: To provide clear, historical examples of linear relationships.
- Machine Learning Beginners: For building introductory predictive models without complex feature engineering.
- Researchers: Interested in historical anthropometric data and the history of statistics.
Dataset Name Suggestions
- Pearson's 1903 Father-Son Height Regression Data
- Historical Biometric Height Analysis Records
- Linear Regression Training Data: Fathers and Sons
- Pearson's Anthropometric Height Pairs
Attributes
Original Data Source:Historical Biometric Height Analysis Records
Loading...
