Dark Mode

Home

Data Categories

AI & ML Data

Gendered Naming Patterns Dataset

FREE DATASET LIBRARY

Verified Data Provider

£0

Gendered Naming Patterns Dataset

Data Science and Analytics

Tags and Keywords

Names

Gender

Social

Trends

Baby

Trusted By

Gendered Naming Patterns Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

An exploration of first name trends for babies across the United States, United Kingdom, Canada, and Australia, this data combines raw counts of names for males and females from various government sources. It calculates the probability of a name being associated with a specific gender based on the aggregated counts, offering insights into naming patterns over different periods in these countries. The data is useful for tasks such as classification and clustering within the social sciences.

Columns

Name: The first or given name (String).
Gender: The gender associated with the name, indicated as 'M' for male or 'F' for female (Category/String).
Count: The total number of occurrences for the name (Integer).
Probability: The calculated probability of the name belonging to the specified gender (Float).

Distribution

The data is provided in a tabular CSV format (data.csv) with a file size of 3.77 MB. It contains 147,270 instances or rows, and consists of 4 columns or features. There are no missing values in any of the columns.

Usage

Ideal applications for this data include social science research, onomastic studies (the study of names), and developing machine learning models for gender classification based on names. It can also be used for data cleaning tasks, trend analysis of popular names over time, and clustering names based on their characteristics.

Coverage

Geographic Scope: The dataset includes data from four countries:
- United States: 1880 to 2019
- United Kingdom (England and Wales): 2011 to 2018
- Canada (British Columbia): 1918 to 2018
- Australia: 1944 to 2019
Time Range: The data spans various time periods, with the earliest records from 1880 and the most recent from 2019, depending on the country.
Demographic Scope: The data pertains to the first names of male and female babies.

License

Creative Commons Attribution 4.0 International (CC BY 4.0)

Who Can Use It

Social Scientists and Researchers: To study cultural trends, naming conventions, and demographic patterns.
Data Scientists and Analysts: For building predictive models for gender classification, performing cluster analysis, and practising data preparation techniques.
Marketing Professionals: To understand name popularity for personalised marketing campaigns or product development.
Genealogists and Historians: To analyse historical naming trends within specific geographic regions.

Dataset Name Suggestions

Cross-Country Naming Trends and Gender Probability
Historical Baby Names: US, UK, Canada & Australia
Gendered Naming Patterns Dataset
First Name Gender Statistics (1880-2019)
International Baby Name Counts and Probabilities

Attributes

Original Data Source: Gendered Naming Patterns Dataset

Listing Stats

VIEWS

DOWNLOADS

LISTED

17/09/2025

REGION

GLOBAL

QUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format

Recommended Datasets

Loading recommendations...