Dark Mode

Home

Data Categories

AI & ML Data

Speed Dating Match Prediction Data

FREE DATASET LIBRARY

Verified Data Provider

£0

Speed Dating Match Prediction Data

Data Science and Analytics

Tags and Keywords

Dating

Compatibility

Relationships

Match

Demographics

Trusted By

Speed Dating Match Prediction Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset captures insights from experimental speed dating events held between 2002 and 2004. Participants engaged in four-minute "first dates" and then indicated whether they would like to see their date again. The dataset's primary purpose is to enable predictions of compatibility and match outcomes between individuals. It includes participant ratings of their dates across six attributes: Attractiveness, Sincerity, Intelligence, Fun, Ambition, and Shared Interests. Additionally, it contains questionnaire data covering demographics, dating habits, self-perception on key attributes, beliefs about what others value in a mate, and lifestyle information. This rich collection of data allows for a deep exploration of factors influencing romantic compatibility and human interaction dynamics.

Columns

wave: Indicates the experimental speed dating wave.
gender: The gender of the participant (self), typically 'male' or 'female'.
age: The age of the participant (self). Valid ages range from 18 to 55.
age_o: The age of the partner. Valid ages range from 18 to 55.
d_age: The calculated difference in age between the participant and their partner. Values range from 0 to 37.
d_d_age: The binned difference in age. Common bins include '[1, 2]' and '[3-5]'.
race: The self-reported race of the participant. Common categories include 'European/Caucasian-American' and 'Asian/Pacific Islander/Asian-American'.
race_o: The race of the partner. Common categories include 'European/Caucasian-American' and 'Asian/Pacific Islander/Asian-American'.
samerace: A binary indicator (0 or 1) stating whether the two persons have the same race.
importance_same_race: A rating indicating how important it is to the participant that their partner is of the same race, on a scale of 0 to 10.
importance_same_religion: A rating indicating how important it is to the participant that their partner has the same religion, on a scale of 1 to 10.
d_importance_same_race: Binned importance rating for same race.
d_importance_same_religion: Binned importance rating for same religion.
field: The participant's field of study. 'Business' and 'MBA' are common examples, though many others exist.
pref_o_attractive: How important the partner rates attractiveness, typically on a scale from 0 to 100.
pref_o_sincere: How important the partner rates sincerity, typically on a scale from 0 to 60.
pref_o_intelligence: How important the partner rates intelligence, typically on a scale from 0 to 50.
pref_o_funny: How important the partner rates being funny, typically on a scale from 0 to 50.
pref_o_ambitious: How important the partner rates ambition, typically on a scale from 0 to 53.
pref_o_shared_interests: How important the partner rates having shared interests, typically on a scale from 0 to 30.
d_pref_o_attractive: Binned preference rating for partner's attractiveness.
d_pref_o_sincere: Binned preference rating for partner's sincerity.
d_pref_o_intelligence: Binned preference rating for partner's intelligence.

Distribution

The dataset is provided in CSV format and is approximately 7.46 MB in size. It contains 24 of the original 123 columns from the full dataset. Many columns feature valid record counts of approximately 8378, with some exhibiting a small percentage of missing values (around 1-2%). Data distributions for numerical columns are available, showing mean, standard deviation, and quantiles, while categorical columns display unique values and their commonality.

Usage

This dataset is ideal for:

Predictive Modelling: Building models to forecast whether two individuals will match in a speed dating scenario.
Behavioural Analysis: Investigating the significance of various attributes and preferences in dating outcomes.
Social Science Research: Studying human behaviour, attraction, and decision-making in a controlled social setting.
Educational Purposes: Serving as a practical example for students learning binary classification, data analysis, and feature engineering.

Coverage

The data was collected from experimental speed dating events conducted between 2002 and 2004. The geographic scope is not explicitly detailed but pertains to the locations where these events were held. The demographic scope is broad, encompassing various ages, genders, and racial backgrounds of participants, along with their detailed preferences and self-perceptions, providing a nuanced view of dating populations within the specified timeframe.

License

Attribution 4.0 International (CC BY 4.0)

Who Can Use It

This dataset is suitable for:

Data Scientists and Machine Learning Engineers: For developing and testing classification models.
Academics and Researchers: In fields such as psychology, sociology, and economics, interested in human interaction and relationship dynamics.
Students: As a valuable resource for projects and coursework in data science, statistics, and social sciences, particularly those focusing on classification tasks.
Anyone interested in human relationships: Individuals curious about the underlying factors that contribute to romantic compatibility and attraction.

Dataset Name Suggestions

Speed Dating Match Prediction Data
Compatibility Analysis Dataset
Dating Preferences & Outcomes
Human Attraction Factors Data
Experimental Dating Event Records

Attributes

Original Data Source: Speed Dating Match Prediction Data

Listing Stats

VIEWS

DOWNLOADS

LISTED

07/08/2025

REGION

GLOBAL

QUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format

Recommended Datasets

Loading recommendations...