Opendatabay APP

Gender-Based Performance Mean Comparison Dataset

Education & Learning Analytics

Tags and Keywords

Students

Grades

Hypothesis

Gender

Performance

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Gender-Based Performance Mean Comparison Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

Analysing secondary school academic results reveals significant insights into the variables influencing student success. This collection focuses on comparing the mean final grades of male and female students to identify potential gender disparities. By applying statistical hypothesis testing, specifically T-tests, the data explores whether observed differences in performance are statistically significant or occurred by chance. The records also account for socio-economic factors, such as parental education and occupation, providing a broader context for the academic outcomes. Particular attention is paid to the impact of data cleaning, such as the exclusion of students who received a zero score, on the final p-value and statistical conclusions.

Columns

  • school: The specific secondary school attended by the student, identified as either Gabriel Pereira (GP) or Mousinho da Silveira (MS).
  • sex: The gender of the student, recorded as female (F) or male (M).
  • age: The numerical age of the student, ranging from 15 to 22 years.
  • address: The category of the student's residential area, classified as Urban (U) or Rural (R).
  • famsize: The size of the student's family, divided into groups of greater than three (GT3) or less than or equal to three (LE3).
  • Pstatus: The cohabitation status of the parents, indicating whether they live together (T) or apart (A).
  • Medu: The mother's level of education, represented on a scale from 0 (none) to 4 (higher education).
  • Fedu: The father's level of education, represented on a scale from 0 (none) to 4 (higher education).
  • Mjob: The occupation of the student's mother, including categories such as services, health, or other.
  • Fjob: The occupation of the student's father, including categories such as services, health, or other.
  • G3: The final grade achieved by the student, which serves as the primary variable for performance comparison.

Distribution

The information is delivered in a CSV file titled student-mat.csv with a file size of 41.98 kB. It contains 395 valid records across 33 columns (with 10 primary columns detailed for demographic and school-related analysis). The resource maintains a perfect usability score of 10.00, demonstrating 100% validity with no mismatched or missing values in the primary fields. No future updates are planned.

Usage

This resource is ideal for practicing statistical inference and data visualisation using tools like R and the ggplot2 library. Researchers can use the data to perform T-tests to compare means between two groups or to conduct exploratory data analysis on the relationship between family background and academic achievement. It is also a valuable case study for understanding how data cleaning decisions, such as handling zero-value outliers, can alter the results of a hypothesis test.

Coverage

The scope covers a population of 395 students aged between 15 and 22. The demographic profile is diverse, including students from both urban and rural environments and varying family structures. While the data represents a specific cohort of students, the variables included are standard for educational and sociological research. The coverage is static, representing the performance and demographic status at the time of the original study.

License

CC0: Public Domain

Who Can Use It

Data science students can leverage these records to refine their skills in hypothesis testing and data cleaning. Educational researchers may utilise the demographics to investigate the social determinants of academic success. Additionally, statisticians can use the p-values and confidence intervals to demonstrate the nuances of significance testing in real-world scenarios.

Dataset Name Suggestions

  • Student Academic Performance and Gender Disparity Analysis
  • Hypothesis Testing: Secondary School Grade Statistics
  • Socio-Economic Factors in Student Performance
  • Student-Mat: Academic Grades and Demographic Profiles
  • Gender-Based Performance Mean Comparison Dataset

Attributes

Listing Stats

VIEWS

4

DOWNLOADS

2

LISTED

23/12/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in CSV Format