Gender-Based Performance Mean Comparison Dataset
Education & Learning Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
Analysing secondary school academic results reveals significant insights into the variables influencing student success. This collection focuses on comparing the mean final grades of male and female students to identify potential gender disparities. By applying statistical hypothesis testing, specifically T-tests, the data explores whether observed differences in performance are statistically significant or occurred by chance. The records also account for socio-economic factors, such as parental education and occupation, providing a broader context for the academic outcomes. Particular attention is paid to the impact of data cleaning, such as the exclusion of students who received a zero score, on the final p-value and statistical conclusions.
Columns
- school: The specific secondary school attended by the student, identified as either Gabriel Pereira (GP) or Mousinho da Silveira (MS).
- sex: The gender of the student, recorded as female (F) or male (M).
- age: The numerical age of the student, ranging from 15 to 22 years.
- address: The category of the student's residential area, classified as Urban (U) or Rural (R).
- famsize: The size of the student's family, divided into groups of greater than three (GT3) or less than or equal to three (LE3).
- Pstatus: The cohabitation status of the parents, indicating whether they live together (T) or apart (A).
- Medu: The mother's level of education, represented on a scale from 0 (none) to 4 (higher education).
- Fedu: The father's level of education, represented on a scale from 0 (none) to 4 (higher education).
- Mjob: The occupation of the student's mother, including categories such as services, health, or other.
- Fjob: The occupation of the student's father, including categories such as services, health, or other.
- G3: The final grade achieved by the student, which serves as the primary variable for performance comparison.
Distribution
The information is delivered in a CSV file titled
student-mat.csv with a file size of 41.98 kB. It contains 395 valid records across 33 columns (with 10 primary columns detailed for demographic and school-related analysis). The resource maintains a perfect usability score of 10.00, demonstrating 100% validity with no mismatched or missing values in the primary fields. No future updates are planned.Usage
This resource is ideal for practicing statistical inference and data visualisation using tools like R and the ggplot2 library. Researchers can use the data to perform T-tests to compare means between two groups or to conduct exploratory data analysis on the relationship between family background and academic achievement. It is also a valuable case study for understanding how data cleaning decisions, such as handling zero-value outliers, can alter the results of a hypothesis test.
Coverage
The scope covers a population of 395 students aged between 15 and 22. The demographic profile is diverse, including students from both urban and rural environments and varying family structures. While the data represents a specific cohort of students, the variables included are standard for educational and sociological research. The coverage is static, representing the performance and demographic status at the time of the original study.
License
CC0: Public Domain
Who Can Use It
Data science students can leverage these records to refine their skills in hypothesis testing and data cleaning. Educational researchers may utilise the demographics to investigate the social determinants of academic success. Additionally, statisticians can use the p-values and confidence intervals to demonstrate the nuances of significance testing in real-world scenarios.
Dataset Name Suggestions
- Student Academic Performance and Gender Disparity Analysis
- Hypothesis Testing: Secondary School Grade Statistics
- Socio-Economic Factors in Student Performance
- Student-Mat: Academic Grades and Demographic Profiles
- Gender-Based Performance Mean Comparison Dataset
Attributes
Original Data Source: Gender-Based Performance Mean Comparison Dataset
Loading...
Free
Download Dataset in CSV Format
Recommended Datasets
Loading recommendations...
