Secondary School Student Achievement Dataset
Education & Learning Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset provides insights into student achievement within secondary education across two Portuguese schools, specifically focusing on performance in the Portuguese language subject. It includes a variety of attributes such as student grades, demographic information, social factors, and school-related features. The data was gathered through school reports and questionnaires. The dataset facilitates analysis of how different attributes correlate with student performance, particularly with the final year grade (G3), which is provided alongside first and second-period grades (G1 and G2). While predicting G3 without prior grades is more challenging, such a prediction offers greater utility.
Columns
- school: Student's school, binary: 'GP' (Gabriel Pereira) or 'MS' (Mousinho da Silveira).
- sex: Student's sex, binary: 'F' (female) or 'M' (male).
- age: Student's age, numeric: from 15 to 22.
- address: Student's home address type, binary: 'U' (urban) or 'R' (rural).
- famsize: Family size, binary: 'LE3' (less or equal to 3 members) or 'GT3' (greater than 3 members).
- Pstatus: Parent's cohabitation status, binary: 'T' (living together) or 'A' (apart).
- Medu: Mother's education, numeric: 0 (none), 1 (primary education - 4th grade), 2 (5th to 9th grade), 3 (secondary education), or 4 (higher education).
- Fedu: Father's education, numeric: 0 (none), 1 (primary education - 4th grade), 2 (5th to 9th grade), 3 (secondary education), or 4 (higher education).
- Mjob: Mother's job, nominal: 'teacher', 'health' care related, civil 'services' (e.g., administrative or police), 'at_home', or 'other'.
- Fjob: Father's job, nominal: 'teacher', 'health' care related, civil 'services' (e.g., administrative or police), 'at_home', or 'other'.
- reason: Reason to choose this school, nominal: close to 'home', school 'reputation', 'course' preference, or 'other'.
- guardian: Student's guardian, nominal: 'mother', 'father', or 'other'.
- traveltime: Home to school travel time, numeric: 1 (<15 min.), 2 (15 to 30 min.), 3 (30 min. to 1 hour), or 4 (>1 hour).
- studytime: Weekly study time, numeric: 1 (<2 hours), 2 (2 to 5 hours), 3 (5 to 10 hours), or 4 (>10 hours).
- failures: Number of past class failures, numeric: n if 1<=n<3, else 4.
- schoolsup: Extra educational support, binary: yes or no.
- famsup: Family educational support, binary: yes or no.
- paid: Extra paid classes within the course subject (Math or Portuguese), binary: yes or no.
- activities: Extra-curricular activities, binary: yes or no.
- nursery: Attended nursery school, binary: yes or no.
- higher: Wants to take higher education, binary: yes or no.
- internet: Internet access at home, binary: yes or no.
- romantic: With a romantic relationship, binary: yes or no.
- famrel: Quality of family relationships, numeric: from 1 (very bad) to 5 (excellent).
- freetime: Free time after school, numeric: from 1 (very low) to 5 (very high).
- goout: Going out with friends, numeric: from 1 (very low) to 5 (very high).
- Dalc: Workday alcohol consumption, numeric: from 1 (very low) to 5 (very high).
- Walc: Weekend alcohol consumption, numeric: from 1 (very low) to 5 (very high).
- health: Current health status, numeric: from 1 (very bad) to 5 (very good).
- absences: Number of school absences, numeric: from 0 to 93.
- G1: First period grade, numeric: from 0 to 20.
- G2: Second period grade, numeric: from 0 to 20.
- G3: Final grade, numeric: from 0 to 20 (output target).
Distribution
The dataset is provided as a CSV file named
student-por.csv
, with a size of 93.22 kB. It contains 33 columns and 649 records.Usage
This dataset is ideal for various analytical tasks, including:
- Developing predictive models for student final grades.
- Conducting binary or five-level classification tasks related to student success.
- Exploring the impact of demographic, social, and school-related factors on academic performance.
- Identifying key indicators that influence student achievement in secondary education.
Coverage
The data originates from two secondary schools in Portugal. It covers students aged 15 to 22, including various demographic and social attributes. Specific time ranges for data collection are not detailed.
License
CC0: Public Domain
Who Can Use It
This dataset is suitable for:
- Data Scientists and Machine Learning Engineers: To build and evaluate models for predicting student performance or classifying student profiles.
- Educational Researchers: To analyse factors influencing academic success and develop insights into student achievement.
- Sociologists: To study the correlation between social and family backgrounds and educational outcomes.
- Policymakers and Educators: To inform decisions regarding educational support programmes and curriculum development.
Dataset Name Suggestions
- Portuguese Student Performance in Language Arts
- Secondary School Student Achievement Portugal (Portuguese)
- Portuguese Student Grades & Social Factors
- Student Performance Analytics Portugal
- Academic Success Factors in Portuguese Schools
Attributes
Original Data Source: Secondary School Student Achievement Dataset