Opendatabay APP

Secondary School Student Achievement Dataset

Education & Learning Analytics

Tags and Keywords

Student

Education

Grades

Portugal

Performance

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Secondary School Student Achievement Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset provides insights into student achievement within secondary education across two Portuguese schools, specifically focusing on performance in the Portuguese language subject. It includes a variety of attributes such as student grades, demographic information, social factors, and school-related features. The data was gathered through school reports and questionnaires. The dataset facilitates analysis of how different attributes correlate with student performance, particularly with the final year grade (G3), which is provided alongside first and second-period grades (G1 and G2). While predicting G3 without prior grades is more challenging, such a prediction offers greater utility.

Columns

  • school: Student's school, binary: 'GP' (Gabriel Pereira) or 'MS' (Mousinho da Silveira).
  • sex: Student's sex, binary: 'F' (female) or 'M' (male).
  • age: Student's age, numeric: from 15 to 22.
  • address: Student's home address type, binary: 'U' (urban) or 'R' (rural).
  • famsize: Family size, binary: 'LE3' (less or equal to 3 members) or 'GT3' (greater than 3 members).
  • Pstatus: Parent's cohabitation status, binary: 'T' (living together) or 'A' (apart).
  • Medu: Mother's education, numeric: 0 (none), 1 (primary education - 4th grade), 2 (5th to 9th grade), 3 (secondary education), or 4 (higher education).
  • Fedu: Father's education, numeric: 0 (none), 1 (primary education - 4th grade), 2 (5th to 9th grade), 3 (secondary education), or 4 (higher education).
  • Mjob: Mother's job, nominal: 'teacher', 'health' care related, civil 'services' (e.g., administrative or police), 'at_home', or 'other'.
  • Fjob: Father's job, nominal: 'teacher', 'health' care related, civil 'services' (e.g., administrative or police), 'at_home', or 'other'.
  • reason: Reason to choose this school, nominal: close to 'home', school 'reputation', 'course' preference, or 'other'.
  • guardian: Student's guardian, nominal: 'mother', 'father', or 'other'.
  • traveltime: Home to school travel time, numeric: 1 (<15 min.), 2 (15 to 30 min.), 3 (30 min. to 1 hour), or 4 (>1 hour).
  • studytime: Weekly study time, numeric: 1 (<2 hours), 2 (2 to 5 hours), 3 (5 to 10 hours), or 4 (>10 hours).
  • failures: Number of past class failures, numeric: n if 1<=n<3, else 4.
  • schoolsup: Extra educational support, binary: yes or no.
  • famsup: Family educational support, binary: yes or no.
  • paid: Extra paid classes within the course subject (Math or Portuguese), binary: yes or no.
  • activities: Extra-curricular activities, binary: yes or no.
  • nursery: Attended nursery school, binary: yes or no.
  • higher: Wants to take higher education, binary: yes or no.
  • internet: Internet access at home, binary: yes or no.
  • romantic: With a romantic relationship, binary: yes or no.
  • famrel: Quality of family relationships, numeric: from 1 (very bad) to 5 (excellent).
  • freetime: Free time after school, numeric: from 1 (very low) to 5 (very high).
  • goout: Going out with friends, numeric: from 1 (very low) to 5 (very high).
  • Dalc: Workday alcohol consumption, numeric: from 1 (very low) to 5 (very high).
  • Walc: Weekend alcohol consumption, numeric: from 1 (very low) to 5 (very high).
  • health: Current health status, numeric: from 1 (very bad) to 5 (very good).
  • absences: Number of school absences, numeric: from 0 to 93.
  • G1: First period grade, numeric: from 0 to 20.
  • G2: Second period grade, numeric: from 0 to 20.
  • G3: Final grade, numeric: from 0 to 20 (output target).

Distribution

The dataset is provided as a CSV file named student-por.csv, with a size of 93.22 kB. It contains 33 columns and 649 records.

Usage

This dataset is ideal for various analytical tasks, including:
  • Developing predictive models for student final grades.
  • Conducting binary or five-level classification tasks related to student success.
  • Exploring the impact of demographic, social, and school-related factors on academic performance.
  • Identifying key indicators that influence student achievement in secondary education.

Coverage

The data originates from two secondary schools in Portugal. It covers students aged 15 to 22, including various demographic and social attributes. Specific time ranges for data collection are not detailed.

License

CC0: Public Domain

Who Can Use It

This dataset is suitable for:
  • Data Scientists and Machine Learning Engineers: To build and evaluate models for predicting student performance or classifying student profiles.
  • Educational Researchers: To analyse factors influencing academic success and develop insights into student achievement.
  • Sociologists: To study the correlation between social and family backgrounds and educational outcomes.
  • Policymakers and Educators: To inform decisions regarding educational support programmes and curriculum development.

Dataset Name Suggestions

  • Portuguese Student Performance in Language Arts
  • Secondary School Student Achievement Portugal (Portuguese)
  • Portuguese Student Grades & Social Factors
  • Student Performance Analytics Portugal
  • Academic Success Factors in Portuguese Schools

Attributes

Listing Stats

VIEWS

1

DOWNLOADS

1

LISTED

08/07/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format