Synthetic Student Profiles with Academic Outcomes Dataset
Education & Learning Analytics
Related Searches
Trusted By




"No reviews yet"
£19.99
About
The Synthetic Student Performance Dataset is designed to support research, analytics, and educational projects focused on academic performance, family background, and behavioral factors affecting students. It mirrors real-world educational data and offers diverse features to explore student success patterns.
Dataset Features
- student_id: Unique identifier for each student.
- school: Attended school (e.g., GP or MS).
- sex: Gender of the student (F/M).
- age: Student's age in years.
- address_type: Urban or Rural home location.
- family_size: Family size (Less than or equal to 3 / Greater than 3).
- parent_status: Parental cohabitation status (Living together / Apart).
- mother_education / father_education: Highest education level completed (e.g., Primary, Secondary, Higher).
- mother_job / father_job: Occupation of the student's parents.
- school_choice_reason: Reason for choosing the school (e.g., Reputation, Proximity).
- guardian: Primary caregiver (e.g., Mother, Father, Other).
- travel_time: Daily travel time to school.
- study_time: Weekly study time outside school.
- class_failures: Number of past class failures.
- school_support / family_support: Extra academic support received at school and from family (Yes/No).
- extra_paid_classes: Attending paid private tutoring (Yes/No).
- activities: Participation in extracurricular activities (Yes/No).
- nursery_school: Attended preschool (Yes/No).
- higher_ed: Desire to pursue higher education (Yes/No).
- internet_access: Access to the internet at home (Yes/No).
- romantic_relationship: Currently in a romantic relationship (Yes/No).
- family_relationship: Quality of family relationships (numeric scale).
- free_time: Amount of free time after school (numeric scale).
- social: Frequency of social activities with peers (numeric scale).
- weekday_alcohol / weekend_alcohol: Alcohol consumption levels on weekdays and weekends.
- health: Current health status (1–5 scale).
- absences: Number of school absences.
- grade_1 / grade_2 / final_grade: First and second period grades and final academic performance.
Distribution

Usage
This dataset is ideal for:
- Academic Performance Prediction: Predict final grades based on behavioral and background features.
- Feature Importance Analysis: Identify key influences on student success.
- Sociological Insights: Understand the impact of family, relationship, and lifestyle factors on education.
- Model Training: Suitable for classification, regression, and clustering tasks in educational data mining.
Coverage
Captures a comprehensive view of student life, including family background, academic history, health, and lifestyle. The dataset supports multi-disciplinary research across education, sociology, and data science.
License
CC0 (Public Domain)
Who Can Use It
- Educational Researchers: For testing interventions and identifying risk factors.
- Data Scientists and ML Practitioners: For building predictive models in education.
- Instructors and Students: For coursework in data analysis, machine learning, and statistics.