Opendatabay APP

University Student Retention Dataset

Education & Learning Analytics

Tags and Keywords

Education

Students

Dropout

Success

Academic

Trusted By
Trusted by company1Trusted by company2Trusted by company3
University Student Retention Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset offers crucial insights into student academic journeys within a Portuguese higher education institution. It was developed as part of a national effort to address student dropout and academic failure in universities. The dataset encompasses rich information from 4,424 undergraduate students enrolled in eight diverse degree programmes, including Agronomy, Design, Education, Nursing, Journalism, Management, Social Service, and Technologies. Its primary objective is to facilitate early intervention by enabling the prediction of a student's academic outcome: dropout, continued enrolment, or successful graduation. This presents a realistic three-class classification challenge, notable for its inherent class imbalance, and is highly valuable for predictive modelling and education analytics.

Columns

  • Marital Status: Student's marital status at enrolment.
  • Application mode: Method or channel of student’s application.
  • Application order: Priority order in application submission.
  • Course: Enrolled undergraduate programme.
  • Daytime/evening attendance: Study schedule (day or evening).
  • Previous qualification: Academic qualification before enrolment.
  • Previous qualification (grade): Grade or score from previous education.
  • Nacionality: Country of citizenship.
  • Mother's qualification: Educational level of student’s mother.
  • Father's qualification: Educational level of student’s father.
  • Mother's occupation: Job type or category of the mother.
  • Father's occupation: Job type or category of the father.
  • Admission grade: Entry grade into the university.
  • Displaced: Indicates if the student was displaced (e.g., for military service).
  • Educational special needs: Flags students requiring special education support.
  • Debtor: Indicates if the student owes fees.
  • Tuition fees up to date: Whether tuition fees are paid.
  • Gender: Student’s gender.
  • Scholarship holder: Whether the student received a scholarship.
  • Age at enrolment: Age (in years) at the time of enrolment.
  • International: Whether the student is international.
  • Curricular units 1st sem (credited): Courses credited in semester 1.
  • Curricular units 1st sem (enrolled): Courses enrolled in semester 1.
  • Curricular units 1st sem (evaluations): Exams or assessments taken in semester 1.
  • Curricular units 1st sem (approved): Courses passed in semester 1.
  • Curricular units 1st sem (grade): Average grade for semester 1.
  • Curricular units 1st sem (without evaluations): Courses without assessments in semester 1.
  • Curricular units 2nd sem (credited): Courses credited in semester 2.
  • Curricular units 2nd sem (enrolled): Courses enrolled in semester 2.
  • Curricular units 2nd sem (evaluations): Exams or assessments taken in semester 2.
  • Curricular units 2nd sem (approved): Courses passed in semester 2.
  • Curricular units 2nd sem (grade): Average grade for semester 2.
  • Curricular units 2nd sem (without evaluations): Courses without assessments in semester 2.
  • Unemployment rate: National unemployment rate at enrolment.
  • Inflation rate: Consumer inflation at the time of enrolment.
  • GDP: National economic indicator (e.g., GDP per capita).
  • target: Final academic status: Dropout, Enrolled, or Graduate.

Distribution

The dataset is typically provided in a CSV format. It comprises 4,424 instances (rows) and 37 features (columns). The features include a mix of integer, categorical, and real-valued data types. Importantly, the dataset has undergone extensive data cleaning, including the handling of outliers, inconsistent entries, anomalies, and missing values, resulting in a final dataset with no missing values.

Usage

This dataset is ideal for:
  • Educational Data Mining: Discovering patterns and insights from educational data.
  • Early Warning Systems for Student Dropout: Developing systems to identify students at risk of dropping out.
  • Classification Benchmarking: Evaluating the performance of various machine learning classification models.
  • Feature Importance & Interpretability Studies: Understanding which factors most influence student outcomes and interpreting model predictions.
  • Policy-making Simulations for Academic Retention: Informing and simulating the impact of policies aimed at improving student retention.
  • Predictive Modelling: Building models to predict student academic outcomes (dropout, enrolled, or graduate).
  • Education Analytics: Performing in-depth analysis of student performance and trends in higher education.

Coverage

The dataset covers undergraduate students from a Portuguese higher education institution. It includes students across 8 distinct degree programmes: Agronomy, Design, Education, Nursing, Journalism, Management, Social Service, and Technologies. The data encompasses both demographic and academic information, along with external factors like GDP and inflation rate at the time of enrolment. The target variable, 'Target', categorises student outcomes into Dropout, Enrolled, or Graduate, noting a known class imbalance which reflects real-world challenges.

License

Attribution 4.0 International (CC BY 4.0)

Who Can Use It

  • Machine Learning Practitioners: For building and testing classification models focused on student outcomes.
  • Data Scientists: To conduct in-depth analyses and extract patterns from student data.
  • Educational Researchers: To study factors influencing student success, retention, and academic failure.
  • University Administrators & Policymakers: To develop and implement targeted interventions and academic retention strategies.
  • Students and Educators: For learning and applying data science techniques to real-world educational problems.

Dataset Name Suggestions

  • Student Academic Outcome Prediction
  • Higher Education Dropout & Success Data
  • University Student Retention Dataset
  • Academic Performance Predictive Analytics
  • Student Dropout Risk Dataset

Attributes

Listing Stats

VIEWS

1

DOWNLOADS

0

LISTED

19/08/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format