College Bound Student Data
Education & Learning Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset contains high school student data, synthetically generated for a college project, with the primary aim of predicting whether students will proceed to higher education. It is designed to facilitate machine learning explainability, allowing school counsellors to identify influencing factors and provide targeted assistance to students who may not go to college.
Columns
- type_school: Specifies the type of school the student attends, categorised as either Academic or Vocational.
- school_accreditation: Indicates the quality of the school, denoted by 'A' or 'B', where 'A' signifies a better quality.
- gender: Represents the gender of the student, either Male or Female.
- interest: Describes the student's level of interest in attending college, including categories such as 'Very Interested' or 'Uncertain'.
- residence: Details the student's living environment, categorised as Urban or Rural.
- parent_age: Provides the age of the parent, ranging from 40 to 65 years, with a mean of 52.2 years.
- parent_salary: Represents the parent's monthly salary in Indonesian Rupiah (IDR/Rupiah), spanning from 1,000,000 to 10,000,000, with an average of 5.38 million.
- house_area: Denotes the parent's house area in square metres, ranging from 20 to 120 square metres, with an average of 74.5 square metres.
- average_grades: Displays the student's average grades on a scale of 0-100, with values between 75 and 98 and a mean of 86.1.
- parent_was_in_college: A boolean field indicating whether a parent ever attended college (True or False).
- will_go_to_college: The predicted outcome regarding whether the student will attend college (True or False).
Distribution
The dataset is provided as a CSV file, named
data.csv
, with a file size of approximately 70.67 KB. It comprises 11 distinct columns and contains 1000 individual records or rows.Usage
This dataset is ideal for developing predictive models to forecast a high school student's likelihood of attending college. It can be particularly valuable for school counsellors to understand the various factors influencing students' decisions and to implement data-driven interventions to support those at risk of not pursuing higher education.
Coverage
This is a synthetic dataset created for a college project, so it does not represent specific real-world geographic or temporal coverage. It encompasses demographic details such as student gender, school type and accreditation, interest in college, and residence. Furthermore, it includes parental attributes like age, salary, house area, and prior college attendance.
License
CC0: Public Domain
Who Can Use It
- School Counsellors: To gain insights into student profiles and predict college attendance, enabling early intervention and support programmes.
- Machine Learning Developers: For building, training, and evaluating binary classification models focused on educational outcomes.
- Educational Researchers: To study the correlations between various socio-economic and academic factors and the decision to pursue higher education.
Dataset Name Suggestions
- College Propensity Dataset
- Student Higher Education Predictor
- Future Scholar Pathways
- College Bound Student Data
- Academic Continuation Factors
Attributes
Original Data Source: College Bound Student Data