University E-Learning Behaviour Dataset
Education & Learning Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset is designed for predicting student course completion in online university courses. It aims to determine whether a student will successfully finish a course or drop out, providing valuable insights for educational institutions looking to enhance student retention and support.
Columns
The dataset comprises two main files:
events_train.csv
and submissions_train.csv
.events_train.csv:
- step_id: An identifier for a specific step within a course.
- user_id: An anonymised identifier for each student.
- timestamp: The time at which an event occurred, provided in Unix date format.
- action: Describes the type of event performed by the user. Possible values include: 'discovered' (user switched to step), 'viewed' (user viewed a step), 'started_attempt' (user began an attempt to solve a step), and 'passed' (user successfully solved a practical step).
submissions_train.csv:
- step_id: An identifier for the practical step related to the submission.
- timestamp: The time at which the solution was submitted, in Unix date format.
- submission_status: The status of the submitted solution, indicating its outcome.
- user_id: An anonymised identifier for the student who made the submission.
Distribution
The data is typically provided in CSV format. The
event_data_train.csv
file has a size of 108.57 MB and contains 3.48 million records. It includes details on student actions, timestamps, and anonymised user IDs. Specific record counts for submissions_train.csv
are not detailed in the provided information, but it contains similar column types for submissions.Usage
This dataset is ideal for developing and evaluating machine learning models focused on student success prediction. It can be used to:
- Predict student dropout rates in online learning environments.
- Identify students at risk of not completing their courses.
- Inform early intervention strategies by educational platforms and universities.
- Perform educational analytics to understand student engagement patterns.
Coverage
The dataset primarily covers student interactions and submissions within online university courses. The time range for recorded events spans from 1434340848 to 1526772811 in Unix timestamp format, representing a significant period of student activity. Geographic and specific demographic scopes are not detailed, with user IDs being anonymised.
License
CC0: Public Domain
Who Can Use It
This dataset is suitable for:
- Data Scientists and Machine Learning Engineers: For building and refining predictive models for classification tasks related to student retention.
- Educational Researchers: To analyse student behaviour and factors influencing course completion in online learning.
- Online Learning Platforms and Universities: To implement proactive measures to support students and improve course completion rates.
Dataset Name Suggestions
- Online Student Course Completion Prediction
- Student Engagement and Dropout Data
- University E-Learning Behaviour Dataset
- Student Performance Prediction for MOOCs
Attributes
Original Data Source: University E-Learning Behaviour Dataset