Opendatabay APP

Book Language Proficiency Levels

Education & Learning Analytics

Tags and Keywords

Education

Text

Literature

Books

Language

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Book Language Proficiency Levels Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

A curated collection of books from the Kumon Institute of Education, categorised by language proficiency levels. The dataset is structured into six main categories, from beginner to advanced, making it suitable for learners at different stages. It is particularly useful for tasks related to CEFR language level prediction, where these levels can be used as labels for the book texts.

Columns

  • Title: The title of the book.
  • Author: The author of the book.
  • Language Level: The language difficulty level of the book for reference.

Distribution

  • Format: CSV
  • File Name: book_levels.csv
  • Size: 11.42 kB
  • Structure: The dataset contains 236 records and 3 columns.

Usage

  • Language Level Prediction: Can be used to train models for predicting the language proficiency level of a given text.
  • Educational Content Curation: Ideal for educators and platforms developing personalised reading lists based on language skill.
  • Literary Analysis: Researchers can analyse characteristics of books across different language difficulty levels.

Coverage

  • Geographic: Not specified.
  • Time Range: Not specified.
  • Demographic: The data is intended for language learners across various proficiency stages, from beginner to advanced.

License

Attribution 4.0 International (CC BY 4.0)

Who Can Use It

  • Data Scientists: For building and training machine learning models to predict text difficulty.
  • Educators: To create customised reading lists and educational materials for students at different language levels.
  • App Developers: For integrating a recommended reading feature into language-learning applications.
  • Researchers: To study the linguistic features that define different reading levels in literature.

Dataset Name Suggestions

  • Kumon Recommended Reading List by Language Level
  • Graded Readers Dataset for Language Learners
  • Book Language Proficiency Levels
  • Educational Reading List for CEFR Prediction

Attributes

Original Data Source: Book Language Proficiency Levels

Listing Stats

VIEWS

1

DOWNLOADS

0

LISTED

28/09/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format