Opendatabay APP

Educational Bangla Text Dataset

Education & Learning Analytics

Tags and Keywords

Computer

Education

Nlp

Textbook

Bangla

Answer

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Educational Bangla Text Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This specialised dataset has been created to advance Bangla language processing, serving as a cornerstone for developing an effective Bangla Question-Answering system with a strong emphasis on customisation. It comprises approximately 3,000 meticulously curated question-and-answer pairs, painstakingly selected by human annotators guided by NCTB textbooks from classes six to ten. Each passage in the dataset, averaging 387 words, offers rich context for meaningful question answering. Human annotators diligently collected responses for various question types, ensuring the dataset's reliability and relevance in Bangla. This dataset forms the foundation for a precision-driven, context-aware Bangla question-answering system, acting as a vital resource for researchers and developers working to enhance Bangla language processing capabilities.

Columns

  • Passage: Contains textual content from NCTB textbooks, providing contextual information.
  • Question: A query formulated based on the corresponding passage.
  • Answer: The expertly annotated response to the question, derived from the passage.

Distribution

The dataset is provided in CSV file format. It consists of approximately 3,000 question-and-answer pairs. The dataset is organised into training and validation subsets, seamlessly integrating multiple passages with their corresponding questions and annotated answers. Specific numbers for rows or records are not detailed, but each passage typically averages 387 words.

Usage

This dataset is ideal for developing proficient Bangla question-answering systems. It can be used by researchers and developers aiming to enhance Bangla language processing capabilities, particularly for tasks such as Natural Language Processing, Answer Extraction, Reading Comprehension, and Text Processing.

Coverage

The dataset's scope is global. It is derived from NCTB textbooks for classes six to ten, focusing on content relevant to Bangla language education. Specific notes on data availability for certain groups or years are not available.

License

CC0

Who Can Use It

Intended users include researchers and developers focused on artificial intelligence (AI) and machine learning (ML), especially those in Natural Language Processing (NLP), Bengali Language studies, and educational technology. Use cases involve training AI models for question answering, text comprehension, and information extraction in Bangla.

Dataset Name Suggestions

  • Bangla QA Textbook Dataset
  • NCTB Bengali Question-Answer Corpus
  • Educational Bangla Text Data
  • Bengali Reading Comprehension Dataset

Attributes

Original Data Source: Textbook Dataset from NCTB

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

16/06/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format