Rakhine Language Proverbs Dataset
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset presents Rakhine/Arakan Proverbs, a collection of wisdom in the Rakhine language, primarily spoken by the Rakhine people in the Rakhine State of Myanmar [1]. Rakhine is considered a low-resource language, which limits research and applications [1]. The dataset aims to support further research and studies in the Rakhine language [1]. It is a valuable resource for promoting education and academic studies within the MyanmarGPT-Movement, an AI initiative in Myanmar [1]. The proverbs were summarised and extracted from the book "ဥပမာစုံ၊ ရခိုင်စကားပုံ။" by "အရှင်စက္ကိန္ဒ, အရှင်ဝါသဝ", originally published in August 1996 [1]. The dataset includes over 300 proverbs [2].
Columns
- proverbs: This column contains the proverbs themselves [2].
- proverbs/စကားပုံများ: This column appears to contain the proverbs in their original Burmese/Rakhine script [2].
Distribution
The dataset features over 300 proverbs, with 221 unique values identified [2]. Data files are typically provided in CSV format [3]. Specific details on the total number of rows or exact file size are not available within the provided material.
Usage
This dataset is ideal for:
- Language research and studies focused on the Rakhine language [1].
- Natural Language Processing (NLP) applications, particularly for low-resource languages [1].
- Text-to-text generation tasks and other AI/ML data initiatives [1].
- Educational purposes within the Myanmar AI community [1].
- Developing text analysis tools and models for Burmese and Rakhine languages [1].
Coverage
- Geographic Scope: Primarily pertains to the Rakhine State of Myanmar in Southeast Asia, with global applicability for research [1, 4].
- Time Range: The proverbs were published in August 1996, and the dataset was released in February 2024 [1]. It was listed on the platform in June 2025 [4].
- Demographic Scope: Relevant to the Rakhine people who speak the language [1].
License
CC0
Who Can Use It
- Researchers and Academics: Those studying low-resource languages, particularly Rakhine and Burmese [1].
- Data Scientists and NLP Practitioners: Individuals working on text data, language modelling, and generation tasks [1].
- AI/ML Developers: Creating applications or models for linguistic analysis in Myanmar [1].
- Students: Engaging in educational projects related to linguistics or data science [1].
Dataset Name Suggestions
- Rakhine Proverbs
- Arakanese Proverbs Collection
- Myanmar Rakhine Wisdom
- Rakhine Language Proverbs Dataset
Attributes
Original Data Source: Rakhine Proverbs