Quranic Arabic-Indonesian Translation Dataset
Knowledge Bundles
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset is a structured collection of the Quran's parallel text in its original Arabic language and its official Indonesian translation. It is sourced from the Kementerian Agama Republik Indonesia's Quran Online portal, which provides the complete Arabic text and its corresponding Indonesian translation, as authorised by the Indonesian Ministry of Religious Affairs. Each row represents a single verse, with the Arabic text and its Indonesian translation precisely aligned for ease of comparison and analysis. This resource is invaluable for linguistic analysis, natural language processing (NLP) tasks, religious studies, and machine translation applications, enabling deep comparative studies between the two languages.
Columns
- Arabic: Contains the original text of the Quran in Classical Arabic script, organised by Surah (Chapter) and Ayah (Verse). These are verses in Arabic.
- Bahasa: Provides the official Indonesian translation of the corresponding Arabic verse, as supplied by Kementerian Agama Republik Indonesia. This represents the translation in Indonesian.
Distribution
The dataset is structured as a parallel corpus with two columns, likely distributed in a CSV file format. It contains approximately 6,116 records, with each record representing a unique Quranic verse and its translation.
Usage
This dataset is ideal for:
- Linguistic analysis of Arabic and Indonesian languages.
- Natural language processing (NLP) tasks, including corpus creation and model training.
- Religious studies, offering insights into textual interpretation across languages.
- Machine translation applications between Arabic and Indonesian.
- Comparative studies to explore translation alignment, semantic equivalence, and syntactical structures between Arabic and Indonesian.
Coverage
The dataset has a global reach, providing the Quranic text in Arabic and its official Indonesian translation. The content covers the complete text of the Quran.
License
CCO
Who Can Use It
- Researchers in linguistics, NLP, and religious studies.
- Developers working on machine translation systems or language models for Arabic and Indonesian.
- Academics and students focusing on comparative textual analysis or Islamic studies.
Dataset Name Suggestions
- Quran Arabic-Indonesian Parallel Corpus
- Quranic Arabic-Indonesian Translation Dataset
- Arabic-Indonesian Quranic Verse Corpus
- Indonesian Ministry of Religious Affairs Quran Dataset
Attributes
Original Data Source: Quran Arabic - Indonesian Parallel Corpus