Dark Mode

Home

Data Categories

AI & ML Data

Official Turkish Lexicon Data

FREE DATASET LIBRARY

Verified Data Provider

£0

Official Turkish Lexicon Data

Data Science and Analytics

Tags and Keywords

Turkish

Dictionary

Language

Linguistics

Tdk

Trusted By

Official Turkish Lexicon Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

Explore the richness of the Turkish language with this collection of Turkish dictionary definitions. The dataset serves as a valuable resource for those interested in linguistic research, language analysis, natural language processing tasks, and educational projects. It offers detailed definitions for a wide array of Turkish words and phrases, providing foundational material for understanding the intricacies of the language.

Columns

madde_id: A unique identifier for each dictionary entry.
word_id: An identifier for the specific word within an entry.
kac: A numerical column, possibly indicating a count or frequency related to the word.
kelime_no: A word number or sequence identifier.
cesit: A numerical value, potentially representing a classification or type of word.
anlam_gor: A numerical column, likely indicating whether a meaning is visible or checked.
on_taki: Represents a pre-suffix, with many missing values and diverse entries like '(birinin)'.
madde: The core meaning or entry word, with a high number of unique values.
cesit_say: A numerical count related to the classification or type.
anlam_say: The count of meanings associated with a word, ranging from 1 to 56.
taki: Represents a suffix, with 'ği' being a common example among others.
cogul_mu: A binary indicator, likely signifying if the word is plural.
ozel_mi: A binary indicator, likely signifying if the word is a proper noun.
lisan_kodu: A language code, with various numerical values.
lisan: The language, with 'Rumca' being a frequently occurring entry.
telaffuz: Pronunciation notes, such as 'l ince okunur' (l is pronounced thinly).
birlesikler: Compound words or phrases, with 'kerli ferli' as an example.
font: This column appears to be entirely empty.
madde_duz: A regularised or adjusted form of the entry word.
gosterim_tarihi: The display date or last update date for the entry, ranging from 26 March 2019 to 31 July 2023.
anlamlarListe: A JSON-like structure containing a list of meanings for each entry, including 'anlam_id', 'madde_id', 'anlam_sira', 'fiil', 'tipkes', 'anlam', 'gos', and 'ozelliklerListe'.
atasozu: A JSON-like structure containing proverbs related to the entry.

Distribution

The dataset is provided in a CSV format, typical for data files. It contains 21 distinct columns and consists of approximately 92,400 records or rows of data. The file size is 77.03 MB.

Usage

This dataset is ideal for a variety of applications, including natural language processing tasks such as text analysis and sentiment analysis in Turkish. It can also be used for educational projects focused on the Turkish language, linguistic research, and the development of language learning tools. Researchers and linguists can leverage it for detailed language studies.

Coverage

The dataset focuses exclusively on the Turkish language, providing definitions from the Turkish Language Association (TDK). The content covers a range of dictionary entries with historical data updates spanning from March 2019 to July 2023. There are no specific demographic notes, as the data is linguistic in nature.

License

CC0: Public Domain

Who Can Use It

This dataset is primarily intended for researchers, linguists, language enthusiasts, and anyone with a keen interest in the Turkish language. It supports use cases such as:

Researchers: For academic studies on Turkish lexicography and etymology.
Linguists: For analysing word structures, semantic relationships, and language evolution.
Developers: For building natural language processing models, spell checkers, or translation tools for Turkish.
Educators and Students: For language learning resources and educational projects.

Dataset Name Suggestions

TDK Turkish Dictionary Words
Turkish Language Definitions
Official Turkish Lexicon Data
Turkish Dictionary Entries

Attributes

Original Data Source: Official Turkish Lexicon Data

Listing Stats

VIEWS

DOWNLOADS

LISTED

08/09/2025

REGION

GLOBAL

QUALITY

5 / 5

VERSION

1.0

FREE DATASET LIBRARY

£0

Official Turkish Lexicon Data

Data Science and Analytics

Tags and Keywords

Turkish

Dictionary

Language

Linguistics

Tdk

Trusted By

Free

About

Columns

Distribution

Usage

Coverage

License

Who Can Use It

Dataset Name Suggestions

Attributes

Listing Stats

Free

Download Dataset in ZIP Format

RECOMMENDED DATASETS