Text-to-SQL Model Evaluation Data
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset comprises 8,034 entries designed to assess the performance of text-to-SQL models. Each entry includes a natural language text query and its corresponding SQL command. It is a subset derived from the Spider dataset, focusing on diverse and complex queries to challenge machine learning models' understanding and generation capabilities. This is a free dataset, ideal for data science and analytics, particularly in natural language processing and deep learning applications.
Columns
- text_query: This column contains natural language queries in text format.
- sql_command: This column contains the corresponding SQL commands generated from the text queries.
Distribution
The dataset consists of 8,034 entries. The
text_query
column features 7,990 unique values, whilst the sql_command
column contains 4,525 unique values. Data files are typically provided in CSV format, and a sample file will be updated separately to the platform. The dataset is currently at version 1.0.Usage
This dataset is ideal for evaluating the performance of text-to-SQL models. It can be utilised to challenge and enhance the understanding and generation capabilities of various machine learning models, especially within the domains of natural language processing and deep learning research and development.
Coverage
The dataset's regional scope is global. The listing date for this dataset is noted as 08/06/2025.
License
CC-BY-SA
Who Can Use It
This dataset is intended for:
- Machine learning researchers and developers who aim to train, test, or validate text-to-SQL models.
- Professionals and academics working in natural language processing (NLP) and deep learning.
- Data scientists and analysts focused on building and evaluating artificial intelligence models for natural language understanding and SQL generation.
Dataset Name Suggestions
- Text to SQL Dataset
- Natural Language to SQL Commands
- SQL Command Generation Dataset
- Text-to-SQL Model Evaluation Data
Attributes
Original Data Source: Text to SQL dataset