Opendatabay APP

Text-to-SQL Model Evaluation Data

Data Science and Analytics

Tags and Keywords

Business

Intermediate

Nlp

Deep

Hugging

Bart

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Text-to-SQL Model Evaluation Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset comprises 8,034 entries designed to assess the performance of text-to-SQL models. Each entry includes a natural language text query and its corresponding SQL command. It is a subset derived from the Spider dataset, focusing on diverse and complex queries to challenge machine learning models' understanding and generation capabilities. This is a free dataset, ideal for data science and analytics, particularly in natural language processing and deep learning applications.

Columns

  • text_query: This column contains natural language queries in text format.
  • sql_command: This column contains the corresponding SQL commands generated from the text queries.

Distribution

The dataset consists of 8,034 entries. The text_query column features 7,990 unique values, whilst the sql_command column contains 4,525 unique values. Data files are typically provided in CSV format, and a sample file will be updated separately to the platform. The dataset is currently at version 1.0.

Usage

This dataset is ideal for evaluating the performance of text-to-SQL models. It can be utilised to challenge and enhance the understanding and generation capabilities of various machine learning models, especially within the domains of natural language processing and deep learning research and development.

Coverage

The dataset's regional scope is global. The listing date for this dataset is noted as 08/06/2025.

License

CC-BY-SA

Who Can Use It

This dataset is intended for:
  • Machine learning researchers and developers who aim to train, test, or validate text-to-SQL models.
  • Professionals and academics working in natural language processing (NLP) and deep learning.
  • Data scientists and analysts focused on building and evaluating artificial intelligence models for natural language understanding and SQL generation.

Dataset Name Suggestions

  • Text to SQL Dataset
  • Natural Language to SQL Commands
  • SQL Command Generation Dataset
  • Text-to-SQL Model Evaluation Data

Attributes

Original Data Source: Text to SQL dataset

Listing Stats

VIEWS

2

DOWNLOADS

0

LISTED

08/06/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free