AI TRAINING DATA
Licensed AI training datasets for machine learning, LLM fine-tuning, NLP models, generative AI, and data-driven applications.
High-quality datasets for AI model training, LLM fine-tuning, NLP systems, RAG pipelines, computer vision, predictive analytics, and generative AI applications. Includes labeled datasets, benchmarking data, and privacy-safe training data designed to improve model accuracy and accelerate AI development.

Marie DeVox
Female Monologue Dataset: Tier 3 | Audio + Transcript Bundle
BEST FOR:
Enterprise AI Research Labs & Data Engineers who require a multi-seat department licens...
Number of records
32
Size
219.7 MB

Maria Radio Magyarorszag
The Franciscan deal
This dataset consists of 40 original MP3 audio files from my motivational radio show focused on sta...
Number of records
40
Size
105.0 MB

Marie DeVox
SaaS Corporate English Vocal Dataset
PROFESSIONAL AI VOICE DATASET - SAAS CORPORATE SERIES
Format: LJ Speech Standard Compliance | 24-b...
Number of records
80
Size
20.1 MB

Verbalscripts Transcription LLC
Global Conversational Audio Dataset 515,849 Hours Across 100 Languages
Overview
The Global Conversational Audio Dataset from Verbalscripts Transcription LLC provides acce...
Number of records
515.8K
Size
6.0 GB

Verbalscripts Transcription LLC
African MENA Conversational Audio Dataset 75807 Hours
Overview
The Top 25 Strategic Languages Conversational Audio Pack provides access to 174,308 availa...
Number of records
75.8K
Size
3.0 GB

Verbalscripts Transcription LLC
Top 25 Strategic Languages Conversational Audio Pack 174308 Hours
Overview
The African MENA Conversational Audio Dataset provides access to 75,807 available audio ho...
Number of records
174.3K
Size
4.0 GB

Verbalscripts Transcription LLC
Long Tail Multilingual Speech Dataset 423981 Hours 92 Languages
Overview
The Long Tail Multilingual Speech Dataset provides access to 423,981 available audio hours...
Number of records
424K
Size
2.0 GB

Verbalscripts Transcription LLC
Custom Conversational Audio Collection And Enrichment For AI Training
Overview
Verbalscripts Transcription LLC provides custom conversational audio collection, transcrip...
Number of records
1
Size
5.8 GB

OTL DATA S.R.L.
Real Industrial Video Dataset for Computer - wooden_window_factory_01
Overview
ORION WWF1, short for wooden_window_factory_01, is a certified industrial AI sample datas...
Number of records
620
Size
16.1 GB

Day By Day Recovery Resources
Pre AA Addiction Trajectory Dataset 100 JSONL Records for AI Training
This dataset provides 100 structured, scene-level behavioral records modeling the developmental traj...
Number of records
100
Size
354.0 KB

SimAIHub
Simaihub Expert Navigation Foundation Pack
Simaihub Expert Navigation Foundation Pack(V1)
Product: Reinforcement-learning-ready expert naviga...
Number of records
1K
Size
10.6 GB

Krampfstadt Studio
Sport Cars 2 sound recordings Audio Dataset for ML AI training
Shift into high gear with Sport Cars 2, a premium collection featuring 10 iconic sports cars and per...
Number of records
2.2K
Size
26.7 GB
Show More Results