Opendatabay APP

Midjourney AI Art Prompts from 2022 Dataset

Data Science and Analytics

Tags and Keywords

Midjourney

Prompts

Generative

Ai

Genai

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Midjourney AI Art Prompts from 2022 Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset contains a collection of Midjourney user prompts and their corresponding generated image URLs from 2022. It has been reformatted from a previous "Midjourney User Prompts & Generated Images" dataset, making it particularly well-suited for text search applications designed to display associated images. The dataset offers multiple versions to cater to different analytical needs: a main version includes re-runs that may result in similar image outputs, while a reduced version excludes re-runs, though it might contain duplicate text with differing arguments. A raw version is also available but is generally not recommended due to the inclusion of errors, chat, and server messages.

Columns

  • timestamp: The precise date and time the message was recorded.
  • _message: The original message content from the user, which may include commands, arguments, and other textual elements.
  • thumb_url: A URL for a thumbnail image, which is another form of an image URL.
  • img_url: The proxy URL for the generated image. This path requires a prefix of either https://cdn.discordapp.com/attachments/ or https://media.discordapp.net/attachments/ to form a complete, usable image URL.
  • cmd: The extracted command portion from the _message field.
  • job_id: A unique 36-character hexadecimal identifier for the specific Midjourney generation task.
  • text: The cleaned text of the /imagine command, specifically excluding any arguments or input URLs.

Distribution

The dataset is provided in CSV format and is split into several files with varying row counts and characteristics:
  • midjourney_2022_250k_raw.csv: Contains approximately 251,390 rows, including raw messages with commands and potentially unwanted content. URLs in this file are in full length.
  • midjourney_2022_250k.csv: Contains approximately 248,069 rows. This version is suitable for text search and includes re-runs, which might lead to similar output images. URLs are shortened to conserve memory.
  • midjourney_2022_reduced.csv: Contains approximately 130,407 rows. This version excludes re-runs but may feature duplicate text entries associated with different arguments. URLs are also shortened.
The URLs in midjourney_2022_250k.csv and midjourney_2022_reduced.csv are partial and need the base URL (https://cdn.discordapp.com/attachments/ or https://media.discordapp.net/attachments/) re-attached to be fully functional.

Usage

This dataset is ideally suited for data science and analytics projects focused on:
  • Developing text search functionalities to retrieve images based on descriptive prompts.
  • Analysing trends in user prompts and their impact on generated imagery.
  • Training or evaluating AI models related to natural language processing (NLP), such as prompt engineering or text-to-image synthesis.
  • Exploring the relationship between textual prompts and visual outputs in generative AI systems.
  • Building recommender systems for creative content.

Coverage

The data spans the year 2022. Its geographic scope is global. While the primary dataset includes re-runs, which might show similar images, a dedicated reduced version is available that filters these out, potentially offering a cleaner set of unique prompt-image pairs. The raw version should be approached with caution as it contains uncleaned data.

License

The dataset is free to use. A specific license URL is not available in the provided materials.

Who Can Use It

This dataset is beneficial for a wide range of users, including:
  • Data scientists and analysts: For exploring and modelling generative AI data.
  • Machine learning engineers: For training and testing models in areas like computer vision, NLP, and recommender systems.
  • Researchers: Studying AI prompt engineering, image generation, and user behaviour in creative AI platforms.
  • Developers: Building applications that leverage text-to-image search or prompt analysis.
  • AI art enthusiasts: Curious about the prompts behind AI-generated images.

Dataset Name Suggestions

  • Midjourney Prompts and Images 2022
  • AI Image Generation Prompt Data
  • Midjourney AI Art Prompts (2022)
  • Text-to-Image Prompts Archive

Attributes

Original Data Source: Midjourney 2022 - 250k [CSV]

Dataset Category Suggestions

  • AI & ML Data
  • Computer Vision
  • Natural Language Processing
  • Generative AI
  • Data Science
  • Programming

Dataset SEO Keyword Suggestions

Midjourney, Prompts, AI, Images, Generative

Listing Stats

VIEWS

4

DOWNLOADS

0

LISTED

05/06/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free