Midjourney AI Art Prompts from 2022 Dataset
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset contains a collection of Midjourney user prompts and their corresponding generated image URLs from 2022. It has been reformatted from a previous "Midjourney User Prompts & Generated Images" dataset, making it particularly well-suited for text search applications designed to display associated images. The dataset offers multiple versions to cater to different analytical needs: a main version includes re-runs that may result in similar image outputs, while a reduced version excludes re-runs, though it might contain duplicate text with differing arguments. A raw version is also available but is generally not recommended due to the inclusion of errors, chat, and server messages.
Columns
timestamp
: The precise date and time the message was recorded._message
: The original message content from the user, which may include commands, arguments, and other textual elements.thumb_url
: A URL for a thumbnail image, which is another form of an image URL.img_url
: The proxy URL for the generated image. This path requires a prefix of eitherhttps://cdn.discordapp.com/attachments/
orhttps://media.discordapp.net/attachments/
to form a complete, usable image URL.cmd
: The extracted command portion from the_message
field.job_id
: A unique 36-character hexadecimal identifier for the specific Midjourney generation task.text
: The cleaned text of the/imagine
command, specifically excluding any arguments or input URLs.
Distribution
The dataset is provided in CSV format and is split into several files with varying row counts and characteristics:
midjourney_2022_250k_raw.csv
: Contains approximately 251,390 rows, including raw messages with commands and potentially unwanted content. URLs in this file are in full length.midjourney_2022_250k.csv
: Contains approximately 248,069 rows. This version is suitable for text search and includes re-runs, which might lead to similar output images. URLs are shortened to conserve memory.midjourney_2022_reduced.csv
: Contains approximately 130,407 rows. This version excludes re-runs but may feature duplicate text entries associated with different arguments. URLs are also shortened.
The URLs in
midjourney_2022_250k.csv
and midjourney_2022_reduced.csv
are partial and need the base URL (https://cdn.discordapp.com/attachments/
or https://media.discordapp.net/attachments/
) re-attached to be fully functional.Usage
This dataset is ideally suited for data science and analytics projects focused on:
- Developing text search functionalities to retrieve images based on descriptive prompts.
- Analysing trends in user prompts and their impact on generated imagery.
- Training or evaluating AI models related to natural language processing (NLP), such as prompt engineering or text-to-image synthesis.
- Exploring the relationship between textual prompts and visual outputs in generative AI systems.
- Building recommender systems for creative content.
Coverage
The data spans the year 2022. Its geographic scope is global. While the primary dataset includes re-runs, which might show similar images, a dedicated reduced version is available that filters these out, potentially offering a cleaner set of unique prompt-image pairs. The raw version should be approached with caution as it contains uncleaned data.
License
The dataset is free to use. A specific license URL is not available in the provided materials.
Who Can Use It
This dataset is beneficial for a wide range of users, including:
- Data scientists and analysts: For exploring and modelling generative AI data.
- Machine learning engineers: For training and testing models in areas like computer vision, NLP, and recommender systems.
- Researchers: Studying AI prompt engineering, image generation, and user behaviour in creative AI platforms.
- Developers: Building applications that leverage text-to-image search or prompt analysis.
- AI art enthusiasts: Curious about the prompts behind AI-generated images.
Dataset Name Suggestions
- Midjourney Prompts and Images 2022
- AI Image Generation Prompt Data
- Midjourney AI Art Prompts (2022)
- Text-to-Image Prompts Archive
Attributes
Original Data Source: Midjourney 2022 - 250k [CSV]
Dataset Category Suggestions
- AI & ML Data
- Computer Vision
- Natural Language Processing
- Generative AI
- Data Science
- Programming
Dataset SEO Keyword Suggestions
Midjourney, Prompts, AI, Images, Generative