Opendatabay APP

Star Wars Species and Planet Dataset

News & Media Articles

Tags and Keywords

Star

Wars

Characters

Homeworld

Prediction

Movies

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Star Wars Species and Planet Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset provides a collection of attributes for various Star Wars characters, with the primary objective of enabling the prediction of a character's homeworld. Inspired by George Lucas's expansive universe, it features a diverse array of species, including droids, humans, and numerous alien life forms, each with unique characteristics. It offers a valuable resource for exploring the intricate connections between character traits and their planetary origins within the Star Wars galaxy.

Columns

  • name: The character's name. This column has 87 unique values and no missing data.
  • height: The character's height, measured in centimetres. Approximately 7% of entries in this column are missing.
  • mass: The character's weight, measured in kilograms. Around 32% of the data in this column is missing.
  • hair_color: The character's hair colour, when known. This column contains 12 unique colours and has approximately 6% missing entries.
  • skin_color: The character's skin colour, when known. There are 31 unique skin colours recorded.
  • eye_color: The character's eye colour, when known. This column features 15 unique eye colours.
  • birth_year: The character's year of birth. Roughly 51% of this data is missing.
  • sex: The character's sex, if recorded. This column has 4 unique categories and about 5% missing entries.
  • gender: The character's gender, if recorded. There are 2 unique gender categories, with approximately 5% of the data missing.
  • homeworld: The planet or world that the character calls home. This column has 48 unique homeworlds and around 11% missing entries.
  • species: The character's species. There are 37 unique species represented, with about 5% of the data missing.
  • films: The titles of the films in which the character appears. This column lists 24 unique films.
  • vehicles: Any vehicles associated with the character. Approximately 87% of this data is missing, with 10 unique vehicle types recorded.
  • starships: Any starships associated with the character. Around 77% of this data is missing, with 16 unique starship types recorded.

Distribution

The dataset is typically provided as a CSV file and comprises 14 distinct columns across 87 individual records. The total file size is approximately 10.83 kB.

Usage

This dataset is ideally suited for machine learning projects, particularly classification tasks aimed at predicting a character's homeworld based on their attributes. It is also excellent for exploratory data analysis (EDA), allowing users to uncover patterns, distributions, and correlations within the Star Wars character data. It can be employed for academic research into character demographics or for developing interactive Star Wars-themed applications.

Coverage

The dataset spans a diverse range of Star Wars characters, covering various geographic homeworlds found throughout the galaxy. The time range for character birth years is broad, from 8 to 896. Demographically, it includes multiple species such as Humans, Droids, and various alien forms, with different sexes and genders represented. It is important to note that certain attributes, such as character mass, birth year, associated vehicles, and starships, have a notable proportion of missing data.

License

Attribution 4.0 International (CC BY 4.0)

Who Can Use It

  • Beginner data scientists and analysts: To practice data preparation, feature engineering, and applying basic classification algorithms.
  • Machine learning practitioners: For developing and testing models that predict categorical outcomes.
  • Star Wars enthusiasts and academic researchers: Interested in character studies, inter-species relationships, or the geographical spread of life within the Star Wars universe.
  • Educators: As a practical, engaging dataset for teaching data science principles and programming.

Dataset Name Suggestions

  • Star Wars Character Homeworld Predictor
  • Galactic Character Attribute Data
  • Star Wars Species and Planet Dataset
  • Homeworld Prediction from Star Wars Traits

Attributes

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

22/08/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format