Opendatabay APP

Wikia Comic Character Archive

Data Science and Analytics

Tags and Keywords

Comics

Marvel

Dc

Characters

Demographics

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Wikia Comic Character Archive Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

Data features detailed information on comic book characters sourced from Marvel Wikia and DC Wikia. The material was initially collected to analyze demographic trends and patterns in character creation within the comic industry, particularly concerning gender, alignment, and physical attributes. It includes key character identifiers, physical attributes like eye and hair color, status (alive or deceased), and first appearance dates. The character appearance counts reflect statistics recorded as of September 2, 2014.

Columns

The dataset is split into two files, with dc-wikia-data.csv containing 13 columns across 6,896 records:
  • page_id: The unique identifier for that character's page within the wikia. This field is 100% valid, with a mean value of 147,000.
  • name: The name of the character. This field is 100% valid, with 6,896 unique values.
  • urlslug: The unique URL component for the character within the wikia. This field is 100% valid.
  • ID: The identity status of the character (e.g., Public Identity, Secret Identity). This field is 71% valid, with Public Identity accounting for 36% of entries.
  • ALIGN: Indicates the character's moral alignment (Good, Bad, or Neutral). This field is 91% valid, with Bad Characters and Good Characters making up the majority of entries (42% and 41%, respectively).
  • EYE: The eye color of the character. This field is 47% valid, missing 53% of records, with Blue Eyes being the most common color (16%).
  • HAIR: The hair color of the character. This field is 67% valid, with Black Hair being the most common color (23%).
  • SEX: The character's sex (e.g., Male or Female). Male Characters account for 69% of the data. This field is 98% valid.
  • GSM: Indicates if the character belongs to a gender or sexual minority. This field is highly incomplete, with 99% of data missing.
  • ALIVE: Status indicating if the character is Living (75%) or Deceased (25%). This field is nearly 100% valid.
  • APPEARANCES: The total count of the character's appearances in comic books. Values range up to 3,093. This field is 95% valid.
  • FIRST APPEARANCE: The month and year of the character's first appearance in a comic book. This field is 99% valid.
  • YEAR: The year of the first appearance, ranging from 1935 to 2013. This field is 99% valid.

Distribution

The material is organized into two primary files: dc-wikia-data.csv (1.11 MB) and marvel-wikia-data.csv. The DC file contains 6,896 records. Identifying information such as character name and unique ID are 100% valid. However, fields related to physical traits (EYE) and gender/sexual minority status (GSM) show substantial levels of missing data. The data reflects a snapshot collected in 2014. The expected update frequency is Annually.

Usage

This resource is suitable for analyzing character demographics, investigating alignment statistics (Good versus Bad), and studying the chronological evolution of character characteristics over time using the YEAR field. It enables comparative analysis between the two major comic universes, Marvel and DC.

Coverage

The scope covers characters derived from the Marvel and DC comic universes. The temporal span of character introductions ranges from 1935 up to 2013. The content includes biographical details, alignment, physical attributes, and appearance counts.

License

Attribution 4.0 International (CC BY 4.0)

Who Can Use It

The dataset is intended for researchers studying media representation, comic book historians, and data scientists interested in text and demographic analysis of fictional content.

Dataset Name Suggestions

  • Marvel and DC Character Demographics
  • Wikia Comic Character Archive
  • Comic Book Character Attributes

Attributes

Original Data Source: Wikia Comic Character Archive

Listing Stats

VIEWS

2

DOWNLOADS

0

LISTED

18/12/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in ZIP Format