Opendatabay APP

Animal Lifestyle and Traits Dataset

Data Science and Analytics

Tags and Keywords

Feline

Cats

Animal

Breeds

Data

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Animal Lifestyle and Traits Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset is an animal-focused collection designed for various analytical and machine learning tasks, including classification, regression, clustering, visualisation, and exploratory data analysis. It contains approximately 1000 records featuring data on three distinct cat breeds: Maine Coon, Ragdoll, and Angora. The information includes details such as breed, age, gender, body length, weight, fur colour and pattern, eye colour, sleeping and playing times, and geographical location (country, latitude, and longitude). The data was artificially generated and is available in both clean and dirty versions, making it suitable for data cleaning exercises.

Columns

  • Breed: Specifies the cat's breed (Ragdoll, Maine Coon, or Angora). Ragdoll accounts for 41%, Maine Coon 32%, with others making up 27% of the records.
  • Age_in_years: The cat's age, ranging from 0.08 to 11.3 years, with a mean of 4.85 years.
  • Age_in_months: The cat's age in months, spanning 1 to 135 months, with a mean of 58.1 months.
  • Gender: Indicates if the cat is male or female, with an equal distribution of 50% each.
  • Neutered_or_spayed: A boolean field showing whether the cat has been neutered or spayed (58% true, 42% false).
  • Body_length: The body length of the cat, ranging from 10.00 to 102.00, with a mean of 44.
  • Weight: The cat's weight, from 0.50 to 12.10, with a mean of 5.49.
  • Fur_colour_dominant: The primary fur colour, with 'seal' being most common at 28% and 'white' at 25%.
  • Fur_pattern: Describes the fur pattern, with 'solid' being the most prevalent at 43% and 'colorpoint' at 32%.
  • Eye_colour: The colour of the cat's eyes, with 'blue' being most common at 51% and 'yellow' at 24%.
  • Allowed_outdoor: A boolean field indicating if the cat is allowed outdoors (9% true, 91% false).
  • Preferred_food: The cat's preferred food type, either 'wet' (70%) or 'dry' (30%).
  • Owner_play_time_minutes: The amount of time in minutes the owner plays with the cat, ranging from 0 to 60 minutes, with a mean of 23 minutes.
  • Sleep_time_hours: The cat's sleep duration in hours, ranging from 8 to 22 hours, with a mean of 15.9 hours.
  • Country: The country of residence, with USA (62%) and UK (13%) being the most frequent.
  • Latitude: The geographical latitude of the cat's location, ranging from 37.8 to 53.8, with a mean of 44.4.
  • Longitude: The geographical longitude of the cat's location, ranging from -123 to 13.4, with a mean of -60.2.

Distribution

The dataset is primarily available in a CSV format and comprises approximately 1000 items or records. Specifically, the clean version, cat_breeds_clean.csv, is 104.39 kB in size and contains 1071 valid records across its 17 columns. There are two versions: a clean version and a dirty version, specifically provided for data cleaning purposes.

Usage

This dataset is ideally suited for:
  • Machine Learning: Enabling tasks such as classification (e.g., breed identification), regression (e.g., predicting weight or age), and clustering.
  • Exploratory Data Analysis (EDA): For gaining insights into cat characteristics and behaviours.
  • Data Visualisation: To graphically represent trends and patterns within the feline data.
  • Data Cleaning: The 'dirty' version offers a practical challenge for refining data preprocessing skills.
  • Geospatial Analysis: Utilising the country, latitude, and longitude data to explore geographical distributions.

Coverage

  • Geographic Scope: The dataset includes data from various countries, predominantly the USA (62%) and UK (13%), with corresponding latitude and longitude values provided.
  • Demographic Scope: It covers three specific cat breeds: Maine Coon, Ragdoll, and Angora. Information on gender (male and female) and neutered/spayed status is also included.
  • Time Range: While no specific time range for data collection is given, individual cat ages range from 0.08 to 11.3 years, providing an age distribution.
  • Data Availability: The data is artificially generated, ensuring a balanced representation across different attributes such as gender and providing a controlled environment for analytical tasks.

License

CC0: Public Domain

Who Can Use It

This dataset is beneficial for a wide range of users, including:
  • Data Scientists and Machine Learning Engineers for building and testing models (e.g., predicting cat health based on attributes).
  • Data Analysts and Researchers for exploring animal behaviours, characteristics, and distributions.
  • Students and Beginners in data science for learning data cleaning, EDA, visualisation, and fundamental ML techniques.
  • Anyone interested in geospatial studies related to animal populations or characteristics.

Dataset Name Suggestions

  • Feline Breed Characteristics Dataset
  • Domestic Cat Attributes for ML
  • Artificially Generated Cat Data
  • Animal Lifestyle and Traits Dataset
  • Cat Population Metrics

Attributes

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

22/08/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in ZIP Format